cold plover Feb 20, 2026, 8:01 PM

#

or you take several instances into account before you take a step so is that what stochastic is doing? while regular does the former?

serene scaffold Feb 20, 2026, 8:02 PM

#

gradient descent is like climbing down a hill with a blindfold on. it's "stoachastic" when you feel around a few spots with your foot before deciding which way to go.

#

@tidal bough do you like my analogy or no
it's okay if you don't

#

I barely get to think about neural networks anymore now that everything is """""agentic"""""

cold plover Feb 20, 2026, 8:04 PM

#

ah so regular is just taking the trip on a guess and then finding out that there could've been a more efficient route? vs stochastic is taking one step, looking around for the next like downhill slope and going that way?

serene scaffold Feb 20, 2026, 8:05 PM

#

also it occurs to me that my definition of "not stochastic" might be inverted

tidal bough Feb 20, 2026, 8:05 PM

#

It's easy to show that if you take the limit as learning rate approaches zero, the average direction of the gradient of SGD will match the true gradient (and the expectation value of the SGD gradient just is the true gradient, always). But as for why it works in practice for sizable learning rates... One way to explain that is that SGD doesn't go along the direction of the true gradient, it randomly deviates from it. The thing is that, for tasks with many local optimal like backpropagation in neural networks, SGD typically works better than true gradient descent because of this. The randomness makes it better at global optimization, the same way many global search algorithms involve randomness.

#

i think I remember reading some blogpost that showed a cool result relating SGD to a more complex algorithm but I'm not sure how to find it...

cold plover Feb 20, 2026, 8:07 PM

#

right so instead of assumed/ill informed planned routing, improvised routing?

Like choosing a route on google maps that appears to have less traffic but then adhoc taking another exit/road to avoid the evident traffic and arriving at the same destination?

tidal bough Feb 20, 2026, 8:08 PM

#

Sure, that's maybe a decent analogy

cold plover Feb 20, 2026, 8:08 PM

#

i know stat quest is a really simplified version but this still confuddles me.

#

the way he explained regular descent didn't go back to the initial guesses either, the steps built up on top of each other?

tidal bough Feb 20, 2026, 8:12 PM

#

tidal bough i think I remember reading some blogpost that showed a cool result relating SGD ...

I may have been thinking of https://distill.pub/2017/momentum/ , though it only covers SGD briefly and mostly talks about momentum

Distill

Why Momentum Really Works

We often think of optimization with momentum as a ball rolling down a hill. This isn't wrong, but there is much more to the story.

tidal bough Feb 20, 2026, 8:15 PM

#

cold plover the way he explained regular descent didn't go back to the initial guesses eithe...

Consider taking that dataset, and plotting the MSE as a function of slope and intercept. That's the error function you're trying to minimize, and GD and SGD both do this by "walking" this landscape in small steps, evaluating the gradient at every location. That's maybe more understandable than looking at the dataset itself in (x,y) space with the curve shown.

main fox Feb 20, 2026, 8:15 PM

#

tidal bough I may have been thinking of https://distill.pub/2017/momentum/ , though it only ...

Were you thinking of NAdam optimizer?

tidal bough Feb 20, 2026, 8:16 PM

#

I don't think so

cold plover Feb 20, 2026, 8:16 PM

#

cold plover i know stat quest is a really simplified version but this still confuddles me.

so in regular descent, the derivative of the loss function was calculated with ALL data in mind. a guess was taken for slope of loss function and intercept, multiplied into the learning rate and rinse and repeat until the loss function is close to zero.

i can understand how that would be a lot of computation for a LOT of data points for the parameters.

where as stochastic takes only one data point for the parameters, find a line of best fit, and outputs the parameters for step 1. then the subsequent steps take the previous parameter otuputs, multiply by learning rate, and do the same thing which leads to a more efficient path to local optima?

#

i can sort of understand it as: each data point being allowed to pull the line of best fit toward itself for SGD until it closely fits in an arrangment where all data points are "happy", where as RGD just shotguns a guess for slope for all data points and keeps adjusting that until "happy"

#

so i can see why SGD would be faster.

tidal bough Feb 20, 2026, 8:20 PM

#

cold plover so in regular descent, the derivative of the loss function was calculated with A...

Sure, note that for SGD you're effectively doing much less computation per step, so you can afford many more steps.
E.g. you have a dataset of N=10**6 points. You could do one step of GD, computing the gradient on the entire dataset. Or you could do SGD on minibatches of m=1000 samples each. 1000 samples is many enough that the gradient computed this way would be a pretty good approximation of the true one, and for the same amount of compute, you'll be able to do N/m = 1000 steps, instead of one.

cold plover Feb 20, 2026, 8:20 PM

#

so a compromise of optimization between accuracy and computational power

tidal bough Feb 20, 2026, 8:21 PM

#

tidal bough Sure, note that for SGD you're effectively doing much less computation per step,...

(you could spend some of that speedup on lowing the learning rate for SGD to make it more like GD, but that's not necessarily what you want, because of that consideration I mentioned where randomness can improve convergence)

cold plover Feb 20, 2026, 8:21 PM

#

i feel like my interpretation here makes sense, could you correct me if I am wrong?

#

sorta like throwing one thing out of balance in the favor of other until you get close as opposed to finding the best arrangment for the initial data point and then optimizing off that to get a close arrangement that works for all.

tidal bough Feb 20, 2026, 8:24 PM

#

Yeah, that describes SGD, sure

#

However I should note that... 2d dataset fitting is a toy problem, and for it, SGD is, I think, objectively not at all better than GD.

#

because the loss function here is smooth and convex. GD would just go directly to the minimum, whereas SGD would wander a little. SGD has pretty much no advantage here, so trying to understand why SGD can be good by only studying this task won't go well.

cold plover Feb 20, 2026, 8:26 PM

#

tidal bough because the loss function here is smooth and convex. GD would just go directly t...

this was the only video I could find that didn't provide an unecessarily collegiate detailed explanantion without actual conceptual explanation.

#

like it didn't just spout variables and calculus but rather explained the underlying reasoning/process.

#

but thanks for the concise explanations whilst i stumbled through my thought process!

#

ill be back with more questions...soonish.

iron basalt Feb 20, 2026, 9:16 PM

#

cold plover so in regular descent, the derivative of the loss function was calculated with A...

SGD wiggles around which lets it generalize better since it will jump out of steep local minima holes. Not having to go over the whole dataset matters for performance reasons. In practice batches are used in deep learning which is kind of in between the two, and this is done there because doing just one sample at a time would be slow as computers want to work on chunks of things at a time. So the batches act as chunks for better performance.

#

In non-toy, non-convex problems SGD is used for these properties.

cold plover Feb 20, 2026, 9:35 PM

#

gotcha

cold plover Feb 20, 2026, 9:54 PM

#

how does one determine what kind of cross validation method to use based on the amount of training data?

#

for example if it goes 50,100,200,400,500,1000 etc?

dreamy solstice Feb 20, 2026, 10:04 PM

#

gm

Please, is there a paper or an article that explains how word embedding captures meaning from training? I recently finished learning linear and logistic regression and multiclass with softmax, so I'm planning on building a sentiment analysis. I'm planning on Word2vec embedding. Training numerical data is simpler because your X is the input data, but from what I've learned so far, the linear model takes the parameter h as an input whch is the avg of all the vectors of each word in a sentence.

And it trains and trains and changes the word vectors (The beginning of the confusion), how will the changing of the vectors make the model understand the words? Since the input is a mean vector and to find the h gradient, you won't use the parameters of the h you first passed.

I know I can use packages for embedding, but it's somthing i want to write from scratch, so if any paper, blog can help, cause i dint even think i under wht im trying to ask any longer.

Thank you.

cold plover Feb 20, 2026, 10:13 PM

#

@tidal bough is there a ML course you would recommend?

#

my current class is an elective that is...poorly taught IMO. would like something more clarity and structure.

serene scaffold Feb 20, 2026, 10:25 PM

#

dreamy solstice gm Please, is there a paper or an article that explains how word embedding capt...

They only represent meaning in the sense that a given word's vector is expected to be closer to vectors for words with similar meanings

half pulsar Feb 21, 2026, 4:35 AM

#

cold plover my current class is an elective that is...poorly taught IMO. would like somethin...

Practice more math, helps builds a better mental model for that kind of stuff

#

A good way to start is to take a specific algorithm and work through it end-to-end, from the derivation of the equations to a full implementation in code. Translating it helps turn the math into something executable builds intuition much faster than passively watching lectures imo.

final kiln Feb 21, 2026, 7:06 AM

#

#

seems like they finally cracking arc agi

fringe thicket Feb 21, 2026, 7:09 PM

#

Hi

#

Guys I Just completed Python And I want To enter in data science field I don't know what to do Now
Can Anyone help me ?

serene scaffold Feb 21, 2026, 7:20 PM

#

fringe thicket Guys I Just completed Python And I want To enter in data science field I don't k...

What formal education do you have and what country are you in?

rare bane Feb 21, 2026, 7:33 PM

#

Wait can we put links to GitHub repos for data science projects in this channel?

serene scaffold Feb 21, 2026, 7:33 PM

#

rare bane Wait can we put links to GitHub repos for data science projects in this channel?

There's #1468524576479641744

rare bane Feb 21, 2026, 7:34 PM

#

serene scaffold There's <#1468524576479641744>

Ok then noted and thanks

#

For the more data science inclined python devs, when working with model predictions, is it like a procedural thing, or you just have to work an entire new brand of logic to get what you want

For example: if I'm working with a small dataset I usually use a linear regression model and then load the dataset, clean/wrangle the dataset , then select my features and my target, run some metric scores and visualize. And I go about with that almost everytime. Is that the standard case? I know different datasets and features to predict require nuance and different predictive models as it's not a one size fits all scenario, so I'm just asking if it's as procedural in data science like when you're making am omelette, where you know exactly what to do and the process doesn't change

sullen urchin Feb 21, 2026, 7:45 PM

#

Is there anyone who is looking for a dev?

elfin stratus Feb 21, 2026, 8:47 PM

#

Hey guys! I got very interested in coding and especially data science in the past year. I learnt python pretty decently and started learning other tools and libraries with kaggle.

I am ambitious but the path is unclear. I would be happy to get a little clarification about the best way to build out decent data science skills. Like a roadmap.

Thanks in advance

main fox Feb 21, 2026, 9:05 PM

#

rare bane For the more data science inclined python devs, when working with model predict...

There are well described patterns/workflows for standard ML work, yes

Logically, you:

Understand the ask, and determine an approach

Identify relevant data sources, if you're working with more than one source of data

Understand the data, clean and reshape for your purposes

And determine if you need an ML model

If so, establish a baseline, and note general transformations that you may need (OHE, missingness, standardization)

Test different models and feature engineering steps

See if performance and complexity are justified

Determine deployment strategy

Like you noted, these steps are not all encompassing, and every problem has it's own details

rare bane Feb 21, 2026, 9:08 PM

#

Huh, so basically if the output is looking somewhat reasonable for a complex dataset, after using multiple models, then that is probably ok?

Well I appreciate the detailed answer if anything. T for thanks

main fox Feb 21, 2026, 9:10 PM

#

rare bane Huh, so basically if the output is looking somewhat reasonable for a complex dat...

If you've tested various models and feature engineering steps, then you may determine that, with the current dataset, you've reached a performance limit

#

You'll make the call if performance is acceptable

rare bane Feb 21, 2026, 9:11 PM

#

main fox You'll make the call if performance is acceptable

I guess that is why it is data science. It's still in a discovery phase of sorts

main fox Feb 21, 2026, 9:12 PM

#

Usually within industry you can tie e.g. dollar amounts to the events you're trying to predict to help you determine if the model is worth implementing

proven pier Feb 21, 2026, 9:54 PM

#

This may potentially violate rule 9, but I think I'd like to look into potential options for learning AI by paying for a curriculum or some sort of teaching service. - I suppose I'd say, not asking for anybody here to do it specifically, but perhaps somebody here knows of some reputable sources for such things? I think what I need is a good curriculum to follow then I could be more confident in the directions I'm taking

serene scaffold Feb 21, 2026, 10:10 PM

#

proven pier This may potentially violate rule 9, but I think I'd like to look into potential...

You can ask about suggestions for learning resources that one has to pay for, as long as you don't try to solicit an exchange of money between people on this server.

#

!resources data science

arctic wedgeBOT Feb 21, 2026, 10:11 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

umbral dove Feb 21, 2026, 11:25 PM

#

https://theconversation.com/openai-has-deleted-the-word-safely-from-its-mission-and-its-new-structure-is-a-test-for-whether-ai-serves-society-or-shareholders-274467

The Conversation

OpenAI has deleted the word ‘safely’ from its mission – and i...

OpenAI’s restructuring may serve as a test case for how society oversees the work of organizations with the potential to both provide benefits and harm humanity.

ocean hinge Feb 22, 2026, 3:08 AM

#

Hello can anyone look into my issue on python-help?

serene scaffold Feb 22, 2026, 3:10 AM

#

ocean hinge Hello can anyone look into my issue on python-help?

you could at least link it

ocean hinge Feb 22, 2026, 3:23 AM

#

serene scaffold you could at least link it

Like this?

https://discord.com/channels/267624335836053506/1474955204314267649

jaunty helm Feb 22, 2026, 4:38 AM

#

rare bane For the more data science inclined python devs, when working with model predict...

you know exactly what to do and the process doesn't change
generally each dataset has its own quirks you have to work with, so nah
I mean as a very exaggerated example - would you treat a tabular dataset the same way you do an image dataset? I hope not

#

it's also seldom 'procedural' as in one-and-done, after a model is trained you inspect its predictions and if it's not good enough, go to one of the previous many steps to try and improve it

copper kindle Feb 22, 2026, 10:37 AM

#

I just can't get over this concept. Easy to grasp W_(target node, source node) and easy to compute Wx+b.
But I've seen some other example online that takes in account W_(source node, target node) and then does a transpose of weight matrix before dot product. Why would we need to do that ? is that a special case based on inputs dimention or something else?

final kiln Feb 22, 2026, 10:47 AM

#

copper kindle I just can't get over this concept. Easy to grasp **W_(target node, source node)...

if I had to guess, it just depends on the author chose to construct the weight matrix in the first place and has no additional meaning

#

in the second image the weights of each "neuron" are organized in column vectors instead of lines

#

there's stuff about line vs column vectors which may be meaningful if you're in some advanced text, but I havent seen it a lot outside of research papers

https://en.wikipedia.org/wiki/Dual_space

but yea not worth digging into this if you just learning grad descent and other basics

Dual space

In mathematics, any vector space

    V
  

{\displaystyle V}

has a corresponding dual vector space (or just dual space for short) consisting of all linear forms on

    V
    ,
  

{\displaystyle V,}

together with the vector space structure of pointwise addition and scala...

copper kindle Feb 22, 2026, 10:59 AM

#

final kiln if I had to guess, it just depends on the author chose to construct the weight m...

I'll stick with W_(target, source). its in accordance with matrix multiplication rules. Same approach is used in PyTorch and Keras. The other example is just verbose in my opinion. You take the transpose again, thats an example for maybe a teaching exercise or something.

copper kindle Feb 22, 2026, 11:00 AM

#

final kiln there's stuff about line vs column vectors which may be meaningful if you're in ...

I've got experience in DL, ADL. I was just wondering the "WHY" of doing it that way.

final kiln Feb 22, 2026, 11:01 AM

#

copper kindle I'll stick with W_(target, source). its in accordance with matrix multiplication...

yea I agree, this is what makes sense and what is done by convention as far as I kno

final kiln Feb 22, 2026, 11:03 AM

#

copper kindle I've got experience in DL, ADL. I was just wondering the "WHY" of doing it that ...

authors preferences, context of the text, or plain random chance somehow

copper kindle Feb 22, 2026, 11:04 AM

#

final kiln there's stuff about line vs column vectors which may be meaningful if you're in ...

This was spot on. The Transpose comes from dual space view, I'll stick with primal view and ctrl + alt + dlt that ever saw the dual space way of doing neural networks lol.

#

@final kiln Thanks.

final kiln Feb 22, 2026, 11:42 AM

#

any time

fringe thicket Feb 22, 2026, 1:48 PM

#

serene scaffold What formal education do you have and what country are you in?

Currently I'm doing BCA computer science And I'm from India

wet dome Feb 22, 2026, 5:12 PM

#

What are the most common data visualisation tools? Is it power bi and tableau?
If I was to learn one what should i choose?

waxen kindle Feb 22, 2026, 8:59 PM

#

They are very similar to one another

#

The free alternatives are too

rich moth Feb 23, 2026, 2:57 AM

#

Has anyone played around with WorkClaw?

serene scaffold Feb 23, 2026, 3:25 AM

#

!warn @hardy fractal your message was removed for listing a job, which is not allowed.

arctic wedgeBOT Feb 23, 2026, 3:25 AM

#

:incoming_envelope: :ok_hand: applied warning to @hardy fractal.

hardy fractal Feb 23, 2026, 3:28 AM

#

serene scaffold !warn <@1436030369261682861> your message was removed for listing a job, which i...

Ok, I am sorry.

#

What should I do for that?

obsidian talon Feb 23, 2026, 3:50 AM

#

wet dome What are the most common data visualisation tools? Is it power bi and tableau? I...

Coding or non coding?

#

Tableau, Power BI, Looker Studio are the top 3 most common, but there are others

#

Programming languages can make data visualizations too though

serene scaffold Feb 23, 2026, 4:38 AM

#

hardy fractal What should I do for that?

go to a hiring website

rich moth Feb 23, 2026, 7:45 AM

#

I build a "brain" map of my AI system. Just wanted to see the inner workings a bit, but it will be really helpful for debugging. But since its initial origin point its built and connected these nodes over roughly 2 weeks, its organizing its thoughts. but so far so good, no hairball. Its currently integrating itself into WorkClaw, its really fun to watch it just go. I dont even watch TV anymore.

#

its pretty incredible what some of these local models can do these days

rare bane Feb 23, 2026, 9:23 AM

#

wet dome What are the most common data visualisation tools? Is it power bi and tableau? I...

Yes and even excel, however, I really dislike having to choose one as an ultimatum. It's like going to a party with a random assortment of food and deciding you'll only take one item of food, despite no explicit instruction saying you should do so. So go ahead and have fun with both

glass temple Feb 23, 2026, 12:24 PM

#

does SMOTE work with high dimensional sparse matrices like TF-IDF matrices?

#

If not, then should I reduce the dimension of the TF-IDF matrix by limiting max_features, or use Truncated SVD to reduce the dimensions? What's a good rule of thumb to the max dimensions to use for SMOTE?

wooden sail Feb 23, 2026, 12:32 PM

#

glass temple If not, then should I reduce the dimension of the TF-IDF matrix by limiting max_...

neither smote nor svd work for sparse data

#

general linear combinations aren't sparsity-preserving

glass temple Feb 23, 2026, 12:34 PM

#

I'm guessing there's no over sampling method for text data that uses just ML then?

full granite Feb 23, 2026, 4:46 PM

#

I’m not sure if this is something I should be asking here, but do you think that having written my own programs to conduct materials analysis during university and graduate school could be considered one of my strengths when it comes to job hunting?

final kiln Feb 23, 2026, 5:04 PM

#

yes

prime cliff Feb 23, 2026, 5:08 PM

#

Hlo

unkempt apex Feb 23, 2026, 5:31 PM

#

final kiln yes

how's health?

final kiln Feb 23, 2026, 6:21 PM

#

unkempt apex how's health?

better

waxen kindle Feb 23, 2026, 6:49 PM

#

full granite I’m not sure if this is something I should be asking here, but do you think that...

Absolutely

wind breach Feb 23, 2026, 7:20 PM

#

guys, is there that idea to have tokens that only the LLM can generate and they are not displayed to user but they are like for reasoning (not just reasoning tokens, but tokens what are not in dataset, not in human texts, not mapped to real text, only unique id tokens)?

daring matrix Feb 23, 2026, 7:52 PM

#

any bioinfos

fallow coyote Feb 23, 2026, 8:46 PM

#

Im thinking about doing a little programming project for my workplace where Ill make a program that can automatically detect and measure wear on a tool. My company is a tooling company and when Ive been tasked to record tool wear for a project, its tedious task taking a photo with a specialised microscope cam, positioning the tool and then trying to get a decent enough measurement. I want to first automate identifying siginificant tool wear and obtaining the measurement for it. How do i go about it?

rich moth Feb 24, 2026, 4:48 AM

#

wind breach guys, is there that idea to have tokens that only the LLM can generate and they ...

I feel like you're describing coconut by meta? https://arxiv.org/pdf/2412.06769

jaunty helm Feb 24, 2026, 8:20 AM

#

wind breach guys, is there that idea to have tokens that only the LLM can generate and they ...

is your emphasis on hiding stuff from the user?
if so just host your own chat interface that the users must use to interact with your model

torn hill Feb 24, 2026, 10:07 AM

#

Hi Guys

I have been experimenting with sentence relevancy for the past few weeks.

So I made Scout , its an experimental attention model that slightly modifies the standard Transformer attention architecture to learn directional relevance between sentences instead of tokens. Instead of asking "are these similar?", it asks "does sentence B actually help sentence A?"

Still a small model trained on ~4,500 synthetic pairs. The deeper question that am trying to asnwer is can attention mechanics encode functional utility rather than just contextual compatibility. But early results are interesting.

Do check it out and tell me what you think

https://github.com/samyak112/Scout

GitHub

GitHub - samyak112/Scout: Modified transformer that learns directio...

Modified transformer that learns directional information gain between sentences. - samyak112/Scout

wind breach Feb 24, 2026, 12:30 PM

#

jaunty helm is your emphasis on hiding stuff from the user? if so just host your own chat in...

no, I think just tokens would push model to be more token-eff

jaunty helm Feb 24, 2026, 12:33 PM

#

wind breach no, I think just tokens would push model to be more token-eff

then the other person linked you something relevant I think
a quick search also found this survey on latent reasoning which may or may not be of interest

arXiv.org

A Survey on Latent Reasoning

Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, especially when guided by explicit chain-of-thought (CoT) reasoning that verbalizes intermediate steps. While CoT improves both interpretability and accuracy, its dependence on natural language reasoning limits the model's expressive bandwidth. Latent reasoning tac...

wind breach Feb 24, 2026, 12:34 PM

#

jaunty helm then the other person linked you something relevant I think a quick search also ...

what something different, what I suggest is to add tokens what can be only generated by LLM and are not present in any datasets to the vocab

#

okay, not only that but also force LLM to use only them for reasoning by showing to the user all tokens except these

wind breach Feb 24, 2026, 12:36 PM

#

rich moth I feel like you're describing coconut by meta? https://arxiv.org/pdf/2412.06769

latent reasoning as I just said is something different

jaunty helm Feb 24, 2026, 1:00 PM

#

wind breach okay, not only that but also force LLM to use only them for reasoning by showing...

can you how that will work exactly? if you want:

'reasoning' with values not in the vocabulary
humans can't see it, the model 'reasons internally'
how is that different than the normal no reasoning mode, where magical stuff happens internally in the llm, with intermediate vectors not corresponding to any particular word in the vocabulary, and also humans can't really 'see' it

wind breach Feb 24, 2026, 3:32 PM

#

how is that different than the normal no reasoning mode, where magical stuff happens internally in the llm, with intermediate vectors not corresponding to any particular word in the vocabulary, and also humans can't really 'see' it

allows to save progress between forward passes

#

normal non-reasoning LLMs can only "think" inside a single forward pass, which is extremely limited

jaunty helm Feb 24, 2026, 4:12 PM

#

wind breach > how is that different than the normal no reasoning mode, where magical stuff h...

so you want stateful llms? a model where some internal state is modified based on input, and said state also affects outputs?
actually before transformers stateful models were more prominent, like recurrent models, but they largely fell out of favor due to being hard to train/scale

#

you can check RWKV which is a novel architecture that's been devved for some years now (though to no larger scale or adoption), I think it has some close ideas

wind breach Feb 24, 2026, 4:30 PM

#

jaunty helm so you want stateful llms? a model where some internal state is modified based o...

no more stateful than they currently are

jaunty helm Feb 24, 2026, 4:32 PM

#

wind breach no more stateful than they currently are

elaborate?

wind breach Feb 24, 2026, 4:33 PM

#

jaunty helm elaborate?

My idea doesn't add any new state

#

the only state is previous tokens

#

and it is used anyway

#

actually, I wouldn't call it state, that's just part of LLM input (LLMs are auto-regressive)

jaunty helm Feb 24, 2026, 4:38 PM

#

wind breach the only state is previous tokens

and these 'tokens' are invisible to the outside?
how would these invisible tokens be meaningfully different from internal states?

wind breach Feb 24, 2026, 7:55 PM

#

jaunty helm and these 'tokens' are invisible to the outside? how would these invisible token...

internal states reset after each forward pass

#

one forward pass is only one token

wind breach Feb 24, 2026, 7:56 PM

#

jaunty helm and these 'tokens' are invisible to the outside? how would these invisible token...

not necessary, but they should not be mapped to some text

jaunty helm Feb 25, 2026, 1:11 AM

#

wind breach not necessary, but they should not be mapped to some text

any data can be mapped to text; like of you opened a png in notepad you'll see text characters

#

I feel like I'm just trying and failing to guess what you're actually trying to describe
can you like say exactly what you mean and how its different from existing architectures

empty dragon Feb 25, 2026, 4:50 AM

#

Ok 👍

pliant steppe Feb 25, 2026, 9:30 AM

#

anyone got experience with gpus and OOM errors? im doing a project where we have teach a neural network to play from a baseline of data and then improve from selfplay (baseline is generated by a MCTS agent and then a PUCT agent trains on it and starts the selfplay phase). the problem is that when i got to training the neural net i set a batch size of 64 for the data and it keeps causing a OOM error. iv only managed to train the first model on 10k games from the MCTS agent, then i generated another 10k games using the PUCT agent but now when i go to train the 2nd neural net on all the previous data it keeps causing a OOM error, im not sure why

bronze wyvern Feb 25, 2026, 2:12 PM

#

Hello, quick question, our lecturer told us to split our data from the very beginning and to always work on the train data , anyone has any idea why? For e.g what I'm used to do is preprocess everything and at the very end then only split the data.

if the test data is unprocessed and we test our algorithm, say linear progression, is that a problem?

jaunty helm Feb 25, 2026, 2:23 PM

#

bronze wyvern Hello, quick question, our lecturer told us to split our data from the very begi...

specifically, anything that needs to be 'learned' - you can only learn it from training data
this ranges from the obvious, like your linear regression model, to something not as obvious, such as StandardScaler needing to learn the mean and the std of the dataset - you can only use the training set to find the mean and std

bronze wyvern Feb 25, 2026, 2:24 PM

#

when using standardScaler for e.g, this normally should be applied on whole dataset, no? Like to scale everything to a specific range?

jaunty helm Feb 25, 2026, 2:25 PM

#

bronze wyvern when using standardScaler for e.g, this normally should be applied on whole data...

it's in 2 steps
you should scaler.fit() only on the training set, but then you can do scaler.transform() on whatever

#

so e.g. you may do

train_X, test_X = train_test_split(X)
scaler = StandardScaler()
train_X = scaler.fit_transform(train_X)
test_X = scaler.transform(test_X)
... # more processing

however you can not do

scaler = StandardScaler()
X = scaler.fit_transform(X)
train_X, test_X = train_test_split(X)
```because your scaler had the information from the test set to learn a better mean and std from, your model will seem better than it is on true, unseen data

#

I recommend using a sklearn.pipeline if you have a lot of steps

scikit-learn

sklearn.pipeline

Utilities to build a composite estimator as a chain of transforms and estimators. User guide. See the Pipelines and composite estimators section for further details.

bronze wyvern Feb 25, 2026, 2:31 PM

#

jaunty helm so e.g. you may do ```py train_X, test_X = train_test_split(X) scaler = Standard...

oh ok, I see

jaunty helm Feb 25, 2026, 2:31 PM

#

also the above extends to manual processing as well
e.g. for some unsplit numerical dataset arr you can not do:

arr_standardized = (arr - arr.mean()) / arr.std()
... # split later
```because again the `.mean()` and `.std()` used information from what should be the test set

bronze wyvern Feb 25, 2026, 2:31 PM

#

jaunty helm I recommend using a [`sklearn.pipeline`](https://scikit-learn.org/stable/api/skl...

it's for the californian dataset housing prices, it's not really a big dataset with lots of steps, more a dataset to learn the fundamentals

#

yep I see, I will read a bit on what you mentioned and come back but I understand the gist of it; we want the test data to be truely unseen

jaunty helm Feb 25, 2026, 2:35 PM

#

bronze wyvern yep I see, I will read a bit on what you mentioned and come back but I understan...

yeah
again, anything that requires stuff to be learned from data, you must do after splitting
there are other operations that don't, in which case you can do it whenever
for example, it's often you train the model on the log of house prices (then exp back the model predictions, for the actual predictions), because usually house prices are very skewed and taking the log can make it fall into a nicer distribution
since log doesn't need to know anything about the dataset, you can do y = log(y) before splitting it into train_y and test_y no problem

bronze wyvern Feb 25, 2026, 2:37 PM

#

oh ok

#

Will come back, I will dig a bit in what you mentioned/read a bit to document myself and come back, thanks for the insights

bronze wyvern Feb 25, 2026, 3:09 PM

#

yeah I have a better understanding now, thanks !

limber plover Feb 25, 2026, 4:47 PM

#

Can I share my idea I had about Ai system I was building and how scifi is this or could this work?

#

Anyone here?

serene scaffold Feb 25, 2026, 4:48 PM

#

limber plover Can I share my idea I had about Ai system I was building and how scifi is this o...

no need to wait for permission.

limber plover Feb 25, 2026, 4:49 PM

#

I just want to be courteous to the topic at hand they are having and not spam.

#

A long while I was doing research and I had this idea, though I don't know how practical this would be to implement. Parallel Forest: A Mesh Network-Enabled Model for Diverse Task Management
Overview of Model Architecture
The model utilizes a unique architecture that combines forest clusters with a mesh network to efficiently handle a diverse set of tasks. Each cluster, resembling a revolver drum, contains dense trees arranged in a matrix formation. This matrix arrangement enables parallel processing and efficient computation within each cluster, allowing for simultaneous data processing and pattern recognition.
In addition, the forests (Random) are interconnected through a mesh network, which provides redundancy and fault tolerance. This interconnected nature ensures that the model can continue functioning even if one forest goes down, as data can be rerouted and processed through alternative paths, maintaining the overall functionality of the model. This scalability allows the Random forest to expand based on the tasks given.
By leveraging the matrix formation for parallel processing and the mesh network for fault tolerance and redundancy, the model aims to achieve robustness, efficiency, and continuous operation across a wide range of tasks. this is a bit general overview....

serene scaffold Feb 25, 2026, 4:52 PM

#

limber plover A long while I was doing research and I had this idea, though I don't know how p...

can you give an example of something a model of this architecture could be trained to do?
also why would one forrest go down?

limber plover Feb 25, 2026, 4:54 PM

#

I have more of this, but I did work on some code and I started out with basic binary classification problem for valid and or not valid worlds of which I was training NN for, though it did have also decision trees. When I say go down, I mean power outage or problem with a node. If I remember right I simply wanted a system using mesh network for certain ML tasks, for example one node would be down so it switches to another. Sorry this was A LONG while ago I did this.

#

I don't have an example as this was a road map for me to start on, but I never did get properly started on it. The idea basically is a multi core processing. The idea is that, one cluster would have Just math so when math is needed that cluster would be activated FOR that task, it would then revolve again for say language processing. Sort of like LLM with multi hat prompmt however, each node does ONE taks only. The master controller then manages this of which node is used acting as input and output. I have more of this but I am afraid to spam. It.

#

I unfortunately have no math and no code at the moment that I can find. However here is more info on this.

#

Decision Tree Details
The dense trees arranged in a matrix formation within each forest would appear as a grid or array of interconnected nodes, representing the individual decision trees. Each row in the matrix could correspond to a specific feature or attribute, while each column could represent a different split or decision point within the tree. The nodes within the matrix would be interconnected to facilitate parallel processing and information exchange, allowing for efficient computation and pattern recognition within each cluster. This matrix arrangement enables simultaneous data processing and collaborative learning among the dense trees, contributing to the model's overall performance and robustness.
Just for the decision tree.

#

Basically if I remeber right, LLM have hard time with math as they are mostly LLM. Suppose you want a math problem now, the language node and math node would talk and are able to get to you the solver for the problem at hand. I used matrix because its the best for math processing. At this time I was working on a chip that was called MALU (matrix aritthamtic logic unitt) I thought I campe up with it, but apperently this existed already for sometime now since the 70s or early.

#

sorry not sure what is going on and double ttt textt, my keyboard being wireless is having problems just a sec.

glacial socket Feb 25, 2026, 5:06 PM

#

Hi!

So I am new to python. My teacher told us that the second chance at exam is in may/june. i suddenlh got a notification and now our exam is on friday. The exam is like this multiple choice exam. You pick what code is right. Some of them you need to fill in the blanks and some are math python stuff. I have had a rough half year and I had to love across the country and I really dont need a good grade. I just need to pass. He said everything is allowez expect talking to people and AI. Do you guys have any tips for me? It would really help me, I need all the tips I can get, even the basic ones. I am not good ar coding and our teacher has been very absent. The whole class is failing and we know almost nothing.

#

had to move across*

limber plover Feb 25, 2026, 5:08 PM

#

eden you have question to help you with?

glacial socket Feb 25, 2026, 5:10 PM

#

Not really a question, I am sorry! More like do you know any websites that perhaps can help me under the exam? Something like that. He said we can use internet but not AI

limber plover Feb 25, 2026, 5:11 PM

#

Right to help you with what exactly? I am not not sure what will be on your exam, you have notes to tell you what you should study for?

glacial socket Feb 25, 2026, 5:12 PM

#

I have python for beginners. Not anything difficult. Why the exam is hard, is because the exam is full of very long code, and we need to either fill in or to pick between code lines. Give me 1 minute and I will show you how far we have come into python

limber plover Feb 25, 2026, 5:14 PM

#

Ok. Otherwise this reads as if.."my teacher told me something to do with the room and he wants it to look certain way, but I am not sure what to do about the room, we have been doing stuff in the room but he said we cannot talk to people or use ai to do something with the room....

glacial socket Feb 25, 2026, 5:14 PM

#

Oh ok! Well I am sorry, never mind just forget it

limber plover Feb 25, 2026, 5:15 PM

#

I am not discouraging you I am tying to ask you to to ask me or us specifics. What was your topic about?

#

Oh good, now I am the bad guy...

#

@glacial socket Still there?

main fox Feb 25, 2026, 5:26 PM

#

glacial socket Hi! So I am new to python. My teacher told us that the second chance at exam is...

Check the syllabus / exam guide if you've been given one and look up the topics.
If you like books, Automate the Boring Stuff is decent and free. If you like videos, I think FreeCodeCamp has a Python intro.

#

Since you're asking in the data science channel, let us know if your exam is specifically about data science content

limber plover Feb 25, 2026, 5:27 PM

#

Maybe I am just bad with people. Thanks Twiibz

#

That said anyone interested in chiming in about what I wrote with regards to my road map to making this system. For example I can see maybe doing small nodes for certain tests. Perhaps making a binary decision tree but braking it up how it is accessed. Having a master controller managing. Perhaps have a node that just stores words, and then have binary decision tree tell you said words of valid or not. IF you ask for example ABC or bob is valid.

#

@serene scaffold Any input on this topic?

#

@serene scaffold Do you have no opinon on this? Should I move this to some other place for talks I am open to any ideas or questions.

serene scaffold Feb 25, 2026, 5:37 PM

#

limber plover <@253696366952316929> Do you have no opinon on this? Should I move this to some ...

I'm working right now and can't get into the level of depth that your topic requires
this is the best channel on this server for your topic.

#

https://discord.gg/FwEw8PKX

limber plover Feb 25, 2026, 5:37 PM

#

@serene scaffold Ah ok my apology

#

Does not look like they are active there. Maybe I will comeback this evening.

half pulsar Feb 25, 2026, 6:08 PM

#

serene scaffold https://discord.gg/FwEw8PKX

That server sucks absolute shit

#

Instant banned for starting a talk in General chat

#

Redditor level mods

limber plover Feb 25, 2026, 6:08 PM

#

What the hell?

#

I was having a conversation about my idea I had while back and I got banned for it, never told why until I had to ask.

half pulsar Feb 25, 2026, 6:09 PM

#

limber plover I was having a conversation about my idea I had while back and I got banned for ...

You were pasting LLM Text

#

Clear as day

limber plover Feb 25, 2026, 6:09 PM

#

LLM text? you mean REGULAR text? what is LLM text?

half pulsar Feb 25, 2026, 6:10 PM

#

limber plover LLM text? you mean REGULAR text? what is LLM text?

Just don't talk to me you just got me banned for even being near you

limber plover Feb 25, 2026, 6:10 PM

#

I never used GPT to think out my thoughts, Now I was pasting large block of text, BUT ASKED if could.

half pulsar Feb 25, 2026, 6:10 PM

#

Unlike you I actually care, build and have had a passion for it.

limber plover Feb 25, 2026, 6:10 PM

#

Holy hell, how childish.

#

I was A asking if this something possible and B have a conversation about it. What did I have to gain by using GPT for this?

half pulsar Feb 25, 2026, 6:11 PM

#

Whatever. That was on you. Don't copy and paste ChatGPT shit.

limber plover Feb 25, 2026, 6:11 PM

#

ITS NOT GPT MY god

#

I typed this out

half pulsar Feb 25, 2026, 6:12 PM

#

I got banned in Affiliation

half pulsar Feb 25, 2026, 6:12 PM

#

limber plover I typed this out

BS it was a wall of text now don't argue no more

limber plover Feb 25, 2026, 6:12 PM

#

YES it was a wall of text, I ASKED IF I CAN DO THIS. You said sure.

#

I could have done it line by line why would I though?

#

I was getting my thought out. What a wired reaction.

half pulsar Feb 25, 2026, 6:13 PM

#

I said SURE cause I just was looking for conversation in something I'm passionate about, I was literally about to tell you to stop the LLM Spam.

#

It was already getting on my Nerves just type it out next time.

limber plover Feb 25, 2026, 6:14 PM

#

Ok, lets do this, what part of it was LLM specifically.

#

The idea of multi core system for specific tasks?

half pulsar Feb 25, 2026, 6:15 PM

#

How about we start off where we left off, It was about KGs

limber plover Feb 25, 2026, 6:15 PM

#

Ok tell me more about KGs

half pulsar Feb 25, 2026, 6:15 PM

#

No copy and paste walls of text.

#

Just regular talk, so have you built anything?

limber plover Feb 25, 2026, 6:15 PM

#

All right I wont do that, though I was not trying to spam. I actualy asked if this was ok, it seemed like it was.

#

I did build something though I no longer have the code, I started out, AS I was saying using simple binary decision tree classifier if I remember right, for valid and not valid words, I was training to look at impute like this dgo and dog and tell me if this was valid or not, though I was going to try and make it into my multi core system. I abandoned this idea.

half pulsar Feb 25, 2026, 6:17 PM

#

limber plover All right I wont do that, though I was not trying to spam. I actualy asked if th...

Just a reminder put yourself in their shoes if you were running a server about AI, you'd be quickly flooded with just bots, No tolerance for that shit in the AI communities, no second chances in servers like that when you first join.

limber plover Feb 25, 2026, 6:18 PM

#

Fine, I am no longer interested to ever be near that server. I was suggested to go to it for deeper talks about it.

#

That was all, since the person who I was having a conversation with did not have the time for it.

#

I continued spaming text to get my full thought out. I could type it out but that would be really strange. The mod who handled this could have reached out and talked to me but nope.

#

Fine by me. Never using that server.

half pulsar Feb 25, 2026, 6:19 PM

#

limber plover I did build something though I no longer have the code, I started out, AS I was ...

Build it again trust me you'd keep building it over and over for years realizing how hard it is, The main issue is stability at scale when we're talking KG

limber plover Feb 25, 2026, 6:21 PM

#

RIGHT, that is why I stopped as it would require rather a lot of work and processing, that said, it would be interesting to implement it in more small scale. Like using simple binary classification. Have nodes comunicate in a mesh network doing specfic tasks. I did not say it was a good idea, I was curious how practical it was, I even asked this.

#

I cannot now tell if anyone was even reading anything about by in between inputs.

half pulsar Feb 25, 2026, 6:22 PM

#

If I can do it you probably can

#

Like I said if you want to build it you're gonna need to learn a lot more about "experimental" equations for AI

#

If you're not good at math it's going to be a struggle.

limber plover Feb 25, 2026, 6:24 PM

#

Tell me more about your system you built

half pulsar Feb 25, 2026, 6:26 PM

#

limber plover Tell me more about your system you built

It’s a governed hybrid system focused on determinism and bounded reasoning at scale.

#

Telling anymore than that would be revealing

limber plover Feb 25, 2026, 6:30 PM

#

Yes, I did not know you can make a hybrid system that is why I wanted to talk about it, what did yours solve when you built it?

#

Sorry, I having a conversation with the mod right now and they are REALLY wired. I honestly cannot tell if I am not talking to Ai.

#

The one that banned us..

half pulsar Feb 25, 2026, 6:30 PM

#

I think that's as much as I'm going to share so don't expect much more cause anything else would be systems level revealing.

limber plover Feb 25, 2026, 6:31 PM

#

Are the nodes hardware or did you make software? I was thinking the cluster would be servers...more or less doing certain task IF I was to scale this.

#

I mostly started with emergent behavior systems .

half pulsar Feb 25, 2026, 6:32 PM

#

limber plover Are the nodes hardware or did you make software? I was thinking the cluster woul...

I mean you can network it.

#

Okay enough is enough 🤷‍♂️

#

And if you wanted to see a visualization of it(limited though cause it'd crash)

limber plover Feb 25, 2026, 6:36 PM

#

That is EXACTLY how I saw it in my head though I did view it also form top view

half pulsar Feb 25, 2026, 6:36 PM

#

That's a limited view of it

#

https://klipy.com/gifs/math-graph

Klipy

Rotating 3D Math Graph

▶ Play video

#

Looks more like that

limber plover Feb 25, 2026, 6:37 PM

#

To give you an idea of similar to what I had was something like this processing https://www.parallax.com/propeller-multicore-concept/

Parallax

Propeller Multicore Concept - Parallax

The Propeller family of microcontrollers are designed to perform multiple tasks simultaneously, without the need for interrupts or the dictates of an onboard

#

honestly that is incredible

half pulsar Feb 25, 2026, 6:39 PM

#

Thanks

limber plover Feb 25, 2026, 6:44 PM

#

Man, I am telling you the mod that banned us is REALLY odd, I feel like they are an AI. We are talking about regarding how I knew a lot of stuff I talked about and they said we like cs50p. WHAT?

#

Its some Harvard corse but its not relevent to the topic...;

half pulsar Feb 25, 2026, 6:44 PM

#

limber plover Man, I am telling you the mod that banned us is REALLY odd, I feel like they are...

Doesn't matter to me anyways better if I'm not on there

limber plover Feb 25, 2026, 6:47 PM

#

Yeah dude I don't know really odd, anyway stay around I do want to talk more bout it. Let me ask this, was my "spam" wrong in terminology I used? Was there something I missed on what I said wrong way?

half pulsar Feb 25, 2026, 6:48 PM

#

limber plover Yeah dude I don't know really odd, anyway stay around I do want to talk more bou...

Let me be real with you here I'm done talking about KG stuff, for me I don't want to say something revealing.

limber plover Feb 25, 2026, 6:49 PM

#

Ah ok your own private project then?

half pulsar Feb 25, 2026, 6:49 PM

#

That's as much as I'll say aloud publicly. Yeah

half pulsar Feb 25, 2026, 6:49 PM

#

limber plover Ah ok your own private project then?

KGs are a hot topic right now

#

If I truly had something stable at scale and everything I said it was I'm sitting on Gold.

#

If you're interested in KGs then start now

#

If you just want a hobby project just take it easy

limber plover Feb 25, 2026, 6:53 PM

#

All right, anyway sorry for getting you banned but that was really odd.

limber plover Feb 25, 2026, 7:13 PM

#

@half pulsar Hey I found this, this seems like what you were talking about https://pmc.ncbi.nlm.nih.gov/articles/PMC11316662/?

PubMed Central (PMC)

Knowledge-graph-based explainable AI: A systematic review

In recent years, knowledge graphs (KGs) have been widely applied in various domains for different purposes. The semantic model of KGs can represent knowledge through a hierarchical structure based on classes of entities, their properties, and their ...

bronze wyvern Feb 25, 2026, 7:40 PM

#

hello, quick question, when we normalize our independent variables, say for a linear regression algorithm, do we need to normalize the dependent variable also, that is the target variable?

limber plover Feb 25, 2026, 7:52 PM

#

@thin sky Can we have a conversation? I am really confused what you do not get? I am actually really curious about this. Perhaps call?

rich moth Feb 25, 2026, 8:32 PM

#

half pulsar KGs are a hot topic right now

What did you make? I made a KG too.

limber plover Feb 25, 2026, 8:34 PM

#

Yeah I know nothing about KGs I just got introduced into them.

#

I presented my idea and turns out the concept I have is similar to what I had in mind. I have not built anything yet. Long road there.

half pulsar Feb 25, 2026, 8:35 PM

#

rich moth What did you make? I made a KG too.

Now scale it to 500 million nodes. In Seconds. No LLM.

limber plover Feb 25, 2026, 8:38 PM

#

OK this may sound dumb but are you guys interested in having this compressed into smaller nodes?

#

Could one use matrix for this? I am trying to remember this again.

rich moth Feb 25, 2026, 8:40 PM

#

half pulsar Now scale it to 500 million nodes. In Seconds. No LLM.

I'm using Postgres + pgvector with a typed graph schema . the data model already maps to what enterprise kg systems use. but the main thing i havent done yet is flip postgres from mirror to primary store.

half pulsar Feb 25, 2026, 8:40 PM

#

limber plover Could one use matrix for this? I am trying to remember this again.

No

limber plover Feb 25, 2026, 8:41 PM

#

@half pulsar Is there no way to do tthis # Compress ONE text into 2x2 matrix, then reconstruct
text = "HELLO"

Split & store in matrix (list of dicts)

matrix = [
[{"char": "H", "pos": 0}, {"char": "E", "pos": 1}],
[{"char": "L", "pos": 2}, {"char": "LO", "pos": 3}]
]

RECONSTRUCT (concatenate)

original = ''.join(cell["char"] for row in matrix for cell in row)
print(f"Original: '{original}'") # "HELLO" but on larger scale and with nodes?

#

I am thinking node in hardware terms here as a sort of another brain sort of like that one chip.

half pulsar Feb 25, 2026, 8:45 PM

#

rich moth I'm using Postgres + pgvector with a typed graph schema . the data model already...

Pretty common stack

limber plover Feb 25, 2026, 8:48 PM

#

Sorry, can't matrix used for storing data if yes and then concatenate out? If so can't you use that for maybe not the node but the data it has?

half pulsar Feb 25, 2026, 8:49 PM

#

limber plover Sorry, can't matrix used for storing data if yes and then concatenate out? If so...

It wouldn't hold up in scale.

limber plover Feb 25, 2026, 8:49 PM

#

Would that be just too much to compute?

half pulsar Feb 25, 2026, 8:50 PM

#

Not only just that I wouldn't expect it to even be stable.

limber plover Feb 25, 2026, 8:50 PM

#

Why?

#

I thought matrix's are designed for fast processing, specifically numerical data no?

half pulsar Feb 25, 2026, 8:51 PM

#

There is a lot more than to it then just trying to store information in a huge "matrix".

#

And why would you store something useless liike Hello.

limber plover Feb 25, 2026, 8:52 PM

#

The hello was an example

half pulsar Feb 25, 2026, 8:52 PM

#

Get better examples, it'd help you build a better structure, you need a goal.

limber plover Feb 25, 2026, 8:52 PM

#

I thought also its braking it up so that its not processing all at once but in pieces

#

Though I guess overhead would be the problem trying to concatenating out is that why it would not scale?

half pulsar Feb 25, 2026, 8:53 PM

#

Yeah overhead would be the biggest problem there.

limber plover Feb 25, 2026, 8:54 PM

#

Yeah, I remember trying to do something with compression and even trying to compress all the indices that have to go on to under what you compressed was a problem.

#

This is party why I left the project, I simply did not have the time to learn the exact math for it.

half pulsar Feb 25, 2026, 8:54 PM

#

You need a algorithm.

limber plover Feb 25, 2026, 8:55 PM

#

That is true but what algo would be good to work with that?

half pulsar Feb 25, 2026, 8:56 PM

#

I'm not telling you that XD

limber plover Feb 25, 2026, 8:56 PM

#

So need to make a custom one then?

half pulsar Feb 25, 2026, 8:56 PM

#

Pretty much

limber plover Feb 25, 2026, 8:56 PM

#

Interesting...

half pulsar Feb 25, 2026, 8:56 PM

#

Think in layers here

limber plover Feb 25, 2026, 8:57 PM

#

Yeah..but for matrix would it not be for x and y and z? You would basically need to make a cube?

#

If you layer it, that is what it would become yes?

half pulsar Feb 25, 2026, 8:58 PM

#

These are questions you can find answers for online, there's a lot of research papers on that type of stuff.

limber plover Feb 25, 2026, 8:59 PM

#

I have read them but a long while ago, I am just remembering some stuff I read a long time ago.

half pulsar Feb 25, 2026, 8:59 PM

#

limber plover I have read them but a long while ago, I am just remembering some stuff I read a...

Yeah this stuff has been around since the beginning

limber plover Feb 25, 2026, 8:59 PM

#

Layering would be an algo you would have to use.

#

I am just thinking outload..it would be best if I was in a voice chat group about this.

#

I have no experience with this or not much of it, but I am just thinking this out visually what it would look like and what you could possibly use for it.

half pulsar Feb 25, 2026, 9:02 PM

#

The problem is that you're over thinking the wrong thing, its not as simple as how do I store information, its more of how can I retain that information at scale with stability and do it efficiently, it takes many "Layers" to get to that point even then you'd hit a wall of Complexity creep, Every-time.

half pulsar Feb 25, 2026, 9:03 PM

#

limber plover I have no experience with this or not much of it, but I am just thinking this ou...

It has to be thought of as Layers and Systems, its pure math here, if you need to I highly recommend refreshing math from the basics, Its about Repetition here, once you know the basics of math you can build up to actually implementing it into structured code.

#

Nobody is good at math, You need to practice regularly.

limber plover Feb 25, 2026, 9:05 PM

#

Ok I am reading this again as its been a while not humor me as I am using arithmetic logic unit for this that would be implemented in harware form..something like The core of the MAC is the Matrix Arithmetic Logic Unit (MALU). The overall functionality of the MALU is to perform the matrix operations and write the output to memory. Could one not do something like this same idea for nodes?

half pulsar Feb 25, 2026, 9:06 PM

#

limber plover Ok I am reading this again as its been a while not humor me as I am using arithm...

Once again overhead, tell me what do you want to do clearly here?

limber plover Feb 25, 2026, 9:06 PM

#

What I have saved was this link https://www.ece.ualberta.ca/~elliott/ee552/projects/1998f/matrix_calculator/MALU.htm now I get that this is NOT quite the same thing..but the concept seems like you could do it...I mean high level abstract computation is just down to binary anyway...

half pulsar Feb 25, 2026, 9:06 PM

#

Like give me a demo of what your project would do. In text

limber plover Feb 25, 2026, 9:07 PM

#

The link I gave you is what I made a while back but using verlog I think that is the name...implementing a chip. For a 16 bit system.

#

I was wanting this as ALU would be much faster FOR numbers specifically. I then wondered if one can use this for higher levels. Mostly I wanted to see one can fit ML into 16bit system and what can it do.

#

I never did finish 16bit computer as I got distracted thinking about this...then Ai.. My mind loves to move fast..a lot

half pulsar Feb 25, 2026, 9:10 PM

#

limber plover I never did finish 16bit computer as I got distracted thinking about this...then...

I'm so confused right now.

#

So you're building a 16 bit computer?

limber plover Feb 25, 2026, 9:10 PM

#

Yeah sorry topic jump, its my history why I am so hung up on matrix.

half pulsar Feb 25, 2026, 9:11 PM

#

That makes more sense now personally I don't see the value in proceeding with that.

limber plover Feb 25, 2026, 9:11 PM

#

I was building it, I have a lot of ideas like these as I am expressing and I get half way through them and then leave them....

half pulsar Feb 25, 2026, 9:11 PM

#

Like if you want to build AI you don't need to build a 16 bit computer

#

Talking about two different things here KGs and 16 bit computing. Its having me confused

limber plover Feb 25, 2026, 9:12 PM

#

No, but I do want to build one since I would know exactly what its doing, I would then try to fit a FORM of LM into this system to see how strained it could get. Perhaps then needing more ram but curious what sort of lago you would use to compress it further down.

#

Think of it this way, you can implement KG ON a compute yes clearly?

#

How low of bits can we make this to fit it.

#

Maybe even just one node.

half pulsar Feb 25, 2026, 9:13 PM

#

I don't see the benefit

limber plover Feb 25, 2026, 9:14 PM

#

Think of the moon lander...

#

They had to do with very little yet they were able to get A LOT done, now what sort of algo did they have to use, Its impresseve, same idea, though we are NOT ladning on the moon.

half pulsar Feb 25, 2026, 9:15 PM

#

Giving me 80s/90s vibings NOT a good thing

limber plover Feb 25, 2026, 9:16 PM

#

I mean this is what nvidia is doing are able to have large processing task done on a chip.

#

You can implement your KGs I bet, much better on thier chips.

#

I am sort of thinking a long those lines, though I thought matrix would be involved

#

Anyway, I digress....

#

My main question is, HOW small can you make the nodes to do what you need them to fit on embedded systems.

#

Why not? A node does not have to be with KGs..it could be some other processing node for say classification problem yes?

half pulsar Feb 25, 2026, 9:20 PM

#

Even me telling you my project does hundreds of nodes in seconds is revealing

limber plover Feb 25, 2026, 9:20 PM

#

Hundresed of nodes in secs on how large of a system with how much ram do you need?

#

You don't have to answer that, but suppose you do it with some other simple problem

#

96gb is a lot of ram, though context here is how much processing are you doing with data, so it must be impressive, now assume you try to do this with even less ram. Would you think you could?

half pulsar Feb 25, 2026, 9:22 PM

#

Answering that would tell you the magic behind the curtain

limber plover Feb 25, 2026, 9:22 PM

#

Is that the active project then you are working on?

#

Ah see, we think a like. I am just approaching it from another way. Clearly you cannot out engineer complexity to scale eventually you do need to add more ram.

#

So I guess I am not crazy or stupid in this yeah?

#

Can you tell me what task you are trying to solve with AI?

#

Ok let me ask this, can you access your nodes procedurally?

#

Meaning only with in certain task given it access said nodes as some time when its done it dose not proceed with the data untill need more?

#

Not sure if I made that right.

#

No...hold on let me think on this;...

lunar heart Feb 25, 2026, 9:27 PM

#

hi

limber plover Feb 25, 2026, 9:28 PM

#

You have a node, and its doing some data or needs to access some data yes?

#

Ok wait, could you build nodes procedurally instead of by hand? if you know what I mean...

#

I feel like that is something you would want....

#

If I need more tasks done I would extend nodes but I have an algo that just builds them as needed yes?

half pulsar Feb 25, 2026, 9:30 PM

#

I'm not going to describe the architecture

limber plover Feb 25, 2026, 9:30 PM

#

Sure you don't have to, but you get the idea what I am saying, am I on the right track in thinking that generaly?

#

I am not looking at your project or needing some info from yours, I am just reasoning all of this out, I am not even doing any math about it lol

#

The problem is that overhead for having to build these nodes as needed, the processing would be hard, so I guess you would have to control rendering for said task. I could think using pricewise function for this where it predicts future use needs. Or something of along the lines of that sort of algo,.

half pulsar Feb 25, 2026, 9:33 PM

#

Like this is as much as I'll let myself say aloud, You just gotta take it and make what you want from it

limber plover Feb 25, 2026, 9:33 PM

#

This seems strange that I am actually that close to your project surely this all seems BS to you?

half pulsar Feb 25, 2026, 9:34 PM

#

No you're way off-track but thats fine don't try to do what I did, it's up to you to build whats right.

limber plover Feb 25, 2026, 9:34 PM

#

Oh ok good

half pulsar Feb 25, 2026, 9:35 PM

#

Don't try to follow others don't poison yourself, Just like I said the answers become clear when you know the math

limber plover Feb 25, 2026, 9:35 PM

#

Are you afraid that if you correct me it will leak out on how you are thinking so I might start develping along your lines?

half pulsar Feb 25, 2026, 9:36 PM

#

I'm not going to tell you what ways or any other pointers, except for scale and stability. watch for complexity creep.

#

That's the only thing you should focus on after learning the math behind it

limber plover Feb 25, 2026, 9:36 PM

#

What math would I need to know?

#

So linear algebra I suppose

half pulsar Feb 25, 2026, 9:37 PM

#

limber plover Are you afraid that if you correct me it will leak out on how you are thinking s...

No, because everything I've said is just so vague and high level and you can make any thing out of it.

limber plover Feb 25, 2026, 9:38 PM

#

Let me ask this, can you make procedural nodes Suppse I do not care about processing for now.

#

Were you using some other architecture besides KGs. Meaning did you start the project with that in mind or was there some other you made before but did not work so you rebuilt it?

#

Anyway, I am actually not going to pursue this, its interesting to think about. I am just going to do some more python programming for whatever ideas I have, like procedural nodes that interests me now.

#

This is actually how I got my project here done, I was thinking this much and just implemented it, it worked but not sure how efficient it was, though I am not really good at programming.

half pulsar Feb 25, 2026, 9:43 PM

#

This is something I've been working on since 2014. I don't expect people to try and pursue it.

limber plover Feb 25, 2026, 9:43 PM

#

Wow a long time.

#

Congrats

#

Ayway, good luck I am out for now.

keen wind Feb 25, 2026, 10:49 PM

#

is it possible to convert chemical structures and their information into vectors for machine learning

rare bane Feb 25, 2026, 10:50 PM

#

umbral dove https://theconversation.com/openai-has-deleted-the-word-safely-from-its-mission-...

With regards to this anthropic has followed suit and removed safely from its mission statements as well, and currently chatter around is due to the upper hierarchy of anthropic frowning at their usage of their AI in military operations. Of course they use grok and openAIs chatgpt, but it seems claude appears to be a cut above the rest. And with Pete hesgeth leaning towards labelling anthropic as a "supply chain risk", I guess it was bound to happen
https://www.linkedin.com/news/story/anthropic-shifts-stance-on-ai-safety-7047916

Anthropic shifts stance on AI safety | LinkedIn

The AI firm has revised its safety commitments as competition intensifies among major industry players.

rich moth Feb 25, 2026, 11:15 PM

#

rare bane With regards to this anthropic has followed suit and removed safely from its mis...

Its a sad day, all integrity out the window with corporate America and the Trump admin.

lime grove Feb 26, 2026, 3:36 AM

#

The military world demand this regardless of who's the President.

rich moth Feb 26, 2026, 3:56 AM

#

Like a parade?

#

I get it, the USA and CHINA in a new space race, AI. Everything the admin does is political. But its not presidential the way its going down.

#

Theres no normalizing this dude.

#

anyways! im done

lime grove Feb 26, 2026, 4:36 AM

#

It's just the nature of the Pentagon. I'm not saying I agree with this.

#

Consider nukes, for instance. There's no guarantee that they will not use them at some point, and any external control over their ability to use them would be unthinkable to them.
This whole topic is f****d

mossy blaze Feb 26, 2026, 7:12 AM

#

Currently 21/400 ARC training tasks solved with unified approach! https://github.com/Julien-Livet/aicpp

GitHub

GitHub - Julien-Livet/aicpp: Artificial intelligence with a network...

Artificial intelligence with a network of connected neurons - Julien-Livet/aicpp

urban heart Feb 26, 2026, 9:32 AM

#

I'm running a flow for my chess model where I let codex: 1. validate lichess datasets (using python-chess) 2. upload to hf 3. start a runpod to train the model, 4. get it back 5. start and run the model with a bot account to compete against other bots on lichess. (all this done by codex running commands, scripts, and following my specs)

Anyone interesting in giving me some pointers or talking about this?

The model I'm training is a phase aware (early game, mid game, late game), LSTM next-move model, training on elite game PGNs (from lichess) capped at 4 random moves moves in the game.

smoky robin Feb 26, 2026, 10:55 AM

#

guys i need help with something

#

I am building a NER for NCBI disease Corpus. my text abstract or the input sequence are variable in length. I Plan to use LSTM for this task initially and i am using TensorFlow as the framework. Problem is

How do I handle the Variable length input sequences

gritty vessel Feb 26, 2026, 10:59 AM

#

smoky robin I am building a NER for NCBI disease Corpus. my text abstract or the input seque...

You can pad the variable length to maxlength seq?

smoky robin Feb 26, 2026, 11:00 AM

#

gritty vessel You can pad the variable length to maxlength seq?

ok i can give this a shot

gritty vessel Feb 26, 2026, 11:01 AM

#

smoky robin ok i can give this a shot

Also I can be wrong but you can ignore the added info(padding) during training

smoky robin Feb 26, 2026, 11:02 AM

#

ok how do i do that? isnt that a feature available in pytorch?

#

another thing, i am having issues tokenizing the sequencing. i tokenized the labels but for the input i still have no idea what is the right approach

gritty vessel Feb 26, 2026, 11:06 AM

#

I don't have any good resources for that but when I train something like convlstm or unet on images

#

Sometimes satellites don't capture data so it gives fillvalue or nans

#

I remember there is a way to ignore this fill values during training

#

I will share it as I find it

#

criterion = torch.nn.CrossEntropyLoss(ignore_index=255)

#

Something like this

smoky robin Feb 26, 2026, 11:09 AM

#

yeah i researched it a bit apparently you pad the sequence by <PAD> and then mask it so it doesn't affect the training

gritty vessel Feb 26, 2026, 11:09 AM

#

Great

smoky robin Feb 26, 2026, 11:12 AM

#

there is another thing i worked on recently but it was never cleared up. lets say i train a RNN model on patient EEG session. in testing i pass on variable length input like pat 1 5 session, pat 2 20 session and so on so based on that how do you predict the rest of the sequence

fleet tendon Feb 26, 2026, 11:14 AM

#

What do you think is the best way to track these types of stuff?

I tried doing something like a cup tracking but the resnet18 model I tried using to track coordinates of every cup at all frames is having trouble with the occluded samples (returning coordinates of nothing because it can't see the 2 hidden cups). Because of this, the network i trained to connect the coordinates per frame is making mistakes

Resnet18 might be lacking in resolution but i can't do heavier model as this is meant to run realtime

gritty vessel Feb 26, 2026, 11:15 AM

#

smoky robin there is another thing i worked on recently but it was never cleared up. lets sa...

Maybe autoregressively? Like taking past few input x1,x2,x3 to predict x4 then x2,x3,x4 to predict x5 and so on

smoky robin Feb 26, 2026, 11:20 AM

#

gritty vessel Maybe autoregressively? Like taking past few input x1,x2,x3 to predict x4 then x...

so if i train for 60 session and in testing phase i pass only 20 session will the LSTM throw any kind of error for not having fixed dimension? Linear regression and XGBoost did so unless i am understanding something different here

gritty vessel Feb 26, 2026, 11:21 AM

#

I will stop commenting as I am not qualified enough to comment on this ,I work on forecasting so in that I take like past 8images and then predict next 4images so it's little bit similar but still not enough to guide you.I will let someone else take over from here

gritty vessel Feb 26, 2026, 11:21 AM

#

smoky robin so if i train for 60 session and in testing phase i pass only 20 session will th...

Sessions ? what does each session is made up of I am curious

smoky robin Feb 26, 2026, 11:27 AM

#

average of a EEG recording in a single session i believe

dire surge Feb 26, 2026, 12:55 PM

#

hello I made a data cleaner program but I need someone to test it... can anyone help?
https://github.com/Mohammed-Musab/Lazy-Data-Cleaner/releases
(Note that it might be unstable since I added GUI recently)

tardy haven Feb 26, 2026, 1:01 PM

#

Hii everyone I'm new here

dire surge Feb 26, 2026, 1:04 PM

#

hello

granite zephyr Feb 26, 2026, 1:50 PM

#

This may be a stupid question, but i want to use LinearRegression() from sklearns, and use the fit function model.fit(X=x_train, y=x_train), my teacher has written for us to use model.fit(X=x_train, y=x_train) and not model.fit(X=x_train, y=y_train) , is this a typo or am i misunderstanding something? Also is there a reason why the MSE of model.fit(X=x_train, y=x_train) is 500+?

serene scaffold Feb 26, 2026, 1:55 PM

#

granite zephyr This may be a stupid question, but i want to use `LinearRegression()` from sklea...

That's definitely a typo.

#

Do you understand x and y, and train and test?

granite zephyr Feb 26, 2026, 1:57 PM

#

Yes, also forget the question about MSE, I used the y_val set.

serene scaffold Feb 26, 2026, 1:57 PM

#

Great

granite zephyr Feb 26, 2026, 1:58 PM

#

serene scaffold Great

Thank you

west wing Feb 26, 2026, 2:42 PM

#

any one worked with equinox

serene scaffold Feb 26, 2026, 3:04 PM

#

west wing any one worked with equinox

just ask your actual question

opaque condor Feb 26, 2026, 3:34 PM

#

Has anyone ever used the hand drawn number and letter data set to generate a message using the dataset

tardy haven Feb 26, 2026, 3:40 PM

#

Guyzz listen I want to create a bot for instagram gc any body knows how to make ?? So olzz help me

#

I want to impress my crush

opaque condor Feb 26, 2026, 3:42 PM

#

tardy haven Guyzz listen I want to create a bot for instagram gc any body knows how to make ...

If you want to impress your crush be yourself you don't need some type of machine

Because people will just think that's too good to be true and when they find out it is resentment being hurt that's worse than just being yourself

tardy haven Feb 26, 2026, 3:43 PM

#

Yuppp bro but I want to do something crazy for her

#

I want to make something by programming

opaque condor Feb 26, 2026, 3:47 PM

#

How long have you been working with python?

tardy haven Feb 26, 2026, 3:50 PM

#

opaque condor How long have you been working with python?

Just a few months 😔

opaque condor Feb 26, 2026, 3:50 PM

#

Specifically because that gives us more of understanding the question

#

Do you want in AI or something like a regular program

rich moth Feb 26, 2026, 3:54 PM

#

tardy haven Just a few months 😔

You gonna let that stop you?

#

This is your crush we're talking about. Take a page out of Nikes book and just do it. 🙂

tardy haven Feb 26, 2026, 3:55 PM

#

opaque condor Specifically because that gives us more of understanding the question

Ahh got it

opaque condor Feb 26, 2026, 3:56 PM

#

I'm sorry I just need more information before I can really give a response

tardy haven Feb 26, 2026, 3:56 PM

#

opaque condor Do you want in AI or something like a regular program

AI-based is also fine, but I want it to impress her and look cute

tardy haven Feb 26, 2026, 3:57 PM

#

rich moth This is your crush we're talking about. Take a page out of Nikes book and just ...

Haha 😅 you’re right, I should just go for it

tardy haven Feb 26, 2026, 3:58 PM

#

opaque condor I'm sorry I just need more information before I can really give a response

I’ll give you more details then.

opaque condor Feb 26, 2026, 3:58 PM

#

tardy haven AI-based is also fine, but I want it to impress her and look cute

So you were thinking of making a large language model
Or an ai that can generate a poem from a image?

tardy haven Feb 26, 2026, 4:01 PM

#

I actually wanted to make a welcome bot for an Instagram group chat, but it’s quite difficult

#

But now I’m thinking of making one for my crush that trending blooming flower thing from reels and host it on something like Netlify

opaque condor Feb 26, 2026, 4:04 PM

#

I'm sorry I wish I could help you

#

I don't really know how to use social media I wish I could help

tardy haven Feb 26, 2026, 4:05 PM

#

It's okk broo

#

Bro, can you make something good for me on your schedule or way??

lyric vale Feb 26, 2026, 4:19 PM

#

can anyone suggest me project in neural network for resume

limber plover Feb 26, 2026, 4:28 PM

#

@lyric vale How simple or complex do you want the project? Why not make a simple binary classification and train the NN on dictionnary words for valid or not valid words. Something like dgo and dog, one is valid and one is not valid. You could then extend the list to add more words. Kinda over kill using NN but why not.

lyric vale Feb 26, 2026, 4:29 PM

#

limber plover <@902960224044978186> How simple or complex do you want the project? Why not mak...

i am just started learning since 1 month so not much complex would be better

limber plover Feb 26, 2026, 4:30 PM

#

@tardy haven You can make something better, if you are really into them you can simply say, here is thing I tried making, I was going to program this whole complex thing, but I am not really good at it, but I tried my best. I am rather sure they will appreciate the effort.

lyric vale Feb 26, 2026, 4:30 PM

#

limber plover <@902960224044978186> How simple or complex do you want the project? Why not mak...

right now i am working on human written text predictions which is almost over

limber plover Feb 26, 2026, 4:30 PM

#

@lyric vale Projects starts with what you know. What do you know?

lyric vale Feb 26, 2026, 4:31 PM

#

limber plover <@902960224044978186> Projects starts with what you know. What do you know?

basic concepts of neural network

limber plover Feb 26, 2026, 4:31 PM

#

Predicting just text like 1 2 ..4 the what is missing is? Or something like context aware?

lyric vale Feb 26, 2026, 4:31 PM

#

lyric vale right now i am working on human written text predictions which is almost over

i implemented this without any in build function so i can understand how it works

tardy haven Feb 26, 2026, 4:32 PM

#

limber plover <@1476549841671684226> You can make something better, if you are really into the...

I had made something before and they really liked it, but now I want to do something crazy, which is beyond my limits

limber plover Feb 26, 2026, 4:32 PM

#

@lyric vale That is fine, then since you know basic things about NN, you can just look up what a binary classification problem is. it is not any more complicated then predicting text.

lyric vale Feb 26, 2026, 4:32 PM

#

limber plover Predicting just text like 1 2 ..4 the what is missing is? Or something like cont...

just predicting 0...9 a...z A...Z

limber plover Feb 26, 2026, 4:33 PM

#

@tardy haven If it is beyond your limits then how can you possibly make it?

lyric vale Feb 26, 2026, 4:34 PM

#

lyric vale just predicting 0...9 a...z A...Z

after completing this i want to make it to smth advanced like complete page human written text to pdf

#

is it good idea or should i make smth else

limber plover Feb 26, 2026, 4:34 PM

#

@tardy haven Start with what you know and then make it creatively, something amazing is very subjective to individual, it means nothing on what you are saying amazing except what you see it is.

tardy haven Feb 26, 2026, 4:35 PM

#

limber plover <@1476549841671684226> If it is beyond your limits then how can you possibly mak...

That’s why I’ve been giving my best for 2 weeks, but it’s still not working

tardy haven Feb 26, 2026, 4:36 PM

#

limber plover <@1476549841671684226> Start with what you know and then make it creatively, som...

Ahh got it , I’ll start with what I know and try to make it creative

limber plover Feb 26, 2026, 4:36 PM

#

@lyric vale Well, how close is that to what you know? Try it and see. However, binary classification problem is not much different, you are classifying two choices, is 0..9 a number yes? valid, is abc a number no? then not valid, I am being abstract here but this can help you with other topics later on.

lyric vale Feb 26, 2026, 4:36 PM

#

limber plover <@902960224044978186> Well, how close is that to what you know? Try it and see. ...

ohh got it

limber plover Feb 26, 2026, 4:37 PM

#

@lyric vale That is just one way of using NN, you can also use it to build basic logic. Something like nand gates /AND/OR/XOR gates from that

lyric vale Feb 26, 2026, 4:38 PM

#

limber plover <@902960224044978186> Well, how close is that to what you know? Try it and see. ...

yes i know binary classification

lyric vale Feb 26, 2026, 4:39 PM

#

limber plover <@902960224044978186> That is just one way of using NN, you can also use it to b...

yes

limber plover Feb 26, 2026, 4:39 PM

#

Now this is simple right? But try building ALU just using NN. Its a ridiculous project but interesting exercises

lyric vale Feb 26, 2026, 4:39 PM

#

limber plover Now this is simple right? But try building ALU just using NN. Its a ridiculous p...

arithmetic logic unit right

limber plover Feb 26, 2026, 4:39 PM

#

Yes

#

Which you can build from previous gates. You can start with XOR or use NAND. Nand is mostly used as it is a bit faster.

#

You can then rearange nand in to any basic gates, from there you build your structure for like dmux and mux and so on. However, in your case you are using NN or several of them doing just that.

#

Its not going to be efficient, but that is not the point.

lyric vale Feb 26, 2026, 4:41 PM

#

limber plover Its not going to be efficient, but that is not the point.

yes you are right. i will try making this

#

should i make it from scratch or use inbuild functions?

limber plover Feb 26, 2026, 4:42 PM

#

Make a simple one, you can try training single NN on several chips or use several of them. You will have to have combine them eventualy using one after the other. NN based computers have been done before back in 1960

#

Depends on what you are interested in, is inbuild functions going to abstract too much from your learning?

lyric vale Feb 26, 2026, 4:44 PM

#

probably i should use inbuilt functions cause if i make it from scratch then it will take too much time

limber plover Feb 26, 2026, 4:46 PM

#

@lyric vale true but you would learn more. However, if you know in general what inbuild functions do, then its ok to have that abstracted for you and treat it like a blackbox. IN computer engineering, the engineer is not really interested in how exactly the transistor arranged inside it, they are only interested on what the chip is doing. You leave the rest to hardware engineer. So you can think of it that way in this small case.

lyric vale Feb 26, 2026, 4:47 PM

#

limber plover <@902960224044978186> true but you would learn more. However, if you know in gen...

@limber plover thank you. btw where do you work and what is your role if you like to share.

limber plover Feb 26, 2026, 4:47 PM

#

@lyric vale Its hobby for me, I don't work any place, I do like collecting knowledge though.

#

Mostly I set out to learn how to learn, but real applications I leave to someone else.

lyric vale Feb 26, 2026, 4:48 PM

#

limber plover <@902960224044978186> Its hobby for me, I don't work any place, I do like collec...

wow.

limber plover Feb 26, 2026, 4:48 PM

#

I have a lot of info but very little depths

lyric vale Feb 26, 2026, 4:49 PM

#

limber plover I have a lot of info but very little depths

are you working on any project?

limber plover Feb 26, 2026, 4:50 PM

#

@lyric vale I have, yes, though I never finish them. Mostly because the current question I have about something gets answered, then I don't really continue it or have to.

#

For example, I have build ALU before, but I never had it used for anything, I got the general idea on what it was doing but I lost interest in the rest of the 16 bit system I was using.

lyric vale Feb 26, 2026, 4:51 PM

#

nice but you should make smth that usefull to people

limber plover Feb 26, 2026, 4:52 PM

#

That is subjective. I cannot possibly know what is useful to people unless they say so. I can make something for me, that is useful and then hope someone finds it interesting.

wind breach Feb 26, 2026, 4:52 PM

#

jaunty helm I feel like I'm just trying and failing to guess what you're actually trying to ...

it is not a change in arch, only in vocab, add N (I would try 256) tokens without mapping them to text, that's all

lyric vale Feb 26, 2026, 4:53 PM

#

limber plover That is subjective. I cannot possibly know what is useful to people unless they ...

you are right

limber plover Feb 26, 2026, 4:54 PM

#

This also helps me keep my mind steady, and not criticizes my self too much, since to me I can see the imperfections, like how efficient is it really build NN computer... but then I am too close to the subject, a layman might think it is impressive and someone might want to do something with it..next thing you know, you are selling a product you had no idea had this sort of use for it. Someone found out though.

#

Anyway, I digress try that project and see how far it goes..use inbuilt functions don't use it, who cares.

tardy haven Feb 26, 2026, 5:07 PM

#

Guyzz help me what should I do to impress my crush

limber plover Feb 26, 2026, 5:11 PM

#

That is a loaded question. Sounds like you don't know your crush well to impress her. For example does she like programming? If not then why do you want to use programming as a tool to impress. If you are telling her, look how smart I am on what I did, then that is a bit egocentric and you might have problems later down the road. Maybe just ask her out. And let her ask the questions on what you like then she might be impressed. Since you are not the one saying hey look I code.

#

Just know that this is AI and data science section. If you want advice ask general python group that is active.

tardy haven Feb 26, 2026, 5:17 PM

#

She liked it last time, that’s the thing. I made something for her and she really liked it, and now I want to do something similar again. Can you help me?

limber plover Feb 26, 2026, 5:20 PM

#

How would I know what you made? And what she liked. I have no idea what I would help you with but then ask yourself this. When you impress her, should I also step in and say oh yeah I made this as well.

#

I gave you some advice, good or bad, that is the best I can do.

tardy haven Feb 26, 2026, 5:24 PM

#

According to you, give me something nice that girls would like. I can make it in a way that she will definitely like it 100%

limber plover Feb 26, 2026, 5:26 PM

#

Sure, though you can just go to Python discussion and ask for help there. Show you previous work and ask, how can I improve on it.

tardy haven Feb 26, 2026, 5:30 PM

#

My previous work wasn’t done very well, that’s why I’m asking for your help, sir

limber plover Feb 26, 2026, 5:43 PM

#

That is fine, however, this is not the right sub unless you are asking with AI and data science. Python discussion is what this is for. There is python expert there to help so they say so ask them. But you have to be exact. What is your goal what have you made and how do you see improvement for it.

quasi pier Feb 26, 2026, 6:36 PM

#

not sure if this is the right place for this question: Is anyone here familiar with Reinforcement Learning on Farama Foundation's highway-env ?
I'm having trouble getting decent results using DicreteActions with DQN.

cursive schooner Feb 26, 2026, 11:45 PM

#

heyo

rich moth Feb 27, 2026, 12:57 AM

#

do economic, environmental, and competitive pressures improve llm code patch quality? im about to run a controlled study with contamination auditing using the qwen3.5-35b-a3b model to test this theory.

#

i honestly think it will, but i was wondering about your guys opinions

#

ill share the visuals it produces nevertheless, for science!

barren gulch Feb 27, 2026, 6:40 AM

#

is there no demand of Data Science / Data analysis in Healthcare sector? i haven't seen a single DA/DS job in healthcaer
am i cooked?

waxen kindle Feb 27, 2026, 6:42 AM

#

rich moth do economic, environmental, and competitive pressures improve llm code patch qua...

Probably yes, I am curious to know your methodology to test this

waxen kindle Feb 27, 2026, 6:42 AM

#

barren gulch is there no demand of Data Science / Data analysis in Healthcare sector? i haven...

Market is tough

#

You can look for data scientist jobs outside of healthcare

barren gulch Feb 27, 2026, 6:45 AM

#

yeh, but healthcare/medicine is the only thing that sparks my interest and sustains long-term attention

so perhaps i should use healthcare datasets to really learn DS and then apply to DS jobs outside of healthcare

waxen kindle Feb 27, 2026, 6:46 AM

#

Yeah, or if you really wanna stay in healthcare, look at non data scientist jobs

barren gulch Feb 27, 2026, 6:46 AM

#

right. i think i will do just that, use healthcare datasets to learn Data Science and after having learnt enough of DS, will apply outside of healthcare

#

skills would be transferable right? (my head says so, but i think i still need confirmation)

waxen kindle Feb 27, 2026, 6:49 AM

#

Yes

barren gulch Feb 27, 2026, 6:51 AM

#

alright, thanks!

half pulsar Feb 27, 2026, 7:36 AM

#

barren gulch is there no demand of Data Science / Data analysis in Healthcare sector? i haven...

No but the demand is currently growing though as we speak

barren gulch Feb 27, 2026, 7:36 AM

#

I seee, how much is it expected to grow in next 2 year? like till 2028 😅

half pulsar Feb 27, 2026, 7:37 AM

#

It'd probably be another 1 - 2 years before you'll start seeing things pop up for it

barren gulch Feb 27, 2026, 7:38 AM

#

half pulsar It'd probably be another 1 - 2 years before you'll start seeing things pop up fo...

I seeeee, apparently this is perfect time for my to start learnign DS for heatlcare sector then, it will undoubtly take me like 2 years to learn enough data science using healthcare datasets for a job

#

thank you!

lusty rune Feb 27, 2026, 2:12 PM

#

Am I doing something wrong here?
I'm following along the python one liners book, I got to the neural network section. In his examples he gets a low finxter score for doing 0 hours of python coding in the input data, while mine gets a higher finxter score for 0 hours and a lower one for more hours. I'm getting the complete opposite behavior he is with the same dataset and inputs.

My first two responses was running the model twice at 0 hours, the next response was once at 20, the next was once at 50. The finxter score goes down the more hours of python weekly I input , but the book has the complete opposite behavior

#

But the code is the same and so is the dataset

#

I get why it thinks more hours is a less score since the lowest score is someone who says they code 35 hours a week

limber plover Feb 27, 2026, 3:36 PM

#

@lusty rune Were you able to resolve it?

odd shell Feb 27, 2026, 3:37 PM

#

How important is it knowing when to actually use dataframes or series? Cause the syntax is murdering my sanity. Example:

print(cars.loc[:, 'drives_right'])
print(cars['drives_right'])
print(cars[['drives_right']])
(python, pandas)

lusty rune Feb 27, 2026, 3:37 PM

#

limber plover <@677718946270543882> Were you able to resolve it?

Not yet

odd shell Feb 27, 2026, 3:38 PM

#

Feel like there are too many ways to do something, or I'm doing something too many ways

odd shell Feb 27, 2026, 3:40 PM

#

lusty rune Am I doing something wrong here? I'm following along the python one liners book,...

This looks a lot more fun than fundamentals 😭

limber plover Feb 27, 2026, 3:42 PM

#

@odd shell I would use ai to answer that question if you want researching it. You don't need to let it code for you, but answering basic questions will give you a general idea alike. Also, you could look at the sources its using it.

odd shell Feb 27, 2026, 3:43 PM

#

I've limited my GPT on purpose using study methods, enforcing docs/community assistance feedback 😉

limber plover Feb 27, 2026, 3:43 PM

#

@lusty rune What book are you reading?

lusty rune Feb 27, 2026, 3:43 PM

#

odd shell This looks a lot more fun than fundamentals 😭

It's so fun but I already have the fundamentals down I've been coding for a couple of years now just not very consistent

lusty rune Feb 27, 2026, 3:43 PM

#

limber plover <@677718946270543882> What book are you reading?

odd shell Feb 27, 2026, 3:44 PM

#

I've leaned too heavily previously, realising it was causing damage to my learning. Though, I agree, GPT/LLM's can be potent if used right!

lusty rune Feb 27, 2026, 3:44 PM

#

All I can think is that I put in the wrong data

limber plover Feb 27, 2026, 3:44 PM

#

I am not familiar with that book, I would have too look into the exercise.

lusty rune Feb 27, 2026, 3:44 PM

#

But I spent 2 hours last night re checking the data set

odd shell Feb 27, 2026, 3:44 PM

#

lusty rune

I love their books..

lusty rune Feb 27, 2026, 3:45 PM

#

limber plover I am not familiar with that book, I would have too look into the exercise.

This is the code example (I took these so I can go over the problem at lunch while I'm at my construction job not trying to pirate)

limber plover Feb 27, 2026, 3:46 PM

#

Yeah just looking at it, it seems data set you used would be the problem, I don't see any problem with the code...

odd shell Feb 27, 2026, 3:46 PM

#

I wonder if capital X is used due to grammar or purposefully

serene scaffold Feb 27, 2026, 3:47 PM

#

odd shell I wonder if capital X is used due to grammar or purposefully

it's just a thing that X is capitalized and y isn't. but I don't capitalize it in my code.

lusty rune Feb 27, 2026, 3:47 PM

#

limber plover Yeah just looking at it, it seems data set you used would be the problem, I don'...

I re input the data like 4 times last night, maybe the MLPRegressor algorithm updated since the book was published is the only other thing I can think of

odd shell Feb 27, 2026, 3:47 PM

#

I don't work with scikit, but this is honestly interesting lol

serene scaffold Feb 27, 2026, 3:47 PM

#

except that array shouldn't even be called X because it has the y in the last column

lusty rune Feb 27, 2026, 3:48 PM

#

odd shell I love their books..

I love their books

#

Also got the automate the boring stuff with python book

#

The secret life of programs is probably my favorite

odd shell Feb 27, 2026, 3:49 PM

#

Of course SQL is the thickest.... ☠️

lusty rune Feb 27, 2026, 3:49 PM

#

The powershell one I bought for my buddy's birthday

odd shell Feb 27, 2026, 3:49 PM

#

whenever I see powershell, I think, maybe I should switch to Linux completely? 😂

lusty rune Feb 27, 2026, 3:49 PM

#

This one is pretty thick too

odd shell Feb 27, 2026, 3:51 PM

#

Yea, I know they're useful books, but too general 🙁

limber plover Feb 27, 2026, 3:51 PM

#

@lusty rune That is actually what I was thinking to, I was about to ask when was this book published...is there a way to look up the library and its updates?

odd shell Feb 27, 2026, 3:52 PM

#

I think they often come with repos?

lusty rune Feb 27, 2026, 3:53 PM

#

limber plover <@677718946270543882> That is actually what I was thinking to, I was about to as...

Apparently it has been, I thought that was an issue as well when I ran into problems with the Kmeans algorithm but it was actually my data input, this one however seems to be the algorithm

limber plover Feb 27, 2026, 3:53 PM

#

I finished looking at the array but I do not see a discrepancy between yours and theirs...so it has to be the library

lusty rune Feb 27, 2026, 3:54 PM

#

odd shell I think they often come with repos?

The automate the boring stuff has a python package made by AL that you use

lusty rune Feb 27, 2026, 3:54 PM

#

limber plover I finished looking at the array but I do not see a discrepancy between yours and...

I'll play with the data a bit to see if I can get similar behavior with different training inputs

limber plover Feb 27, 2026, 3:55 PM

#

The best thing to do I suspect, but you said you know why its behaving the way it is, so you generaly understand what the book is talking about, I would not get too hung up on it.

odd shell Feb 27, 2026, 3:55 PM

#

I played once with scikit on data I pulled from a videogame(eve online) 4 million rows or so, and still ended up a lot of overfitting

#

not familiar truly with the math/stats how to properly use it

#

pandas doesnt like that amount of data either lol

lusty rune Feb 27, 2026, 3:57 PM

#

limber plover The best thing to do I suspect, but you said you know why its behaving the way i...

Yea I was going crazy last night before bed making sure the dataset was the same 😭 I'll play with the inputs for an hour when I get home before I move onto the next section to get more familiar with the algorithm

#

It's still super fun

lusty rune Feb 27, 2026, 3:57 PM

#

odd shell I played once with scikit on data I pulled from a videogame(eve online) 4 millio...

I don't think I've ever dealt with that much data before

limber plover Feb 27, 2026, 3:57 PM

#

ML is always interesting...I which I kept my code from when I was doing classification problems

odd shell Feb 27, 2026, 3:58 PM

#

lusty rune I don't think I've ever dealt with that much data before

It was my first project 2 months after career switching from art 😂

#

I was in way over my head, but had a lot of fun

limber plover Feb 27, 2026, 3:58 PM

#

I got to a point where I kept building the dictionary data set in text and then parsing it for training so that then it could tell me valid and not valid words based on examples.

#

Example is cta and cat, cta is not valid but cat is, however, it was interesting that when I did tac which was not part of the binary decision, it still said not valid. Which makes sense but I did not program it for that.

odd shell Feb 27, 2026, 4:04 PM

#

This lib seems pretty interesting on say historical market data 🤔

lusty rune Feb 27, 2026, 4:08 PM

#

limber plover Example is cta and cat, cta is not valid but cat is, however, it was interesting...

Unsupervised learning is so cool

#

When I get done with this section I wanna teach an AI how to play blackjack or poker

limber plover Feb 27, 2026, 4:08 PM

#

Yeah you do get emergent behaviors from them.

lusty rune Feb 27, 2026, 4:08 PM

#

I'm working on a little RPG game and it would be cool to teach my NPC enemies how to make the best moves based on players decisions

lusty rune Feb 27, 2026, 4:10 PM

#

odd shell This lib seems pretty interesting on say historical market data 🤔

The first exercise in the ML section was using linear regression to predict a little stock market sample, it's been really cool learning about the different algorithms. This stuff used to be so intimidating to me but the way the book breaks it down is easy to understand and when I don't understand something too well I do more research on it

limber plover Feb 27, 2026, 4:10 PM

#

That sounds like you will need a lot of data for that as players play. I can see them getting smarter over time, but it will take a while for that.

odd shell Feb 27, 2026, 4:11 PM

#

lusty rune When I get done with this section I wanna teach an AI how to play blackjack or p...

Isn't that pretty deterministic?

#

Im trying this method now on a pokemon dataset and see if I can match later* generations with only earlier generation data

limber plover Feb 27, 2026, 4:12 PM

#

My favorite book I used was Grokking Algorithms..

lusty rune Feb 27, 2026, 4:16 PM

#

odd shell Isn't that pretty deterministic?

Yea it would be simple to hard code

lusty rune Feb 27, 2026, 4:16 PM

#

limber plover My favorite book I used was Grokking Algorithms..

I'm pretty sure that book has been mentioned in this one

#

I might have to check it out

odd shell Feb 27, 2026, 4:17 PM

#

Alright, wonder what happens if I do this regression consectuively for gens? 😄

#

i guess meta-shfits from designers prolly makes it harder, unless they stick to their philosophy methodology?

#

or lack of data and we get goey? 😂

limber plover Feb 27, 2026, 4:19 PM

#

I am more interested in the library directly and its math. I never used it, I just set out making my own at some point.

odd shell Feb 27, 2026, 4:20 PM

#

yea, i get that. i need to start digging into math more

#

instead of building stuff i dont understand in the end truly

lusty rune Feb 27, 2026, 4:21 PM

#

limber plover I am more interested in the library directly and its math. I never used it, I ju...

I did my own manual implementation of the KMeans algorithm to learn it better but I think it's fine to use libraries if you understand what it's doing under the hood

limber plover Feb 27, 2026, 4:22 PM

#

Yeah, though I have not done this in a while so I forgot a lot about it now.

#

I do electronics mostly, so I hardly have to deal with this high level programming. Mostly low embedded system programming.

#

I am only coming back because I don't have the budget to continue it and software is a lot simpler to get into.

odd shell Feb 27, 2026, 4:24 PM

#

yeah..the tools to do data analysis is so damn accessible

lusty rune Feb 27, 2026, 4:26 PM

#

odd shell Alright, wonder what happens if I do this regression consectuively for gens? 😄

I loved learning about hardware in the secret life of programming book

#

Whoops

#

Wrong reply

limber plover Feb 27, 2026, 4:26 PM

#

The fact that python is free is amazing to me

lusty rune Feb 27, 2026, 4:26 PM

#

odd shell Alright, wonder what happens if I do this regression consectuively for gens? 😄

Let me know how it goes!

lusty rune Feb 27, 2026, 4:27 PM

#

limber plover The fact that python is free is amazing to me

Wait are there paid programming languages? HUH

odd shell Feb 27, 2026, 4:27 PM

#

something.microsoft?

limber plover Feb 27, 2026, 4:27 PM

#

Well, yes, for licenses.

#

If you plan on using it in commercial setting then yeah. For example there is programming language forth. Well, swift forth that is rather limiting until you pay.

odd shell Feb 27, 2026, 4:28 PM

#

the problem is rather where do you store all the data from the languages 😂

limber plover Feb 27, 2026, 4:28 PM

#

Same with something like "true basic"

#

If I remember right some charge you for compilers and such...language dependent

odd shell Feb 27, 2026, 4:55 PM

#

@lusty rune Did a "prediction" on type, if x stats = fire or water? then added +1 on every consecutive generation to see if it improved actual confidence. and eh, yea lol. they changed how they defined types. but also missing lots of nuance. ofc. anyhow, fun stuff

#

could improve the model (or worsen) if we consider every type, or more

rancid thorn Feb 27, 2026, 5:00 PM

#

Guys im trying to make an LSTM model but for some reason the loss flatlines and the outputs end up being all the same

#

(or very similar)

#

https://paste.pythondiscord.com/N2DA

#

Also while training

#

https://paste.pythondiscord.com/PQQQ

limber plover Feb 27, 2026, 5:08 PM

#

are you training a sensor on different weather types ?

rancid thorn Feb 27, 2026, 5:08 PM

#

Train loss decreases slowly

rancid thorn Feb 27, 2026, 5:09 PM

#

limber plover are you training a sensor on different weather types ?

No its supposed to be a timeseries forecaster

#

Basically I give it the last, say, 20 days, and it tells me the weather of tomorrow

rancid thorn Feb 27, 2026, 5:09 PM

#

rancid thorn Train loss decreases slowly

val loss goes up

#

No real improvement overall

limber plover Feb 27, 2026, 5:10 PM

#

Wow, that is going to be hard but interesting.

rancid thorn Feb 27, 2026, 5:10 PM

#

limber plover Wow, that is going to be hard but interesting.

well it's supposedly a textbook application of LSTMs

#

But for some reason it doesnt work

limber plover Feb 27, 2026, 5:11 PM

#

Yeah, I am not sure , I would have to look at the book you are using

rancid thorn Feb 27, 2026, 5:11 PM

#

textbook application doesnt mean it comes from a textbooks

#

It means its common/classic

limber plover Feb 27, 2026, 5:11 PM

#

OH I thought you were working it out from textbook sorry.

#

Are you using a library for this?

rancid thorn Feb 27, 2026, 5:13 PM

#

PyTorch

#

pretty standard for AI

limber plover Feb 27, 2026, 5:13 PM

#

Has PyTorch been updated recently?

rancid thorn Feb 27, 2026, 5:14 PM

#

limber plover Has PyTorch been updated recently?

Probably

#

in this AI era I really doubt its left to itself

#

OpenAI uses it

#

All big AI firms do too

limber plover Feb 27, 2026, 5:14 PM

#

You can look up recent updates, it might be something with this if your are sure you data is right

rancid thorn Feb 27, 2026, 5:15 PM

#

No I doubt they messed up LSTMs

limber plover Feb 27, 2026, 5:15 PM

#

Not sure then, I have neve used it, but I thought from systems approach it might be that.

#

@rancid thorn Have you tried asking Ai on this?

rancid thorn Feb 27, 2026, 5:22 PM

#

what do you mean?

limber plover Feb 27, 2026, 5:24 PM

#

Asking ai on the problem you are having you said something "loss flatlines and the outputs end up being all the same?"

rancid thorn Feb 27, 2026, 5:25 PM

#

wont help

#

ai sucks at this kind of stuff

limber plover Feb 27, 2026, 5:25 PM

#

Well, what I got, not sure if makes any sense to you but "the model has collapsed to a trivial solution, like predicting the mean or mode of targets across all timesteps"

#

Using science direct as its source though \

#

Maybe that is too general?

rancid thorn Feb 27, 2026, 5:31 PM

#

no that doesnt really seem to be the issue

limber plover Feb 27, 2026, 5:31 PM

#

Were you just testing it or you know for sure?

#

I mean another one I got was "Unnormalized inputs or targets cause exploding/vanishing gradients, forcing the model to output safe constant values" uncertain if this helps.

rancid thorn Feb 27, 2026, 5:33 PM

#

Im clipping the values and the inputs are already normalized

limber plover Feb 27, 2026, 5:34 PM

#

All right what about "Learning Rate Problems
Too-low LR traps the optimizer in flat loss regions; too-high causes oscillations ending in constant predictions"

#

This sources is reddit so..take that with a grain of salt

rancid thorn Feb 27, 2026, 5:36 PM

#

Nope learning rate is a normal one

limber plover Feb 27, 2026, 5:36 PM

#

Have you ever made LSTM from scratch?

rancid thorn Feb 27, 2026, 5:36 PM

#

this is my first time

limber plover Feb 27, 2026, 5:36 PM

#

No I mean the library that you are using which use LSTMs yes?

rancid thorn Feb 27, 2026, 5:37 PM

#

yeah pytorch

jaunty helm Feb 27, 2026, 5:37 PM

#

rancid thorn Guys im trying to make an LSTM model but for some reason the loss flatlines and ...

NNs fail silently all the time, I doubt anyone can tell you what's wrong only looking at the predictions and losses
there's a very nice though a bit outdated recipe on training NNs, maybe something in there can help you

A Recipe for Training Neural Networks

Musings of a Computer Scientist.

limber plover Feb 27, 2026, 5:38 PM

#

Well the other suggestions I got dead neurons not sure how accurate that is but I don't even know if you can see this or test this.

#

Sorry, not really helping as I don't know much about them. I just never use libraries for this and take apart what LSTM is. Maybe the math on how it actually does it.

#

For me its more of "its fine if you don't want to know what a brick is to lay it down, but if you want to know why it keeps crumbling, you better get to know the chemicals of it"

#

Why not look into pytorch forums if they have any, maybe they had a problem like yours

rancid thorn Feb 27, 2026, 5:50 PM

#

limber plover Sorry, not really helping as I don't know much about them. I just never use libr...

I dont think you should make AI libraries yourself

#

I mean understand what you're doing, necessary

#

Making it yourself from scratch, will probably end up doing worse

jaunty helm Feb 27, 2026, 5:50 PM

#

also on the topic of lstm (or deep learning in general) for time series:
every other week, some new hot sophisticated dl architecture for ts will come out boasting sota performance
but also, don't sleep on "traditional and outdated" methods like arima/ets, which are still surprisingly competitive in certain scenarios

rancid thorn Feb 27, 2026, 5:51 PM

#

jaunty helm also on the topic of lstm (or deep learning in general) for time series: every o...

buzzword buzzword buzzword

#

What are "sota" and "arima"?

limber plover Feb 27, 2026, 5:51 PM

#

@rancid thorn TURE but it would help you see the picture better.

rancid thorn Feb 27, 2026, 5:52 PM

#

rancid thorn I mean understand what you're doing, necessary

.

tardy kayak Feb 27, 2026, 5:52 PM

#

hello

jaunty helm Feb 27, 2026, 5:52 PM

#

rancid thorn What are "sota" and "arima"?

"state of the art," which means best of the best
and "autoregressive integrated moving average," a very very important traditional method of doing time series forecasting

#

describing them in detail on discord is probably not very effective, you can search them up when you want to learn more

rancid thorn Feb 27, 2026, 5:53 PM

#

oh okay i will thanks

#

This is the source data (a slice)

#

Bigger slice

#

From a purely visual standpoint, Id say theres some pattern

#

So it can work

limber plover Feb 27, 2026, 5:58 PM

#

So that is the original data?

rancid thorn Feb 27, 2026, 5:59 PM

#

yes

#

Not the whole data

limber plover Feb 27, 2026, 5:59 PM

#

What is the predicted then?

#

Sorry, I am going to ask basic questions, as I am now reading up on what LSTM is

#

I am reading that the limit for them is "Manual Optimization: Requires tuning for best performance" How do you know you optimized it well?

rancid thorn Feb 27, 2026, 6:02 PM

#

Its supposed to take the features of a series of days and predict the next day/sequence of days

rancid thorn Feb 27, 2026, 6:02 PM

#

limber plover I am reading that the limit for them is "Manual Optimization: Requires tuning fo...

before tuning you need to have a model that works at least a bit

#

If it doesnt then thats not the issue

limber plover Feb 27, 2026, 6:03 PM

#

What do you mean a bit? Why not indefinite?

#

Is there away to can plot out the predicted data vs original?

#

One of the things I see is "PyTorch provides a clean and flexible API to build and train LSTM models" Yet they also state "Version Gaps: API changes may affect older code"

rancid thorn Feb 27, 2026, 6:06 PM

#

limber plover What do you mean a bit? Why not indefinite?

what?

rancid thorn Feb 27, 2026, 6:06 PM

#

limber plover Is there away to can plot out the predicted data vs original?

yeah

#

Also you can check how good the model is from loss

limber plover Feb 27, 2026, 6:09 PM

#

@rancid thorn I did not understand the way you phrased it. "before tuning you need to have a model that works at least a bit" what does this mean exactly? works at least a bit is like I guess it should work....seems uncertain..

rancid thorn Feb 27, 2026, 6:09 PM

#

If the model has some fundamental flaw that renders it completely useless fine tuning is useless

#

You need rough tuning before fine tuning

limber plover Feb 27, 2026, 6:10 PM

#

Ahh ok, and you have to do this all manually?

rancid thorn Feb 27, 2026, 6:11 PM

#

Well you have to make the code

#

Make code that works

limber plover Feb 27, 2026, 6:11 PM

#

I thought tuning meant specific weight adjustments?

#

You have it coded so would there not be some constants you can adjust?

warm dune Feb 27, 2026, 6:14 PM

#

limber plover I thought tuning meant specific weight adjustments?

i thought the 'tuning' it's about preprocessing

#

to 'improve' the data and model can get a better accuracy

limber plover Feb 27, 2026, 6:17 PM

#

Yeah nevermind I got them confused, its been a while, yeah you would not adjust that as that is what the NN is doing ,you would adjust the learning speed and so on...

#

Also last NN I worked with was VERY simple XOR problem that I remember right now so I could do it manually

#

Anyway, I am losing interest now, since I am doing some other project. Hope someone can help you.

warm dune Feb 27, 2026, 6:21 PM

#

limber plover Anyway, I am losing interest now, since I am doing some other project. Hope some...

i dont get the problem

#

are u trying to improve a model?

limber plover Feb 27, 2026, 6:21 PM

#

I am not the one that had the problem StraReal did, you will have to scroll up to read their specific problem.

warm dune Feb 27, 2026, 6:23 PM

#

limber plover I am not the one that had the problem StraReal did, you will have to scroll up t...

lol I hadn't seen it

limber plover Feb 27, 2026, 6:23 PM

#

Well, you did join in midway so there is that.

#

@rancid thorn Do you know if you can express LSTM like this F = x'y' + x'y + xy' = y'(x' + x) + xy'? Similar to product of sums in digital logic?

#

Sorry I just had to ask, as I am curious

rancid thorn Feb 27, 2026, 6:27 PM

#

limber plover <@812605006149976104> Do you know if you can express LSTM like this F = x'y' + x...

An LSTM has more operations that that Im pretty sure

limber plover Feb 27, 2026, 6:27 PM

#

Yeah I know based on how large it is.

rancid thorn Feb 27, 2026, 6:27 PM

#

I mean theres two main variables, the short term and long term memory

#

+the input

#

Then they go through the forget gate, the input gate and the output gate

limber plover Feb 27, 2026, 6:28 PM

#

I was trying to see based on what I could find, it looks like something you can express similar to logic gates

#

I know a bit more about digital electronics and if I can connect my thinking that way, maybe I can understand it more

rancid thorn Feb 27, 2026, 6:29 PM

#

This is an LSTM expressed as a mathematical formula

limber plover Feb 27, 2026, 6:30 PM

#

Interesting

warm dune Feb 27, 2026, 6:30 PM

#

rancid thorn An LSTM has more operations that that Im pretty sure

an LSTM it's a type of rnn to avoid vanishing gradient?

rancid thorn Feb 27, 2026, 6:30 PM

#

Yeah

#

And exploding gradient

#

And in doing that it adds long term memory which is really cool

limber plover Feb 27, 2026, 6:31 PM

#

So similar to programmable memory?

rancid thorn Feb 27, 2026, 6:33 PM

#

Not sure what that is

warm dune Feb 27, 2026, 6:33 PM

#

rancid thorn And in doing that it adds long term memory which is really cool

how that works?

#

it's like the RMSProp?

rancid thorn Feb 27, 2026, 6:34 PM

#

warm dune how that works?

Basically long term memory gets carried from each LSTM to the next and it passes through a forget gate, which decides what % of it to remember, and an input gate, which decides what to add to the long term memory

#

It is never outright deleted

#

And short term memory is carried from one LSTM to the next but it doesnt go to the one after it too

warm dune Feb 27, 2026, 6:35 PM

#

rancid thorn Basically long term memory gets carried from each LSTM to the next and it passes...

so it's like a nn to remember things?

rancid thorn Feb 27, 2026, 6:36 PM

#

Yeah I guess

#

LSTM literally means Long/Short-Term Memory

warm dune Feb 27, 2026, 6:36 PM

#

rancid thorn LSTM literally means Long/Short-Term Memory

got it

#

rn i'm in the optimizers

#

dont see the nn

limber plover Feb 27, 2026, 6:37 PM

#

Basically it programs its own memory, it cannot be really expressed with bool logic but if you make it over time then you can sort of get it. The expresson I showed was F = x'y' + x'y + xy' = y'(x' + x) + xy' You can use this to minimze gate use and reduce to using say 3 and gates instead of 10. However, LSTM is not exactly like this but does have it basically overtime expression. Seems to use sigmoid a lot.

warm dune Feb 27, 2026, 6:39 PM

#

limber plover Basically it programs its own memory, it cannot be really expressed with bool lo...

i got it the LSTM, i just don't understand yet, what is it

#

it's like a optimizer for rnn?

#

to resolve the vanishing and gradient probleml

limber plover Feb 27, 2026, 6:40 PM

#

Well, I am not accurate this is just my understanding, its not pure logic like that or static, so you cannot really use it. You cannot use digital logic gate expressions from input to out for this. IT chnages over time as needed, so in digital logice its 1 or 0 for input but with this its way more complicated than that.

rancid thorn Feb 27, 2026, 6:41 PM

#

warm dune it's like a optimizer for rnn?

Its an improvement of basic RNNs

#

But it is an RNN

limber plover Feb 27, 2026, 6:41 PM

#

Ok anyway, I really need to stop thinking about this for now.

rancid thorn Feb 27, 2026, 6:41 PM

#

https://www.youtube.com/watch?v=YCzL96nL7j0

YouTube

StatQuest with Josh Starmer

Long Short-Term Memory (LSTM), Clearly Explained

Basic recurrent neural networks are great, because they can handle different amounts of sequential data, but even relatively small sequences of data can make them difficult to train. This is where Long Short-Term Memory (LSTM) saves the day. Long Short-Term Memory is a type of recurrent neural network that can handle much larger sequences of dat...

▶ Play video

#

You should watch this

#

Its really good

warm dune Feb 27, 2026, 6:42 PM

#

rancid thorn You should watch this

thx

#

and your rnn its for what?

rancid thorn Feb 27, 2026, 6:42 PM

#

in this case time series forecasting

#

Which is basically having a series of values and predicting the next

warm dune Feb 27, 2026, 6:43 PM

#

rancid thorn in this case time series forecasting

real project or for study?

rancid thorn Feb 27, 2026, 6:43 PM

#

Also if you dont know exactly what RNNs are watch this
https://www.youtube.com/watch?v=AsNTP8Kwu80

YouTube

StatQuest with Josh Starmer

Recurrent Neural Networks (RNNs), Clearly Explained!!!

When you don't always have the same amount of data, like when translating different sentences from one language to another, or making stock market predictions from different companies, Recurrent Neural Networks come to the rescue. In this StatQuest, we'll show you how Recurrent Neural Networks work, one step at a time, and then we'll show you th...

▶ Play video

rancid thorn Feb 27, 2026, 6:43 PM

#

warm dune real project or for study?

Well I was hoping to apply it to the market

#

But in the worst case scenario I guess it will be study

warm dune Feb 27, 2026, 6:46 PM

#

rancid thorn Well I was hoping to apply it to the market

oh yes

#

i'm not at that level yet.

#

i want to start studying NN until may, cuz currently I'm focusing more on the intermediate level, such as problems and gradient types and their optimizers (LR, regularization, ADAM, RMSprop, LR scheduler)

rancid thorn Feb 27, 2026, 6:47 PM

#

warm dune i want to start studying NN until may, cuz currently I'm focusing more on the in...

Honestly I think you should do the opposite

#

Learn the basics of NNs

#

Backpropagation, the chain rule, RNNs, what an NN even is

#

And then go into the details of every step

warm dune Feb 27, 2026, 6:48 PM

#

rancid thorn Backpropagation, the chain rule, RNNs, what an NN even is

i know the backpropagation, the chain rule, how it's calculate by derivates and more

rancid thorn Feb 27, 2026, 6:48 PM

#

Oh yeah then youre good lol

warm dune Feb 27, 2026, 6:48 PM

#

activations functions and more

#

but just for simples like linear models

#

i never code a nn

#

just doing model with 1 layer (linear models)

rancid thorn Feb 27, 2026, 7:00 PM

#

warm dune i never code a nn

you should code as soon as you learn something new

#

really get it printed into memory

#

Input

#

Output...

warm dune Feb 27, 2026, 7:10 PM

#

rancid thorn Output...

so the model predicts are bad in rainfall, and sunshine?

rancid thorn Feb 27, 2026, 7:10 PM

#

No no its bad in everything

#

theres no variation

warm dune Feb 27, 2026, 7:10 PM

#

rancid thorn No no its bad in everything

but it's worse in that 2 features right?

rancid thorn Feb 27, 2026, 7:10 PM

#

not really

warm dune Feb 27, 2026, 7:11 PM

#

rancid thorn not really

the date feature are the day? or something else?

rancid thorn Feb 27, 2026, 7:11 PM

#

the date feature is the period of the year

#

at start and end of the year its 0

#

and towards the middle its 1

#

Just a cosine wave made to wrap from 0 to 1

warm dune Feb 27, 2026, 7:12 PM

#

rancid thorn Just a cosine wave made to wrap from 0 to 1

do you have the number for loss?

#

in train set

rancid thorn Feb 27, 2026, 7:12 PM

#

I have these
Epoch 02 | train loss: 1.00803 | val loss: 0.96275

warm dune Feb 27, 2026, 7:13 PM

#

do you check if overfitting?

warm dune Feb 27, 2026, 7:13 PM

#

rancid thorn I have these Epoch 02 | train loss: 1.00803 | val loss: 0.96275

that don't a bad loss, but we can improve

rancid thorn Feb 27, 2026, 7:13 PM

#

The thing is it doesnt improve

warm dune Feb 27, 2026, 7:14 PM

#

rancid thorn The thing is it doesnt improve

the 2 epoch loss are the lower?

rancid thorn Feb 27, 2026, 7:14 PM

#

no I just took this one as sample

rancid thorn Feb 27, 2026, 7:14 PM

#

rancid thorn https://paste.pythondiscord.com/PQQQ

Here

warm dune Feb 27, 2026, 7:14 PM

#

can you send the first, the middle and the last?

warm dune Feb 27, 2026, 7:15 PM

#

rancid thorn Here

i'ts happening a little of overfitting

rancid thorn Feb 27, 2026, 7:15 PM

#

No no

#

with overfitting the train loss would go down by a lot

#

It would end up looking right

#

Sure it wouldnt be useful

#

But it would end up looking right

#

But it doesnt

warm dune Feb 27, 2026, 7:16 PM

#

rancid thorn with overfitting the train loss would go down by a lot

but in 98 epoch

#

the train loss down and the val loss increase

#

thats not a overfitting?

rancid thorn Feb 27, 2026, 7:16 PM

#

over time, it slightly is, but overall its not

#

#

What really happening here

#

Is that its as if the model wasnt even being trained

#

Its as if it was back to the start at every epoch

warm dune Feb 27, 2026, 7:17 PM

#

rancid thorn Its as if it was back to the start at every epoch

using pytorch right?

rancid thorn Feb 27, 2026, 7:18 PM

#

yes

warm dune Feb 27, 2026, 7:18 PM

#

rancid thorn yes

did you dont forget some code?

rancid thorn Feb 27, 2026, 7:18 PM

#

checked it a thousand times

#

for epoch in range(1, NUM_EPOCHS + 1):
    model.train()
    train_losses = []

    for seq, target in train_loader:
        seq = seq.permute(1, 0, 2)

        optimizer.zero_grad()
        pred = model(seq)
        loss = criterion(pred, target)
        loss.backward()
        torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.5)
        optimizer.step()
        train_losses.append(loss.item())

    # ---- validation ----
    model.eval()
    val_losses = []
    with torch.no_grad():
        for seq, target in val_loader:
            seq = seq.permute(1, 0, 2)
            pred = model(seq)
            loss = criterion(pred, target)
            val_losses.append(loss.item())

    avg_train = np.mean(train_losses)
    avg_val   = np.mean(val_losses)

    print(f'Epoch {epoch:02d} | train loss: {avg_train:.5f} | val loss: {avg_val:.5f}')


    if avg_val < best_val:
        best_val = avg_val
        torch.save(model.state_dict(), 'best_weather_lstm.pt')
        cprint(' -> checkpoint saved', 'm')```

warm dune Feb 27, 2026, 7:20 PM

#

i see in somewhere the thinking to avoid that

#

i'ts like

#

the loss don't decrease, so the weights dont change
the weights dont change, so we need to check the weights numbers (check the optimizer)
why the weights dont change? (lr lower/higher, the model find a local minimum) and more

#

i think in this case the model find a local minimum, do you check that?

rancid thorn Feb 27, 2026, 7:23 PM

#

warm dune the loss don't decrease, so the weights dont change the weights dont change, so ...

I tried making the LR crazy high amounts

#

doesnt work

warm dune Feb 27, 2026, 7:23 PM

#

rancid thorn I tried making the LR crazy high amounts

what optimizer?

rancid thorn Feb 27, 2026, 7:23 PM

#

warm dune the loss don't decrease, so the weights dont change the weights dont change, so ...

Also not how it works

rancid thorn Feb 27, 2026, 7:23 PM

#

warm dune what optimizer?

Adam

warm dune Feb 27, 2026, 7:23 PM

#

try the AdamW

warm dune Feb 27, 2026, 7:24 PM

#

rancid thorn Also not how it works

i really think that the problems it's a local minimum

rancid thorn Feb 27, 2026, 7:24 PM

#

No, if it was that it would do a little optimization before flatlining

#

here it flatlines at the start and just stops

warm dune Feb 27, 2026, 7:26 PM

#

rancid thorn No, if it was that it would do a little optimization before flatlining

the train loss decrease with time but the val loss dont

#

well

#

try to increase the batch size

#

or put a lr schuler

#

or idk try to increase the alpha regularization terms

rancid thorn Feb 27, 2026, 7:27 PM

#

already tried changing all the hyper parameters

warm dune Feb 27, 2026, 7:27 PM

#

the data have outliers?

rancid thorn Feb 27, 2026, 7:27 PM

#

I really dont know what could possibly be the issue

rancid thorn Feb 27, 2026, 7:27 PM

#

warm dune the data have outliers?

no

warm dune Feb 27, 2026, 7:28 PM

#

rancid thorn I really dont know what could possibly be the issue

the lr have a scheduler?

#

like starts in 0.00001 and goes to 0.001

rancid thorn Feb 27, 2026, 7:28 PM

#

no but it doesnt really matter if with both high and low values it doesnt work

warm dune Feb 27, 2026, 7:29 PM

#

rancid thorn no but it doesnt really matter if with both high and low values it doesnt work

the data it's scaled?

rancid thorn Feb 27, 2026, 7:30 PM

#

yep, its normalized

warm dune Feb 27, 2026, 7:31 PM

#

rancid thorn yep, its normalized

i saw that in reddit

#

" In your model (in the LSTM/RNN definition), is the batch_first parameter set to False? If it's set to True and you perform this permute, you are training the model with batch instead of time. This would explain why the validation loss never stabilizes: the model is trying to find temporal patterns in dimensions that are, in fact, different samples. "

rancid thorn Feb 27, 2026, 7:31 PM

#

"This would explain why the validation loss never stabilizes" not the issue at hand

#

Also its set to False

warm dune Feb 27, 2026, 7:35 PM

#

rancid thorn Also its set to False

the better epoch it's 37

#

and after goes down

#

idk if it's a overfitting problem that are invisible or something

rancid thorn Feb 27, 2026, 8:32 PM

#

Can someone help with my help thread in #1035199133436354600 ?

limber plover Feb 27, 2026, 8:47 PM

#

@rancid thorn Still on that problem huh? Have your tired asking the pytorch community? I am sure they have discord.

tawdry heart Feb 28, 2026, 4:31 AM

#

@rancid thorn

#

oh nvm i cant send the invite mb

rare bane Feb 28, 2026, 6:46 AM

#

Ooh do we reckon openai has a massive lifeline, now that they've signed with the US secretary of defense?

#

I know they were leaking money bad, but surely that's on the up for them from here

acoustic fjord Feb 28, 2026, 9:44 AM

#

Working on Kaggle right now, didn't even know this could happen lol

high rivet Feb 28, 2026, 12:18 PM

#

hi everyone, Tooba here. I am a SWE Junior, currently in 6th semester.

#

Need your help: I am currently studying "Data Science for SE" course and for its 2nd assignment, I have to effectively visualize big data, which is basically the data of 4 parameters: No2, O3, PM2.5, PM10 of 100 stations hourly data across the world for the year 2025. The problem is that the data is so huge that I am unable to visualize it meaningfully so as to convey anything properly. See the image attached. Well, I need your suggestions or maybe yt tutorials to effective visualizations of such big data....thank you : )

west wing Feb 28, 2026, 1:42 PM

#

high rivet Need your help: I am currently studying "Data Science for SE" course and for its...

find important features with respect to each other and plot their graph, find whose correlation is higher or lower for effective visulization

#

find their RFE

#

if its a linear data

#

you can also check their VIF score to find multicollinearity

high rivet Feb 28, 2026, 2:20 PM

#

west wing find important features with respect to each other and plot their graph, find wh...

thank you!

rancid thorn Feb 28, 2026, 5:30 PM

#

What its trying to predict

#

Cyclically repeats, always in the exact same way

#

What it did

#

So the code must be wrong

#

Because i do know for a fact that an LSTM can predict this

left tartan Feb 28, 2026, 5:32 PM

#

rancid thorn What its trying to predict

I would think that you'd want a single "time" dimension, not separate date & time.

rancid thorn Feb 28, 2026, 5:33 PM

#

What do you mean?

left tartan Feb 28, 2026, 5:33 PM

#

And, even then, if the intervals are constant, that dimension doesn't provide anything useful since it's ordered.

rancid thorn Feb 28, 2026, 5:33 PM

#

theres no separate date and time dimensions

#

Theres one feature

#

date

#

Then the dataframe is ordered of course, but that doesnt indicate in any way seasonality, as if the weather snapshots were taken, say, 2 days apart, then the formula would have to change

wet dome Feb 28, 2026, 6:04 PM

#

Is anyone working as a data engineer here?

left tartan Feb 28, 2026, 6:04 PM

#

rancid thorn Then the dataframe is ordered of course, but that doesnt indicate in any way sea...

Did you share your code anywhere?

rancid thorn Feb 28, 2026, 6:14 PM

#

Yeah, I'll send it again

#

!paste

arctic wedgeBOT Feb 28, 2026, 6:15 PM

#

Pasting large amounts of code

So that everyone can easily read your code, you can paste it in this website:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

pliant steppe Feb 28, 2026, 6:15 PM

#

Anyone here could help with a computer vision task? Im supposed to perform transfer learning on a pretrained model like YOLOv8-seg which i did, and the end result is very bad, i lowered the confidence threshold to 0.05 and it only gave me a single prediction of a small piece of a dendrite as seen in the image.
the labels it was trained on are manually done by me and my friend and we took a good amount of time to do them correctly so that isnt the problem either, im suspecting the biggest problem is the image size i set when training and triggering inference which is 1024 but the real image dimensions are:
(2188, 3072, 3) h, w, channels
(1094, 1536, 3)

Maybe because the dendrites are so thin in some areas or faint it ends up squishing it out of existence when resizing.

rancid thorn Feb 28, 2026, 6:15 PM

#

https://paste.pythondiscord.com/7V2Q

#

On line 50 and 51 theres a commented line with the actual features id wanna use

left tartan Feb 28, 2026, 6:29 PM

#

rancid thorn Yeah, I'll send it again

#

So, I reduced your problem down to 1 dimension: max temp

rancid thorn Feb 28, 2026, 6:30 PM

#

Howd you make this visualization

#

can you send the code?

left tartan Feb 28, 2026, 6:30 PM

#

Plotly, I will, just cleaning it.

#

This is how I like to visualize predictions

rancid thorn Feb 28, 2026, 6:31 PM

#

left tartan

Did you change the actual ai code to get this?

left tartan Feb 28, 2026, 6:33 PM

#

I think your lr was probably the biggest problem

rancid thorn Feb 28, 2026, 6:33 PM

#

whatd you do to it?

left tartan Feb 28, 2026, 6:33 PM

#

Changes to 1e-3, from 1e-1

#

With 1e-1: Epoch 20 | train loss: 0.98093 | val loss: 1.14488

#

At 1e-3: Epoch 20 | train loss: 0.45296 | val loss: 0.49629

rancid thorn Feb 28, 2026, 6:34 PM

#

But its not 1e-1

#

its 1e-3

left tartan Feb 28, 2026, 6:35 PM

#

Oh, 1e-5 is also Epoch 20 | train loss: 0.97913 | val loss: 1.11007

#

There's a bunch of hyperparameters to play around with her

#

layers, dropout, lr, etc

rancid thorn Feb 28, 2026, 6:36 PM

#

Well but the loss was 1e-3 in the code I sent

left tartan Feb 28, 2026, 6:36 PM

#

I'm looking at your code from yesterday https://paste.pythondiscord.com/KYORVZPCXVGMSN6PWSRHGWPQSE

rancid thorn Feb 28, 2026, 6:36 PM

#

Why? Thats worse

#

I also removed the weight decay now

left tartan Feb 28, 2026, 6:37 PM

#

Anyway, all I'm saying is, reduce to a single parameter: features = ['MaxTemp']

rancid thorn Feb 28, 2026, 6:37 PM

#

Can you send the code to plot it like you did?

left tartan Feb 28, 2026, 6:38 PM

#

Yup, one sec

pliant steppe Feb 28, 2026, 6:42 PM

#

pliant steppe Anyone here could help with a computer vision task? Im supposed to perform trans...

my learning rate is set to 0.01 default and i have early stopping on 5 epochs if the loss doesnt change and it always stops at 5 epochs like its not learning anything

rancid thorn Feb 28, 2026, 6:53 PM

#

@left tartan ?

left tartan Feb 28, 2026, 6:54 PM

#

rancid thorn <@738234281146712084> ?

I gotcha, I needed sustenance

rancid thorn Feb 28, 2026, 6:54 PM

#

oh sure lol

left tartan Feb 28, 2026, 6:56 PM

#

I made a few small changes to your code, mainly changing lr... but also had to set device in a few places for cpu/gpu. The main thing in my case is I changed features to feature s= ["MaxTemp"].

#

https://paste.pythondiscord.com/QDWQ

main grail Feb 28, 2026, 8:23 PM

#

Hey friends. I'm developing a stocks trading app that learns from trading behavior. It can describe any market condition. It is something like a state machine. But I have no experience with ML. If someone is interested in having a look. I'd love some feedback.

opaque condor Feb 28, 2026, 9:28 PM

#

By getting the device
That play torch can run on is there a way of limiting how much is used

warm dune Feb 28, 2026, 10:09 PM

#

main grail Hey friends. I'm developing a stocks trading app that learns from trading behavi...

Can I? dm

warm fossil Feb 28, 2026, 10:28 PM

#

Hi, is leetcode good for practicing for interviews?

waxen kindle Feb 28, 2026, 10:29 PM

#

It's good to practice leetcode questions, which may or may not be asked by interviewers

serene scaffold Mar 1, 2026, 12:26 AM

#

warm fossil Hi, is leetcode good for practicing for interviews?

this is the data science and AI channel. I was never asked a single leetcode-style questions when I interviewed for DS/AI positions.
you can ask for job hunting advice in #career-advice, and you can ask leetcode questions in #algos-and-data-structs

warm fossil Mar 1, 2026, 12:28 AM

#

serene scaffold this is the data science and AI channel. I was never asked a single leetcode-sty...

okay thanks mate

warm fossil Mar 1, 2026, 12:28 AM

#

waxen kindle It's good to practice leetcode questions, which may or may not be asked by inter...

okay thanks

sterile sierra Mar 1, 2026, 1:30 AM

#

https://github.com/GriffinCanCode/Callosum anyone like this?

GitHub

GitHub - GriffinCanCode/Callosum: A language for defining AI person...

A language for defining AI personalities. Contribute to GriffinCanCode/Callosum development by creating an account on GitHub.

serene scaffold Mar 1, 2026, 1:32 AM

#

sterile sierra https://github.com/GriffinCanCode/Callosum anyone like this?

what is it? why should people like it?

sterile sierra Mar 1, 2026, 1:33 AM

#

personality DSL for agents, compatible w langchain, lets you be deterministic about ai personalities

#

compiler's in OCaml and is lightning fast

jaunty helm Mar 1, 2026, 3:05 AM

#

sterile sierra https://github.com/GriffinCanCode/Callosum anyone like this?

quickly skimming through it, I'm not sure how this isn't just a glorified system prompt swapper
I doubt those presets work as well as advertised too, instead of helpfulness: 0.90 you might as well just say high helpfulness and the latter is probably way more understandable

sterile sierra Mar 1, 2026, 3:34 AM

#

jaunty helm quickly skimming through it, I'm not sure how this isn't just a glorified system...

The output is a system prompt because that's the interface LLMs expose and calling it a "prompt swapper" is like calling TypeScript a "glorified JS writer." Behind that output is an OCaml compiler with a real lexer/parser, typed AST, semantic analysis (cycle detection, conflicting modifiers, contradictory rules), and multi-target codegen (JSON, Lua, SQL, Cypher - not just prompts). The numeric values aren't for the LLM to interpret literally howeber they drive the DSL's rule system: behavioral conditionals, cross-trait interactions, evolution deltas, and compile-time validation that "high helpfulness" can't participate in.

jaunty helm Mar 1, 2026, 3:56 AM

#

sterile sierra The output is a system prompt because that's the interface LLMs expose and calli...

behavioral conditionals, cross-trait interactions, evolution deltas, and compile-time validation
literally what does that mean? I mean I can guess, but it sounds like you understand it more, so please go ahead?

I guess my thoughts are like, ts is a lot more complicated than js, but it provides very visible benefits (like, well I mean, types)
the dsl thing is a lot more complicated, but in the end needs me to do the same amount of work as just writing a system prompt without it anyway?

#

ig if it works for you great, though currently I'm not seeing like too much benefit

rich moth Mar 1, 2026, 4:54 AM

#

https://github.com/plunder707/muon-curiosity/tree/main

I had an experiment my AI system and I wanted to run. Anyone have the resources?

GitHub

GitHub - plunder707/muon-curiosity: Muon vs AdamW fine-tuning exper...

Muon vs AdamW fine-tuning experiment for Qwen 3.5 35B-A3B — autonomously designed by a local AI agent after reading arXiv:2502.16982 - plunder707/muon-curiosity

pliant steppe Mar 1, 2026, 2:56 PM

#

oh my god it workeddd

#

from artificial stupidity to intelligence 📈

rancid thorn Mar 1, 2026, 4:09 PM

#

main grail Hey friends. I'm developing a stocks trading app that learns from trading behavi...

hey

rancid thorn Mar 1, 2026, 4:09 PM

#

pliant steppe oh my god it workeddd

woah i saw your question yesterday and wow its so good

pliant steppe Mar 1, 2026, 4:36 PM

#

rancid thorn woah i saw your question yesterday and wow its so good

yeah there was some loss function explosion when i fixed it and fixed my data augmentation it went well 🕺

jaunty helm Mar 1, 2026, 4:40 PM

#

<@&831776746206265384> looks like ad ^?

zenith nova Mar 1, 2026, 4:40 PM

#

!cleanban 906481045044625428 ads

arctic wedgeBOT Mar 1, 2026, 4:41 PM

#

:incoming_envelope: :ok_hand: applied ban to @pliant temple permanently.

warm dune Mar 1, 2026, 5:09 PM

#

rancid thorn woah i saw your question yesterday and wow its so good

and your problem? did u fixed?

rancid thorn Mar 1, 2026, 5:09 PM

#

nah

past bramble Mar 1, 2026, 9:05 PM

#

Where can I find data for code of multiple programming langs, preferably labelled in large amounts?

worldly dawn Mar 1, 2026, 10:17 PM

#

past bramble Where can I find data for code of multiple programming langs, preferably labelle...

github?

serene scaffold Mar 1, 2026, 10:19 PM

#

past bramble Where can I find data for code of multiple programming langs, preferably labelle...

Are you looking for the same logic implemented in multiple languages?

past bramble Mar 2, 2026, 4:25 AM

#

worldly dawn github?

i need ones outside github

past bramble Mar 2, 2026, 4:25 AM

#

serene scaffold Are you looking for the same logic implemented in multiple languages?

nope any random source code

worldly dawn Mar 2, 2026, 4:27 AM

#

past bramble i need ones outside github

why?

past bramble Mar 2, 2026, 4:28 AM

#

worldly dawn why?

the guy I'm working with said so not really sure why

worldly dawn Mar 2, 2026, 4:28 AM

#

past bramble the guy I'm working with said so not really sure why

sounds like something worth asking

final kiln Mar 2, 2026, 5:36 AM

#

y did they have to choose json for tool definition in LLMs and for structured output

#

is supa token intensive

#

even if the LLM is cheap it just reduces performance cuz of all the cluttering in the input prompt

spring field Mar 2, 2026, 7:53 AM

#

why does it have to be structured at all? does it make a difference to the LLM?

#

maybe it does make a difference if you want to parse the output

#

(and subsequently deterministically change said output to feed it back to the LLM)

arctic silo Mar 2, 2026, 9:49 AM

#

I'm building a QA agent and I'm handling the context so how can I split the codeabse and index it so be used later on or there an mcp server that handle all of that ??

#

in general how to handle the context it contains hundred line of codes?

jaunty helm Mar 2, 2026, 10:26 AM

#

arctic silo in general how to handle the context it contains hundred line of codes?

well first I'd imagine if to answer some of these questions, you need to understand all those hundred lines
if yes... well I mean that means you must send all of that into llm context
if not then you can think about it more

#

there has been code-oriented embedding models coming out lately that you could try to RAG with

jaunty helm Mar 2, 2026, 10:40 AM

#

final kiln even if the LLM is cheap it just reduces performance cuz of all the cluttering i...

I'd imagine json is one of the most trained formats for modern llms; anything custom and accuracy probably worsens
additionally I think integration with other tools is less of a headache when you just have plain ol' json

#

that said tho I went looking and landed here
I might try yaml instead 🤔

final kiln Mar 2, 2026, 10:45 AM

#

im impressed that yaml is actually better than json

final kiln Mar 2, 2026, 10:46 AM

#

jaunty helm that said tho I went looking and landed [here](https://www.improvingagents.com/b...

tho tbf they did use gpt 4.1, I'd be more interested in the latest gen of models

#

wonder if they have the code for their bench

#

#

interesting stuff

jaunty helm Mar 2, 2026, 10:48 AM

#

jaunty helm that said tho I went looking and landed [here](https://www.improvingagents.com/b...

well actually yeah, tool integration is gonna be a headache
everyone including the providers and the frameworks seem to have json only 😔

final kiln Mar 2, 2026, 10:49 AM

#

jaunty helm well actually yeah, tool integration is gonna be a headache everyone including t...

I might see if I can hack it into pydantic-ai somehow

jaunty helm Mar 2, 2026, 10:50 AM

#

final kiln I might see if I can hack it into pydantic-ai somehow

that would be awesome though I'm not sure how it would work exactly
considering the providers also only use json

#

ig just prompt for yaml output? probably a lot less reliable

#

local might work ig
llamacpp grammar keeps winning
-# never used the others so wouldn't know about them

final kiln Mar 2, 2026, 10:50 AM

#

jaunty helm ig just prompt for yaml output? probably a lot less reliable

yea, id have to have a custom prompt type thing, both for structured output and for tool usage

#

long term Id probably just use json cuz they are actively training the models for it

jaunty helm Mar 2, 2026, 10:51 AM

#

honestly I wouldn't be surprised if newer models have/will have tokenizers specifically optimizing for json token length

fair aspen Mar 2, 2026, 12:24 PM

#

hey guys

#

what's the best way to run llms locally with python?

#

I've tried many different things but none of them worked (vllm, transformers)

serene scaffold Mar 2, 2026, 12:45 PM

#

fair aspen what's the best way to run llms locally with python?

You have to have enough RAM, preferably on a GPU. Otherwise, you can't.

Saying that something "didn't work" doesn't communicate anything. What did you try to do, and what happened that was different from what you expected?

fair aspen Mar 2, 2026, 12:46 PM

#

serene scaffold You have to have enough RAM, preferably on a GPU. Otherwise, you can't. Saying ...

I have 16gb of vram

#

When I say it didn't work, I mean that I kept getting error after error and got burned out

serene scaffold Mar 2, 2026, 12:48 PM

#

You have to show the code and the whole error message for people to be able to help you.

grand minnow Mar 2, 2026, 12:49 PM

#

fair aspen what's the best way to run llms locally with python?

Ollama

serene scaffold Mar 2, 2026, 12:50 PM

#

(but if you just try to run them with ollama, and you don't know why you were getting errors before, you'll probably get errors with ollama.)

fair aspen Mar 2, 2026, 12:52 PM

#

serene scaffold You have to show the code and the whole error message for people to be able to h...

is this a support channel?

grand minnow Mar 2, 2026, 12:53 PM

#

fair aspen is this a support channel?

sure

fair aspen Mar 2, 2026, 12:53 PM

#

yay

#

I think I'm having an issue with ROCm

#

print(torch.cuda.is_available()) returns false

#

I have an RX 9060 XT by the way

grand minnow Mar 2, 2026, 1:08 PM

#

fair aspen ``print(torch.cuda.is_available())`` returns false

How did you install pytorch?

fair aspen Mar 2, 2026, 1:09 PM

#

grand minnow How did you install pytorch?

from the AUR:
yay -S python-pytorch

grand minnow Mar 2, 2026, 1:15 PM

#

fair aspen from the AUR: ``yay -S python-pytorch``

that seems like you've only installed the CPU only version

grand minnow Mar 2, 2026, 1:17 PM

#

fair aspen from the AUR: ``yay -S python-pytorch``

The official docs shows how to install with ROCm or with CUDA or such and such

#

https://pytorch.org/get-started/locally/

final kiln Mar 2, 2026, 1:24 PM

#

jaunty helm honestly I wouldn't be surprised if newer models have/will have tokenizers speci...

ya

I figure YAML is currently better for LLMs cuz humans understand it better and hence, there's more internet text correctly discussing big complex yaml than big complex json

#

is the whole "LLMs are a reflection of ourselves" typa thing

#

at least me personally, I'd rank yaml > XML > json as for ease of understanding

fair aspen Mar 2, 2026, 1:27 PM

#

grand minnow The official docs shows how to install with ROCm or with CUDA or such and such

oh it's because I should've installed python-pytorch-rocm now it works

#

thanks

jaunty helm Mar 2, 2026, 2:00 PM

#

final kiln ya I figure YAML is currently better for LLMs cuz humans understand it better a...

honestly, shrug
to my knowledge llm's getting more and more rl and other post training alignment, who knows what goes in there

#data-science-and-ml

Split & store in matrix (list of dicts)

RECONSTRUCT (concatenate)