#data-science-and-ml

1 messages · Page 182 of 1

pallid badge
#

@half pulsar Would you also happen to know good readings, example about federated hybrid architecture ? The idea is to know where data is but without creating a local database.

serene scaffold
#

!warn 1398328649857372222 your message was removed for hiring, which is not allowed

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied warning to @dull glade.

hollow frigate
#

how long is going through "10 minutes to pandas" actually supposed to take

#

have been working on ts for the past 2h

#

10 minutes my ass 😭

proven pier
#

I'm curious about parallelizing training experiments, and the effects/bottlenecks it has on performance/training time

#

For my current models, I have an excessive amount of VRAM - I could train maybe 6-8 at a time. Right now I'm training 2, but it seems to have slowed down quite a bit

#

I know very little about GPU hardware. At least when it comes to CPU, I can monitor performance and I understand there's a difference between "normal" threads and hyperthreads. When you start maxing out bandwidth and all of the hyperthreads, you're going to start noticing a nonlinear drop in performance (I think). I have no idea how to monitor my gpu besides nvidia-smi

#

All that gives is memory usage, temperature, power, "Volatile GPU-Util", and process by process memory usage

#

Oh, it appears "Volatile GPU-Util" may be akin to what CPU thread usage is. Oh boy, i do see that nearing 100% 😂

jaunty helm
warm dune
#

so the data when placed in a multi dimensional graph, the data get a shape, but we can’t visualize, so we put in a way that humans can see (2d or 3d)?

proven pier
half pulsar
#

Agreed

half pulsar
#

It's high in Complexity and the risk of it losing stability at scale is high.

jaunty helm
half pulsar
#

Manifolds are great

proven pier
#

At least practical experience gives me a little more perspective, if hardware prices ever come down again so I can build a dream server 🥲

half pulsar
#

You're pretty much right with not being able to ever fully visualize it. I can only draw it in 3D space. But It runs in 96 Dimensions

warm dune
#

the first picture I understand, but the second one i get lost

obtuse mauve
#

How are y'all today

proven pier
#

Me waiting 5 hours for 2 models to train in parallel because I guess my GPU's compute throughput is a lot worse than I thought

#

(without knowing metrics that's meaningless, but usually 1 model takes <= 2 hours)

dusky hemlock
obtuse mauve
#

1660 ti?😭

#

how old is that thing?

dusky hemlock
#

dang idk, my last card was a 1050 Ti. My buddy gave me the 1660 when he upgraded his setup

obtuse mauve
#

Damn bro

#

nf btw

dusky hemlock
#

yeah it's tough out here, but at least the card keeps up with most modern games which is fairly surprising for only 6 GB VRAM

#

wym by nf, i'm kinda lost lol

proven pier
#

My GPU is only 1 year younger than 1660 ti lmao

#

Oh wait, no. It's ONE YEAR OLDER

dusky hemlock
#

whaaaaa

proven pier
#

I mean idk what you're trying to train but obviously can't train anything like an LLM. There's always a limit

obtuse mauve
proven pier
#

Mine does seem to have better hardware specs though

obtuse mauve
dusky hemlock
#

And the output even when training on TinyStories was very low quality

proven pier
#

Yeah dude, training an LLM is not worth your time

dusky hemlock
#

I know D: it is sad

proven pier
#

According to techpowerup benchmark, 1660 Ti is roughly 71% performant to mine. So I would still have a terrible time training LLMs

#

Why? LLM training sounds boring

dusky hemlock
#

I just want to do it, that's literally all

proven pier
#

The literal entire industry runs on consuming 90% of electricity on the grid

#

That's like being interested in racing yachts

#

Not something I could ever have any interest in

dusky hemlock
#

That's the great thing about personal interests, everyone has different ones!

obtuse mauve
#

I am training a gemini api backed llm, my original goal was to run it fully local now it's gonna be mostly gemini because it's really time cosuming. Privacy gone but I have an ai assistant😂

proven pier
#

I have local LLMs I didn't train them at all

#

You can download open source ones and just let them do inference without training

dusky hemlock
#

I don't think the point is having an LLM that is already trained, at least not for me

#

I can run models locally no problem already

obtuse mauve
#

at least I had that experience

iron basalt
proven pier
#

If their models can't run good locally, I presume anything we'd train would run even worse

dusky hemlock
proven pier
#

Unless you had like 5 million dollars to blow

iron basalt
dusky hemlock
#

LMAO

#

bro i am on a 1660 Ti. A decent model to me is one that runs on my machine without exploding. Appreciate the input though

iron basalt
dusky hemlock
proven pier
#

Can one run pytorch with AMD gpu's or is it all cuda based

iron basalt
#

Asterisk.

#

Need modern AMD GPUs, and support is sometimes shaky, but fixed mostly by the OSS community. Vulkan versions (not ROCm) exist too and have been making a lot of progress.

#

Specific machines with 128GB RAM needed are sold for this purpose. They often use the AMD Ryzen AI 395+.

#

Unified RAM/VRAM.

#

They go for about 2600-4000 USD.

proven pier
#

GPU, price at launch: $1200, 2022
Price today: $1600
saltdRiot

obtuse mauve
#

A gemini api runs on raspberry pi 5 with 8 gig ram, right?😌

iron basalt
#

Which is why AMD, Nvidia, Apple, etc are selling these SoC boxes with a lot of RAM now.

#

You don't need as much compute, just a lot of RAM (for inference).

#

(Big model, and big context window)

iron basalt
#

Also it's pretty big then, need a giant box for the case.

half pulsar
#

Last year was the last year you could get 3090s for 500 bucks, now they're double that

narrow tiger
#

are there any open source llm models I can run on 1660ti (6gb) gpu, I remember trying some using ollama but it was very bad,
I mainly want to use it to generate text for maybe a description for youtube video or a tweet maybe U know basic stuff but to automate it.

dusky hemlock
narrow tiger
#

I have 16gb ram,
not sure what/how quantization works. though..
Is Qwen3 decent?

#

last time I ran small model locally it hillucinated pretty bad like it was drunk

dusky hemlock
#

I think for the purpose of what you are trying to do it should be plenty? I'm honestly 50/50 on it, i've only used the model a short while so far. Quantization is a bit out of my repotoire for readily available explanations, if someone else can explain

#

the one I am using has reasoning and tool calls, so if that helps you decide

narrow tiger
#

nw, thanks I'll google quantization iirc it has something to do with floats sizes like convert them 16b -> 8B but will search if i have to do this manually or itis something handled by ollama/models

dusky hemlock
#

No no no

#

you don't need to do manually

#

you can download the model in that format, but yes it does work with the float sizes!

narrow tiger
dusky hemlock
#

I cannot run that model at a realistic output rate to make it worth running to begin with, but yes that is a solid model for the card you and i have

narrow tiger
#

Thanks alot will check this one out

dusky hemlock
#

Sounds good

mossy osprey
#

Yay i fixed it i spit the ai into multiple AI’s that each control one part of the trading logic and it basically works now

dim fossil
#

9+

half pulsar
#

@limber plover How's it been?

lime grove
#

I am finding the sklearn.feature_selection.SequentialFeatureSelector to be incredibly sensitive to the choice in its hyperparameters. Is it right to basically start brute forcing the aspect of the machine learning problem and accept whatever gives you the best outcomes?

#

Mainly, you have the scorer, the estimator, the direction of the search, a tolerance. But there are dozens of scorers, and dozens of estimators, and decisions of which is best rely too much on over-interpretitive thought processes.

#

This is a sample

df_work_trn, df_work_tst, df_trgt_trn, df_trgt_tst = skl.model_selection.train_test_split(
    df_work,
    df_trgt,
    test_size=0.2,
    random_state=42
)
selector = skl.feature_selection.SequentialFeatureSelector(
    skl.naive_bayes.GaussianNB(),
    n_features_to_select='auto',
    direction='backward',
    scoring='balanced_accuracy',
    tol=0.0001,
    cv=5,
)
selector.fit(df_work_trn, df_trgt_trn)
feature_names = f_work.columns[selector.get_support()]
print(feature_names)
print(len(feature_names))
#

there are a total of 38 features, but I can end up with anything between 3 and 33 features depending on these hyperparameter selections

#

for instance, I can replace 'balanced_accuracy' w/ 'roc_auc', or replace GaussianNB with XGBoost. And so on

#

and note that prior to this I performed an alternative feature selection using statistical tests. Got p-values from Logistic Regression and chi-squared tests for independence. Those produced features that resulted in a certain f1-score found in the high 80s

#

But, again, the feature set was as seemingly arbitrary as anything produced by the SFS approach

bronze wyvern
#

Hi, quick question, is there a recommended asymmetric semantic search model? Currently I'm using my base model of MSMARCO but just wondering if there is a newer/better one?

bronze wyvern
#

noted, will give that a look, ty !

jaunty helm
# narrow tiger are there any open source llm models I can run on 1660ti (6gb) gpu, I remember t...

well one, "description for video" and "description for text tweet" are in wildly different parks of complexity; the former requires vision capabilities. esp. since you're on very constrained hardware you should try to narrow down the scope then search for specialized models
and 2, personally I advise moving away from ollama to just llamacpp, the former sits on top of the latter anyways and uses different api from everyone else for some reason

bronze wyvern
bronze wyvern
#

noted, ty !

jaunty helm
#

or well, probably - what are you trying to do exactly?

bronze wyvern
#

like, I have an app, an animal welfare app, currently I look for exact keywords to return some results. For e.g, user type "dogs", anything that has keyword dogs is retrieved.

But if I write labrador, I want to also have dogs to be retrieved since they are semantically the same

jaunty helm
#

yeah probably retrieval then

bronze wyvern
#

noted

#

question, have a look at this picture, this answer was generated by a LLM. Am I reading wrong or it's the LLM which is wrong? Normally as size of dataset increases, shouldn't we increase number of epoch?

I'm training a model based on a triplet dataset.

jaunty helm
bronze wyvern
#

yeah I see

bronze wyvern
jaunty helm
bronze wyvern
#

yeah I see

warm dune
#

Could someone correct me if I'm wrong, please? Basically, what a neural nn does is the following:

Each layer has its neurons, and each neuron in that layer generates a hyperplane with its own activation intensity. If the intensity is high, the hyperplane remains on the map; if it's low, it's deactivated, meaning it disappears. In the end, we would have several intersecting hyperplanes.

After that, for better visualization, we ignore the finiteness of the hyperplanes (but they would still be infinite), that is, where they intersect, they end there (so having a shape that represents how all the hyperplanes would look together) The final neurons observe which hyperplanes (neurons) are active and perform a linear function to check if these hyperplanes belong to them or not (basically, if the shape of the hyperplanes represents their "village"). If the output neuron perceives that certain neurons are active (that is, the shape represents its village), it understands that it belongs to it

stuck bluff
#

.

#

ok

iron basalt
#

And to solve XOR this is required, since a simple line can't separate it correctly.

#

XOR problem being to have the model mimic an XOR gate.

#

(One of the original problems that caused the search for something better and ultimately multi-layer, nonlinear, solutions (followed by deep learning (backpropagation) as a way to train it))

warm dune
#

That's basically what I was trying to say: each neuron produces a line/hyperplane with its own slope, and by joining them we can form curves and more abstract figures

warm dune
#

in classification I understand that we need to separe the date with the lines, but in regression i get lost

bronze wyvern
#

Hi quick question, when building a neural network in pytorch for nlp task multiclass classification, how do we determine how the mode should look like along with the hidden layers?

serene scaffold
bronze wyvern
#

yep noted, ty !

iron basalt
#

Passing through/near the points.

#

Warping lets you pass through them all perfectly. But now you have another problem, how does it handle new points? A very warped thing wiggles and jumps all over the place, maybe the new points are just simply in between the others (linearly for example).

#

Also what happens if you keep adding more parameters?

warm dune
warm dune
iron basalt
#

The neurons are just part of the computation of that big function.

#

You could write it out as a big function.

#

The graphical representation with "neurons" is actually just a visual way to represent that function, specifically what is sometimes called a "compute graph" (a DAG).

#

For example you could represent f(x, y) = x + y like that in symbolic algebraic form. Or as a 2D visual graph with 3 nodes ("""neurons""" in a deep learning model), the first 2 are the x and y inputs, and the third is the plus. So it looks like two nodes feeding into a third (x and y into plus).

#

You can also represent it as a tree, which is what a parser of that expression does. Plus is the parent, and x and y are children.

#

Execution is then just left child, right child, parent in that order. Where encountering x means push x's value into the stack, same for y, and then when you get to plus it pops the top two off the stack, adds them, and pushes the result.

#

The compute graph is a more generalized form of that, using a directed acyclic graph, which is more flexible than a regular tree graph (both don't have cycles, but one is directed).

#

f(a, b) = e = c * d = (a + b) * (b + 1)

#

The "neurons" are just part of the expression/function.

half pulsar
#

Finally we're getting some math in here. Best way to think of it, only way.

subtle lotus
#

I didn't know that if I wanna build an app AI like Grok and ChatGPT I must use React Native.

#

I really thought that with python I can build my AI app multi-platform hybrid

bronze wyvern
#

Hi, quick question, padding and truncation in nlp text classification pipeline, when does it occur? I thought it was at the tokenization level but I think I was wrong, it occurs at another stage?

marsh mango
#

Hey guys quink question how do you remember Python codes I'm having problem in remembering code if anyone can guide

serene scaffold
bronze wyvern
#

yeah like before we feed into the model

#

I understand that we need it to have the same size/same tensor size

serene scaffold
serene scaffold
#

and by "usually", I can't think of a counterexample. but things in AI usually aren't absolute.

bronze wyvern
#

yep I see, small question, when do we add the unknown/ out of vocabularly token?

bronze wyvern
#

okk, ty !

#

Quick question, should I be bothered with the standalone "s"?

serene scaffold
bronze wyvern
#

in the dataset it's, it's written as "ocean s" instead of "ocean's"

#

ocean s twelve raids box office ocean s twelve the crime caper sequel starring george clooney brad pitt and julia roberts has gone straight to number one in the us box office chart. it took $40.8m (£21m) in weekend ticket sales according to studio estimates. the sequel follows the master criminals as they try to pull off three major heists across europe. it knocked last week s number one national treasure into third place. wesley snipes blade: trinity was in second taking $16.1m (£8.4m). rounding out the top five was animated fable the polar express starring tom hanks and festive comedy christmas with the kranks. ocean s twelve box office triumph marks the fourth-biggest opening for a december release in the us after the three films in the lord of the rings trilogy. the sequel narrowly beat its 2001 predecessor ocean s eleven which took $38.1m (£19.8m) on its opening weekend and $184m (£95.8m) in total. a remake of the 1960s film starring frank sinatra and the rat pack ocean s eleven was directed by oscar-winning director steven soderbergh. soderbergh returns to direct the hit sequel which reunites clooney pitt and roberts with matt damon andy garcia and elliott gould. catherine zeta-jones joins the all-star cast. it s just a fun good holiday movie said dan fellman president of distribution at warner bros. however us critics were less complimentary about the $110m (£57.2m) project with the los angeles times labelling it a dispiriting vanity project . a milder review in the new york times dubbed the sequel unabashedly trivial .

serene scaffold
bronze wyvern
#

is that an indication of a poor dataset? 😭

Should I switch to AG news instead of this dataset?

serene scaffold
bronze wyvern
serene scaffold
bronze wyvern
#

multi class classification... I was reading the rows of text, the rows feel strange tbh

#

have a look at this one:

s korean credit card firm rescued south korea s largest credit card firm has averted liquidation following a one trillion won ($960m; £499m) bail-out.  lg card had been threatened with collapse because of its huge debts but the firm s creditors and its former parent have stepped in to rescue it. a consortium of creditors and lg group  a family owned conglomerate  have each put up $480m to stabilise the firm. lg card has seven million customers and its collapse would have sent shockwaves through the country s economy.  the firm s creditors - which own 99% of lg card - have been trying to agree a deal to secure its future for several weeks. they took control of the company in january when it avoided bankruptcy only through a $4.5bn bail-out.  they had threatened to delist the company  a move which would have triggered massive debt redemptions and forced the company into bankruptcy  unless agreement was reached on its future funding.  lg card will not need any more financial aid after this   laah chong-gyu  executive director of korea development bank - one of the firm s creditors - said.  the agreement will see some 12 trillion won of debt converted into equity.  the purpose of the capital injection is to avoid delisting and the goal will be met   david kim  an analyst at sejong securities  told reuters. south korea s consumer credit market has been slowly recovering from a crisis in 2002 when a credit bubble burst and millions of consumers fell behind on their debt repayments. lg card returned to profit in september but needed further capital to avoid being thrown off the market. south korea s stock exchange can delist any firm if its debt exceeds its assets two years running.
serene scaffold
bronze wyvern
#

I mean sometime it doesn't make sense, like one of the firm s creditors, it seems it removed all 's

bronze wyvern
#

word2vec followed by a lstm

serene scaffold
#

it looks like in this dataset, when "trailing s" is used for pluralization (rather than posession), there's no space separation. so you don't need to worry about "isolated s" being overloaded (to mean both posesssion and pluralization)

#

so adding the apostrophe back doesn't make a difference.

balmy fog
#

hey guys!

#

i have a quick question

bronze wyvern
balmy fog
#

how do i make a counter on openCV so that as soon as I start the camera, the counter goes until the stopkey?

#

like a time counter mb

serene scaffold
#

!warn 1443252853874495528 your messages were removed for offering a form of employment, which is not allowed.

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied warning to @calm cargo.

opaque condor
#

If I want to teach an AI colors like the name of certain colors I would need to give it every color combination and type

final cobalt
opaque condor
#

Question

Let's say to make x color

You need this this and this do I have to type out each and every combination so it has multiple references to a variable that can change

serene scaffold
#

@opaque condor what does it mean to "teach a color to AI"? be as specific as possible.

final cobalt
#

It really depends on what the input is. I'll just say, this is a fascinating little problem that would be perfect as a beginners project. I'm going to take and it flesh it out a bit, so the next time I encounter someone asking "what's a getting beginner ML project" I can show them

#

So for starters, in the most practical sense, you don't need an AI for this. A color is just a number (usually RGB). Imagine a 3D space just like the X, Y, Z coordinates we live in. A color is just a set of three coordinates just like X, Y, and Z - it's RGB

#

Right? So in a purely informational / numerical sense, to pair "a color" the number with "a color" the name, you just need a dictionary

#

And if you want a little wiggle room, you just snap to the nearest named color

serene scaffold
#

I think Mechanical Fox wants to train a generative LLM to be able to answer the question "what colored paints would I need to mix together to create color x?"

final cobalt
#

Well I was circling in on this. So @opaque condor in your own words

opaque condor
#

It was going to be for
Mixing paints

Or subtractive colors
So I can mix pigments or paints

final cobalt
#

CMYK?

opaque condor
#

Yes

final cobalt
#

Well, I can't speak to the exact nature of pigments - I recommand you take what I'm about to tell you to someone with actual training in color theory

opaque condor
#

I do painting but I'm wondering since this is going to be in text do I have to type down each individual pigment color separately to mix together etc

half pulsar
#

You don't need a generative LLM to achieve that

final cobalt
#

^

#

Honestly, the standard approach to representing a color is by writing in RGB or some other format

opaque condor
#

Well I wanted to make a specific machine learning algorithm that could tell me how many milliliters so I get the perfect shade

final cobalt
#

That color, by definition, is the recipe

#

And if you need a mapping from color recipe to color name, just treat it as a vector space and snap to nearest name in the space for each set of coordinates (or vice versa)

half pulsar
#

This is just color theory + math. CMY/CMYK can approximate it, and you solve for pigment ratios with optimization.

half pulsar
#

I wouldn't trust a LLM or Transformer here, as if you want real reliable results then you're gonna have to actually model it physically

ember nexus
#

hi - I'm trying to build some things tangentially around AI/ML workloads: what is it that is important to have in a language or elsewhere that would be of interest? I mean things that python or even C does not either make accessible or doesn't have as a built-in thing.
Secondly - where can I find code that these people running long-job inferrence or other things are running? I want to test things with comparable jobs to the real world

warm dune
iron basalt
#

(1 input, 2 hidden, 1 output)

#

Try making just f(x)=max(0, ax + b) in Desmos, and mess around with a and b as sliders.

#

Then combine multiple and mess around with the sliders.

warm dune
# iron basalt

i didn’t put the bias for makes it more simple to visualize, but now that’s correct, right?

iron basalt
#

In the linear algebra sense. In calculus both are linear (contextual meaning).

warm dune
#

finally I understand

#

thks

final cobalt
#

I'm pleased

#

I've joined up on a server for AI researchers

#

And as best I can tell, these are the people publishing the groundbreaking work. As a general rule, they don't let people who don't publish join up unless their willing to just lurk

#

So I shared with them the architecture I'm working on for an MtG AI

#

And for the most part, they tell me there's no obvious flaws

heavy dagger
#

where can i learn about fine-tuning a model? Yes there are a lot of videos out there but what resources do you recommend?

serene scaffold
#

The only wrong answer to this question is one that you look up.

#

(when we get to the end of this, you might even understand why there probably isn't a "good video" about fine tuning in general.)

ocean hinge
dusky hemlock
#

Hey folks, I have moved past the desire for a custom model of any kind, and am now looking to generate flowcharts or diagrams via python with Machine Learning or Reinforcement Learning based on human approval or rejection mechanics, does anything like this already exist, or can I theoretically build it in Python?

#

Basically the goal is to allow any amount of data to be input, in either .md, .txt and (maybe?) image formats, to enable the user (me) to programmatically generate or quickly iterate through designs with generative functionality, specifically for the purpose of creating layout diagrams for branching pathways of decision trees

#

ML and RL would come in (and maybe I am mistaken in what these are actually defined as, so please, correct me if I am wrong!) handy with constructing background memory (possibly a poor term) or context for the rest of the layout diagram, and allow me to quickly append, say, 3 branches of a decision's possible outcomes and then either approve of or reject the output of the generator (this, i think will be the easier part. I have a GUI for generating layouts in a different context so I can apply those lessons here). The approvals and rejections would "reinforce" things to avoid in that project's context, and things to lean-on more heavily.

Am i talking about a pipe dream/schizo or is this possible?

wooden sail
#

there are several tools that can already generate flowcharts and diagrams from md or md-like inputs in a deterministic fashion

dusky hemlock
#

I have tried mermaid, and various mcp-servers for LM Studio to integrate with an existing LLM interface, what kind of tools should I search for? Literally just "deterministic flowchart generator"?

wooden sail
#

an easy variation of what you're describing is asking an llm to generate, as an example, the latex source for a diagram based on markdown/text and images you provide

dusky hemlock
#

what does latex source get me? I am unfamiliar with it

wooden sail
#

latex (and the newer typst, if you're interested) allow you to generate pdf files with text, maths, diagrams, figures, etc based on text commands

#

it's a so-called "what you see is what you mean" pdf generator

dusky hemlock
#

oh wow! okay cool, I will do some experimentation with that before i ask anymore about my idea

#

thanks a ton!

wooden sail
#

let's see if this example works

strange elbowBOT
wooden sail
#

oof

#

maybe something smaller

#

.latex
\begin{tikzpicture}[
% select an arrow style

= latex',
%
% set nodes to be 2cm wide, 0.8cm tall rectangles
every node/.style={
draw,
minimum width=2cm,
minimum height=0.8cm,
},
]
% place and name the nodes
\node (S1) {Step 1};
\node[below=of S1] (S2) {Step 2};
\node[below=of S2] (S3) {Step 3};

    % draw connecting arrows
    \draw[->] 
          (S1) edge (S2)
          (S2) edge (S3);
\end{tikzpicture}
strange elbowBOT
dusky hemlock
#

oooh that's just what i need!

wooden sail
#

@dusky hemlock something like that

#

maybe typst is easier to work with, but latex is more "well-established"

#

describing your desired diagram to an llm and asking it to generate latex for it is a pretty standard task. typst might not work so well because it's newer, but latex has decades of stackoverflow content on which llms have trained

dusky hemlock
#

Okay wonderful I am gonna give this a try

#

before i do though

#

you were mentioning that it could use markdown as input already? is that the case with LLMs in the context of what we're discussing?

#

I know normal LLMs use markdown as a standard, i mean for input do I have to do anything special for the markdown to be accepted as an input for the latex to function?

wooden sail
#

i'm fairly confident an llm should be able to handle it

dusky hemlock
#

Fantastic thanks again for the tips

wooden sail
#

though i think pandoc lets you convert markdown to latex by specifying some sort of configuration

#

emacs org mode does something similar

lime grove
#

I am coming across an issue

#

when scaling data (StandardScaler, RobustScaler, whatever)

#

is it best practice to scale the entire dataset, i.e. before the train/test split, or is it best practice to scale the individual sub-datasets after the split?

#

it looks like scaling before train/test split might cause data leakage, but this is model dependent

wooden sail
#

you would usually determine either a scaling factor based solely on the training data, or you would apply some procedure that independently brings any dataset you'll feed into the network into an expected distribution/scale

#

e.g. you can normalize every input so that the largest value is 1 or the magnitude is 1

#

or each data set to be normally distributed with a given mean and variance (e.g. standard values or values determined from the training set)

#

the approach that works best depends on the nature of the data and the network you use

lime grove
#

can you clarify?

#

it sounds like you are just saying "it depends"

#

the usual workflow is

  1. clean up the data
  2. train/test split
  3. scale the data
  4. apply the classification / regression model
#

I am just asking if I should change the order of steps 2 & 3

#

it really seems that the order above is right, because model evaluation should be performed on test data that is nominally independent from the training data. Scaling the whole dataset in one go removes that independence

wooden sail
#

what you should never do is scale the training data based on the test data. in that sense, you can keep your numbered points in the order they already are

#

your options will rather be whether to scale the data completely independently, or to make a scaling scheme based on the training data and apply it to the test data (possibly without ever looking at the test data's properties)

brave berry
#

How should I start learning ML any recommended videos and reps to start?

restive bay
#

Hi everyone, i have a good knowledge of python, streamlit(dashboards) , ML models and now i want to learn more and explore through projects so if anyone of you have any projects where i can contribute please share

grand minnow
warm dune
#

Hey guys, are there any important certifications for machine and deep learning?

drifting gust
#

I can assure you it increases stamina and throughput, as well as lowers risk

#

Definitally penetrates the market in ways that need more exploration

serene scaffold
ocean hinge
#

Hey guys.

I want to know how much I know, and how much I dont. I have done courses in data science, gone through andrew ng course on coursea. I want to focus on machine learning or computer vision, What topics should I check out. So far, the projects I have done are, Image classification using qwen, model training for detecting an embryo quality by using tensorflow. I learnt about yolo, paddleOCR,easyOCR.

#

So do I know atleast 10% of the subject? I can perform EDA, perform analytics by using pandas, matplotlib, seaborn, numpy. Currently learning how to scrape data from a site using beautifulsoup and playwright.

valid fractal
#

I just started out on that.

spring field
ocean hinge
iron basalt
# ocean hinge Hey guys. I want to know how much I know, and how much I dont. I have done cou...

It seems like you may want deep knowledge. So let me explain what I mean by that. I generally split knowledge into two categories, deep and shallow. Shallow knowledge is like memorizing certain libraries/APIs, or memorizing how to add two numbers on paper using the standard algorithm learned in schools for addition. It's what is needed to actually get real work done, and do so efficiently/quickly (practicing the actual real world things). Deep knowledge is like understanding the math behind the models you are using, or knowing why the addition algorithm learned in school works, the structure of numbers as strings of digits. Deep knowledge is needed to generalize well, and be able to do things like invent new types of models, and algorithms. Right now you listed what can be categorized as a lot shallow knowledge. You can accomplish many real practical tasks using many existing tools. But if your end goal is to "learn everything," then you probably want a ton of deep knowledge too. To that end I recommend getting books that dig into how all those tools you have been using work, from the ground up. If you feel like you could have invented some of these tools yourself, you have acquired deep knowledge.

#

Note that complete lack of shallow knowledge can hurt your attempt at getting deep knowledge, as you lack a bunch of practical tools to test your deep knowledge understanding.

#

This is why in schools they often teach arithmetic first via just memorization and lots of practice before getting into any deep mathematical knowledge (if you struggle with multiplying two numbers in your head your journey through that deep knowledge will be a slog).

drifting gust
#

I can't resist the idea of pushing the front-line of research

#

I think im like beethoven here with my bachelors comp sci degree and lifetime nerd hacking exp

deep summit
#

@serene scaffold in what sense do you think yesn't is being used?

i mean n't is a negation clitic right and in english a negation clitic can only be used with auxiliary verbs, yes is not an auxiliary verb its not even a verb so any assumptions that construed: * yesn't is a valid phonological word
are False!!?

half pulsar
#

They are a dime in a dozen and mean nothing.

#

Hiring is focused on two things experience and degrees.

#

There are no shortcuts.

ocean hinge
worldly dawn
#

They are there to give students opportunities, but it's up to them to take them or to limit themselves as a way to only get a job

ocean hinge
worldly dawn
#

if you want to know how much you know or don't, it may be useful to look at some textbooks or papers published in your area(s) of interest

ocean hinge
ocean hinge
#

Also, isn't it better to have a wide range of options? Focusing on only one position doesn't seem right, what if there are no openings for that position? I'd have to wait months for one to appear.

steel harness
#

Hello!! Someone use Kaggle for data science?

serene scaffold
#

also the problem with "yesn't" isn't phonological. it's that the semantics are incoherent.

final badge
#

i struggle with calculating central tendency on paper. I know the formula an everything but i make alot of careless mistakes when doing it using a paper

#

is that a problem

serene scaffold
serene scaffold
#

the point is whether you understand what they mean

final badge
#

i struggle with doing long calculations using +,-,* and /, when it requires a lot of steps

#

i somehow end up making a mistake

serene scaffold
#

you need to be good at algebra.

#

in terms of figuring out what formula represents a problem

final badge
serene scaffold
final badge
serene scaffold
final badge
#

I might know how to solve it but i mess up the calculation part

#

especially when its tedious

ocean hinge
serene scaffold
# ocean hinge Still not answered. Anyone?

a few people responded. And Squiggle is especially knowledgeable, so I'd give what they said at least one additional read.

end goal would be to learn everything
you pretty much can't learn everything about ML. Just pick a direction that looks interesting and run towards it.

ocean hinge
wooden sail
#

suffice to say getting a masters or phd in mathematics can be a good starting point to go in full depth in the theory of some ML topics

ocean hinge
#

But are there topics you think every newbie needs to know? Should be a handful no?

wooden sail
#

that's a completely different question, yes

#

what you asked before encompassed like 100 years of mathematics and computer science

#

for newbies wanting to understand what they're doing, linear algebra, multivariable calculus and statistics are the starting point to gently move into optimization

ocean hinge
wooden sail
#

so you can do stuff like compute covariance matrices and do eigenvalues decomps, low rank approximation, statistical parameter estimation?

ocean hinge
ocean hinge
wooden sail
#

well, not only

#

that's one version of it, the more modern deep learning approach

#

most problems don't require deep learning though

#

so convex and nonconvex optimization are generally useful topics to study, which can include deep learning but also include other stuff

wooden sail
#

what properties does your solution inherit if you use regularizers of different kinds

#

why do cnns work well for image processing

ocean hinge
wooden sail
#

you can look into books on optimization

#

stephen boyd has great free resources on convex optimization

ocean hinge
#

Is it okay to add you as friend? Only if you are comfortable. I just have these kind of questions.

wooden sail
#

i'd rather not. i'm usually lurking here anyway

#

i also like louis scharf's statistical signal processing

#

very enlightening

ocean hinge
drifting gust
#

@ocean hinge I'm not in the ML industry, but it all comes down to the knockout competition of winning a round of interviews. So whatever you can do to impress your employer and beat any code tests they throw at you will help you. A degree is in the impress category. So is having a portfolio of works you've done. I'm not willing to waste extra years pursing a degree in a field moving so fast, so I'm going with portfolio and self study

#

You could probably get together your vision projects and make a portfolio of them and apply to jobs for a year and hopefully land one :p Also I'd say you should be getting AI to write your code, not avoiding it. Embrace it :p (but maximize your understanding and knowledge)

fair aspen
#

how do I install rocm and pytorch for gfx 1200 (rx 9060 xt) on a python venv?

drifting gust
#

this way I don't have to mess my systems drivers / torch up

#

  torchbox-custom-models:
    build:
      context: .
      dockerfile: Dockerfile
    devices:
      - /dev/kfd:/dev/kfd
      - /dev/dri:/dev/dri
    volumes:
      - /sytem/code/dir:/path/in/container
    environment:
      - HSA_OVERRIDE_GFX_VERSION=10.3.0
      - HIP_VISIBLE_DEVICES=0
    ipc: host
    network_mode: host
    shm_size: 16g
    cap_add:
      - SYS_PTRACE
    security_opt:
      - seccomp:unconfined
    group_add:
      - video    
#
FROM rocm/pytorch:latest

# Set working directory
WORKDIR /root/dockerx/custom_models/

# Install Python dependencies
COPY requirements.txt requirements.txt

RUN pip install -r requirements.txt

CMD ["/usr/bin/bash"]
fair aspen
drifting gust
#

You probably wanna check the card + rocm version I'm not sure if that HSA_OVERRIDE_GFX_VERSION is the right one

fair aspen
#

yeah I think that's an rx 6800 xt or something

frail wing
#

who uses math libraries in python? have you ever implemented a bayesian hierarchical model with a horseshoe prior using variational inference and assess convergence without mcmc?

waxen kindle
#

that is soooooo specific

#

but you can probably just ask your questions, even if people have nt done exactly the same thing, they will be able to answer

peak thorn
#

I trying to build iris biometric verification system using python without any physical fingerprint sensor I m planning to capture images from camera detect fingerprint edges, store them and them compare them for verification but current version is not working fine I mean accuracy is very low

If anyone now something please help me here by giving me solutions and suggestions It would be very helpful

waxen kindle
#

Wait, so you are using fingerprints or irises ?

mental bronze
#

I think he meant capturing the fingerprint using the camera

#

If that's even feasible, I think it's going to largely depend on the camera ability

waxen kindle
#

I don't think that what they meant

waxen kindle
#

(Which is what I assume you are doing)

hazy vector
#

Bro i am final year ai and data science, i got a job customer care associate, should i accept that job or is any company providing ai job roles for freshers and what are IT roles should i prepare for the job? I am confusing with python development ,java development, data analyst for learning course 😶‍🌫️

half pulsar
ocean hinge
#

Hello

I wrote a model that detects which species is there in a given image. I am getting an accuracy of 86%. How can I fine-tune it? Just increase my dataset?

serene scaffold
ocean hinge
#

@serene scaffold

ocean hinge
serene scaffold
#

so either the confusion matrix is wrong, or "accuracy of 86%" is wrong.

bronze wyvern
#

Hello, quick question... pytorch is the way to go now when it comes to DL/ML/NLP stuff.

So I was wondering, what would be the proper way to start learning how to implement a linear regression model using pytorch for e.g.

For example, say I learnt linear regression and now I want to implement it through code. How would you people recommed me to learn pytorch pls because it seems too big, I don't really know where to start.

serene scaffold
bronze wyvern
#

noted, will keep that in mind, ty !

wooden sail
bronze wyvern
#

yep noted

static shadow
#

im not the greatest at ai ml and things of that nature, but can somebody assure me if my understanding is correct?

I want to get the cosine similarity between A and B, B lets say is our target, and A is our input which we are unsur about. If say B is one word, and A is say 15 words, but is inclusive of the word found in B. Due to fact that there are 14 other words in A, the similarity between these 2 vectors will be lesser right? supposedly this is called dilluation. Is that correct? I want an answer from a human not AIs as this is a domain im not well versed in and i feel like AIs could hallucinate too hard.

serene scaffold
#

I want an answer from a human not AIs
It is against the rules to copy and paste "answers" directly from AI in this server.

static shadow
#

assume that both A and B have the same dimension (384 for example), but lets say that before the the vector embedding is done, B is literally "Hello" and A would be "Hello, to you there too... " etc, generally more words, that is the idea

rich moth
#

i made a new memory system, i just got it up on webui today, its been in a terminal since development. But it connects via api using open at endpoints. Im putting the finishing touches on the fast mcp server, but you can use cloud agents too

rich moth
#

well it blow Letta/Mem0 off the shelves. but its rust and python but i need to figure out to package it

gentle girder
#

Hi friends

half pulsar
#

ok

spring field
#

why

ivory hare
#

How do I start making AI and stuff?

jaunty helm
fickle oxide
brave berry
#

any 1 have recommended tutorial vids or reps to able to learn about ML?

fickle oxide
#

i have text book with reps

grim wolf
#

does any of anyone have a background of (cv2, yolo, ctranslate2, edge-tts, piper-tts), I am building voice AI agent, with integration of os level navigation. If you interested in reach me on DM.

serene scaffold
#

@grim wolf it's not appropriate to use this server to recruit people into secret projects.

ocean hinge
#

Hello

Can anyone explain what PCA really is? I am not able to understand the google definition. I get you are decreasing the features, but how? You just ignore it?

pseudo lark
# ocean hinge Hello Can anyone explain what PCA really is? I am not able to understand the go...

Covariance:
Measurement that indicates how random variables change together, this gives us the direction and value of the relationship.

Covariance matrix:
Relationships among the features

Derived numbers from that for PCA selection -

Eigenvector: direction of component

Eigenvalue: amount of variance explained

You want the largest eigen values. These are the principle components. Hope that’s put simply enough

ocean hinge
pseudo lark
#

Load a basic dataset in Python and run

Object = np.cov(data, rowvar=False)
pd.DataFrame(Object)

Then compare that generated table to the shape of the dataset. You’ll see this behavior.

Edited: confused more traditional stats. So syntax, would have rowvar=False in cov()

warm dune
#

the process of transforming images into matrices, does this preprocessing have a specific name?

iron basalt
warm dune
iron basalt
#

Unless you mean capturing an image physically with a camera?

warm dune
iron basalt
#

It's still numbers, just compressed.

warm dune
# iron basalt It's still numbers, just compressed.

u don't say, my point is that a model can't perform linear algebra on a compressed file header. It needs a structured numerical input., so I was asking for the industry term for that 'file-to-matrix' bridge, is it decompression? or other?

iron basalt
# warm dune u don't say, my point is that a model can't perform linear algebra on a compress...

So there are multiple things going on. The first is loading the file into main memory, then there is parsing the file, and then there is performing decompression to get the image data itself in uncompressed form. Then you are associating that data with an object in the programming language. You are giving that object ownership of that data. That object has associated operations (methods) it can perform on/with the data it references. So it looks something like: load image with Pillow, it reads the file into main memory (RAM), parses it using its JPEG parser, decompresses the image, and returns a Pillow image object that has ownership of that data. I am now going to make a bit of guess here to what I think you are referring to. You then give that Pillow object to Numpy and now you have a Numpy array that references that image data. This process does not exactly have a single name, it's just passing around a reference to the data to another library.

#

The type of object from a data structures POV is called a multi-dimensional array, or N-D array for short as numpy calls its arrays.

#

Matrices are specifically a 2D table of things. Most typically used as notation for linear transforms in linear algebra.

#

Images don't fit that, unless they are specifically greyscale, or binary images formats (or any single channel format).

#

They have a third dimension, which are the color channels.

warm dune
lime grove
#

another data science workflow question

#

earlier I asked about scaling the data prior to the train-test split, and the outcome of that query is that no, you do not scale the data before you perform any sort of n-fold validation

#

scaling the data before n-folding would cause leakage between the folds, which is malpractice

#

now, a different and slightly related question has arisen

#

when you are performing feature engineering, e.g. taking a numerical feature, and applying transformations on it. Multiplication of features with each other, or taking exponentials, or powers, etc., all in an effort to increase the number of features

#

should this done before the train-test split, or after ?

serene scaffold
#

also, if the operation affects how you represent data that goes into the model, you need to be able to perform the same operation on the X data of both train and test

#

otherwise, you won't be able to run the test data.

rich river
#

I heard some people say that anaconda will be replaced by uv, is it true?

serene scaffold
#

there were many years after anaconda should have completely died, before uv was even an idea

#

I've been working in data science and AI since 2017, and I've never used anaconda a single time, and I've never needed to use it in any way, for any reason, ever.

#

even on Windows.

#

@rich river is that clear?

rich river
#

I see, I just wondered if I can just uninstall it

serene scaffold
rich river
serene scaffold
#

That's a separate issue. Like I said, anaconda should have gone away many years ago.

#

I'd reckon that 2016 was the last year that using anaconda wasn't embarrassing.

glass temple
#

I have a kind of peculiar problem; I have 2 datasets that's mostly categorical values. That in of itself isn't much of an issue, but the problem is: the unique values in the training and testing datasets are quite different.

The scoring metric is accuracy, so getting a decent score isn't much of an issue, but whenever I tried training the models, the validation accuracy comes out to 1.0, and the test score is ~0.94. I have no idea how to improve the model and get an actual useful result as the validation score is always max, even for base models...

#

So far, I tried one hot and ordinal for linear models, and ordinal for tree models. One hot with linear, and ordinal with tree models have a val score of 1.0, while ordinal with linear ones overfitted to a val score of ~0.99, but the test scores were ~0.89

glass temple
#

I mean, the leaderboard has a score of 1 after an hour of submissions...

jaunty helm
#

how large is the test set

glass temple
#

plus, it kinda feels like cheating as in, i didn't even do anything but train a base logistic regression

glass temple
#

the train set is 7k rows, but after dropping some missing data, it's around 6.8k

jaunty helm
#

that just means the dataset's easy to separate
I'd start by looking at the ones your model classified incorrectly

glass temple
#

I don't know what the cutoff score is gonna be, but it's not gonna be under 0.85 considering the number of people who scored over .9 after less than 24 hours

glass temple
jaunty helm
#

or can you not say due to competition rules or sth (if so, it's gonna be a bit harder to help)

glass temple
#

here's the data

#

I can send a link to the kaggle comp if you want

glass temple
jaunty helm
#

ehh might be slightly different actually
number_of_bruises is new

glass temple
#

ah, then there's no way I can get a higher score as I can't use external data 😅

jaunty helm
#

well it's not that you have to use external data to train your model to get better scores
but it probably will be a lot harder
esp. for the data that only appears in the testing set

glass temple
#

yeah I guessed so

#

looking at the uci dataset, I'm pretty sure they just added the number_of_bruises as a numerical feature later on

#

I'm assuming that there's no way to cover the missing unique data from the test dataset?

jaunty helm
glass temple
#

looking at it, I might be able to reverse engineer how they made the test dataset

jaunty helm
#

at which point it's less a data science exercise, but sure if you want to, or its important to score high for whatever reason

glass temple
glass temple
#

thank you for finding out the real dataset and the help

ivory hare
bronze wyvern
#

Hello, just wanted to ask something.... I've seen all the influencers out there talking about programmers/developers job is over. I wanted to know to what extent is this true? A friend of mine told me that Node creator even stated a claim like that. For the influencers out there I didn't really bother but if creators like even for Node run time start to talk about that, I wanted to know how "bad" it is for developers, can we expect this to be worse?

In a time like that, do you people recommend to learn specific skills so that we can stand out from the others? I would really appreciate some advice 🙏

serene scaffold
bronze wyvern
#

yepp noted

tight stag
#

guys can anyone help me , im currently working with an anonymized dataset with huge distribution shift, its hft data, shuffled and we are not supposed to order it by time. its a as a regression task on shuffled row-level samples.

does anyone know what i can do? I have tried regression and lgbm but my LB score is like -0.331

#

please ping me if replying

warm dune
pseudo lark
pseudo lark
# bronze wyvern Hello, just wanted to ask something.... I've seen all the influencers out there ...

I’ve recently started a small internship with the Florida Department of Transportation and the skill is merely transitioning to thought process and not raw coding ability to be able to offload stuff like data cleaning/transformation to entry. It just no longer takes the same amount of time for that extra wage. Those help gain domain knowledge for new people though. So, more limited. At least with our current economy.

Pipelines, workflows, etc? AI still booty cheeks at creativity. Let alone complex ensemble predictive models.

half pulsar
#

I'm seriously thinking at this rate we will not be getting DDR6 on consumer builds.

serene scaffold
#

It will probably take a few years for hardware prices to normalize

half pulsar
#

I regret selling my high capacity servers last year, that's all I gotta say

#

DDR4 was basically borderline ewaste last year

vague frost
#

In ai engineering what would u guyz suggest to learn first c++ or python

mellow vector
# vague frost In ai engineering what would u guyz suggest to learn first c++ or python

Obviously, you'll encounter bias for python on this server but the data ecosystem is top notch. Before you're writing models though, you'll want to spend a couple months, minimum, going over core python. NN code is very dense, you'll want to know how to read it going in. That said, once you're able to parse the code, you can whip up a working MNIST model in like half an hour and dive into AI coding.

glass temple
half pulsar
#

C/C#/C++ Lang's is some of the most fundamental languages one must know, You can easily translate your knowledge from there.

#

Later you will also find it VERY powerful, because Python has limits that C lang solves.

rich moth
#

I say learn rust, python and go but thats me 🙂

#

Separate yourself from the pack. Find your own path, I'd research it. No one can tell where you path will be in 5-10 years.

#

Not in these times.

plush shuttle
#

does anyone know a good dataset for a chatbot ai

#

or should i scrape it

plush shuttle
#

altho i mightve forgot rust and go

#

since i dont use them anymore

violet mauve
#

Looking for a beginner level ML/DS study partner. (3-5 only)
We’ll study for at least an hour in VC at night (IST) daily.
It's okay if you can't open your mic
Just dm me those who want to join

plush shuttle
limpid zenith
ocean veldt
#

guys who has the right tutorial which can teach me data sciences i know nothing

#

but i know python

ornate bone
#

hey developers,i am in my freshman yr and i started to learn python and its libraries.i wanted to ask few questions.like after learning single concept like if i learned oops today so should i try to build something with it or just the assignments i am doing with lectures is fine and building project when its right time(like after learning a lot and you get an idea)

ocean veldt
ornate bone
ocean veldt
#

if you use ai this means it aint your work

#

that makes u a non python coder

ornate bone
#

like i am not asking it to code

#

just prompting i know oops and basic python give me some projects ideas to build on these topics

grand minnow
grand minnow
serene scaffold
wheat snow
#

for a coursework i have trained a DQN to balance a frictionless cart pole (the usual one from gymnasyium https://gymnasium.farama.org/environments/classic_control/cart_pole/ with teh usual rules (episode terminates after 500, or if angle is above a 24 degree or if cart position is +-4.8)

when generating a greedy policy slice of how my dqn acts: cart position frozen at 0 + cart velocity frozen into 4 values (each get a subplot) i struggle to understand for the lower 2 plots thoose push right islands... could sm1 explain them to me?

A standard API for reinforcement learning and a diverse set of reference environments (formerly Gym)

cedar tusk
cedar tusk
#

was fun

#

didnt even use deep learning

wheat snow
#

my loss function is given

#

i can share teh coursework later

#

but i have more bad news... :/

#

mind helping me out in a bit? @cedar tusk

cedar tusk
wheat snow
#

i share the stuff in a bit

#

more like the slices

#

are super inconsistant

#

and i do not know hwo to answer teh problem the best

#

this is first up given for teh loss

#

this is the coursework

#

bruh i cant sent .pdf

fierce python
#

I'm researching a bit about TurboQuant and was quite amazed with their work.

On their 1-bit error correction, do they add the attention score (using quantized query and key Q^ . K^) with their respective error correction vector (Q^ . K^) + ΔQ^ + ΔK^? Or just (Q^ . K^) + ΔQ^. Where ΔQ^ andΔK^ are the scalar corrections?

tall garden
#

You need to do math for AI ????

limpid zenith
#

yes

limpid zenith
fierce python
# cedar tusk https://github.com/TheTom/turboquant_plus what in the fuuuuuuuuuuuuuuu

It's interesting on how they dequantize the vectors back to their approximate original form instead of adding an error correction after calculating the attention scores.

I think they can even further reduce the KV-Cache size by calculating error after attention scores are calculated. Since in the current implementation with dequantization, they require the projection matrix matching the Key/Query dimension which is dxd.

Theoretically, they can reduce the projection matrix to m x d where m is far smaller than d by calculating a scalar correction and modifying the attention scores instead.

cedar tusk
#

ai is math my dude

austere marsh
tall garden
#

Can you share a roadmap or something ?

austere marsh
#

wdym no

serene scaffold
austere marsh
#

ok well i use tensorflow

tidal bough
#

...tensorflow is outdated??

#

what happened to it?

austere marsh
serene scaffold
austere marsh
#

basically both tensorflow and pytorch are good but there really isn't one standard

serene scaffold
#

I've never seen a coworker use tensorflow ever in the last five years.

#

just use pytorch. be on the winning team.

austere marsh
#

team?

serene scaffold
#

the team of people who use pytorch

austere marsh
#

i give up

serene scaffold
#

you don't need to give up. you can start winning by using pytorch.

austere marsh
#

i give up with arguing

austere marsh
#

got nothing for that one huh

serene scaffold
#

I wasn't looking at this channel, but yeah, I agree

#

Every soul is free to choose their life and what they'll be.

austere marsh
serene scaffold
#

how so?

austere marsh
#

because i hate the first guy in the moderator list

#

yk

#

im from israel n stuff

#

we have some unwanted problems over here..

serene scaffold
#

@austere marsh I'm muting you if this continues. you can have whatever opinion you want, and you can put any country flag in your nickname, but we're not discussing this in the server.

austere marsh
#

well you asked i answered but ok..

serene scaffold
#

saying "except when it comes to politics" doesn't indicate that you're about to say you hate a member of the staff.

austere marsh
#

like people are hating each other and idk why..

#

let's just live all happily together.

serene scaffold
#

Great, we'll leave it at that.

austere marsh
serene scaffold
#

I said it because I didn't understand what you were trying to say.
Everything that you say is "your fault".
Send a message to @sonic vapor if you have any other questions or comments about this.

austere marsh
#

oh?

#

ok

austere marsh
#

can we ban this guy

#

@serene scaffold

serene scaffold
#

!ban 932187617288667138 antisemitism

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied ban to @warped vault permanently.

robust echo
#

You don't need to get insanely deep into those subjects to start, but it's worth having the basics in order to understand the properties of NNs and how they're designed.

#

i.e. in order to understand the effect of using different loss functions in a network, you have to understand its partial derivatives for a given input/output as a function of the weights and biases in the network.

#

So being able to do basic vector math and take derivatives is really useful.

iron basalt
#

Jax is the new shiny toy for them.

#

TF 1.0 was a broken rigid mess, then Pytorch came in got all those users. They tried to make a comeback with TF 2.0, but it was too late by then.

#

(And now Google has moved on)

#

(Final nail in the coffin)

lime grove
glossy crow
half pulsar
#

Can't believe that just happened

summer mist
#

Someone has a good tutorial on binary classification, more particularly on svm and kernel trick ?

idle stone
#

i have hands-on machine learning 2023 with scikit-learn, tensorflow and keras, is this still one of the golden texts for machine learning even though everything i've seen is saying that pytorch is the dominant framework now? just curious if a lot of the material is relevant and would help me picking up pytorch while getting the fundamentals a lot quicker

serene scaffold
idle stone
#

i've had some experience with pytorch before, but i think this book will set the ground stage for everything and then i can tackle pytorch specifically once i finish this with some background

wide wing
#

I got a question How much should ik about MLOps if i want to become a machine learning engineer?

serene scaffold
# wide wing I got a question How much should ik about MLOps if i want to become a machine le...

it's going to depend on what role you end up getting and how that company/team decides to distribute labor. I have a coworker who on paper has the same job as me, but he pretty much only does MLOps.

Be prepared to learn as much about MLOps, or about whatever else as ends up being needed of you in the future. But for the moment, I would get comfortable with Docker and its core concepts.

twilit prism
#

So... when it comes to training LLMs and inference, is MoE just a matter of having the autoregression take on the task of picking the feature-space of the embeddings to apply adjustments?

wide wing
serene scaffold
#

and current stuff ("""""agentic""""""""") isn't really NLP

wide wing
serene scaffold
wide wing
serene scaffold
#

you also won't be able to create your own LLM unless you get a job at an exceptionally powerful company. LLMs cost millions of dollars to create.

waxen kindle
peak lark
#

i am having trouble recreating john cramers sonification of post planck epoch radiowave data from 2013

#

because if you rotate the data by 90 degrees, and cross reference declassified 2003 gateway, theres already quite the uh, that, before you even have to cross reference current classified gateway

#

basically im tryna do what cramer did, but, finer detail for wave inspection

#

oh, you're currently inside a black hole btw. rather, EVERYTHING is a black hole. that much has been hashed, now im just tryna confirm if its if its recursive black holes.

this proposed formula for 'why everything' wasn't supposed to end up being a black hole annulus intentionally. /but/...

#

one is one instance, the next 100k instances, the last one million instances, you can think of the 100k as sort of a theta wave oscillator of the hippocampus, but staring down the cyclical bangs, the color is just phase density being used as a heatmap, the last is one million

#

one trillion.

#

oh, sorry, forgot the 100k for probability, and here's the full proposed formula for 'Chaos sequence'

#

almost had it earlier

fading wigeon
peak lark
fading wigeon
#

I know how LLMs work

peak lark
#

sorry, just saw the question, i dunno you personally

#

nice name btw

fading wigeon
#

They claimed that they don't know how to code, but have taken ML basics and know how they work. I was trying to gauge their level of understanding of the architecture through asking

#

And ty 🙂

peak lark
fading wigeon
#

I'll chime in more about my thoughts and the mechanisms after I hear from them, haha. Don't want to ask a question and give the answer away

#

Oh no 🙁

serene scaffold
#

!unmute 115751921813422082

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: pardoned infraction timeout for @peak lark.

serene scaffold
#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

So that everyone can easily read your code, you can paste it in this website:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

serene scaffold
#

you can attach more than one file at once in this paste bin.

peak lark
#

ty

peak lark
fading wigeon
#

On a different topic of handling missing data, how do you guys feel about model-based imputation? I know stacked models are a thing, but using a model to solve a problem with a model just seems..... maybe this is a me problem but it seems cursed, lol. Maybe it's just something I need to get over.

#

I have no problem with teacher/student modelling or transfer learning, for whatever reason

peak lark
fading wigeon
#

(For anyone unaware about model-based imputation, it's a more recent method of handling missing data by having.... a model try to guess at the missing value based on the other features.)

fading wigeon
#

I would personally advocate for removing records with missing data, but if there's a systemic reason you're missing data you're going to bias your results

peak lark
#

i gotinto this wholemess chasing 3 body vector problem off a hunch for, at the time, a buckshot whim for orthogonal superphase.

#

now the only thing it DOESNT involve is string theory

fading wigeon
#

Wasn't the three body problem solved, even if it was ugly?

#

I haven't followed it too closely

peak lark
fading wigeon
#

No, I'm not talking about like... specific orbits. I mean I thought there was a closed form solution that was ugly but worked.

#

Oh. The solve is contested

peak lark
#

the reason why it happens i believe, is the universe is inherently uneven, 'lobsided' both the cia model and my own conclude it to be product of phi

fading wigeon
#

Seems like we first have to agree on if the three masses are equal

peak lark
fading wigeon
#

I mean, I think the most famous example of 3 body problem is the sun, the moon, and the earth

#

Because it could, you know, kill us

peak lark
peak lark
peak lark
#

i had that formula hashed BEFORE gettin this stuff,

peak lark
#

the brain too is a 3 body system

#

pi/phi my guy, planck era took place essentially because pre non-deterministic reality following chaos sequence just, ran out of anything else that adhered to the demand,

#

this could be paraphrased as 'infinite novelty'

#

anything else would be a closed deterministic loop, in full, just one thing defined by nothing, resulting in 'final event horizon scenario' all over again, planck era.

#

in essence this would be clarifying boltzmann brain as not quite the case and expanding on hawkings otherwise monopole formula for time/space/miwkowski space

#

it even clarifies antimatter/weak nuclear force violations

peak lark
#

fantastic, i managed to corrupt python

dusty valve
peak lark
#

numpy._core._exceptions._ArrayMemoryError: Unable to allocate 2.06 GiB for an array with shape (2048, 2817, 48) and data type float64

peak lark
#

so, we try something lighter

fading wigeon
#

Wait.... are you suggesting closed form solutions exist for 4+ n_body problems?

peak lark
fading wigeon
#

There may be some specific high-symmetry special cases, but generally I'm pretty sure it doesn't get easier at 4+, it's just 3 where it gets hard. Since 2 is solvable

peak lark
#

you're a sharp lad il give you that

#

as reminder, phase is the cia word for toroidal miwkowski space

dusty valve
peak lark
#

i didnt make it very far, um, would anybody mind lending a hand if they have more ram than i do?

arctic wedgeBOT
peak lark
#

tryna do that, lots of freezing

#

pain without end

dusty valve
#

Someone on ts server has a 256gb ram hobbyists datascience setup

peak lark
#

did i do it right?

peak lark
shut vapor
#

Hi, I'm about to finish CS50 introduction to artificial intelligence and I started studying calculus 1-2. What should I study (and what resources could I use) for data analysis and machine learning? Ty

half pulsar
peak lark
# half pulsar Optimize it, RAM shouldn't be the limiting factor here.

did, found out theres more missing from the details about how john cramer did that, infact, MOST EVERYTHING related to this is just 404 links not found
https://library.wolfram.com/infocenter/MathSource/5083/
https://wayback.archive-it.org/21834/20250903013453/https:/map.gsfc.nasa.gov/

between the classified stuff, these details, and the fact even hawking said it was the most hype of his entire career, then proceeding to make zero revision to his otherwise monopole formula? not hard to see a gag order when it happens, this isnt historically a novel concept, back when the iron curtain was a thing, you practically werent allowed to discuss water being wet

#

nasa even commisioned it as a bronze statue. whys this the first youre seeing it?

half pulsar
#

What are you even trying to do here?

peak lark
#

/currently/ im tryna figure out if common logic be the case, and its black holes all the way up/down/bifurification'd

peak lark
# half pulsar What?

2003 declassified depiction of consciousness energy grid throughout a brain compared to bang wave, care to see the stuff regarding consciousness in here? they even go into what quantifies death and identity down to a geometry

half pulsar
#

Man did you take too many shrooms or something

peak lark
#

not for like 8 years at least i mean

#

been awhile

#

i got this stuff like, not even a day after hashing my own formula as posted earlier

half pulsar
#

You're thinking aloud here, sure I give you that they are pretty patterns, but I want to see a cool demo or something not speculative out of context stuff.

half pulsar
#

There's is no structure or form to what you're talking about from my perspective. Keep things Concise

half pulsar
peak lark
#

vector scope, basically looking down the barrel of an osciollo

#

its SUPPOSED to be used for doing stereoscopic phase relations/cancellation management,

peak lark
half pulsar
#

What's the relation here? I don't know what you're talking about

peak lark
#

380k years post bang to be specific, redshifted,shortened to 100/500 seconds

#

personally i found the notion of black hole theory rather edgy/cringe, and was not intending to end up with that as the take, however, run the formula for a bang cycles matter/probability, accounting for inflaton dark energy swap as a sort of threshold about a million times and you get a black hole annulus. then i get the cia stuff, which is literally just 'yeah black hole my dood'

#

they REALLY rely on geometry for everything though, fair i guess but, im more of a frequency guy, though the two are intergangable

#

two of the files most predominantly conveying this

#

visual wise i mean

serene scaffold
peak lark
#

otherwise its just data related science, physics

#

if you want context for that i gotta dm, pdfs aint allowed

serene scaffold
#

I think all your messages and images are likely to displace messages that are questions about implementing data science and AI in Python, which is what this channel is mostly for, so I'd appreciate if you tone it down.

peak lark
#

that's impressive without any context,but sure

serene scaffold
#

thanks!

peak lark
#

wish i could do that

ocean hinge
#

Hello

Can anyone explain me what confidence interval is? Its the 95% of times I get mean right?

waxen kindle
#

It's the bound within X% (usually 95%, can be something else) of the experiments results

ocean hinge
waxen kindle
#

It's not necessarily 95% of confidence, you can make intervals with a confidence of 99.5% if you want, or 75% for example

#

Usually we pick 95% but we don't have to

#

It's the interval within which X% (can be 95%, can be something else) of the random experiments ends

fading wigeon
#

For confidence intervals, are you referring to the bars on graphs?

fading wigeon
#

A confidence interval is a range that shows our uncertainty about our estimate of a true population value, since in practice we generally don't have the luxury of getting that value.

Like let's say we sample 1000 python developers and ask their hourly pay and the average is $60/hr and we compute a 95% confidence interval of 50-70. This suggests that the true average pay for python developers is somewhere within that range.

The interval can be wider or narrower depending on sample size and variability within our sample.

fading wigeon
#

(There’s more technical nuance to confidence intervals, confidence levels, and p-values, but I think this is a good enough high-level summary.)

shut vapor
#

Hi, I'm about to finish CS50 introduction to artificial intelligence and I started studying calculus 1-2. What should I study (and what resources could I use) for data analysis and machine learning? Ty

ocean hinge
#

Hey @summer plover

I am going through gradient descent, and was wondering, are there any particular cases where we are supposed to use gradient descent and others where we use batch, mini-batch or stochastic gradient descent? Thanks!

cedar tusk
cedar tusk
#

but converges MUCH faster

#

time / compute tradeoff basically

cedar tusk
ocean hinge
cedar tusk
#

there is no difference between batch gd and normal gd, mini batch is used when you cant fit the data to ram

#

both gd methods can be used with batch or mini batch

#

oh wait

#

mini batch is normal gd but since smaller batches the size of X does not get too big so is fast to compute the inverse solution

#

it still requires the data to load to ram

cedar tusk
#

for example this is the lbfgs method for linear regression

shut vapor
gilded depot
cedar tusk
#

but apparently it just uses less memory FOR the compute and not the data that compute uses

ocean hinge
cedar tusk
raw hare
#

Hi, I am a hs student who is going to uni next year for cs. I want to know what should I prepare for if want do ai. I am currently are experimenting with decision trees

serene scaffold
raw hare
#

yes but I feel like could be to hard

raw hare
serene scaffold
warped pike
#

could it be theoretically possible to compress some random string of digits by repeatedly 1) doing some reversible transform (and marking it down so we can reverse it on the decompression) and 2) applying some sort of naive (or not) compression on the resulting string? (obviously we want the step 1 to help step 2 as much as possible) im asking this because i need to compress some random string of a couple mb worth of integers into a couple of kb

fading wigeon
#

The problem is how/where you store the compression algorithms for reversal. Like if your goal is to shrink a string and rebuild it, but you need to pass the blueprints with the shrunken string and it takes up more space than what it reduced then it's unhelpful

#

You can try public compression algorithms, there are more complex compression algorithms, but the naive approaches are collapsing repeats (if your data has a lot of repeats) and/or looking for common substrings and attaching a translation dictionary to the file to decode it

random jay
#

i'm making an open source llm inference product, if anybody would be open to me asking them questions to flesh it out/validate it please dm me. 🙏

serene scaffold
lime grove
gilded depot
#

but the only reason indexing into ascii characters would reduce size of your string is because string was not an efficient way to store digits in the first place

atomic glade
#

Any recommendations to where I can learn how to create AI?

grand minnow
jagged stratus
serene scaffold
#

we have one person saying "making an AI is super difficult" and another saying "it's not that difficult" because they're using two definitions of "to make AI".

#

@atomic glade the word "AI" is way overloaded at this point. Describe as specifically as possible what you want to create, if you had to describe it to someone who had heard of computers but not AI.

jagged stratus
#

What do you guys know about Net2Net type of Ais?

serene scaffold
# jagged stratus What do you guys know about Net2Net type of Ais?

to avoid any confusion, we should call those neural networks instead of AIs.
the idea is that you use the output of one model, the "teacher", to be the target of another model, the "student". the teacher model is usually larger and more complex than the student.

#

this is useful if the training data for the teacher is no longer available, or if you want to "compress" the teacher.

jagged stratus
# serene scaffold to avoid any confusion, we should call those neural networks instead of AIs. the...

let me give you context. : I'm setting up my stock AI to learn continuously from the live market, but I have to expand its neural network first so it has room for the new data. I'm using a technique called Net2Net, which literally lets me add an extra layer to the neural network without wiping its memory. Instead of starting over, Net2Net mathematically copies the AI's existing knowledge into the new layers, giving it a bigger brain instantly. This lets the AI adapt to current market trends in real-time with its new capacity, while I still feed it historical data so it retains its memory of massive market crashes

serene scaffold
jagged stratus
serene scaffold
serene scaffold
jagged stratus
serene scaffold
jagged stratus
half pulsar
#

AI at this point is a abstract concept 🤣

#

Such a small word for a broad field

serene scaffold
ocean hinge
#

Hello

Can someone please explain backpropagation in a neural network? It is described as we are going back with the new weights. But going back where? We are in the same layer. it doesnt make any sense.

atomic glade
#

@jagged stratus

serene scaffold
#

Oh sorry, I see.

#

Thought you and courageandfire were the same person for a moment.

atomic glade
#

Oh its fine lol

serene scaffold
serene scaffold
atomic glade
serene scaffold
#

Also don't think about "being comfortable with sklearn". It's a general purpose toolbox. Focus on ML concepts.

ocean hinge
atomic glade
serene scaffold
#

Something about you twos' PFPs activates the same neurons in my brain

#

Maybe I only have one neuron for that.

ocean hinge
serene scaffold
#

They're both guys with similar haircuts looking in roughly the same direction

half pulsar
serene scaffold
# ocean hinge Hello Can someone please explain backpropagation in a neural network? It is des...

something that really helped me understand neural networks is understanding how they're actually really big composite functions. if you expanded out all the multiplications, summations, and activations, you'd have one really big function with lots of deeply nested parentheticals. Those parentheticals will repeat many times throughout the function, because of how interconnected the layers are.

and each time you want to adjust the weights, you have to compute the gradient for that whole function. which is a calculus derivative thing.

backpropogation is is just a math trick to make it easier to compute the gradient at each layer, because the gradient at a given layer depends on the next layer. so you start with the last layer and work your way back.

#

and by "you", I don't really mean you. the whole point of libraries like pytorch or JAX is that they do that work automatically.

peak lark
#

the shape never changes but the color for affinity rep does, probably would even support gradients

fading wigeon
#

I mean you're sort of describing an encoder. Transforming the input into a contextual representation of the input.

#

But the model you use is going to be highly dependent on what type of problem you're trying to solve. We've seen that decoder only models are sufficient for coming off convincingly as human created text through LLMs

#

If you're trying to explore existing concepts/data and group them together and use distance measurements, that are cleaner and more effective models

#

there are*

#

And there's no reason someone couldn't create a more contextualized distance measurement. We have Gower's distance for being able to work with a mix of categorical and numberical data, for instance

lime grove
#

p.s. the thing did, in fact, exit the ollama shell

fading wigeon
#

ridiculous, lol

devout talon
peak lark
#

enneagrams are already inherently encoding, the more you can consolidate the less itd take processing load wise

hasty lynx
#

Hello Im an AI researcher and I currently need a team, if you're interested text me please, I'm currently working on an algorithm that can significantly lower both the energy comsumption and the compute cost of ai training

hasty lynx
#

not yet, thats what im working about

serene scaffold
#

@hasty lynx what would you tell them if they DMed you, so that they can decide if they're able to participate?

half pulsar
hasty lynx
#

im working on the demo, i need other's opinion for that to work

half pulsar
hasty lynx
#

im working on the core logic of this algorithm, currently im searching for similiar reverse engenneer algorithms, because my idea is that backpropagation is one of the ways to solve the complex problem of ai training, if we find a way to reverse engeneer a model's weights, past the relu activation, we can give human qas to the algorithm and get the human's weights

#

this is really simply talking

half pulsar
hasty lynx
fading wigeon
#

I'm kind of confused. Why not just use a linear activation? (If all activations are linear, the network collapses to a single linear transform, so I'm unsure if you're saying your goal is to try to accomplish this)

#

ReLU zeroes out negative activations by design

#

and it became popular because it helps mitigate vanishing gradients compared to other activations

#

Also, by AI training what are you referring to? Machine learning? Deep learning? LLM/transformers?

#

It seems like you're referring to LLMs

#

But backprop is not so simple in practice for transformers. Transformers go all in on the attention mechanism, and in the case of LLM use, allowing them to weight previous tokens in a sequence at different amounts in order to predict the next one.

That said, backprop is still the training mechanism. I'm not sure what alternative you're proposing.

#

I'm a little confused on your statement on usingbackprop to solve ai training. Or what human qas means in this context. Human involvement is often used at the fine tuning step already.

#

Can you define what you're doing mathematically, perhaps?

#

(Edits for clarity)

#

If it would help to explain it more clearly, feel free to go into more technical detail.

fading wigeon
#

I'm also a bit unclear on how any of this reduces either energy consumption or compute costs

robust echo
#

isn't the whole point of backprop that the cost/loss of the activation for a given set of inputs can be expressed as a function of the trainable params?

fading wigeon
#

I try to give the benefit of the doubt. I am a little confused on if the goal it to improve the model's "understanding" or if we're trying to reduce compute costs during training.

#

Like maybe avoiding full backprop in some layers or fewer parameters or maybe are trying to optimize the attention mechanism in some way since it is expensive, I believe it's O(n^2)

#

Or better hardware for training, some way to reduce training steps, etc

#

It would help with clarity if we could isolate which computationally expensive step we're targeting and what the improvement would look like.

#

But also, there are tradeoffs to not doing the full process. And the only things that immediately come to mind for LLMs specifically are...... limiting the context window (truncating the amount of data that it's considering during token prediction) or freezing most of the weights and just doing like final layer tuning.

#

(And the tradeoffs are that the LLM will not be able to refer to earlier parts of the conversations and final layer retuning will not change its internal representations.)

#

But I'll stop pontificating until we hear more 🙂

fading wigeon
#

Especially if they manage to do so in a way that the fronteir AI companies have not implemented

twilit prism
# fading wigeon Oh, sorry, thought I responded to this but I didn't. Essentially yes. And if s...

There is some sort of R&D secrecy act in the US that triggers on the condition of exceeding 20% performance range of existing tech beyond what the meta is. (Its pretty fringe, ik)

That being said...
Use nomials, and prime numbers to generate polynomials, leverage causal masking on the symbolically purposed nomial participation, and bam, you can just use that instead of BPE character/word token embeddings.

Something something... makes it turn into AGI.

Then you can encode skip grams with nomials purposed for skip gram index identifiers, role, anything really. You can also have nomial nucleation where the heads have training stages just permuting which poly/nomial theyre targeting and if one sticks, keep it, as nucleation. i hypothesize thats way better than using strictly just skipgram multihead attention.

fading wigeon
#

If there was some sort of R&D secrecy act with that sort of metric we wouldn't be discussing LLMs at all. Are you referring to the Invention Secrecy Act of 1951? That only covers inventions deemed to be "detrimental to national security"

twilit prism
#

But I wouldnt jump into fringe topics further than that on these types of discords for posterity. You know, to preserve my credibility and all XD

fading wigeon
#

It is important to maintain credibility, absolutely.

twilit prism
#

Im trying to figure out though, using nomials, whether to stick with embeds and weights and softmax, or use a twist on vectorization and nomials, and use that as a poor man's LLM

twilit prism
rich moth
#

What if we've been thinking about AGI all wrong? Artificial implies we're manufacturing something that doesn't exist in nature, but intelligence is natural. Pattern completion, feedback loops, reinformcement, decay. They are all natural things we all experience every day. I'm calling it "Augmented General Intelligence", instead.. Real continual learning can only come from the topology, not the weights but it needs external correction pressure. It needs to work with use, not gradient decent training each time, its inefficient. You can't teach a neuron to be a different kind of neuron. but you can change which neurons connect to which, how strongly, and what gets reinforced vs what decays.

fading wigeon
#

Are you proposing automated human feedback retraining loops? Human feedback retraining already happens, but only comes into new model versions and I think that’s a sane practice to prevent model drift for existing implementations. (If my program is behaving less accurately today than yesterday, I don’t want changes in the model to be something I need to consider alongside other possible sources of drift)

What you’re describing with neurons connection to which neurons is represented by the attention mechanism and don’t quote me on this but I believe is the most computationally expensive to train.

We also have final layer retraining, but this just affects the final output mapping, it does not fundamentally change the contextual representation of sentence structure and language that are part of the models contextualized “understanding” of the data

rich moth
#

Hebbian learning

fading wigeon
#

Sounds like it will increase inference time. What’s the payoff and what are you trying to replace/improve? Are you wanting an instance if the model to do this or should this affect the global model?

#

Of the

rich moth
fading wigeon
#

So you’re proposing skip connections for… which use case? To help an instance better adapt to a single user/chat/progrsm?

rich moth
#

But the payoff is huge, continuity without retraining. Thats just a tip of the iceberg

fading wigeon
#

Can you define continuity in this context?

rich moth
# fading wigeon Can you define continuity in this context?

Sure, so the model doesnt remember you between sessions and every conversation starts cold. Continuity here means the memory graph persists, so the next session it already know your projects, terminology, concepts. Even when other agents in the shared graph use tools, its all saved to the graph.

fading wigeon
#

Okay so you want it consistent across a single user.

OpenAI has a few features that support this. They would be your competition.

They allow for certain “memories” to be stored to a long term bank (that humans can edit/remove) and also uses vectorized summaries of prior chats. The benefit of this approach is continuity in a way that is interpretable and modifiable.

Do you believe your approach has benefits to this approach?

rich moth
#

Ya I already made it.

#

Thats my local qwen 3.5 model, but I also have gemini, codex and claude connected to the same shared graph using fast MCP server

#

Still tinkering though, ESN Reservoir needs tweaking.

fading wigeon
#

So the idea is that you are storing contextualized memory in a shared graph that you are connecting multiple instances to? What is being stored in the nodes?

hasty lynx
#

ok sorry if I wasn't really clear, I don't talk English that much.
My goal currently is researching a way to skip backpropagation completely, speeding up the model's training exponentially.
Backpropagation is one of the methods to achieve the AI weights, but its really slow, think about it every step we move a bit closer to the final weights of the model after training.
I was reading a paper on reverse engeneering a model's weights, so like you give an LLM a set of Questions and Awnsers and you get with an algorithm a map of the discovered weights, this is still being limited by the ReLU activations, but I think i found a way to go past this: normally you would give a set of qas (questions and anwsers) and get the weights, the change I made its that we send those qas in batches of 8, 16, 32 so we can correlate the activations of the ReLU in group of 16, my example of the AI training referred to this: when we train a model, every step we give it 16, 32 or 64 groups of data so that the training algorithm can find the most efficient way to lower the loss, same here, we have more than one qa set per step of the algorithm.
Also I'm refferring to this as algorithm, but I'm currently considering the Idea of switching to ML, it's easier to develop in my free time.

#

This isn't about making a step of the training faster, this would impact the whole training system

hasty lynx
arctic wedgeBOT
half pulsar
hasty lynx
#

alright but i was clear, even if gemini isnt right, you can prove im wrong

twilit prism
twilit prism
warm dune
#

guys, in machine learning are there specializations in problems or is it more general?

I was seeing some job openings and one of them said "+2 years of experience with Customer Churn"

are there such things, or was it a specific case?

fading wigeon
#

There are specializations in problems (like time series analysis, classification, regression) but this is also a common regression/time series problem that many businesses want to tackle

warm dune
fading wigeon
#

(minor disclaimer that I'm in medtech so while I'm aware of other roles in fields the examples will bend in that direction)

Computer vision is still pretty important. Medical imaging companies, retail theft companies, a lot of research labs all rely on computer vision.

NLP is still important for some contexts. Medical billing, anything where you need language processing against a ground truth so scanning medical insurance policy to try to find the relevant section of info vs what you're doing.

Time series processing/forecasting is also pretty big. Fin tech, business intelligence, even patient monitoring devices will rely on this.

Edge deployment is another thing that can come up that is important. Knowing how to deploy an optimized inference model with time or memory constraints.

Those are the ones that immediately come to mind.

#

Oh, consumer/Saas companies will often use unsupervised learning to better understand their customer base to determine how best to market them

#

And ofc classical/simple regression/classification problems are still useful in a variety of contexts

#

Do you know what sorts of companies/roles you're targetting? Or do you want a breadth of expertise that can flexibly be applied/target multiple companies?

warm dune
fading wigeon
#

Yep!

hasty lynx
warm dune
warm dune
fading wigeon
#

Gotcha. LSTMs are very useful for time series forecasting. Stock prices, weather data, medical data etc can use them.

The specific model that is best for any given problem will depend on problem constraints such as what type of data we're working with, how large the data set is, what sort of long-range dependencies we're looking at, etc.

#

And understanding RNNs and LSTMs help you understand Transformers

#

Since the reason eachof those models became popular is in part because they addressed the weakness in the prior model line

warm dune
fading wigeon
#

Definitely. And knowing a wide range of models also makes you flexible: you don't have to only apply to time series jobs if you know more than the temporal sequence models, you could apply to other roles as well.

warm dune
#

like, has to be graduated, or can be studying yet?

fading wigeon
#

Tech is in a rough place right now. It's hard to predict, but all I can say is that as of this moment specifically a degree is essentially a requirement for a job in tech and ML/AI specifically tends to be more degree gated than most. Enough so that I've gone back for my masters, but that might just be because medtech/medical research expects it. I am not 100% sure if this is all ML/AI domains or just medtech, though.

#

But I've heard enough chatter from others that it's pretty much all ML/AI domains, just haven't confirmed it myself directly

fading wigeon
# warm dune how old are u?

Old, like fossil old, archaeologists study me to understand how a decrepit skeleton is able to move and walk around

#

(Late 30s)

rich moth
sterile shuttle
#

is there here any faang employee

silk acorn
#

!warn 1417961964319015074 don't spam your program across channels, especially since it's not a python program

arctic wedgeBOT
#

:incoming_envelope: :ok_hand: applied warning to @chilly junco.

subtle lotus
#

Hi people

serene scaffold
rich moth
#

Im curious how the amount of neurons in the ESN should scale with the graph.

#

Anyone have any insights?

serene scaffold
#

@spark kernel your message was removed for asking for a job, which is not allowed

frank swallow
#

stop that

warm dune
#

guys, I was reviewing classic ML models, and when I came across Naive Bayes, a question arose.

I understand the model behind it and its advantages and disadvantages.

I'd like to better understand the process of transforming words into numbers. I've heard of bag of words and TF something; can someone explain it to me?

serene scaffold
warm dune
serene scaffold
#

This is a very basic way of doing it, since the vector is kind of meaningless, but it at least gives you a unique representation for each word.

#

And if you represented each word as just a number, it would seem like word #43 is "more" than word #25

warm dune
serene scaffold
#

Yes

warm dune
#

so we created a matrix (samples, n of words in 0 or 1)

serene scaffold
#

Then if you wanted to represent a whole sentence, you can just have a vector where there's a 1 in the index for each word that appears at least once.

#

Can you think of a reason for why that is helpful, but still not amazing?

warm dune
#

i just dont get a thing, when we created the features, the number for each sample it's 0 and 1 (like if have) or it's a count of how many times the words have in the sample

serene scaffold
#

It's usually 1 if the word is present at all, else 0, no matter how many times it appears

warm dune
serene scaffold
#

No, it's usually just 1 or 0.

warm dune
#

like for using the multinominal naive bayes, i saw a video that using how many times each words appears

serene scaffold
#

I mean that's fine if they have a reason for doing it that way

fading wigeon
#

What do you guys think of NLP? (Gen AI is a separate specialization, for clarity)

I'm looking over the elective courses for my masters and trying to determine which specializations I want to pick up.

#

I suppose I should try to select the most critical specializations first and then worry about extras afterwards, I may not have any spare. Still curious to hear your thoughts on classical NLP and how it pertains to stuff done today. (I know it's useful in RAG-type architecture when looking for similarity in vectorized large document DBs)

serene scaffold
fading wigeon
#

Very fair. You were one of the people I was hoping would answer. I appreciate the candor.

serene scaffold
#

a few months ago, a coworker gave a presentation to the department about improving generic model performance for language with non-concatenative morphologies such as arabic, and our department head was like "wow, we actually talked about linguistics"

fading wigeon
#

Hahaha. Out of curiosity, why do you think classical NLP isn't in demand anymore? Is it comparitive familiarity with LLMs vs NLP? Is it that there isn't compute costs/inference time bottlenecks? I admit that I haven't deployed anything LLM related profesionally so am not sure about how the token costs scale to compute costs of hosting a simpler NLP model. (Apologies if terminology is incorrect with NLP, I'm fairly new to it)

serene scaffold
#

Because many many fewer people need to truly understand NLP to produce LLMs, relative to the number of people who can effectively utilize them without knowing anything about NLP.

fading wigeon
#

Fair

#

So the bottleneck is amount of expertise in the person you're hiring 💀

hasty lynx
#

I found a way to completely skip backpropagation, i tried on small and medium transformer models, my generator model performs 99.8% with 8 layers only, it can generate weights of models now purely with sets of questions and awnsers
I need bigger testers to find out if this is truly bulletproof, I've tried models with 90 million parameters max for now.