#data-science-and-ml

1 messages · Page 103 of 1

gritty vessel
lofty thorn
#

can anyone tell me how much time it will take if i have to complete this project?

dusty forge
#

I might have missed it along the way, but the 'error', the distance between X(ᶦ) and the regression, does it have it's own letter?(ᶦ)

dim crane
#

In scikit-learn, linear regression models are typically trained using the least squares method by default. When you train a LinearRegression model in scikit-learn, it automatically fits the model using the least squares method to minimize the residual sum of squares (RSS). Afterwards, you can calculate and check this error metric: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_error.html

dim crane
gritty vessel
#

wow dude

dusty forge
dim crane
#

The letter for the error is epsilon

#

So if the quoefficient has letter A and the intercept has letter B

#

They would have epsilon(A) or epsilon(B)

#

And as for MAE, it would be epsilon(Y)

dusty forge
dim crane
#

Np

versed pilot
#

The NIST handbook is a great resource

gritty vessel
mint palm
#

In my job ,
I have worked on data asset management system with java, mongodb and open source framework: Nuxeo .
I have also worked on event streaming plattform using kafka, mongodb, java

how to write this in resume so that it looks better. Preferably in one line

stable grove
#

I want to make a database for training an AI model to predict the value of a house

Im thinking of doing the following

Year the house was built ---date
Age of the house --integer
safety scores -- double
Total square footage of the house --Double
Number of bedrooms --integer
Number of bathrooms --integer
Lot size --double
Number of floors --integer
Presence of a garage (yes/no) and its size(double) ---unsure
Presence of a basement or attic(yes/no)
Presence of amenities(Text) as theyre could be multilpe ammenities like a fireplace, swimming pool

dusty forge
# stable grove I want to make a database for training an AI model to predict the value of a hou...

Based on the tutorials I've done so far, it seems that features (columns) with format other than numeric should be avoided. My knowledge is limited so this might not be true at all. But in this case you could replace the garage to 0 for no garage, 1 for small, 2 for large (this way you combine the boolean with the double), the attic with 1 and 0, and the amenities with numbers. The amenities might cause issues because the model could interpret the number as a weight, for example 1 is fireplace is more valuable than 5 is swimming pool because 1 is earlier than 5, or 5 is more valuable because it's a larger number. I had a similar situation in which I wanted to use simple numbers as my unique identifier, and someone explained that I need to convert it into vectors so the model doesn't get mixed up.

#

Actually for the garage it can be either a 0 or anything larger than 0

agile cobalt
# dusty forge Based on the tutorials I've done so far, it seems that features (columns) with f...

in general models literally only work with numbers, so yes you have to encode text and potentially other categorical features

it can either be Ordinal (one column with numerical values for sequential categories like no = 0, small = 1, large = 2 for the garage) or One-Hot (one column for each category, when they are not necessarily downwards compatible with each other, e.g. a boolean column for has fireplace True/False, a boolean column for has swimming pool True/False)

stable grove
#

thank you

agile cobalt
#

garage size could be just one double though, with no feature for "has garage", no = 0, yes = >0

#

also we call them "floats" in python rather than doubles

dusty forge
agile cobalt
dusty forge
agile cobalt
#

the models literally can only operate on numbers, if you had some very weird setup it might be possible for something to automatically cast them as numbers by using something like ord(), but that is not the sort of thing that just happens automatically

dusty forge
#

@stable grove are you by any chance doing the Coursera course? Because Andrew indeed introduced us to the housing example 😉

stable grove
#

No, its a project idea I came up with.

stable grove
#

with a delimeter?

#

69,100,101,110,118,97,108,101 --Edenvale

agile cobalt
#

I would strongly recommend just using one-hot encoding first to get used to it before trying anything fancier

#

btw, dates also have to be encoded
technically you could use a unix timestamp, or just separate year/month/day into their own numerical columns (or do more operations like iso calendar week, summer/winter/fall/spring etc.)

but for this case, you might as well not include it whatsoever considering it would be 'overlapping' with age

stable grove
#

thank you for the help

dim crane
#

Hello, I'm wondering in what cases is it needed to use the following techniques on my RandomForestClassifier script: MinMaxScaler, CalibratedClassifierCV and GridSearchCV (without it slowing too much the load time of the script).

My first guess is that we don't need MinMaxScaler; we just need to use GridSearchCV once for each dataset and use CalibratedClassifierCV to check where the log loss is lower (either on not calibrated, sigmoid or isotonic)

Also if you have any more suggestions to improve the results of my predict_proba output, I'll be really thankful

dusty forge
#

by the way, my math is rusty as hell super embarassing, but what does this sign means? (tried to find it in the Greek alphabet, wasn't there?)

agile cobalt
dusty forge
dusty forge
wooden sail
#

yeah, there it's a partial derivative

dusty forge
#

Ah alright, does this symbol have a name?

wooden sail
#

i'm actually not sure, looks like some sort of calligraphic d

#

in most places you'll find it as \partial

versed pilot
wooden sail
#

.latex $\partial$ $\del$ $\delta$ $\nabla$

tidal bough
#

!charinfo ∂

arctic wedgeBOT
tidal bough
wooden sail
#

i think del is the impostor

#

.latex $del$

strange elbowBOT
wooden sail
#

oof

dusty forge
#

Ok I made a note on that partial derivative/ partial differential, thank you.

agile cobalt
#

I'll go back to another one of Andrew's courses I'm watching now lol
edit; 2.5/10 wouldn't recommend that mini course
ping me if you have a follow up question on what I said earlier

dusty forge
#

I absolutely love his course and the way he teaches, by drawing, which works really well for me. The last time I had math, algebra, calculus was ages ago, always loved it, but didn't keep it up since there wasn't anything that required me to do so. Now starting with ML, I'm slowly getting that feeling back of joy seeing and understanding formulas. I'm sure that at some point I'll want to bang my head into the wall, perhaps when I arrive at matrices. Seems to be a rough part based on some Reddit posts I saw.

versed pilot
dusty forge
final kiln
#

Works extremely well locally

#

But it says it can automate the infra, which would be really really good rn

dusty forge
final kiln
#

I already have that, but coded manually by me, which is cool but also a lot to manage for one person

molten acorn
final kiln
#

Looks nice, I'd try to do some coding in parallel as you learn the math

#

A lot of math can be learned much better with the help of python, where you can just do your own experiments

#

Probability in particular can get super abstract if you go into the details of prob spaces and whatnot, code can keep you grounded, which is important when self studying

molten acorn
#

I'm in an odd place because I have over a decade of programming experience, in particular game dev, so I have exposure to linear algebra, I wonder if there's a topic or project that can fast forward my "onboarding" into data/ai

final kiln
#

Linear Algebra and calculus already put you in a good spot for ML imo, there's tools in stats that are important but if you know the others you can quickly pickup stats as you go

molten acorn
#

Or if there's any value to learning through training LLMs

molten acorn
final kiln
#

Training LLM will get you in direct contact with the thing and it will make you make all sort of mistakes

molten acorn
#

But when it comes to AI, I have a lot of unknown unknowns and how they related to my current skillset

final kiln
#

Which is good experience, but also probly the hard route

#

If you know math I'd go directly for doing a project that interests you and use it to learn

molten acorn
#

Ok, I view training LLMs as a top-down approach to learning AI, my other gut instinct is that learning neural networks will be a bottom-up approach to learning AI. Do you think this is accurate?

#

Does it make sense what I'm saying?

final kiln
molten acorn
#

I have a book

final kiln
#

Like a lot of it after the theory will be a sort of dark art where you kinda just experiment til you get it right

molten acorn
final kiln
#

And the way you do it, the tools you use, etc, it's all important stuff you only learn by making mistakes

molten acorn
#

Since my previous background was in games and graphics, I'm way more inclined to learn from this book, but only the latter chapters are dedicated to the topic, but maybe it's a good enough intro

#

I also like that it would bridge concepts for me

final kiln
#

I also come from a sort of graphics-y background

molten acorn
#

I have one project in mind, but because I have a lot of unknown unknowns, I have no idea how feasible by just me

final kiln
#

I used a lot of computer graphics concepts to do physics simulations

molten acorn
#

Oh dude, you're my new best friend

final kiln
#

Hehe, I just had an interview where the first questions were all computer graphics

molten acorn
#

Nice, which graphic libs are you familiar with?

final kiln
#

For MLE

final kiln
#

Ig I've used SDL once to do a ray tracer engine

molten acorn
#

Nice, SDL is nice to use

#

So here's my first project idea; I like watching basketball, but I don't have time to sit through a full televised game. It would be great if I could get AI to take a game recording and cut out all the boring parts like free throws and players walking the ball up the court

final kiln
#

Like usually the data I got back was not a visual output per say, it was physics data of energy deposition, prob distributions and whatnot

final kiln
#

No wait, you can build a classifier right

#

And apply it to every frame

molten acorn
final kiln
#

Is basketball, or is not basketball

final kiln
molten acorn
#

Ok, what do I need? An M3 macbook?

final kiln
#

But I'd recommend getting an Encoder CNN with two output classes

molten acorn
#

Will the new laptop core ultra-9 work?

final kiln
#

You need GPU, and I'd avoid ARM

molten acorn
#

How many gigs of VRAM?

final kiln
#

Idk how the ARM support is rn, but last time it wasn't very good, might've changed

molten acorn
#

I have a 3070 that's not doing anything

#

But realistically, how long of a project is this if done on nights and weekends?

final kiln
final kiln
#

And compute power

#

I thought sentiment analysis was gonna be easy, been like a month and ton of infra

molten acorn
#

I'd have to really research into how to go about doing this. Somehow I have to train it to identify what "boring" moments are

#

Using audio and video

final kiln
#

I'd go for image data

#

Like, videos are just images

#

And multimodal modals are very likely harder to train

molten acorn
#

Right

final kiln
#

I assume that you want to cut out ads and such

molten acorn
#

Exactly

final kiln
#

Then image should workout fairly well

molten acorn
#

a 2 and a half hour recording should turn into a 40-45 min thing

final kiln
#

Like, if you require temporal understanding, it's harder

#

What I mean is like, if you want to detect the times when someone scores

#

It might be harder than just detect the ad bits

#

You might want to consider starting with a simpler classification problem and build up

molten acorn
#

Ok, that's a good suggestion

final kiln
#

And also this, it's indispensable

#

First thing you gotta do is find a dataset and evaluate it. The larger the better. You then split it into three:

  • train
  • validate
  • test
#

Train dataset is the one that the network learns from, the network will perform forward pass, backwards pass and then gradient optimization

#

The test dataset contains data that the network has not seen during training

#

This is crucial because neural networks are functions with high capacity for memorization. So you want your network to perform generalization, or perhaps even better, you can call it compression.

#

To test that your network is not memorizing that dataset you test it against another dataset it had no chance of memorizing from

molten acorn
#

This course is interesting, but I guess I wouldnt necessarily be learning about AI

final kiln
#

If you accidentally use data from test to train, that's called a data leakeage

#

The validation dataset is a bit more subtle, but it's the same as the test data set

#

When you select a model, you have all these "hyper parameters", which decide the shape of the model, the shape of the batches, and a bunch other stuff

#

So like, in order to see which hyper parameters work best, you do a sort of search, you train the model on different combinations of the hyper parameters

#

And this is the subtle part

#

If you use your test data set to choose the best hyper parameters

#

That still constitutes a data leakeage, because in a way, information is still flowing from the test dataset, but through you

#

So you get this validation dataset to choose your hyper parameters. Then you train the thing for longer periods of time, and then you test it against the test dataset

#

The test dataset is then used to measure how well the model will perform in the real world

molten acorn
#

Interesting

final kiln
#

There's a ton of stuff, but this is the gist of it. The challange, at least for me, has been the GPU shortage and finding the hyper parameters that dont overfit my model

#

In terms of intuition for why models work at all

#

The reason is that they are universal function approximators. Same way you can build up any function with tailor series, or any image with Fourier components.

#

Everything in the world can be described with functions

#

And models are like these machines that you can tune to approximate any function. So they can do all these amazing things.

molten acorn
#

Nice

#

I got my work cut out, I guess I'll start going through that book I showed you tomorrow and see how far that gets me

final kiln
#

Good luck

#

Also do check the pinned messages here

dusty forge
final kiln
# dusty forge can you drop some screenshots when you have it running, curious how clean and cl...

still work in progress, but this is how my code is turning out, I'll run it in a sec

@flow
def main() -> None:
    model = load_model()
    loss_function = nn.CrossEntropyLoss()
    mlflow.set_tracking_uri("sqlite:///mlruns.db")
    mlflow.set_experiment("asa")

    eval_interval = 1
    save_interval = 1

    for epoch_step in range(100):
        cleanup_memory()
        for training_loop_step, model, loss in training_loop(model, loss_function, accumulation_steps=1):

            # observability into the training loop:

            mlflow.log_metric("training/loss", loss.item(), step=training_loop_step, synchronous=False)

            if training_loop_step % eval_interval == 0:
                metrics = perform_evaluation(model, loss_function, total_count=5)
                log_metrics(metrics, training_loop_step)

            if training_loop_step % save_interval == 0:
                mlflow.pytorch.log_model(model, "models")


if __name__ == "__main__":
    main.serve("train")
#

with the serve it runs a sort of worker that awaits for me to trigger, now I run the UI

#

it appears right here

dusty forge
#

oh this looks really great

#

how much mem does it use?

final kiln
final kiln
dusty forge
#

around 55% max it seems

final kiln
#

let me see if I can catch the process

dusty forge
#

not bad, I've also been looking at KNIME, which is a low-code pipeline platform, also open source and local, but the nodes can be fully SQL and Python

final kiln
#

no like, the max is 15gb

dusty forge
#

@final kiln this is how KNIME looks like

final kiln
#

also looks good, does it deploy to spot ?

final kiln
dusty forge
#

if you want to model the pipeline like that, sure

#

you can also just flunk everything in sequence like a notebook

final kiln
#

my main worry is infra tbh

#

if it manages aws for me im a happy person

dusty forge
# final kiln if it manages aws for me im a happy person
Amazon Web Services

To quickly build intelligent data-driven workflows, organizations need business analysts to work with data scientists and development teams to unlock useful insights from unstructured or semi-structured data. Learn how KNIME’s end-to-end data science product portfolio helps bridge the gap between the ideation and productionalization steps of dat...

final kiln
#
@task
def forward_pass(model, loss_function, rating_b, text_bw) -> Tensor:
    """ Forward pass through the model. """

    predicted_logits_b5 = model(text_bw.int())
    loss = loss_function(predicted_logits_b5, rating_b)
    return predicted_logits_b5, loss

@task
def backward_pass(loss_train, multiplier: float = 1.) -> None:
    """ Backward pass through the model. """

    (loss_train * multiplier).backward()

@task(log_prints=False)
def gradient_descent(optimizer: torch.optim.Optimizer) -> None:
    """ Perform a gradient descent step. """

    optimizer.step()
    optimizer.zero_grad()

im a bit wary of having these in their own task

dusty forge
#

seems it can

final kiln
#

but code looks better for some reason

dusty forge
#

I think Prefect is a smaller install? KNIME is almost 1Gb

final kiln
#

oh damn

#

yeah it's a lot smaller

#

bit knime is looking good

#

its like labview

dusty forge
#

I think it doesn't get much attention because it markets itself as low-code solution, even though you can put full code into every node

final kiln
#

visual programming languages are still code imo, labview can get pretty complicated, I think visual is better for pipelines

dusty forge
#

yep I agree, it makes learning ML much easier if I can see how it fits into the whole story. I understand that most courses focus on experiments, simply to avoid you getting lost, luckily I work in data so I know how to be lost properly 🤣

vernal quartz
#

hi guys, what is the expectation from a applicat applying for a a junior NLP developer?

final kiln
vernal quartz
#

naure language processing dveloper

#

for gold text processing and prediction

#

so what is the base line for a data scientist junior role or deep learning engineer role?
I got the take home assignment for code filtering a large text talk aobut gold, and they asked you need calculate the probabilities about the imcoming text, but i realize i don't know shit about how to code a linear regression, so what is the expecation for a junior role?

final kiln
#
def training_loop(model: SentimentAnalysisModel, loss_function: callable, accumulation_steps = 10):
    optimizer = create_optimizer(model)
    training_data = load_dataset_from_npz("data/asa/test.npz")
    for step, rating_batch_b, text_batch_bw in yield_batches(training_data[0], training_data[1], 15):
        model.train()
        _, loss = forward_pass(model, loss_function, rating_batch_b, text_batch_bw)
        backward_pass(loss, multiplier=1/accumulation_steps)
        if step % accumulation_steps == 0:
            gradient_descent(optimizer)
            yield step, model, loss

def evaluation_loop(model: SentimentAnalysisModel, loss_function: callable) -> Iterator[int, float, float]:
    model.eval()
    test_data = load_dataset_from_npz("data/asa/test.npz")
    with torch.no_grad():
        for step, rating_batch_b, text_batch_bw in yield_batches(test_data[0], test_data[1], 15):
            predicted_logits_b5, loss = forward_pass(model, loss_function, rating_batch_b, text_batch_bw)
            accuracy = calculate_accuracy(predicted_logits_b5, rating_batch_b)
            yield step, loss, accuracy

I think I got myself a nice little pattern

final kiln
#

is that those roles don't exist, there are no entry level positions for MLE and data scientists

#

Or at least, they are not positions that lend themselves very well to being filled by entry level people

dusty forge
#

@final kiln Prefect is orchestration in general right, not specifically for ML? I'm turning the website inside out and don't see anything about experiment tracking or how the model behaves, etc?

final kiln
vernal quartz
final kiln
#

And my hope is that it helps with the infra part, they mention it a lot in one section

final kiln
#

But the part that I like is the yield statement

#

Which neatly seperates my observability from the ML stuff

vernal quartz
#

okay, move away

final kiln
hollow furnace
#

hey guys, anyone know what models to pick for object detection live camera feed on my mac from hugging face and how to set it up locally? im a newbie with setting up models

#

i know and download yolo models but dont know how to test it on the front mac camera

final kiln
vernal quartz
#

then feed the live data to your models

#

for supervised leanring: you first prepare a large image data set from your camera, and lable them when the object is in the frame
for unsupervides learning: i have no idea

final kiln
#

I think dropout has done the trick, but can only be sure after more 3 or 4h, this gonna be one of my last runs before revamping all the infra

#

Batch size/LR combination give a lot of instability but I don't really see a lot of trouble in the eval so I'm just gonna let it be

vernal quartz
#

So what is the baseline expectation for a junior data scientist role? such for mechien learning and deep learning? what is the project make you outstand from this position? I know there are no junior role for it, but there are less expeirenced data scientist know basic stuffs

#

what you need to do and know?

final kiln
#

Like I said, these roles are usually not entry level, I think people usually start with data analyst or junior software

#

But knowing what a regression is oughta be a good start

vernal quartz
#

I know there are no junior role for it, but there are experience limit roles

final kiln
#

I've seen internships for ML Research scientist, but theyre usually for PhDs and stuff like that

final kiln
#

Just pick something that you are passionate about and make it impressive

final kiln
#

There's also the possibility of training it on next token prediction and use the in between signals to train a classifier.

#

An advantage of that approach is that I can reuse the model for the other tasks I have to do to replicate the MetaFormer study

vernal quartz
final kiln
#

But it very much depends on each person's learning style

#

There's a ton of resources online, I like to recommend Khan academy for building up the math knowledge

#

3blue1brown has a great primer on ML too

vernal quartz
#

just for a understanding, i understand all machine learning concerpts, from linear to K-nearest to VSM, to linear algebra to CNN, RNN and TTS model design, as well spectrogram transforming (audio processing for TTS), is this deep ?

#

what is the water level of this

#

i think the only thing i don't understand is what the water level is a machine learning and deep learning engineer

#

what required for the job

final kiln
#

How come you understand linear algebra and CNNs but you don't know how to code a linear regression ?

vernal quartz
#

because i skipped the machine learning project, i learned all in 2 months, ML by nature is worse than deep learning

#

but i undertand now he many company maybe want a ML project as even worse

final kiln
#

People usually learn regression very early in college, which is why I thought it was weird

#

Ig you can bunch it with ML, but first time most people hear of it it's just an optimization in stats class or wtv

vernal quartz
#

i understand regression just like all other concerpts, but i don't understand how to code them

#

....

final kiln
#

"What I can't create I don't understand"

vernal quartz
#

that's ture

#

i have hands on experience in hot encoding to CNN stack, very limited, but i don't understand how to code a liner regression

#

LOL

final kiln
#

Idk, you can just apply gradient descent to any linear regression problem

#

Tho there's a ton other optimization techniques

#

My fav is the monte Carlo one + local refinement with gradient estimation

#

There's ones with Jacobian and whatnot

#

It's a whole thing

teal lance
vernal quartz
#

we need a undersanding of this (we don't)

dry geyser
#

would pydantic questions fit here?

vernal quartz
vernal quartz
teal lance
# teal lance

Well this is my script I’m writing to create a strategy that’s built to trade strict criteria and focused on exiting the first signs of market reversals or the conditions not being true anymore

final kiln
teal lance
final kiln
# vernal quartz what degree?

Ah, that's a good question actually. I don't know. I come from Physics for example. I'd wait for other people to answer that question

#

But I reckon a lot of people do Computer Science, others do an actual ML degree

vernal quartz
final kiln
# vernal quartz that's a good question, how did you get the job and how did you learned?

Uhm, I got a lot of hands on experience during my masters thesis, which ended up being a very successful project that produced research and even got me to do a talk and a couple poster presentations and published abstracts.

I got a job in software fairly easily, it was during the peak hiring frenzy at the end of COVID. Did a bit of everything, frontend, backend, infra, etc etc. My last role in particular, started as a fullstack but quickly grew into an MLE role, I was training models, etc.

#

Right now I'm in between jobs more or less, I'm interviewing for companies in Switzerland, MLE roles only, I got all this experience behind me which makes me a safer hire, ML is very expensive.

vernal quartz
#

bro from full stack to MEL and thinking ML is experiensive

final kiln
#

And I'm still learning quite a lot. This last one I thought was gonna be easy. Just simple text classification. But it turns out it ain't easy

vernal quartz
#

great

final kiln
vernal quartz
#

What is your master degree, i was wanted to do one

final kiln
#

Master is super worth because you get to do a thesis

#

If you choose an hands on thing, your time is being used very well rite

#

You're both getting XP in software, making all the mistakes you need to make to learn

#

But you're also getting a degree and possibly other cool stuff

hollow furnace
vernal quartz
#

I think you don't understand what you saying, apprantly you from full stack to MLE with a different degree

final kiln
vernal quartz
#

what you learned monte Carlo one + local refinement, in bachelor?

final kiln
hollow furnace
vernal quartz
#

Uhm, I got a lot of hands on experience during my masters thesis, which ended up being a very successful project that produced research and even got me to do a talk and a couple poster presentations and published abstracts.

I got a job in software fairly easily, it was during the peak hiring frenzy at the end of COVID. Did a bit of everything, frontend, backend, infra, etc etc. My last role in particular, started as a fullstack but quickly grew into an MLE role, I was training models, etc.

?

final kiln
final kiln
#

Medical Physics, I specialized in monte Carlo simulations of particle transport for predicting energy deposition of radiation treatments

vernal quartz
#

and doing full stack

final kiln
#

What about it ?

vernal quartz
#

then you aske people with a degree

#

i don't much believe you but good luck

final kiln
#

lol

vernal quartz
#

i guess i will being a wonderfull ML engineer

final kiln
#

You don't believe which part exactly

vernal quartz
#

full stack

#

=> MLE

final kiln
#

It's a pretty natural path I'd say

vernal quartz
#

and ask people for a degree

final kiln
#

MLE requires you to have understanding of software

vernal quartz
#

and saying MLE is expensive

final kiln
#

I'm confused, please make a point

#

Are you saying ML is not expensive ?

vernal quartz
#

you transfer from full stack to MLE in same company, means ML is not expensive

#

becuase its cheap

final kiln
#

What

vernal quartz
#

they don't matter cost

final kiln
#

Have you ever worked in the industry ?

vernal quartz
#

nope

final kiln
#

Makes sense

#

There's a lot of role fluidity in startups

#

You're expected to do a bit of everything

vernal quartz
#

so startups affored expensive ML

final kiln
#

Well they try

#

Some have funding for it too

#

But less hands on deck usually

vernal quartz
#

your logic didn't make sense from frontend to MLE as physic grad
=> ML is expensive
=> do a degree

final kiln
#

Less people means generalists are preferred

#

That's not a logic, that's my employment and educational history

#

Which do not say much about ML being expensive or not

#

What determines the cost of ML is the hardware and the nature of the task

#

Not if a random person has transitioned from full stack to MLE within a startup

#

What kind of argument is that even >.>

vernal quartz
#

while you should stufy pholosphy first, i thought smart people like i am doing physics

final kiln
#

Okay, you are now just saying random stuff, I'm gonna attend to my training loop

vernal quartz
#

i mean, i guess the MLE dosn't need much experiences, as i guess from this conversation, so thanks

#

at least i know the water here

final kiln
#

You have no idea how far you can get in life by just striving to be a good person

vernal quartz
#

because i got all the easy interviews from data scientist or NLP or DL, i guess you really don't need experience on this one

#

i mean i will learn the optimizing tho, even for deep learning there is no much small tweaks, for machine leanring i guess there little bit?

#

like your fav My fav is the monte Carlo one + local refinement with gradient estimation as your only did project in your school

hollow furnace
vernal quartz
final kiln
vernal quartz
hollow furnace
hollow furnace
vernal quartz
#

for what purpose? jus object dection? what objects

final kiln
#

Just mix both

#

Mediapipe is good, I've used it for a PoC recently

#

It fails here and there, if you want better performance you'll have to train your own CNN

vernal quartz
hollow furnace
final kiln
#

Anyway

vernal quartz
final kiln
#

Pleas open the link

#

@hollow furnace I just sent you the two links you need, the first one is the package that will do object detection for you, the second is the code that you can use to read from your camera

#

@vernal quartz Please stop being hostile towards me.

vernal quartz
vernal quartz
final kiln
vernal quartz
vernal quartz
# final kiln Sigh

i give you 1 million for using image detection techniqes for these kind of CAPTCHA recoginition

#

try and do it

#

this is called Lemin CAPTCHA

#

this is my own created algorithm for image recognition, it doesn't even used ML and deep learning

final kiln
vernal quartz
#

i guess you can't even see how many anti-recognition layers on there, this is not your common life image, and i guess you even don't have the skill for nature world images processing and recoginition

#

from skills good for someone without experience, then half sentence drop the skill talking about personality,

#

again, this is a multi-layer anti recognition CAPTCHA

#

this is not a common even real world image

#

i guess someone from physic transfer to MLE don't understand it

final kiln
#

It's time for my coffee I suppose.

#

Yesterday I found out my model was overfitting, which I thought wouldn't be possible on such a. Large dataset >.>

vernal quartz
#

my hammer is coming

#

billyboby

left tartan
wicked osprey
#

Hiya, fella's. I'm kind of eager to figure out why data science is interesting and what benefits it provides.

#

Does anyone mind sharing their insight?

left tartan
#

And, if someone asks something that’s wrong, either take the time to explain… or just don’t respond. It’s all good.

vernal quartz
#

GOT IT

left tartan
#

It’s like asking ‘why math is interesting’, very hard to answer succinctly

wicked osprey
vernal quartz
# wicked osprey Hiya, fella's. I'm kind of eager to figure out why data science is interesting a...

Data Science is only two parts, more tradiontional data anaylzing part that most replaced by GPT, and other great data scienctist doing there job i think, but mostly 300K+ plus just my future, others reporting their basic insights to the captlislim companies that don't know how to or too lazy to use GPT. The second part the only fun part is deep learning, no gurantee income, company don't understand it, but they want it

vernal quartz
wicked osprey
#

So the process of analyzing sets of data and extracting conclusions from them in a nutshell?

left tartan
left tartan
#

This def seems pretty good:

#

Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from potentially noisy, structured, or unstructured data.Data science also integrates domain knowledge from the underlying application domain (e.g.,...

wicked osprey
#

Could you summarize practical use-cases that are most common in the industry?

vernal quartz
#

heavy or easy process part maybe given to data engineer, data scientist has two parts as i mentioned, really depends on what level of them, most them = GPT and others are great, these excluded DL engineers

final kiln
left tartan
final kiln
#

I've heard someone say Data Science can be thought of as an evolution of statistics

vernal quartz
#

please documented what i said and write on your paper

#

that will be so ture and funny as hell

left tartan
final kiln
vernal quartz
#

and where to learn?

left tartan
left tartan
#

In particular, the subfield of EDA is very useful.

wicked osprey
vernal quartz
final kiln
#

I wonder how close what I do is to what is supposed to be done tho >.>

vernal quartz
#

The only way you can get free labor is from GPT or

#

can i say it?

left tartan
vernal quartz
#

or from when America created

final kiln
wicked osprey
#

What could the results of ML contribute towards to? Like more efficient ways of doing generic tasks?

final kiln
vernal quartz
left tartan
wicked osprey
#

Hmm, how could it prevent a country from an agricultural disaster?

#

Comparing previous data with existing current data and forecasting some sort of scenario?

final kiln
left tartan
#

Or finding more efficient planting methods, etc

final kiln
#

My next project is gonna be an image classifier that tells me when my pizza is ready, cuz I need that in my life

wicked osprey
#

Are there any dynamic changes it could bring towards the development of an individual. Ex; would you be able to use your skillset as a way to enhance your lifestyle?

wicked osprey
final kiln
wicked osprey
#

Could you define a few examples?

wicked osprey
#

And is forecasting the only type of byproduct of ML/DS or are there more?

left tartan
vernal quartz
wicked osprey
#

LOL, that's quite an example.

vernal quartz
#

i guess that's what called death are coming

final kiln
wicked osprey
left tartan
wicked osprey
#

Zero clue.

final kiln
left tartan
# wicked osprey Zero clue.

I think you should probably start with those Wikipedia pages, and find a good explanation and read it thoroughly. These are broad topics that need a more thorough explanation

#

I’m not sure we can adequately explain the entire industry effectively here

wicked osprey
#

Ah, well.. I appreciate the insight(s) and guidance.

vernal quartz
wicked osprey
left tartan
wicked osprey
wicked osprey
#

This entire industry seems like a new digital world opening up for me.

vernal quartz
left tartan
serene scaffold
final kiln
#

Actually one of my collegues was working on similar stuff, detecting cancer in PET scans or something of the sort

vernal quartz
#

no, deep leanring developer is a different role, but most industry don't have enough courage or people to hire as a deep learning developer

wicked osprey
#

Are DS'ers involved in building the algorithms, i.e. developing the software? Or are they limited in some sort of way? (Had a quick skim, couldn't get to the bottom of it) So far, I've seen most is data gathering.

serene scaffold
final kiln
left tartan
#

I laughed at how the HBS link above separated DS and DA (the distinction is very muddy in real world)

wicked osprey
serene scaffold
wicked osprey
#

This question is for the employed and or fascinating hobbyists, what responsibilities do you get to adhere to?

final kiln
#

I wish there was enough time in a human life to do it, there's a ton of cool stuff to do in this industry

wicked osprey
final kiln
#

I think at larger companies you'll be a lot more confined to your lane, but I've been in small places only so far

hollow furnace
final kiln
vernal quartz
wicked osprey
hollow furnace
#

i dont think so, didnt see any windows pop up or any green light flashing

left tartan
final kiln
wicked osprey
#

Quite rigurous, you could use some positive reinforcement or a different mental model to tackle your problem. Gets you through the boring stuff.

final kiln
vernal quartz
#

i just lost the interest when i read N

final kiln
# hollow furnace

Try this one

import cv2 

  

  
# define a video capture object 

vid = cv2.VideoCapture(0) 

  

while(True): 

      

    # Capture the video frame 

    # by frame 

    ret, frame = vid.read() 

  

    # Display the resulting frame 

    cv2.imshow('frame', frame) 

      

    # the 'q' button is set as the 

    # quitting button you may use any 

    # desired button of your choice 

    if cv2.waitKey(1) & 0xFF == ord('q'): 

        break

  
# After the loop release the cap object 
vid.release() 
# Destroy all the windows 
cv2.destroyAllWindows() 
hollow furnace
#

yeah. hmm maybe its a user password one at the bottom but 1) i dont know if my camera has credentials and 2) mac password has an @ symbol so dont know if it messes with the '@192..."

left tartan
final kiln
#

I just searched online

vernal quartz
#

i mean after read it i know its talking about metrix, but what is EN?

#

what is this shit?

wicked osprey
hollow furnace
final kiln
#

This one is the official docs

left tartan
hollow furnace
#

n is an element in the set of natural numbers

left tartan
final kiln
#

Except people keep adding features to it

#

In programming language theory and proof theory, the Curry–Howard correspondence (also known as the Curry–Howard isomorphism or equivalence, or the proofs-as-programs and propositions- or formulae-as-types interpretation) is the direct relationship between computer programs and mathematical proofs.
It is a generalization of a syntactic analogy ...

hollow furnace
vernal quartz
#

so it means MN is in nature number

#

Fuck

hollow furnace
#

@final kiln wow, ur a wizard. it really works.

final kiln
vernal quartz
#

why we don't just say it MN is in nature numbers?

final kiln
#

In mathematics, the surreal number system is a totally ordered proper class containing not only the real numbers but also infinite and infinitesimal numbers, respectively larger or smaller in absolute value than any positive real number. Research on the Go endgame by John Horton Conway led to the original definition and construction of surreal n...

hollow furnace
final kiln
tidal bough
hollow furnace
#

cuz math needs precise definitions and should leave no doubt for the reader. at least thats what my math professor said.

vernal quartz
#

this entire paper is bsting about rows and columns

#

and then i realize its matrx adding

#

fk

#

help

#

what is this

hollow furnace
vernal quartz
#

i know its talking about a matrix, but i felt discouraged

#

????????????

#

????????????

wooden sail
#

do you have a particular question about this?

#

here they're just introducing notation

vernal quartz
#

what is this mean?

#

how to read this?

hollow furnace
# vernal quartz

is this saying ur multiplying elements diagonally and then summing them up?

vernal quartz
#

waht is the reverse A?

wooden sail
#

it says in english: for all matrices A, B, C containing real numbers and with shapes m x n, n x p, and p x q, respectively, multiplication is associative

serene scaffold
vernal quartz
#

the angle?

serene scaffold
tidal bough
#

forall

serene scaffold
#

backwards E, I should say

hollow furnace
#

for all A in Real number field of size m times n

#

m times n are dimensions if i remember correctly

wooden sail
#

.latex $\forall$ means for all'' and $\exists$ means there exists''

strange elbowBOT
wooden sail
#

.latex $\in$ means ``is an element of''

strange elbowBOT
vernal quartz
#

so "all" means A or all means A,B,C, and why they formated like A,B,C:, does they connected?

tidal bough
#

i always get jumpscared when I remember this is how one writes quotes in latex

#

i'd much rather csquotes, thank you.

wooden sail
hollow furnace
#

they're connected tgt by the last part

wooden sail
#

ah it means all A, B, and C with the given properties

hollow furnace
#

(AB)C MUST be equal to A(BC)

tidal bough
#

forall ..., it's true that (A B)C = A (B C).

vernal quartz
#

so for A in R, B in R, C in R, (AB)C=A(BC)

wooden sail
#

with the given shapes

serene scaffold
wooden sail
#

since matrix multiplication is not defined for arbitrarily shaped matrices, the shapes are important

serene scaffold
#

R^(a x b) means "an array of real numbers of shape a-by-b"

tidal bough
#

well, this specific statement is true as long as it even makes sense. matrix multiplication is associative for all shapes for which it's even defined.

vernal quartz
#

∀A, B ∈ Rm×n what is A,B means in R 2 Darray?

#

evey element?

serene scaffold
strange elbowBOT
serene scaffold
wooden sail
#

.latex formally, the $\times$ symbol denotes the cartesian product. consider the tuple $(a, b)$ where $a$ and $b$ are both arbitrary real numbers. we can describe this as $\real \times \real$, or equivalently $\real^2$

serene scaffold
#

(that is, they are both elements of the set of all m-by-n arrays)

wooden sail
#

hmm do we have \real in the header

#

i guess not

serene scaffold
wooden sail
# vernal quartz evey element?

.latex formally, the $\times$ symbol denotes the cartesian product. consider the tuple $(a, b)$ where $a$ and $b$ are both arbitrary real numbers. we can describe this as $\mathbb{R} \times \mathbb{R}$, or equivalently $\mathbb{R}^2$

strange elbowBOT
wooden sail
vernal quartz
#

but isn't A,B are elements in m-by-n arrays? why they became the m-by-n array?

wooden sail
#

.latex $A \in \mathbb{R}$ and $a \in A$ mean different things

strange elbowBOT
wooden sail
#

oops, forgot the m xn

#

anyway, one says A is a multidimensional array

#

the other says that lowercase a is an item inside of the matrix A

vernal quartz
wooden sail
#

same thing

#

for all A and B of size m x n, and for all C and D of size n x p

vernal quartz
#

A is array or element

wooden sail
#

it's telling you right there

#

that's exactly what that says

#

you might wanna invest some time in familiarizing yourself with the symbols

#

you already know "for all" and "is an element of", so you're able to read this

final kiln
#

I found my issue

#

the first layer is overfiting hte data, fo' sure

vernal quartz
#

so A is an array and a is element?

wooden sail
#

in what you shared, yes, but symbols are not universal. different authors will use different symbols

#

the symbols for "is an element of", "for all" and "exists" are usually kept standard, but anything else will vary depending on where you read it from, so it's in your best interest to learn how to read it, not memorize the symbols

vernal quartz
#

i will use a shit symbol when i publish articals

wooden sail
#

you can use whatever you like as long as the notation is made clear

#

some of the more cursed stuff uses musical symbols

tidal bough
#

oooh, algebra flashbacks

#

i don't even remember what that means anymore

wooden sail
#

some cursed commutative diagrams among eldritch objects

vernal quartz
#

but, just want to confirm, what is actually A,B in Rmxn means, because it should be mean element right??

wooden sail
#

all you have to do is scroll up to where i told you exactly

#

but really, you're not reading

#

what you're asking me is what the image is telling you

#

sit down calmly and translate the expressions into english or whatever language you like

final kiln
# final kiln

if I use like, a super agressive dropout on the first layer, would that work to prevent overfitting ?

vernal quartz
#

but you said it can be element or an array

#

i don't know if that make seens in the notation it means array

#

∀A, B ∈ Rm×n, C, D ∈ Rn×p : (A + B)C = AC + BC (2.19a)
A(C + D) = AC + AD

wooden sail
#

start with ∀A, B ∈ Rm×n

#

what does that say?

vernal quartz
#

for all A,B in Rmxn array?

#

but what is A,B?

wooden sail
#

let's do this another way

#

if i write for i in range(10), what is i?

vernal quartz
#

you mean code?

wooden sail
#

that's python code

vernal quartz
#

it will create a list, from 0 to 9, and read from it

wooden sail
#

and what is i?

vernal quartz
#

this is off topic

wooden sail
#

no, it's an example of the same notation

#

that line of code tells you i is a number in the range from 0 to 10

iron basalt
#

Let A = {apple, strawberry, grape, 7}, then a∈A is either an apple, a strawberry, a grape, or 7. @vernal quartz

wooden sail
#

the line you shared tells you to consider some object A, doesn't matter what it is, that is in the set R^mxn

vernal quartz
#

i is a memory, bytes, when it read it be assingend with bytes, represent each 0 to 9, its never been a number

wooden sail
#

the variable i will have different values in different iterations, but the value will always be in the range from 0 to 9

iron basalt
#

This notation is set notation, dealing with sets of things. E.g. the set of all real numbers (the fancy R).

odd meteor
vernal quartz
final kiln
wooden sail
#

you're failing to see the connection

#

look also at the example squiggle gave you

final kiln
vernal quartz
wooden sail
#

i have bad news for you if you intend on working with AI, then

vernal quartz
#

a bytes is a physical stuff

wooden sail
#

it's also definitely not mechanical

vernal quartz
#

and number is not

wooden sail
#

well, good luck

final kiln
#

wasn't penrose who published a book about all math being real and we just live in the chunk of it that allows for sentience to exist

hollow furnace
iron basalt
final kiln
vernal quartz
#

if i write for i in range(10), what is i?
jjjj — Today at 1:14 AM
you mean code?
Edd — Today at 1:14 AM
that's python code --- i just saying this is totally two questions, as one is coding behending machanical, such as logical gate and can't simplified as fictional - numbers

#

you are describing a design machinal and designed physical situation within memories compare with a pure mathmatical (fictional) situation

#

back again, so A,B is array in R or A,B is element in R?

wooden sail
#

you're gonna have a really bad time if you don't learn to abstract real world phenomena

wooden sail
#

scroll up to where i already answered the question

vernal quartz
#

because if A,B means array i don't understand the notation

final kiln
#

wait til you get to whatever the heck category theory is

wooden sail
#

that does not mean array and has nothing to do with that

vernal quartz
wooden sail
#

no

vernal quartz
#

what is it

left tartan
wooden sail
#

why don't you just read the line and replace the symbols we discussed with their meaning in english?

agile owl
#

i think the hardest part about math is the notation is not as descriptive as code

agile owl
#

if you don't know what you're reading it might as well be hieroglyphics

left tartan
wooden sail
#

and you mean math notation, because maths is independent of the notation

#

you could just as easily do it in english if you use precise language

agile owl
#

right, but everyone uses math notation

wooden sail
#

yes, but not the same one

final kiln
#

Category theory is a general theory of mathematical structures and their relations

I thought study of mathematical structures and their relations was maths, so like category theory is meta math ?

left tartan
wooden sail
#

every single book and paper starts with a description of what the symbols will mean throughout their work

iron basalt
#

(See formal methods)

agile owl
#

I'm not talking about math papers I'm talking about papers that use math

wooden sail
#

same thing

vernal quartz
agile owl
#

most of the time people aren't proving their results or anything in model papers

wooden sail
#

they will start with a definition of the symbols, and they will also have text and figures and pseudo-code

#

be it pure math papers or engineering ones

#

because a mixture of all those things is perfectly valid, and the symbols are not universal

wooden sail
left tartan
vernal quartz
#

That PRO

wooden sail
#

all you had to do was scroll up to where i had already told you that

vernal quartz
wooden sail
#

what?

vernal quartz
#

you said one says its an array, one says its an element

wooden sail
#

yes

vernal quartz
#

????

iron basalt
wooden sail
#

you shared an image with an element of a matrix, and with a matrix

vernal quartz
#

it doesn't make sense if it is an array

wooden sail
#

you're not reading and you're refusing to understand

#

sit down calmly for a second and read carefully

#

.latex the image you shared spoke of a scalar element $a \in A$ and a matrix element $A \in \mathbb{R}^{m \times n}$

strange elbowBOT
agile owl
wooden sail
#

and i told you to read the symbols carefully so you can distinguish them, because authors will use different symbols

#

nowhere in what i wrote was there any "i'm not sure", there was only "learn to read the notation, don't memorize"

#

.latex going back to the for loop, you can write it in the form $i \in [0, 10)$ which is of the same kind of what we're discussing here

strange elbowBOT
left tartan
#

I think I understand part of the confusion: the element type depends on the what the set contains. Could be anything. Could be the set of all adults in Kansas. Could be the set of all Real numbers. Could be a set of all sets containing at least one prime number, etc

#

The element would thus be any single member of the set

wooden sail
#

i also explained the notation R^{m x n} above, but they skipped that, it seems

iron basalt
wooden sail
#

i am bereft of patience now so i'll be on my way

left tartan
agile owl
#

Trying to remember how to index into my numpy tensor

#

ndarray whatever

#

3d array

#

I ended up fixing my polars so I only collect once but now I have this big honking numpy array I have to index manually for different columns

#

I guess I could just use the eager frame

iron basalt
agile owl
#

does someone have a one liner to get the upper triangle only in a flattened numpy covariance matrix

#

im about to ask chatgpt

vernal quartz
agile owl
#

I didn't even know np.triu existed

#

I feel dumb now

left tartan
wooden sail
#

triu would've been my suggestion

vernal quartz
left tartan
#

The universe is the set of all (m,n) dimensional arrays. M and N could be whatever you want.

vernal quartz
#

but in a array, it only containes elements

left tartan
agile owl
#
        self.observation_space = spaces.Box(
            low=-np.inf, high=np.inf,
            shape=(n_lags, 17 * self.no_symbols + 39 + self.no_symbols*(self.no_symbols+1)//2)
        )

I need to write a comment on the shape of this thing at this point it's gotten monstrous

left tartan
#

You’re using the term element in an ambiguous way, which is your confusion, I think

#

X element of Y means any member of the set Y

vernal quartz
#

let's say this array, it can only contain elements. ∀A, B ∈ Rm×n, C, D ∈ Rn×p : when we use this A and B means anything in side this array as a single element, so it will be element

iron basalt
vernal quartz
#

yes, i know

vernal quartz
#

how come this became a array for A?

left tartan
#

(Someone correct me if I’m wrong)

#

A and B are both m,n dimensional arrays

vernal quartz
#

Associativity:
∀A ∈ Rm×n, B ∈ Rn×p, C ∈ Rp×q : (AB)C = A(BC) (2.18)
distributivity
Distributivity:
∀A, B ∈ Rm×n, C, D ∈ Rn×p : (A + B)C = AC + BC (2.19a)
A(C + D) = AC + AD (2.19 ----- so you saying A,B can be an array in nxp shape R array?

iron basalt
final kiln
#

I'm not understanding the confusion, this is pretty simple to interpret I feel

vernal quartz
#

the question is A,B can be an array in this notation?

final kiln
#

They're matrices

vernal quartz
#

as an element in nxp shape R array?

left tartan
#

I don’t know how to explain differently

final kiln
#

A e R2x2 means A is a 2x2 matrix

#

With real value elements

iron basalt
tidal bough
#

consider reading ∈ as "belongs to the set", for less confusion. "element" is an ambigous term.

vernal quartz
#

∀A, B ∈ Rm×n, my question is, as Rmxn is an array, so array only contains single digital numbers or anything in this array, so can A,B be an array

left tartan
#

It’s the set of all m,n arrays

#

(Real valued arrays)

vernal quartz
#

∀A, B ∈ Rm×n, C, D ∈ Rn×p : (A + B)C = AC + BC ===> because if you think A,B can be an array, i don't understand the following: (A + B)C = AC + BC, so this can aslo equaltion to array additiaon?

vernal quartz
tidal bough
#

... uh, yes? this identity involves matrix multiplication and matrix addition.

final kiln
#

It's stating that the equation holds for all such matrices

#

Every time you write a math sentence you're implicitly stating it to be true

vernal quartz
#

(A + B)C = AC + BC, if A,B are matixs, is this notation right? or equation

left tartan
#

If a and b have the same shape

#

That’s why they’re both members of Rmxn, to say: they both have same dimensions.

agile owl
#

how do I flatten the upper triangle only though

#

that's still the issue

#

np.triu gives you nans elsewhere but if you call .ravel() or something it still gives you the nan values

tidal bough
#

you could make a boolean mask with ones in the upper triangle, and do arr[mask]. that'd be a 1d array.

left tartan
#

You could just mask it with … ^

#

I guess you could invert it and triu?

#

Oh, tril duh

tidal bough
agile owl
#
covariance = np.triu(np.eye(self.no_symbols))[np.triu_indices(self.no_symbols)]
final kiln
tidal bough
#

triu(eye)? isn't that always just an eye?

vernal quartz
#

i got it here, the author called it matrix

agile owl
#

that's just the case where it doesn't have enough observations

#

also I need to remove the triu call

#

I just need the indices

#

but yeah there's a triu_indices method too that's handy

tidal bough
#

purpose of whom

final kiln
agile owl
#

no

#

as long as it's consistent every time

#

I am using np.triu_indices to index it

final kiln
#

Then just sort them and slice from the index of the first Nan

agile owl
#

nah that won't be consistent

final kiln
#

Wdym you'd be literally sorting them

vernal quartz
#

λ ∈ R. Let A ∈ Rm×n any idea of difference between R and Rmxn? this scalar λ in anything?

agile owl
#

the mapping from i, j of the covariance matrix to the flattened array can be in any order, but it has to be the same mapping every time

final kiln
#

Ah you can just kinda use index trickery for it I think

left tartan
#

Because the dimensions aren’t important to these properties

vernal quartz
final kiln
#

I used this not long ago

left tartan
#

You’re mixing multiple links here but that’s what that says yes

vernal quartz
#

i got it

iron basalt
#

This is all just describing the basic ways in which matrix algebra works.

#

You could list all this with real numbers, or whatever you are used to, and it will look similar (list of rules).

#

The first line here, I can show you something similar you are familiar with: a(1/a) = 1 = (1/a)a.

vernal quartz
#

is analytic gemotrey and matrix decomposition important?

iron basalt
vernal quartz
#

?

iron basalt
vernal quartz
#

this book feels off

#

i need a pre-education book

#

what s is this, its even write weird

#

why it needs to be twisted like that

left tartan
#

What’s twisted? Have you taken stats before?

#

Bayes theorem is in college stats, right?

final kiln
iron basalt
vernal quartz
left tartan
iron basalt
#

In a review of probability before really getting to the actual stats part.

#

At the start of the first semester.

vernal quartz
vernal quartz
final kiln
left tartan
#

We’re just trying to figure out what you need to review: as (mostly?) CS students, we take some of this stuff for granted.

final kiln
#

Business analytics (BA) is a set of disciplines and technologies for solving business problems using data analysis, statistical models and other quantitative methods.

agile owl
#

the applications of bayes theorem aren't always immediately obvious

#

everyone should remember the basic idea though

left tartan
#

A good stats background is definitely important for DS

vernal quartz
#

where i should start

#

i guess i passed high school level

left tartan
final kiln
vernal quartz
iron basalt
left tartan
#

And understand you’re just at the first step… it’s ok, just appreciate that there’s more ‘maturity’ to develop

vernal quartz
#

i remember i passed the exam with this

#

so where should i start

final kiln
#

With the basics

left tartan
#

Many colleges have two stats courses, one for engineers and one for non-engineers

lapis sequoia
#

I have a PPO implementation with Actor and Critic both having 2 Dense layers 256 each with RelU activation and an output Dense layer with size 2 I thik with softmax. Unfortunately the average score for Cartpole v0 peaks at 100 and I don't know why. Can anybody advise if my layers are good enough?

vernal quartz
#

i remember i stopped here, bootstrap

final kiln
lapis sequoia
vernal quartz
final kiln
#

But if it plateus it might be that the model is at capacity and you gotta make it bigger

agile owl
#

what's a good reward on cartpole

lapis sequoia
agile owl
#

I spend most of my time comparing rewards on my own custom environments not benchmarks

final kiln
agile owl
#

so what is a good reward on cartpole

#

what are you expecting

lapis sequoia
final kiln
agile owl
#

are you using sb3 for the PPO

lapis sequoia
#

there's very sparse documentation of gym environments to give concrete answers

iron basalt
# vernal quartz thoughts?

I think you seem to be struggling with the notation, so anything that gets you used to that. One such thing is a discrete mathematics book, often used to get people in CS used to it (and it's relevant to writing algorithms).

agile owl
#

I mean you can see benchmarks for cartpole on a lot of sites about RL

#

I just don't remember what a good score is

agile owl
#

There is going to obviously be a limit to how far you can get with a given environment

lapis sequoia
agile owl
#

stable baselines

lapis sequoia
agile owl
#

ah ok they have an implementation of PPO was just curious

vernal quartz
lapis sequoia
iron basalt
# vernal quartz stopped here, bootstrap

Being able to apply existing methods / tools is great, and what a business degree should focus on, but in this case you are trying to understand how those work fundamentally, at a mathematical level. To do so you need to be able to read something like the Bayes theorem notation. This can be learned by any number of mathematical books on probability. But also you want to be comfortable with the set notation earlier, and the linear algebra notation. In addition, ML makes heavy use of calculus.

agile owl
#

deeper networks don't help on cartpole

#

but wider ones do

lapis sequoia
agile owl
#

not afaik

lapis sequoia
# agile owl not afaik

could an LSTM layer help? since you need to balance a cartpole and previous information could be useful

agile owl
#

there's people researching different network architectures for policy networks

vernal quartz
agile owl
#

there is a version of PPO with an LSTM layer but empirical results aren't that great in most applications compared to simple framestacking

final kiln
agile owl
#

I definitely don't have linear algebra I need to get the linear algebra brought in

left tartan
final kiln
#

I'm probly biased, but I think calculus is the most important cuz of gradient descent

lapis sequoia
past meteor
#

I'm honestly less hung up about what method belongs in what box

agile owl
#

I don't really have an interpretation for you besides the actual thing

left tartan
agile owl
#

they lead to different results though

iron basalt
lapis sequoia
agile owl
#

do they teach people statistics without calculus?

past meteor
#

Yes

agile owl
#

yikea

left tartan
lapis sequoia
past meteor
#

My SO took 7, yes seven stats classes

agile owl
#

I took statistics for economics and it had calculus

past meteor
#

And they only gave her math after like 4 or 5?

left tartan
#

AP stats, for instance, has no calc

lapis sequoia
past meteor
#

I studied business engineering. My path was a lot more linear. Year 1 had lin alg in semester 1, with some programming course, semester 2 had calculus.

lapis sequoia
#

i argue real analysis is calculus on steroids

agile owl
#

I didn't take a real real analysis course I took a course called mathematics for economics that was about optimizing in convex spaces and hessians and all that jazz

past meteor
#

Year 2, semester 1 brought stats

#

Year 3 brought econometrics and so on

#

I think econometrics was actually the most valuable data course I took because it was the one that taught you the "finesse" of working with data

#

That's so missing in most

lapis sequoia
#

i miss the funny questions like can an infinite union of countably infinite sets be countably infinite

iron basalt
agile owl
#

I feel like my econometrics course was much ado about multiple linear regression

left tartan
#

That’s something I find myself repeating a lot: that ‘calculus’ is the beginning not end of the math journey. Students were built up to believe that calculus was the pinnacle, rather than the pit stop at the foot of the mountain.

past meteor
#

Yeah, it started slow but ramped up. By the end we were covering logit/probit and so on in detail

#

By my masters I was exclusively taking CS faculty ML courses and they felt half as rigorous. Many of my cohort were lacking that "finesse"

lapis sequoia
#

did anyone do Measure Theory?

iron basalt
left tartan
#

Like, half the class was stats majors

lapis sequoia
vernal quartz
past meteor
#

They were so complementary. I took a course called "data mining" and there we had to do logistic regression by hand

iron basalt
agile owl
#

I feel like all the courses I've taken have been for stupid people that's the only way I could have passed them

vernal quartz
#

Probability i remember i did this

past meteor
#

It was a nice eye opener to do SGD, chain rule etc. to really get it

iron basalt
past meteor
#

The previous semester, econometrics, was really all about residual analysis and "modelling" moreso than "Yeah SGD is scalable!!!" (which was more the CS faculty approach)

iron basalt
#

It's probably the most applicable branch of math other than calculus (maybe even tied with it).

lapis sequoia
left tartan
agile owl
#

I learned backprop really well once and I have completely forgotten the actual math behind it tbh

vernal quartz
#

anyone give me a road map for math

agile owl
#

I forget everything

left tartan
lapis sequoia
agile owl
#

I must have done too many drugs or something

past meteor
#

I'm never going to forget, data mining had this one exam question that was basically

What happens to the weights of a logistic regression if you have no regularization and the classes are perferctly linearly seperable. Answer in 15 words or less

vernal quartz
#

i think i learned probabilty, and my stats i don't know what level is it

lapis sequoia
past meteor
#

I remember they go to infinity and I used L'hopital's rule to prove it

#

I don't remember why tho

agile owl
#

ah L'Hopital's rule

lapis sequoia
agile owl
#

I remember using that

left tartan
vernal quartz
lapis sequoia
past meteor
agile owl
#

I'm gonna just admit it it's a real problem that I don't do anything by hand because I learn it all and then I forget how it works and at the end i'm no different than someone using a library they don't understand

vernal quartz
past meteor
#

I'd not cope with doing math for the sake of it

lapis sequoia
iron basalt
agile owl
#

I did multivariate calculus before linear algebra believe it or not

lapis sequoia
agile owl
#

I just went calculus -> multivariate calculus -> linear algebra

past meteor
#

I don't think we should overstate the amount of math you need tho

iron basalt
agile owl
#

Actually I found multivariate calculus relatively easy compared to Linear

past meteor
#

What it mostly gives you is certainty that you're not doing stuff incorrectly

agile owl
#

I think Linear algebra requires a completely different part of your brain or something

past meteor
#

I think the part you should really get as a practice focused ML professional is how to evaluate models correctly

vernal quartz
agile owl
#

because I was fine at calculus but I suck at linear algebra

past meteor
#

But I always say this 😄

lapis sequoia
left tartan
vernal quartz
#

for math, should i calculate everything by myself, after leanred, or just watching it

iron basalt
past meteor
#

If you know how to evaluate models correctly, that means you could technically do no harm in using the models incorrectly if you don't know how they work

lapis sequoia
vernal quartz
lapis sequoia
vernal quartz
#

i only know gussian

lapis sequoia
past meteor
#

I couldn't code guass-jordan elimination 🤷

#

I know what it is

#

But idt it's particularly relevant to be honest

iron basalt
past meteor
#

It's the same story with CS proper

#

Not every programmer needs to be a systems programmer

vernal quartz
#

i don't question my coding skills, i just don't understand math

lapis sequoia
vernal quartz
#

how to calculate them

iron basalt
#

Not that gauss-jordan in particular is important to implement, but being able to do stuff like that makes you very valuable labor.

past meteor
#

In that vein, not every ML professional needs a deep understanding of math

past meteor
#

A working knowledge is enough for the vast majority of roles

iron basalt
agile owl
#

in my experience people care more about your experience doing commercial things

vernal quartz
#

i will let every recruiter know i know guass-jordan elimination

iron basalt
past meteor
#

I disagree tbh

#

This is only true for some roles

lapis sequoia
#

what area of maths is Kullback–Leibler divergence in?

past meteor
#

Information theory

iron basalt
left tartan
past meteor
#

We have several CS/Physics PhDs at work

vernal quartz
past meteor
#

And I'm sure if I roll in tomorrow and ask everyone to code up Gauss-Jordan elimination