#data-science-and-ml

1 messages Β· Page 137 of 1

unkempt apex
#

use pastebin!

spring field
#

what's the initial image size?

unkempt apex
dim crane
#

if name == "main":
# Ensure Kafka topics are created
create_kafka_topics()

websocket.enableTrace(True)
ws = websocket.WebSocketApp(
    f"wss://ws.finnhub.io?token={finnhub_api_key}",
    on_message=on_message,
    on_error=on_error,
    on_close=on_close
)
ws.on_open = on_open

where:

def on_message(ws, message):
"""Callback function to handle incoming WebSocket messages."""
data = json.loads(message)
if data.get('type') == 'trade':
for trade in data['data']:
symbol = trade['s']
record = {
'symbol': symbol,
'timestamp': datetime.fromtimestamp(trade['t'] / 1000.0).strftime('%Y-%m-%d %H:%M:%S'),
'price': trade['p'],
'volume': trade['v']
}
latest_trade_data[symbol] = record # Update latest trade data
try:
future = producer.send(kafka_topic_data, key=symbol, value=record)
future.add_callback(delivery_report)
future.add_errback(lambda exc: logger.error(f"Failed to send record to Kafka: {exc}"))
except Exception as e:
logger.error(f"Failed to send record to Kafka: {e}")

spring field
#

and the kernel preserves the dimensions?

unkempt apex
#

so yeah!

spring field
#

what

unkempt apex
#

and for conv it is 3

spring field
#

with padding 1 I assume

unkempt apex
#

yeah

#

stride = 1 for conv
stride = 2 for pool

#

so should I start with image or directly with conv1?

spring field
#

I'm not entirely sure, but this makes sense to me

#

cuz obvs you start with an image
after convolution you have more layers, but size is unchanged, after maxpool same number of layers, but size is halved, then again, after convolution, more layers and size unchanged and then after max pool, same number of layers but size is halved, then it goes through the linear transforms

unkempt apex
spring field
#

I didn't save it firEyes
though I had the forethought to screenshot all the parts, so you should be able to easily just recreate it off that screenshot

unkempt apex
#

okay!

finite rain
#

Has anyone been able to use open rest api for calculating driving distances between coordinates? Ideally I’d like to have a table built out with distance between combination of various lat lon.

#

Has anyone used OSRM or similar method to call distance?

deep sleet
#

I watched this video , he only explains how 1 cell workz

spare forum
#

What's the need/ project tho, do you need that much

#

Sample it for learning purpose, for anything serious you don't use your personal computer

spring field
#

you just push all the tokens in a sequence through that one cell

spare forum
#

Tbh I'm dealing with a similar type of data rn, but I have the odometer at the end and start of the trip

finite rain
#

I am looking to build a matrix containing driving distance between all the possible coordinates.

#

Then based on a coordinate I am at, it can tell me the other closest coordinates near by.

#

So far I have found two solutions though unsure of the cost. Distance Matrix API from Google, and Travel Cost Matrix by ESRI. I was curious if there are others someone has already tried.

deep sleet
spring field
#

when looking in the horizontal direction, it's the same cell just being passed all the tokens sequentially, in the vertical direction it's different cells though

#

though as far as I can tell, despite this containing several cells, it's still a single hidden layer pithink
hmmmmmmmmmm

#

or you just can't have two hidden layers with three cells each and it's just gonna be 6 hidden layers

#

that is, unless your hidden layers contain sublayers, meaning, you have an RNN, some other thing, then another RNN, then some other thing, then another RNN and then this whole sequence repeats twice, I guess you could consider that group one hidden layer?

#

right, something like

[
  [
    RNN
    Linear
    RNN
    Linear
  ],
  [
    RNN
    Linear
    RNN
    Linear
  ]
]

this would be 2 hidden layers with 2 RNN cells each

#

cuz you could do say this

[
  [
    RNN
    RNN
    RNN
  ],
  [
    RNN
    RNN
    RNN
  ]
]

but that's equivalent to just

[
  RNN
  RNN
  RNN
  RNN
  RNN
  RNN
]
spring field
#

it also depends on what you consider a layer, is an activation function a layer as well?

#
[

  Embedding

  [

    RNN <Tanh>
    Linear <Sigmoid> |
    RNN <Tanh>       |
    |--- + ----------| (residual)
         |
    LayerNorm
    RNN <Tanh>
    Linear <Sigmoid>

  ] x 2

  Softmax

]

sth like this could be 2 hidden layers with 3 RNN cells each I guess?

#
[

  Embedding

  [

    RNN <Tanh> ----|
    RNN <Tanh>  <- | - (not included in the residual)
    RNN <Tanh>     |
    |-- + ---------| (residual)
        |
    Linear <Sigmoid>

  ] x 2

  Softmax

]

or sth like this, for instance, idk, just dreaming up some architectures, lol

(also need an embedding layer before all of these architectures, lol (added to the last two, but same goes for the previous stuff))

coral field
#

if i have numerical data and i need to preprocess it by addressing input data skew, running SMOTE, and standardizing the data, would this preprocessing pipeline be acceptable:
log transform input features -> split data into train/test -> SMOTE training data -> combine training + testing data -> standardize inputs -> split back into training + testing (using same split approach as step 2)

mild dirge
#

standardizing should be fitted on training only @coral field

coral field
#

why so?

mild dirge
#

Because you shouldn't use information from the test set

#

It is a minor thing, but formally it should be done like that

coral field
#

okay

#

would it cause any large issues in model performance?

mild dirge
#

If your test set has a different distribution, and you standardize based on those values, you already use info from the test set, so your results could be kinda biased in your favor, which is bad.

coral field
#

alright, but if the distribution is really similar, im assuming the performance wouldn't matter too much?

spring field
#

(is standardization the same as normalization?)

mild dirge
coral field
#

one last thing, if my training and test sets are very different, yet standardization was fit based on both sets combined, would the test set results generally be a lot weaker, or would it depend

mild dirge
#

if you fit and transform seperately then probably yes. Just to make it clear, you should fit on the training data (so you get train_mean, train_SD) and then you do

test = (test - train_mean) / train_SD
train = (train - train_mean) / train_SD
#

If the distributions were very different and you did

test = (test - test_mean) / test_SD
train = (train - train_mean) / train_SD

Your model would likely perform worse.

coral field
#

πŸ‘

deep sleet
deep sleet
umbral blaze
deep sleet
umbral blaze
deep sleet
#

No worries

spring field
# deep sleet what is a residual πŸ˜…

in simple terms (in this case meaning that I have barely any ideas as to why, lol) you just add the input to a layer to that layer's output, it's a skip connection, you can also use concatenation instead methinks, but that requires more memory

spring field
spring field
lapis sequoia
#

idk which framework, in keras you'd do (pseudocode):

encoder = Encoder(...)(input)
output = Decoder(...)(encoder)
AE = Model(input, output) 
#

can't you just create a new AE class and use the encoder/decoder models within i.e set them in init, call them in forward?

#
class AE(torch.nn.Module):
    def __init__(self, encoder, decoder):
        super().__init__()
        
        self.encoder = encoder    
        self.decoder = decoder
 
    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded
#

i can't really help more than that now though, that's just a sketch-idea

topaz mason
#

hey um, im about to start college and im taking AI and ML courses in my college, i have some knowledge in AI and ML like about activation functions, layers, how layer values are calculated and some other things but i feel like im nowhere near to the actual stuff, can someone provide me with ways to learn and where to start from? (i dont mind learning from the start), thanks in advance

unique spoke
#

Anyways im sure you can find some great courses on coursera udemy ectc

serene grail
#

There is an ML math book in the pinned messages on this channel

topaz mason
unique spoke
lapis sequoia
#

What textbooks do you suggest for neural networks? Or where do I read research papers. I have heard when getting into these it's important to read many papers

#

Aight

#

Thank you!

topaz mason
lapis sequoia
#

ig not an introductory book, judging by who's commented

ocean pawn
#

Do anyone know how to denormalize w and b? I can do it for w, just not b

lambda x: (x - x_train_mean) / x_train_std) #normalize
lambda n: n * x_train_std + x_train_mean #denormalize
#

x_train_mean and x_train_std both have the shape of w, so it works

#

But for b, which is just a scalar, how do I denormalize it

#

I just want to denormalize it so I can plot the graph

#

Denormalizing w works, but I can't do it for b

#

The gradient looks right, which mean w is denormalized properly, but b is still off

#

Thanks!

#

Is what I'm doing completely wrong? Can you even undo normalization for parameter?

lapis sequoia
#

but a parameter no, you can't; the mean is itself, and std is 0.

spare forum
#

You normalized the data x, not parameters or watever

ocean pawn
#

I know

lapis sequoia
#

well that's batch normalisation

ocean pawn
#

I see

lapis sequoia
#

yeah

spare forum
#

Uh no what am I saying

ocean pawn
#

I think I have to denormalized the prediction parameters before plotting?

spare forum
#

Uh what am I saying

#

Sorry nvm you don't have to, but if you fitted with normalized x just plot normalized x and y

ocean pawn
#

Cause I am scattering the raw data

#

If I predict a line, then get the prediction, I invert the normalization on the prediction

#

Is that the correct way?

lapis sequoia
#

yes

ocean pawn
#

I'm just getting myself confused, trying to plot the function

lapis sequoia
#

so if you normalised x and y, then you undo it, or plot both normed.

#

i thought you meant w and b originally

ocean pawn
lapis sequoia
#

not stupid, it's confusing because NNs do normalise w

#

(it's called batch normalisation.)

ocean pawn
#

Oh, I've not gone that far yet, I'll keep that in mind

lapis wing
#

I new learner Python guys help me

lapis sequoia
#

let's say that you fit a line to data, if you normalise x and y, then your solution parameters will correspond to that scenario

ocean pawn
#
plt.scatter(x_raw[:, 0], x_raw[:, 1], c=y_train)
x1 = jnp.arange(-5, 5)
x2 = ((-b) - x1 * w[0]) / w[1]
print(f"x1: {x1} x2: {x2}")
plt.plot(x1, x2)
#

So I am trying to draw a decision boundaries

#

How do I plot this, if my x_train is normalized

spare forum
ocean pawn
spare forum
#

You litteraly fitted your model to predict y in function of x_normalized

ocean pawn
#

I'm trying to plot a decision boundaries of a logistics regression

ocean pawn
#

I don't have a computer next to me so I can't test my guess

#

Or should I give up and scatter the normalized data and call it a day

deep sleet
#

I got to be honest I am a bit confused

lapis sequoia
#

if x_1 is the input, and it was trained on normalised data, just do (x_1*std)+mean; where mean and std are the ones you got from the dataset.

ocean pawn
wooden sail
ocean pawn
#

When it's =0 sigmoid is 0.5

#

So that's the decision boundaries

#

But I have no idea how to scale this output

#

Am I stupid, or do I invert normalized x1 and x2

#

Is that it?

#

Wait no

#

I am confused

lapis sequoia
#

if what you did is to fit x_1, x_2 (within z) and found w1, w2, and b to predict g(z)

#

the same op you did on training data, you do now on inputs (that's my take from your desc. i can be wrong.)

ocean pawn
#

If x1, x2 is fit into z, g(z) should always be 0.5 right?

ocean pawn
#

And sigmoid of 0 is 0.5

lapis sequoia
#

yeah i meant 0.5

ocean pawn
#

Even knowing that, my brain still gets confused when I want to plot a decision boundaries that fit the not normalized data

lapis sequoia
#

so you have to norm it, using (x_s - mean)/std imho

#

cuz that transform was done to the training data

ocean pawn
#

Just realized I pasted the code twice

lapis sequoia
#

well, those correspond to normed data, if your training data was

ocean pawn
#

Ok

#

We'll see, but I can't test it out right now, sadly

agile cobalt
ocean pawn
#

So if x axis is x1 and y axis is x2, I can just inverse the normalization for x and y axis?

agile cobalt
#

if your normalization function can also be described as x_normalized = a*x + b, then you can just do x = (x_normalized - b) / a

ocean pawn
#

Yup

#

I had that, for z score normalization

wooden sail
#

just make sure you do it in the correct order. if you subtracted first and divided second, then you need to multiply first and add second

wooden sail
#

the cute name for the reversal of order of operations during inversion is "shoes and socks theorem"

#

if you put your socks on first and then your shoes, then you first need to take off your shoes and then take your socks off later

ocean pawn
#

Huh, cause you wear sock then shoes

#

Thanks

deep sleet
#

ohh

#

gotcha

worldly wagon
#

in respect to polars lazyframe are lazy queries similar to lazy computing lazy loading? or is it a coined term by polars?

#

I can't find any good technical discussions of how laziness/lazyframe in polars works
my real issue is the notable slowness of dataframe.unique() but I'm trying to find the true issue as I doubt i'll get solutions to that

past meteor
worldly wagon
worldly wagon
past meteor
worldly wagon
#

processing time is like 0.5-0.77 seconds

worldly wagon
past meteor
#

You're not using a lazyframe here though?

#

this is a regular dataframe

worldly wagon
past meteor
#

not necessarily

#

the unique method exists both on LazyFrame and DataFrame.

worldly wagon
#
   return (
            self.lazy()
            .unique(subset=subset, keep=keep, maintain_order=maintain_order)
            .collect(_eager=True)
        )
past meteor
#

on the former it's naturally lazily evaluated and the latter it's eagerly evaluated

worldly wagon
#
        return wrap_ldf(self._df.lazy())
past meteor
#

You mean, from the impl. pov?

#

How it works under the hood?

worldly wagon
#

don't get me wrong i may be misunderstanding something ofc hence why i asked if there was more technical writing on it

past meteor
#

I'm having a very very hard time to understand what you are asking

worldly wagon
#

methods on the polars DataFrame class have identical implementations to the Lazyframes from me reading the underlying code

misty shuttle
#

any good pytorch tutorials?

past meteor
#

The code you just removed could defnitely benefit from using LazyFrame. Specifically, you need to swap out pl.read_parquet to pl.scan_parquet

misty shuttle
past meteor
worldly wagon
past meteor
#

But that's kind of irrelevant to the point of what LazyFrame is all about

misty shuttle
worldly wagon
worldly wagon
past meteor
#

it's that the query plan functionality of polars can see "oh you're doing a group by and 2 aggregations, those can be done concurrently" and so on

worldly wagon
#

to see if i can speed up my computation, i'm only doing a singular read tho not multiple reads

past meteor
#

Yeah, if I were you I'd start by not using the eager side of polars

#

Do everything in lazy

#

I'm looking for the right docs page because this is discussed at length

#

In the user guide

worldly wagon
past meteor
#

the lazy API allows Polars to apply automatic query optimization with the query optimizer

past meteor
#

Or rather, it can be

worldly wagon
#

seems like columns in lazyFrame has no setter

past meteor
#

Polars typically has two methods, read_X and scan_X

#

Use scan_parquet and you'll have a lazyframe

worldly wagon
#

pardon my ignorance btw like 2 weeks ago i moved off from pandas to polars

past meteor
#

Did you read the user guide?

worldly wagon
#

yes and the documentation

past meteor
#

hmmm

past meteor
worldly wagon
#

hence why i was able to transition everything easily

past meteor
worldly wagon
past meteor
#

You can compare the non-optimized query plan vs the optimized one if you want to see if you're getting any gains from using a lazyframe

past meteor
#

You can assign new columns with a lazyframe

#

.with_columns works there

#

Or how were you assigning columns?

worldly wagon
worldly wagon
past meteor
#

huh?

#

Anyway, I use polars extensively at work. I'd always tell people to use LazyFrame, especially Pandas users

#

Even if you get no performance benefits at the very least it disables all anti-patterns you can do (anti-patterns you can freely do in Pandas as well)

#

The only operations you can do are quite optimized

worldly wagon
worldly wagon
past meteor
#

There's certain things you cannot do like sorting iirc but you basically need to discipline yourself to collect once and sort at the end or sort at the front and then go lazy. Something like that. Not collecting multiple times and so on

#

So like, from a UX pov it's nudging you to more efficient code πŸ™‚

worldly wagon
#

its not much to improve anyway better to write the proper pattern early

past meteor
#

In your work project?

worldly wagon
#

if something is too slow usually rewrite it in C++

past meteor
worldly wagon
past meteor
#

Anyway, if you have any experience with DBs, just think of it as SQL and query plans there

worldly wagon
#

yea I understand wym

past meteor
#

easy peasy

worldly wagon
#

actually kinda insane i'm ngl 😭 but yea appreciate the help alot

languid moss
past meteor
#

you still need to .collect()

#

It (obviously) takes 0s because it didn't do any work

worldly wagon
past meteor
#

That makes sense

#

because collect is doing all the previous deferred computations

past meteor
#

So the secodn function should also return a DataFrame at the end

#

but inside it should only deal with LazyFrames and the very last line should be return df.collect()

past meteor
# worldly wagon

The screenshot has the second one's return type as a LazyFrame though?

worldly wagon
#

i'm fine with using lazyframe until i'm complete with all operations

#

albeit now i'm weighing if we should write the cleaned data opposed to recleaning data sets on every read thats something i'll prob pick up in a meetingpithink

past meteor
#

It's a buzzword for basically writing raw data in its original schema (bronze), cleaning it a little and writing it (silver) and then maybe doing a couple of final transforms for your end application(s) (gold)

past meteor
#

perfect

worldly wagon
#

my project is historical analysis still in the earlier stages so we're still discussing long term stuffpithink i got moved from the real time analysis team lol
everyone is concerned with performance but not sure how to obtain it so i've been mostly fronting that load

past meteor
#

Polars is a fine choice then

worldly wagon
#

yea its a major improvement over pandas team is small as well so not a huge issue of adoption

#

pandas had us discerning if to scrap the original python/C++ architecture lol

jaunty helm
#

pretty sure I read somewhere that the "eager API" is just the lazy API but it collects after every operation or smthn
you'll prob see a nice difference when you queue up a large amount of ops then collect, allowing the optimizer to do its thing

worldly wagon
past meteor
#

That was new to me as well, nice that I learnt that

ocean pawn
#

Thanks everyone! I plotted the decisions boundary (for data that isn't normalized but parameter is trained on normalized data) ! Thanks!

mental rampart
#

how do i

#

store all the values in a column in pandas dataframe in a string

lapis sequoia
ocean pawn
#

Thanks everyone for your help

lapis sequoia
mental rampart
#

how do i unload datatypes from memory that i no longer need themπŸ’€

serene scaffold
serene scaffold
mental rampart
serene scaffold
mental rampart
serene scaffold
#

Int is a data type.
You can delete every int. And you can delete every int in a given column. But you can't delete the whole concept of int from your code.

mental rampart
#

no longer need string in memory*

serene scaffold
#

The string will get deleted automatically.

mental rampart
#

hmmm

#

idk how python manages memory

serene scaffold
#

If you want to delete a column, do del df['col']

serene scaffold
#

If you want to delete something before that, you need to delete all the references to it, so that there are none left.

mental rampart
serene scaffold
mental rampart
mental rampart
#

really

serene scaffold
#

Yes. If you define a variable in a loop, the last value for that variable persists after the loop

#

Until the end of the scope

mental rampart
#

interesting

serene scaffold
#

Same for the "loop variable"

mental rampart
#

i think imma restudy global and local scopes

serene scaffold
#

Sounds good

#

Keep in mind that python makes it impossible to delete data directly. But it's guaranteed that a variable will never refer to deleted data.

#

You can only delete data indirectly by deleting all its references.

mental rampart
#

like ones where the compiler already ran

#

i think if compiler sees it no-longer referenced in future, it decides it went out of scope

serene scaffold
#

You say compiler, but you mean interpreter.

The interpreter doesn't look ahead to see that a reference won't be used again. It only deals with that at the end of a scope.

mental rampart
#

oh ic

#

i forgot python is an interpreted language

serene scaffold
#

But you can manually delete references (not values) with the del keyword

#

!e
a = 5
print(a)
del a
print(a)

arctic wedgeBOT
# serene scaffold !e a = 5 print(a) del a print(a)

:x: Your 3.12 eval job has completed with return code 1.

001 | 5
002 | Traceback (most recent call last):
003 |   File "/home/main.py", line 4, in <module>
004 |     print(a)
005 |           ^
006 | NameError: name 'a' is not defined
mental rampart
#

ohh ic

#

now what u said, makes sense

serene scaffold
#

I'm glad

mental rampart
#

i love that programming communities help each other out

serene scaffold
#

A lot of programmers got a lot of help when they were starting (I did) and want to pay it forward

keen frigate
#

print("Hello World, Im New Here")

ocean pawn
#

Have anyone used lets-plot for python before?

#

The interactive nature of the plot looks interesting, but the syntax is quite different from plt

solemn verge
median grail
#

Do you think it would make sense to make a real jarvis using AI?

serene scaffold
median grail
serene scaffold
#

someone could ask "do you think it would make sense to make pencils?", but I don't understand why someone would ask that.

median grail
#

Dude, do you think it makes sense or not?

serene scaffold
serene scaffold
#

I know you know what you mean, but I sincerely do not.

serene scaffold
median grail
serene scaffold
#

If you're interested in AI and chat bots, there are other projects you can undertake that are attainable

#

This is not the one.

median grail
#

I have 3 chat ai api and 1 imagine ai api

serene scaffold
ocean pawn
serene scaffold
ocean pawn
#

Nor do I know how to generate 3d data

serene scaffold
serene scaffold
#

right

ocean pawn
#

I'll have a look

serene scaffold
#
  • make arbitrary points in 3d space
  • make an arbitrary decision boundary
  • assign each point to "yes" or "no" based on which side of that arbitrary boundary they're on
  • run your code for finding the boundary
ocean pawn
#

Would the decision boundary be a plane in 3d?

#

Assuming I use x1 x2 and x3

serene scaffold
#

if it has to be "flat" then yes

#

a plane is the 3d analog of a line

ocean pawn
#

Should be possible

#

Definitely possible, but I mean it's possible for me to make it

serene scaffold
#

but there are problems that can't be separated by a straight line

#

like XOR

ocean pawn
#

I assume I can just add x1**2, x2**2 ,x3**2 to make it curve?

serene scaffold
#

something like that, I think

ocean pawn
#

... Great, I type annotated and type guarded all my function, and now they only allows 2d array

#

nvm

#

I am stupid

#

3d logistic regression looks fine

ocean pawn
#

Something is missing

#

why do some plt function, like plot_surface not appears in autocomplete?

#

Oh no, you can't animate plot_surface

serene scaffold
ocean pawn
#

But I can't animate surface

#
def animation(frame):
    global w, b, plot
    if plot is not None:
        plot.remove()
    w, b, w_grad, b_grad = ml.grad_descend(
        w,
        b,
        x_train,
        y_train,
        0.1,
        ml.logistic_cost,
    )
    history.append(ml.logistic_cost(w, b, x_train, y_train))
    x1 = jnp.arange(-10, 10, dtype=jnp.float32).reshape(-1, 1)
    x2 = jnp.arange(-10, 10, dtype=jnp.float32).reshape(-1, 1)
    x3 = ((-b) - w[0] * x1 - w[1] * x2) / w[2]
    plot = ax.plot_surface(
        inverse_normalizer(x1, argnums=(0,)),
        inverse_normalizer(x2, argnums=(1,)),
        inverse_normalizer(x3, argnums=(2,)),
    )
    return (plot,)
#

I don't understand how do I plot a surface

ocean pawn
# ocean pawn ```py def animation(frame): global w, b, plot if plot is not None: ...

Ignoring plot_surface vs plot

def animation(frame):
    global plot
    if plot is not None:
        plot = []
    gradient_descend.next_epoch()
    x1 = jnp.arange(-10, 10, dtype=jnp.float32).reshape(-1, 1)
    x2 = jnp.arange(-10, 10, dtype=jnp.float32).reshape(-1, 1)
    x3 = (
        (-gradient_descend.b) - gradient_descend.w[0] * x1 - gradient_descend.w[1] * x2
    ) / gradient_descend.w[2]
    plot = ax.plot(
        inverse_normalizer(x1, argnums=(0,)),
        inverse_normalizer(x2, argnums=(1,)),
        inverse_normalizer(x3, argnums=(2,)),
    )
    return (plot,)

Is the function version neater or the class version neater?

untold dove
#

i be happy to share it i just completely reworked how my reward system works

serene scaffold
#

The "Jarvis" in the Iron Man movies is more advanced than the most advanced AI that currently exists anywhere, so when someone asks "how do I make Jarvis?", it sounds like they have delusions of grandeur.

If you someone wants to attach ChatGPT to a TTS system, that's a very far cry away from Jarvis, but it is doable.

untold dove
#

i completely agree with that statement jarvis is not buildable with ai the way it is today

#

chat gpt isnt even close but im saying ive created a smaller manageable verison

#

do u agree?

serene scaffold
#

I can't get into all my thoughts about that at the moment
I'm a language AI professional.

untold dove
#

honestly id be interested to see what you think of mine if you call it awful so be it yk just feedback is helpful

serene scaffold
#

ye

untold dove
#

shot u a dm

#

dont want to post the full program on here

serene scaffold
#

why not

#

you already said the name of it, so anyone can search "BJW333 argus"

untold dove
#

true lmao didnt even think of that haha

#

lmk what ur opinion is of it

plain totem
#

Does anyone know how to use neat

kindred isle
#

Does anyone know how to access ChatGPT 3.5??

#

I only get 4o and 4o-mini even when I'm not logged in, I needed 3.5 specifically for completing a project and now it's gone

agile cobalt
kindred isle
#

Oh man

#

Alright so I'll need to use the API, got it

#

thanks!

unique spoke
#

Hey guys. How would I be able to detect whether someone is looking at their phone using the phone camera itself

#

I made the code to detect the eyes

#

but what calculations or idk ml model can I use for checking whether the person is looking at the phone

#

Asking chatgpt just told me about the Eye aspect ratio but all that does it check whether you are blinking or not

#

Any other ideas because the only thing I can get from that is that people dont blink as much looking at the phone but thats not close to good enough

spare forum
#

Even Lecunn said that AGI is illusion but OK

serene grail
#

Do you want to elaborate on those ideas?

spare forum
#

That's kinda a pb that you don't know who that is

serene grail
#

Some people argue that human intelligence isn't general intelligence either, it's just a powerful narrow intelligence
And that intelligence can't be general in principle
I think that's kind of pedantic in this context but it could be a useful distinction sometimes

serene grail
spare forum
#

Actual LLM is fancy autocomplete

#

Hype will get crushed in some time

serene grail
#

You don't necessarily need anything more than fancy autocomplete

#

This I agree with

spare forum
#

Except, I hope, you can use logic

serene grail
#

I don't know what a convolution kernel is, I'll read about that

spare forum
#

we can still use it, there is almost none in LLM now except a sequence on trained data

livid pulsar
#

bro can u tell me how to implement chatgpt api in a discord bot

unique spoke
#

hey

#

been looking at this

spare forum
#

You can have context, there is self attention and all, but at the end, we can't say it's really intelligent, without training on terabytes costing millions of dollars it's not that clever

unique spoke
#

Only problem im running into is a module not found error

#

after git cloning it

#

idk why

#

Not exactly but a lot of useful functions

serene grail
#

I would agree that LLMs in their current form don't have "true reasoning" (or they are extremely bad at it), I can't really define what true reasoning is but it's a "you know it when you see it" sort of thing

unique spoke
#

I saw something nvidia was working on but its not released to ublic

serene grail
#

I'm very excited to see the field in the next decade though, things could flop or go in a very interesting way

#

Yeah it's unfortunate

spare forum
#

But tbh the "AI that replaces swe, DS" etc... Is a fantasy at the moment for me, we can have the time to be retired before this happens

lapis sequoia
#

you've to take researchers opinions' with a grain of salt though

#

witten is a genius, hasn't really been correct in many ways

spare forum
#

I still prefer to take researchers opinions rather than Nvidia CEO or Elon musk

lapis sequoia
#

i do agree with lecunn though, but ai's hype isn't really related to AGI

#

it's because there is a lot of money for large companies, and they dont need to get to agi for it..

#

scaling up models will make them better (read as: more useful and a replacement for more jobs.), even though it won't make them agi

livid pulsar
#

'''

clear raft
#

im trying to create this vehicle detection model and as of now it detects,counts and tracks using deep sort. im using pretrainied yolo weights. now I want to record which path the car is using (for example, in a junction). any suggestions?

scenic parcel
#

For time series is it necessary to transform 8:50 a.m. July 25, 2024, into year: 2024, month: 7, day: 25, hour: 8, minute: 50 ? (so 5 columns instead of 1)

agile cobalt
scenic parcel
agile cobalt
past meteor
#

Agreed. The human brain needs a fraction of the energy and training data for new tasks

#

We do multimodal, multitask few shot learning by default

serene grail
#

Ehhh, I don't know about the training data

past meteor
#

Most definitely

#

This is the thesis topic of my girlfriend (experimental psych)

serene grail
#

How much training data has a 3 year old has had to be able to detect edges of objects and identify them as separate objects from something else?

past meteor
#

I'm too lazy to type out the entire premise but we simply need way less data

#

To learn new tasks

#

Because we don't start from 0

serene grail
#

You aren't counting all the frames in this baby's lifespan before that

past meteor
#

If I show you 5 pictures of a dog breed you'll know what it is

#

maybe even 2

serene grail
#

I come pretrained

past meteor
#

She's pretraining as well

serene grail
#

You need a baby with 0 seconds out of the womb

serene grail
#

I think way less

past meteor
#

Ok but that's the uninteresting premise

#

It teaches us very little

#

Being able to adapt foundation models with a similar amount of data for new tasks (few shot or even zero shot learning), that's the interesting premise imo

#

In her thesis she "trains" humans to do a certain task with abstract shapes (so humans don't have a big bias)

#

And does the same with CNNs

serene grail
#

IMO that would have to be performance above human level when it comes to performance to data ratio

past meteor
#

There's always the asterisk you can put on ML/DL studies "did you use the right hyperparameters" etc.

#

But the results are quite clear that humans have no prooblems in finding the decision boundaries correctly and interpolating/extrapolating to other abstract shapes

#

Whereas NNs struggle to reason in that space

#

The same NNs that can outcompete humans on say the imagenet dataset

#

Right now we're at "NNs can beat us humans at many tasks if we're willing to spend €$Β£ to create (or finetune) a specialized model for each task"

spare forum
#

How is that contradictory? That's two different things

spare forum
#

Both are important

spare forum
#

Yet nobody doesn't know Alan Turing in CS lol

#

It's cool to have basic culture

topaz abyss
#

how the hell have i managed to fuck up like this

#

a bad dataset, and terrible training

spare forum
#

It's not a goal to know every scientist or whatever but he is one of the top expert, that doesn't mean everything he say is pitch perfect but it's always good to know what the most knowing person in a field say about his field rather than speculating as simple students/practisionner

#

He's litteraly written articles whatever

dry raft
#

Hey guys, I wanna get into medical field but I love AI and math as a whole, but I don't know if it will be worth it or not. Should I keep doing it?

serene scaffold
spare forum
#

Reflexion by quote yet doesn't want to know scientist of what you use look funnyducky_concerned

dry raft
#

i just wanna know if it's still good to learn

serene scaffold
dry raft
#

Thank you!

serene scaffold
#

unless you can get a summer research internship. I used to work in a lab that took high school interns, but those people wanted to do AI as their primary career focus.

#

to be clear, I think that you should read/watch more about AI and try implementing basic models, but for your own edification.

lapis sequoia
dry raft
lapis sequoia
serene scaffold
#

@dry raft to clarify, do you want to be a healthcare provider (doctor, surgeon, etc) or some kind of biomedical researcher?

lapis sequoia
dry raft
#

Like disease detection, protein folding, etc.

lapis sequoia
#

if you have pre-med programs in ur country u should take compsci

dry raft
#

btw thanks stelercus, I'm starting to understand stuff more!

lapis sequoia
#

you're wrong

#

it is absolutely possible

dry raft
#

btw i'm an incoming freshman

lapis sequoia
#

this person has 4 years before going to college, 4 years is so largely enough if they start learning data sciece and algorithms first

#

so it is litreally possible

dry raft
lapis sequoia
dry raft
lapis sequoia
#

well thats a good start

dry raft
#

I want to reimplement a paper sometime soon tho

#

But I also need to learn some basic pytorch

lapis sequoia
#

i get what you mean, but the field of ai they're iterested in is related to their main job, which is doable since it's not completely 2 different things

dry raft
#

Ig that's true

#

but thank y'all

iron basalt
#

And then you don't forget it either as new data arrives (no catastrophic interference). Other important difference is that the order in which you see them matters, it's not like deep learning in which we randomize our order of inputs (and resample them randomly). (Some) neural networks also lack statistical consistency, you could give two of them the same infinite data and they may never arrive to the same conclusions (initialization matters a lot (and again the order which they are given)).

#

(Most ML methods really want statistical consistency, for obvious reasons)

iron basalt
#

Also note there are no epochs, single it's single shot once through.

#

(This means you can't plot a nice graph for it learning, so you can't directly compare it to deep learning papers)

iron basalt
upbeat kernel
#

I need help with some simple code for ai. Please respond if you can help. I am willing to pay. Thanks!

serene scaffold
upbeat kernel
#

Am I allowed to say that lol

#

ohh okay

#

I have a question for an assignment for an summer educational class on ai class but i havent even taken machine learning.

#

It looks relatively simple but the professor never taught us anything about how to code ai.

serene scaffold
#

what are students in this course expected to already know?

upbeat kernel
#

He said "It will be for everyone"

serene grail
#

Are all the questions like this? Maybe some questions are optional and you aren't expected to get all the points

serene scaffold
iron basalt
# upbeat kernel

This is way too open ended, allowing it to range from trivial to the hardest unsolved problems.

upbeat kernel
serene scaffold
upbeat kernel
#

he said he would check for people using chatgptπŸ’€

serene scaffold
#

if you have no idea how to approach this problem, it would take several hours of someones time to walk you through all the concepts needed for you to solve it.

upbeat kernel
#

This is why I majored in math and not comp sci😭

serene scaffold
#

do you already "know python"?

upbeat kernel
#

Yes I took the same professor for introduction to programming and I made an A

#

It was all python

iron basalt
upbeat kernel
#

But ai is different

#

I asked the best programmers i know but they said they never learned ai

iron basalt
#

Random noise is a valid submission.

#

It did not say the AI needs to be good...

#

Data preprocessing: ignore all data. Model architecture: random uniform noise generator. Training and evaluation: No training, evaluation. Generated sample visualization: converted noise to a image using Pillow (or some other image library).

#

Alternative: it just stores the entire data set and spits out one of them as the generated image.

iron basalt
hearty depot
#

model and be like "transfer learning"

upbeat kernel
hearty depot
# upbeat kernel how do i do that

a lot of online models tend to have their weights in onnx format so just download weights and load into favorite framework and u can fine tune as needed

lapis sequoia
serene grail
lapis sequoia
serene grail
#

I think so?
I think it says that with large enough models, you can find the local minimum instead of the global minimum and it works fine
So not the place where the function's output is at its lowest but just one of the places where it's lower than everything around it
And there are many of these points and it's way easier to find them than the global minimum
But then there's some other stuff with the saddle points which I didn't get

lapis sequoia
#

yes, quite a striking property

#

nice one line summary

with large enough models, you can find the local minimum instead of the global minimum and it works fine

#

i'd say 'local minima' instead of 'local minimum'

serene grail
#

Oh okay

lapis sequoia
#

(just bc there are many normally)

serene grail
#

And global minima too?

lapis sequoia
#

i'm unsure what's the convention there !

#

another phrasing would be 'you can find any local minimum...' maybe

serene grail
#

I see, thank you

carmine tundra
#

Hi guysπŸ‘‹
Please correct me if I am not allowed to comment here.

I am a complete newbie.

Is there someone who can help me to learn how to build Autogen from Microsoft. They have a github page. You can google it. They have tutorials but I am beginner and hard to follow. I also would like to adjust the code for my use case.

Also would be cool if you play LOL or have PlayStation console so we can combine coding with playing and having funπŸ˜‚

https://github.com/microsoft/autogen

GitHub

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap - microsoft/autogen

lapis sequoia
#

just remove the 'pay' part, it's not allowed iirc

#

(it's not that we don't need money.)

carmine tundra
#

I mean I can buy LOL skins instead of directly transferring

lapis sequoia
#

idk anything about it, you could post it in some freelancing platform

#

probably others here do know though, so wait jic

lapis sequoia
#

it's related to bengio's hypothesis

#

i.e: higher dim spaces (in other words nets with many neurons) have a proliferation of saddle points that optimisation algos struggle to get around of.

#

pretty sure there is more recent research on this

tepid parcel
#

Heeey!!!
Good Morning!

Does somebody know the best ML (Machine Learning) engineer roadmaps for basing on my studies?

I appreciate any help

serene grail
lapis sequoia
#

you are welcome

lapis sequoia
#

(...) both articles believe that in order to reach the global minimizer, one must overcome the optimization challenge of saddle points. The first article just believes that local minima are good enough.

#

One common folklore belief is that the optimization landscape is similar to that of an egg carton,
XD

serene grail
#

πŸ₯š ❓

carmine tundra
lapis sequoia
#

maybe depends on how OP imagines the box...as some don't have saddle points

carmine tundra
unkempt apex
#

come on ping that , nvm I will check that

unkempt apex
#

Autogen!!

#

whoa , I lit. read this first time!!πŸ˜‚ ( sry about that)

carmine tundra
unkempt apex
#

at what point you are having problem?

carmine tundra
unkempt apex
#

I don't think so, but wait someone will def.

carmine tundra
unkempt apex
#

you are begineer and directly hoping in autogen?

carmine tundra
carmine tundra
unkempt apex
#

yeah so wait , someone will def. respond to this!

unkempt apex
#

what inputs are you giving to it?

#

dataset?

#

link that!!!!!

#

okay got it

#

it was fist one

#

now send the layer code!!

#

model code

hard shuttle
#

yes

unkempt apex
#

dataset is good, for inference which picture are u giving to it?

#

( to test the model)

hard shuttle
#

Like I have partitioned it from the same dataset. So technically from it

#

like for training, val and testing

unkempt apex
#

the dataset is not labelled I guess right?
like which photo has which characters

#

so what about loss?

hard shuttle
unkempt apex
#

wait wait, what about ur last fc layer??
I think there is a mistake there

hard shuttle
hard shuttle
river cape
#

Any idea as to why do we use -e .

hard shuttle
unkempt apex
#

why 4 conv2d layers??

and 3rd and 4th are same I guess
3rd converts -> 128 - 256
4th converts -> 256 -256
why>

hard shuttle
#

256 - 512?

unkempt apex
unkempt apex
# hard shuttle 256 - 512?

yeah!!, but still why 4 layers??, I mean you can but then you have 2more fc layers!!!!

because images are not that complex, ( to add more layers)
maybe for now work with 4 and add 2 fc layers

river cape
hard shuttle
unkempt apex
unkempt apex
unkempt apex
#

@hard shuttle make changes and send again the model code

hard shuttle
unkempt apex
#

don't worry about overfitting we have dropout layers if required!!,

first it needs to predict something

hard shuttle
#

okay

unkempt apex
#

suppose the captcha have this word -> zvQn

now you have inserted this into model , but the thing is
your model will analyze the picture
and "classify" into only one letter!!!!

#

here is the catch!!

#

that's why it was only giving u only one word!

#

shit happens!

hard shuttle
#

yes. I tried it with val dataset and test sets. Its only predicting one letter. Not all

unkempt apex
#

yeah, which means it is predicting word!! but we are giving it more than one!!
and the model is classifying that!!

hard shuttle
unkempt apex
#

be clear for yourself first and provide correct info

hard shuttle
#

Sorry. There had been instances where it was only predicting one letter of the captcha. It does give 5 characters as an output but it was just random occurances of only one being letter being correct. I just tested it again and the issue still lies. None of the letters of the captcha from the val or test dataset is correct.

unkempt apex
#

can you give screenshot of it?

hard shuttle
#

The current output is - Actual - 3Cj81 ; Predicted - 1l2M7

unkempt apex
unkempt apex
hard shuttle
unkempt apex
#

we still need to change that!

#

send code not ss!

hard shuttle
#

just the model code?

unkempt apex
# river cape Wait so if i change the code in any of my files , I dont need to reinstall the p...

lemme explain breifly

suppose you created package (python package in seperate dir ), and now you are using that in your python code

now if you make changes in package, you have to reinstall it to use in your python code,
but when you set it as editable, it will do it for you automatically!

so any changes in packages will get automatically reflected in all your code where you are using that package!

unkempt apex
hard shuttle
#

is this fine to paste like this?

unkempt apex
#

use this [ ` ] 3 times and then write py

and at end again 3 times

#

what is input size of image?

unkempt apex
hard shuttle
#

dimensions?

unkempt apex
#

yeah

#

like 128 * 128

#

training on GPU??

hard shuttle
#

256*256

hard shuttle
unkempt apex
#

which?, just curious for time taken to train as per epochs

hard shuttle
#

1 epoch took 60-80 secs

unkempt apex
#
import torch
import torch.nn as nn

class NewCNN(nn.Module):
    def __init__(self):
        super(NewCNN, self).__init__()
        self.num_class = 62  # 10 digits + 26 lowercase + 26 uppercase
        self.num_char = 5    # 5 characters per CAPTCHA

        self.conv = nn.Sequential(
            nn.Conv2d(3, 32, 3, padding=1),
            nn.MaxPool2d(2, 2),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.Conv2d(32, 128, 3, padding=1),
            nn.MaxPool2d(2, 2),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.Conv2d(128, 256, 3, padding=1),
            nn.MaxPool2d(2, 2),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.Conv2d(256, 512, 3, padding=1),
            nn.MaxPool2d(2, 2),
            nn.BatchNorm2d(512),
            nn.ReLU(),
        )

        self.fc1 = nn.Linear(512 * 16 * 16, 512)  
        self.fc2 = nn.Linear(512, self.num_class * self.num_char)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.conv(x)
        x = x.view(x.size(0), -1)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

unkempt apex
hard shuttle
#

gpu-4070

unkempt apex
#

and yeah try now by this code

hard shuttle
#

Is the changes i made correct?

#

especially the conv layers and fc

hard shuttle
#

okay

#

One more thing. The dataset

#

My test set contains 824 samples and the training contains 3k samples

#

is it fine?

unkempt apex
#

yeah, 80-20 split!

#

check that first

#

or just ignore for now!

hard shuttle
#

I know this is wrong but are we supposed to split the dataset just by copy/paste it manually?

unkempt apex
#

for smaller it is fine
but well created dataset comes with split!

hard shuttle
unkempt apex
#

are you training or not?

hard shuttle
#

Can i start?

unkempt apex
#

yeah !!

#

how many epochs?

hard shuttle
#

100

#

Should i reduce ?

unkempt apex
#

no it'sfine

hard shuttle
#

bs - 128

unkempt apex
#

you can use early stopper

hard shuttle
#

I havent implemented that

unkempt apex
#

then fine!

#

just train I need to see result fast!

hard shuttle
#

okay

#

It slow almost took 3 mins for one epoch -

unkempt apex
#

it should for first time!

hard shuttle
#

I see

#

It is a problem if the test loss is not completely below the train loss right

unkempt apex
#

the goal of training model is to reduce the val loss, not to compare train loss and val loss!!

hard shuttle
#

Yeah but the final val loss should be below train loss?

#

Is it supposed to be like that?

unkempt apex
#

ignore that !!, also plot first both losses

hard shuttle
#

Okay

unkempt apex
#

lower epochs to 50!!

#

100 is too high accordingly dataset!

hard shuttle
#

Can i do that while its on training?

unkempt apex
#

what is current epoch?

hard shuttle
#

6

unkempt apex
#

stop that

hard shuttle
#

alright

unkempt apex
#

it is taking too much time I guess

#

start with 30 only

hard shuttle
#

okay

#

This was how it was so far

unkempt apex
#

ignore

hard shuttle
#

Started

unkempt apex
#

give me code for how are you calculating training and val loss

hard shuttle
#

Too long to send. Ill send you ?

unkempt apex
#

use pastebin

#

from here

hard shuttle
#

Its not working after clicking paste

unkempt apex
hard shuttle
#

and crossentropy and higher lr?

#

okay

unkempt apex
#

don't cancel training

#

what about current epoch?

hard shuttle
#

at 5

hard shuttle
unkempt apex
lapis sequoia
coral field
#

I haven't really been able to find good research papers on this topic, but what is the point of polynomial feature expansion in the context of nonlinear features, and if my features have nonlinear relationships, would there be any point in using polynomial feature expansion before using a nonlinear dimensionality reduction like KernelPCA?

past meteor
#

Can you confirm that's the question? If so, I like it πŸ˜„

wooden sail
#

i would say the strongest motivation is that polynomials have strong theoretical guarantees regarding their approximation capabilities for functions satisfying more or less mild regularity constraints

#

you can both interpret what you're doing and know how bad the error is going to be

#

you can also endow polynomial bases with nice properties pretty easily

#

ofc if you have a deep enough network you never need to do any of this type of pre-processing explicitly. but if you want to use fewer layers, you can treat poly feature expansion as "model-based machine learning" where you know that what you're doing is approximating a function with a low order poly, and then feed that into a network that does other stuff with those features. similar to how wavelet decomposition can shave off several layers in models dealing with data that lends itself well to approximate low dimension wavelet decomps

#

what comes to mind is the weierstrass approximation theorem, and you can probably find more about poly feature expansion in papers from like 1980~2000 because it's what i'd call a "classical signal processing technique"

#

to address your question more directly, i would say the point of poly feature expansion is that its behavior in classical settings is well known, but ideally one should have good reasons to believe that a poly expansion will work well for the problem. you're then left with the problem of which basis and what order of poly to use. the first isn't super important. for the latter, you can pick a modest number and then do feature selection/dim reduction to keep the terms that contribute the most

#

roughly the same as with any other "encoding" approach

past meteor
#

I was going to give the basic example of decsion trees. They can approximate polynomials

#

But they're greedy, if you don't form that polynomial explicitly ahead of time as a feature you have no guarantees it'll do the 4-5 consecutive splits you may need (depending on the case)

#

Explicitly doing it gives your model space to learn other, more relevant things, say interactions with your polynomial and other features

#

Neural nets are definitely more opaque but the same idea holds there, decision trees are an easy way to make it "visual"

lapis sequoia
ashen sable
#

umm...so i am like starting out with pytorch and i have a small doubt that is there any rule for how many hidden layers we can make cause i see many ppl adding 3 or 4...and got a question that why is it only 3 or 4 why not maybe 50 cause the more the merrier and well what are the cons of using too many hidden layers could any pls explain

remote ridge
#

PS- it will also be helpful if the people who have some idea can tell me an alternative of handling large data for analysis and training models

remote ridge
# ashen sable umm...so i am like starting out with pytorch and i have a small doubt that is th...

if there was a "limit", there would be no LLM's
The people who said that maybe meant with respect to Model complexity vs amt of data
there are many other things to consider when training a neural net like
overfitting,Increased Computational Cost(money vs efficiency of prediction), time can be another factor
optimization is difficult as we add more layers coze lots of local minima(you can use advanced optimizations which can help you maybe optimize better, but what i think of is that you would still have to itterate through random initialization to find the best minima), vanishing gradient is another issue...

unkempt apex
remote ridge
ashen sable
ashen sable
ashen sable
jaunty helm
jaunty helm
jaunty helm
ashen sable
finite lodge
#

Hi all, Im trying to move the legend of a seaborn plot to the bottom, after "bill_lenght_mm", but I dont understand how the bbox_to_anchor works...
Thanks in advance

wooden sail
#

looks like (0,0) is the lower left corner and (1,1) is the upper right corner of the image, and the lower left corner of the legend is placed there

finite lodge
#

Alright, but what does the (.5, 1) means?

#

And what do I have to do to place the legend at the bottom instead of the top

wooden sail
#

as i said, it's the relative location where you want to place the legend

finite lodge
#

Then I would use (0.5, 0) to place the legend at bottom?

#

Assuming the legend would "push" the plot to the top

#

If that makes sense

wooden sail
#

that's a good question, you'll have to try and see. it might be that you need to use a negative y coordinate instead of 0

finite lodge
#

bbox_to_anchor=(0.5, 0) puts the legend over the plot, and bbox_to_anchor=(0.5, -1) puts it way lower the plot

wooden sail
#

yeah, remember 1 is the size of the image

#

so -1 is way below it

finite lodge
#

So I think something like bbox_to_anchor=(0.5, -0.1) would work

finite lodge
past meteor
#

I'd say the reason why people don't go very very deep is occam's razor tbh

#

But in a practical sense, it's better to train a large model and remove layers than do the inverse

past meteor
#

the largest issue you'll have is that training will take much longer and you'll overfit (but you can counteract this with dropout etc)

ashen sable
past meteor
#

From a practitioner's pov I rather have 5 normal layers than 50 layers with dropout, batchnorm, skip connections, L2 regularization, ...

ashen sable
#

ouh i'm still learning those terms πŸ˜…

past meteor
ashen sable
past meteor
#

the ELI5 version of my answer is this: big network => overfit and takes longer to train. You can make it not overfit with tricks but that's more effort than a small net

#

BUT when starting out on a new task going with a big network is smart, then you can check if you can at least fit the training set correctly. If you can't do that maybe you have a bug

past meteor
ashen sable
past meteor
#

You'll probably overfit after some epochs

ashen sable
#

lol idk it became saturated after some epochs

ocean pawn
#

matplotlib gone wrong

#

Apparenly, deepcopying plt item create a new winodw

coral field
lapis sequoia
#
  1. and most random initialisations should get to different minima, but similar in quality..(same article.)
#
  1. vanishing gradient isn't that frequent of an issue using ReLU iirc. This depends on the arch. though..
coral field
#

oh im just running through a project where i'm trying to classify heatwave risk using demographic factors

past meteor
#

Sounds like good ol' gradient boosting does the trick

#

I'd run that without any feature engineering to set a baseline

coral field
#

yep, so with just the base MLP classification it got ~96 training ~92 validation

ocean pawn
#

My apologies for a stupid question.
But do 1/m and 1/2m makes any different?
Can I mix them together? E.g. 1/m for cost, 1/2m for regularization?
thanks!

#

.latex

\[\frac{1}{m}\sum_{i=1}^{m} (f_{\vec{w},b}(\vec{x}^{(i)})-y^{(i)})^{2}+\frac{\lambda}{m}\sum_{j=1}^{n}w_{j}^{2}\]
\centerline{vs}
\[\frac{1}{2m}\sum_{i=1}^{m} (f_{\vec{w},b}(\vec{x}^{(i)})-y^{(i)})^{2}+\frac{\lambda}{2m}\sum_{j=1}^{n}w_{j}^{2}\]
strange elbowBOT
ocean pawn
#

Nice

#

I assume it makes no different except it make differentiation easier?

#

would this be a valid implementation for regularization?

(cost+jnp.mean(jnp.pow(w,2)))*lambda_
#

Nope

#
cost+(jnp.mean(jnp.pow(w,2)))*lambda_
wooden sail
#

no, those have the same minimizer

#

it's just so that the derivative has no dangling 2 in front

ocean pawn
#

Assuming if the cost is calcuated already

wooden sail
#

that'd work regardless of whether you put the 1/2 or not as long as the lambda is chosen correctly, sure

ocean pawn
#

I was a bit concerned as I am using mean instead of doing the \frac{\lambda}{m}

wooden sail
#

it's the same thing

ocean pawn
#

I know it is, I was concern that my math (might) be horrible

#

I think I'm ok

#

Thanks

wooden sail
#

i would have some numerical concerns cuz idk how numpy and jax compute the mean. hopefully the 1/m is multiplied first, otherwise you can have overflows

#

other than that, the expression itself is correct

ocean pawn
wooden sail
#

summing first is what you don't want to do

ocean pawn
#

Convention-wise

wooden sail
#

so the question is what jax does automatically with mean()

#

imagine all of the values are, say, a couple hundred million each

#

and you have several millions of them. the sum will overflow, but the mean can be computed by dividing first

#

these things make a difference in ML due to the size of the problems

arctic wedgeBOT
#

jax/_src/numpy/reductions.py line 743

def _mean(a: ArrayLike, axis: Axis = None, dtype: DTypeLike | None = None,```
wooden sail
#

at any rate, the code and math are technically correct, it's just that the implementation breaks down at different places depending on the implementation

ocean pawn
#
return lax.div(
      sum(a, axis, dtype=computation_dtype, keepdims=keepdims, where=where),
      lax.convert_element_type(normalizer, computation_dtype)
  ).astype(result_dtype)

Sum, then division?

#

So should I do it manually

#

If so, by convention, I should also use 1/2m, right?

wooden sail
#

whatever you prefer is fine, really

ocean pawn
#

Thanks!

ocean pawn
wooden sail
#

yeah

ocean pawn
#

Thanks!

#

I wonder why this isn't the default implementation

spring field
#

!rule 5

arctic wedgeBOT
#

5. Do not provide or request help on projects that may violate terms of service, or that may be deemed inappropriate, malicious, or illegal.

unkempt apex
#

what was malicious?

remote ridge
spring field
unkempt apex
#

ohh!

spring field
orchid lintel
#

Want to make a test dataset to evaluate different approaches to STT - is there any reason why I couldn't generate the dataset with TTS? Then maybe chop & screw the output with something like this? https://github.com/iver56/audiomentations
Seems like a relatively quick way to get a labeled dataset to evaluate some simple cases?

GitHub

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning. - iver56/audiomentations

past meteor
coral field
#

I tried testing it before, but I just couldn't find a good way to prevent validation overfitting

#

Hmm but ig I'll try again

past meteor
#

Do a hyperparameter search on the number of trees

#

Reducing that should prevent overfitting

#

For the record, I call all gradient boosting algorithms "xgboost". Which implementation are you using?

#

Actual xgboost or?

spare forum
coral field
#

got it, thanks so much!

lapis sequoia
orchid forge
#

I really want to talk with a person who is a legit data analyst because I have questions to ask for a data analysis project portfolio

past meteor
orchid forge
eager plume
#

Hellow, I wanna know roadmaps for ai/ml/ds/da.

#

Kindly, give me some sources for roadmaps and the tutorials

lapis sequoia
#

click in that icon at the top of the app (see image below) @eager plume

flint totem
#

Yo. Bout to build pc for neural and ai. So i wanted to ask whats the best cpu for this specific sphere. Currently i dont really understand diff between amd ryzen , intel core i ans intel xeon. Could anybody explain which one could be the better option in this specific sphere.

Motherload: A620M AM5
Gpu: gigabyte rtx 3070 (cuda support)
Ram: ddr5 16-32gb
Rom: ssd250, hdd1000
Ps: 650w

#

I was thinking about amd ryzen 7700, intel core i 7 9-11th gen
But some says xeon is quite better for those massive data so i dont give a thing

lapis sequoia
#

intel xeon has some sort of acceleration.
ppl go for just good GPU since it's more reliable (everyone has to support nvidia.)

#

with cuda supported GPU, i'll only help some.

ocean pawn
jaunty helm
lapis sequoia
jaunty helm
#

cuda's just got a massive head start on everybody
cause no one except nvidia invested this much into gpu computing
then the AI boom and everyone's rushing to replicate what nvidia's got, obviously it's not that easy

lapis sequoia
#

Normally, since the overhead of having the code, gathering training data, the ML libraries failing etc is so high, people go for the easy path.

#

an interesting middle ground is macos metal (M1s and M2s w silicon chip); for example i got a 2nd hand M1 (actually my prev boss gave to me, but it's cheap.), and it's quite nice for simple stuff.

buoyant vine
lapis sequoia
#

i mean, if that were the case google would be supporting it

jaunty helm
buoyant vine
buoyant vine
buoyant vine
# flint totem Yo. Bout to build pc for neural and ai. So i wanted to ask whats the best cpu fo...

In general I'd say:

  • If you don't want to fuck around with AMD drivers and hardware support with ROCm, go with Nvidia GPUs, also I think the AMD gpus are still a little bit lacking, that being said the newer gen being released (soon? now?) seem promising with their increased graphics power.
  • AMD Ryzen Zen4 CPU, it will just serve you much better than Intel will. Especially in terms of upgradability and performance.
  • Get a bigger SSD, 250GB is nothing now days and you will feel it
#

also 650W for that GPU and CPU are probably going to be a low, or at least very close to max capacity

lapis sequoia
#

oneAPI is an open standard, adopted by Intel, for a unified application programming interface (API) intended to be used across different computing accelerator (coprocessor) architectures, including GPUs, AI accelerators and field-programmable gate arrays. It is intended to eliminate the need for developers to maintain separate code bases, multip...

#

btw, idk if it's just SIMD, since the optimisations are unavailable for intel i-series

past meteor
# orchid forge What is it exactly to add in your resume to get a good data analysis job ? Also ...

Knowledge of some of the relevant tools: power bi or tableau, SQL, Python and maybe Excel. If you're interested in a specific industry (finance, healthcare, logistics, ...) domain knowledge is a big big benefit. On top of that, notions of dimensional modelling, ETL and the basics of data engineering go a long way. Being able to analyze a dataset is one thing but being able to set up everything necessary to do it is also important.

lapis sequoia
#

wikipedia landed dark mode

buoyant vine
#

The big reason it isn't supported for most I-series CPUs is because they do not ship most chips with the hardware

lapis sequoia
#

well, but then it's not supported in those cpus

buoyant vine
#

AMD realistically dominates the SIMD game rn

#

Not that it matters in the context of AI training because you'll end up using a GPU anyway

#

but if you're doing KNN or inference, the AMD chips have a significant performance difference over intel

#

πŸ˜… Providing it isn't Zen1 and Zen2, we don't talk about those...

serene grail
past meteor
#

@buoyant vine can I get your opinion on something? One of the things I've been working on recently is a general purpose inference server. Basically, there's a Python API that wraps sklearn => onnx, torch => onnx, ... and writes them to $storage and gives you an id.

On the rust side of things it's a basic axum webserver with dynamic routes that corresponds to that id. It loads the model from $storage and deserializes the payload (json in my poc) and does the inference.

The idea is to make something really simple for data people that want to deploy models. Once the inference server is running all you need to do is call things Python side

#

What's left is a small CLI tool and gui

#

MLflow already has serving but it feels so bloated imo. It pickles models and so on

lapis sequoia
#

k

buoyant vine
#

Sounds vaguely similar to what we did before we merged the services into a monolith

past meteor
#

What do you do now?

buoyant vine
#

Ours is a bit difference since it is a realtime web classifier, but we originally had:

  • Scraper microservices
  • Job submit/front API
  • Job manager
  • Inference API & model
  • Translator service

w/ communication original via SQS which then went to HTTP microservice calls.

And in the end we moved everything into one application and used channels to act as the internal queues.

past meteor
buoyant vine
# past meteor <@290923752475066368> can I get your opinion on something? One of the things I'v...

Overall it sounds fine, the only issue I have had with ort previously is there are some edge cases:

  • If you have two sessions trying to use the GPU it will deadlock both sessions (no idea the cause, but I believe it was memory locking)
  • Some models or CPU inference in general must do some internal batching, because no matter the batch size you give, it will always use effectively all the CPU cores you give it in terms of usage.
#

At some point I thought it must be spinning 1 thread constantly

past meteor
#

yikes for the first bullet point

buoyant vine
lapis sequoia
past meteor
#

Yeah, I remember the load balancer being an issue

#

I think I'll start small (CPU only as well) and expand it gradually

buoyant vine
#

Yeah, I mean in general if your have ORT setup, then adding GPU support is super simple

#

If you want to docker it, then it'll be a bit of a pain but anything gpu wise is a pain

past meteor
#

Why? Isn't it a matter of adding a couple of things to the compose file?

buoyant vine
#

your docker engine needs to have all the GPU stuff configured especially for nvidia,

#

there is a toolkit you need to install and attach to the docker runtime

lapis sequoia
#

there is dockerfiles already with that configured..

past meteor
#

fsr I never had an issue with this, it always just worked

#

But I could've been lucky

buoyant vine
buoyant vine
lapis sequoia
#

well, you need the cuda drivers in the OS

past meteor
#

WSL (at home) and debian (at work). I didn't install docker engine at work so the sysadmin might've gone through the pain for me

buoyant vine
#

do you guys normally do AI related stuff?

past meteor
#

Yes

buoyant vine
#

Possibly then, reason I brought it up was if you intended to share it with co workers via something like docker

#

it can be tempermental

#

but probably chances are most people's devices are setup already for it to 'just work'

#

πŸ˜… Well you'll certainly find out

past meteor
#

I'll look it up for sure

lapis sequoia
#

i normally just use conda either through GPU or HPC/cpus.

#

but the docs on tensorflow docker are very clear, that's why i meantioned it can't be that complex, yeah others need the compatible cuda config than your os

buoyant vine
#

The most annoying thing you can run into is missmatching cuda versions

lapis sequoia
#

yeah but it's not that problematic imho

#

unless you don't know much (i.e how to download the right version)

#

versions are backwards compatible

buoyant vine
#

I think in theory they are, but I remember us having some real pain points with them miss matching

lapis sequoia
#

now they dropped support for centos 7...

#

they've got a great version-matrix, and now the version of each package is !=, anyways..

past meteor
#

So much wacky tacit knowledge is necessary to do this stuff correctly

lapis sequoia
#

i disagree, but it is a pain unless you like to read a lot of random crap

past meteor
#

At my new job they just use databricks and let the cloud bill go πŸ“ˆ

buoyant vine
#

We have some machine images and docker files that are just "You shall use these images with this docker base image if you enjoy your sanity"

lapis sequoia
#

i'd guess simd enabled cpus could be useful if you do data-preprocessing outside the network (edit: without the neural network library.)

buoyant vine
#

I mean idk what the cost is bloblul I know we have this habit of being concerned about the cost of something if it is in its own isolated AWS account and they go "Wait this project costs us how much!?" but then if it is in the main account... You could burn through money and no one is going to notice it

#

Shout out to dynamoDB btw for its truly amazingly expensive on demand cost

lapis sequoia
#

lol

#

(silently closes dynamo db tab..)

buoyant vine
#

Pretty sexy cost reduction graph tho

lapis sequoia
#

interesting