#data-science-and-ml | Python | Page 140

torn talon Aug 8, 2024, 5:30 PM

#

when i score the resulting model using trained data, great, perfect fit of 1.0

#

but when i try any X beyond 12, things get really weird.

#

am i doing something very wrong, that the most simple third degree polynomial is not being correctly found using regression?

raw tree Aug 8, 2024, 5:32 PM

#

may simply not have enough data points ?
wikipedia says that you may have issues is your dataset is too small

#

but the perfect fit is weird ngl

torn talon Aug 8, 2024, 5:34 PM

#

the testing im doing is this:

raw tree Aug 8, 2024, 5:34 PM

#

did you look at the extracted coeffs ?

torn talon Aug 8, 2024, 5:34 PM

#

    model = Model("data/perfect_fit_heat_capacity.csv", "data/perfect_fit_resistivity.csv")
    trained_x_test = PolynomialFeatures(degree=3, include_bias=False).fit_transform([[1],[2],[3],[4],[5]])
    trained_y_test = [1, 8, 27, 64, 125]
    assert model.heatCapacityModel.score(trained_x_test, trained_y_test) == 1

    untrained_x_test = PolynomialFeatures(degree=3, include_bias=False).fit_transform([[0], [13],[1000]])
    untrained_y_test = [0, 2197, 100000]
    assert model.heatCapacityModel.score(untrained_x_test, untrained_y_test) == 1

#

the first assert succeeds

#

the second assert fails comically, with the first x mapping to like -1million y

raw tree Aug 8, 2024, 5:35 PM

#

raw tree did you look at the extracted coeffs ?

havent done polynomial regression before, but might be worth a try

torn talon Aug 8, 2024, 5:35 PM

#

these are the coefs:
material_test.py .Coef: [ 5.26907445e-14 -4.39378140e-15 1.00000000e+00]

raw tree Aug 8, 2024, 5:36 PM

#

seems like overfitting again
try bumping up to 150 samples or so

raw tree Aug 8, 2024, 5:36 PM

#

torn talon these are the coefs: material_test.py .Coef: [ 5.26907445e-14 -4.39378140e-15 ...

overfitting it is

torn talon Aug 8, 2024, 5:36 PM

#

unfortunately, i only have like 20 samples

#

but i know that im mapping a third degree polynomial

raw tree Aug 8, 2024, 5:36 PM

#

hmmm

torn talon Aug 8, 2024, 5:36 PM

#

my samples are based on some published material science articles

#

and they only have like 20 data points

raw tree Aug 8, 2024, 5:38 PM

#

try going oldschool if you are absolutly sure that it is in fact a cubic

#

https://math.stackexchange.com/questions/2655178/finding-the-equation-of-a-cubic-when-given-4-points

Mathematics Stack Exchange

Finding the equation of a cubic when given $4$ points

I am asked to find the equation of a cubic function that passes through the origin. It also passes through the points $(1, 3), (2, 6),$ and $(-1, 10)$.

I have walked through many answers for simi...

torn talon Aug 8, 2024, 5:39 PM

#

ok im not like positive it will always be

#

the function is the heat specifity of a metal wrt temperature

raw tree Aug 8, 2024, 5:39 PM

#

ah

torn talon Aug 8, 2024, 5:39 PM

#

and for one specific metal it is

raw tree Aug 8, 2024, 5:39 PM

#

lemme try plotting your coeffs

torn talon Aug 8, 2024, 5:39 PM

#

but i dont think it is for all metals

raw tree Aug 8, 2024, 5:42 PM

#

torn talon but i dont think it is for all metals

your code has gone awry somewhere

#

wait, shouldn't your cubic have one more coeff ?

torn talon Aug 8, 2024, 5:44 PM

#

material_test.py .coef: [ 5.26907445e-14 -4.39378140e-15 1.00000000e+00]

#

how tf is this giving a 1.0 fit for [[1],[2],[3],[4],[5]]

#

im just doing: print("coef: ", model.heatCapacityModel.coef_)

#

oh wait lol a score of 1.0 is the worst possible

#

a score of 0.0 is the best

#

sorry im pretty new to python data science. ok well i fucked up this very simple linear regression

raw tree Aug 8, 2024, 5:55 PM

#

lol, its fine
I'm banging my head against trasformers myself

raw tree Aug 8, 2024, 5:56 PM

#

torn talon a score of 0.0 is the best

give it a few more tries !
none of us are born clever : D

torn talon Aug 8, 2024, 5:59 PM

#

hrm, python is doing something weird when converting my y values to floats

#

this is my csv:

temp,specifity
1,1
2,8
3,27
4,64
5,125
6,216
7,343
8,512
9,729
10,1000
11,1331
12,1728

#

im just printing out the parsed values from the results of genfromtxt:

temps:  [ 1  2  3  4  5  6  7  8  9 10 11 12]
specificity:  [1.000e+00 8.000e+00 2.700e+01 6.400e+01 1.250e+02 2.160e+02 3.430e+02
 5.120e+02 7.290e+02 1.000e+03 1.331e+03 1.728e+03]

raw tree Aug 8, 2024, 6:01 PM

#

torn talon im just printing out the parsed values from the results of genfromtxt: ``` temp...

looks right to me ?
also, why floats ?

torn talon Aug 8, 2024, 6:02 PM

#

because actual metal heat capacity is reported as floats in kelvin

raw tree Aug 8, 2024, 6:04 PM

#

consider just multiplying your inital data by 1k or so to get inits
called a kernel trick iirc
you can just divide your predicted values by 1k too to get the right values

#

floats can be painful to debug with their inaccuracy

torn talon Aug 8, 2024, 6:05 PM

#

ok ill try it. but yeah the input vectors to my polynomial regression look right

#

ok wtf

#

this is the perfect fit

#

    #use model to make predictions on response variable
    y_predicted = poly_reg_model.predict(poly_features)

    #create scatterplot of x vs. y
    plt.scatter(temps, heatCapacity)

    #add line to show fitted polynomial regression model
    plt.plot(temps, y_predicted, color='purple')

    plt.show()

raw tree Aug 8, 2024, 6:13 PM

#

huh

#

add 13 and a few more out of dataset ig ?

torn talon Aug 8, 2024, 6:14 PM

#

wait is a fit score of 1.0 perfect or worst

#

im so confused now.

raw tree Aug 8, 2024, 6:15 PM

#

seems like best lol

torn talon Aug 8, 2024, 6:15 PM

#

but this doesnt look at all like the graph you plotted

raw tree Aug 8, 2024, 6:15 PM

#

got absolutely no clue ¯_(ツ)_/¯

torn talon Aug 8, 2024, 6:16 PM

#

    print(poly_reg_model.intercept_, poly_reg_model.coef_)

1.7053025658242404e-13 [-4.84333615e-14  1.28108132e-16  1.00000000e+00]

raw tree Aug 8, 2024, 6:16 PM

#

that wasnt what you sent before either

torn talon Aug 8, 2024, 6:17 PM

#

im bouta lose my mind lmao

#

alright well thanks for helping rubber duck, if i get this working ill come back

raw tree Aug 8, 2024, 6:17 PM

#

plot a few more out of dostribution man

torn talon Aug 8, 2024, 6:17 PM

#

ah good idea

raw tree Aug 8, 2024, 6:18 PM

#

torn talon alright well thanks for helping rubber duck, if i get this working ill come back

ye !
stares into my unchanging f1 scores

#

like wtf
this is from a total of ~25 epoaches or so (restarted from checkpoints a few times)
oversampling time ig

torn talon Aug 8, 2024, 6:23 PM

#

#

lmfao why

#

its a perfect x^3. why is my prediction so bad then

raw tree Aug 8, 2024, 6:25 PM

#

lol what

#

also, why include_bias=False ?

torn talon Aug 8, 2024, 6:29 PM

#

cuz y intercept wont always be 0

#

not every metal has a specific heat of 0 at 0K

tacit basin Aug 8, 2024, 6:42 PM

#

Looking at doc's for instructor. But can't find a way to use custom model or hosting API. Is it possible?
https://python.useinstructor.com/

Welcome To Instructor - Instructor

A lightweight library for structured outputs with LLMs.

agile cobalt Aug 8, 2024, 7:31 PM

#

tacit basin Looking at doc's for instructor. But can't find a way to use custom model or hos...

https://python.useinstructor.com/examples/ollama/#ollama

Ollama - Instructor

A lightweight library for structured outputs with LLMs.

#

idk if it supports passing a custom client, but some of the clients it supports allow for you to use a custom hosting API (like how that example uses OpenAI's library for interacting with a model hosted under localhost)

ocean pawn Aug 8, 2024, 7:51 PM

#

Sorry for the probably stupid question

#

Sorry (for the stupid question)

tacit basin Aug 8, 2024, 8:03 PM

#

agile cobalt idk if it supports passing a custom client, but some of the clients it supports ...

Thanks yeah. Where I work they use their own wrapper for openai API. Which complicates things. Can't use lang chain, instructor without customizing it ...

torn talon Aug 8, 2024, 9:06 PM

#

anyone here familiar with the RK45 api in scipy?

torn talon Aug 8, 2024, 11:03 PM

#

what is the shape of this curve called? and what type of sklearn regression model best fits it?

#

#

ive tried linear polynomials of various degrees and none fit it very well

agile cobalt Aug 8, 2024, 11:04 PM

#

looks kinda sigmoid to me? or maybe log

torn talon Aug 8, 2024, 11:04 PM

#

worth noting, this is unrelated to any coursework, im a 32 year old trying to relearn practical diffeq

worldly wagon Aug 8, 2024, 11:05 PM

#

Just curious has anyone seen documentation on forward/back buttons in plotly animations opposed to pause and play?
Or has anyone implemented it
Just curious I may post a thread on this later or if i succeed on implementation I'll share

agile cobalt Aug 8, 2024, 11:05 PM

#

you can try applying a non-linear transformation before trying to fit a linear model to it

torn talon Aug 8, 2024, 11:05 PM

#

isnt that what i did by performing a linear regression with varying polynomial degrees?

#

oh wait youre saying map the values in the scatter plot to logs of themselves, then regress on that?

torn talon Aug 9, 2024, 12:32 AM

#

im dumb, if the x axis is multiples of e^-7 the x axis is not logarithmic right. so this relationship is actually linear?

fallow plume Aug 9, 2024, 12:45 AM

#

torn talon im dumb, if the x axis is multiples of e^-7 the x axis is not logarithmic right....

Yes, that'd be linear... same as: y = (1e-7)ax+b

torn talon Aug 9, 2024, 12:46 AM

#

wtf. thats weird

#

assumed that metal heats up non-linearly when subject to current

fallow plume Aug 9, 2024, 12:46 AM

#

Perhaps its labeled wrong?

torn talon Aug 9, 2024, 12:46 AM

#

nah

#

i probs made a mistake with my runge kutta formulation or something

#

i hate math

#

@fallow plume are you familiar with matplotlib? in particular 3d plots?

fallow plume Aug 9, 2024, 12:48 AM

#

a little y?

torn talon Aug 9, 2024, 12:49 AM

#

    magnet = Magnet(material, validated_config)
    accumulator = []
    for current in validated_config.current_densities_to_plot_A_per_m2:
        (times, temps) = magnet.computeTemperatureEvolution(current)
        accumulator.append([[current]*len(times), times, temps])

#

this results in an accumulator with N entries

#

where the entries represent time vs temperature time series data of my model of a metal at different currents

#

they are definitely not the same length

#

i cant figure out what matplotlib plot to use

fallow plume Aug 9, 2024, 12:50 AM

#

https://matplotlib.org/stable/gallery/mplot3d/surface3d.html surface plot?

torn talon Aug 9, 2024, 12:51 AM

#

by just merging all the accumulated subvectors?

#

oh i guess that would work

torn talon Aug 9, 2024, 2:18 AM

#

nah doesnt work

half lintel Aug 9, 2024, 3:23 AM

#

Man, I've been using pandas for a couple weeks, still cant figure out some basic stuff....

serene scaffold Aug 9, 2024, 3:30 AM

#

half lintel Man, I've been using pandas for a couple weeks, still cant figure out some basic...

Pandas works pretty differently from the rest of python. Avoid loops and the apply method, and keep a tab open for the docs. Eventually you'll get it.

half lintel Aug 9, 2024, 3:33 AM

#

Things like...
I've got a super-big dataset, and I want to just update "things that match whatever" with a lambda.

I can use
df = df.loc[whatever, 'colname'] = df[whatever].apply(mylambda, axis=1)
But can't see how to chain something like that...

serene scaffold Aug 9, 2024, 3:33 AM

#

half lintel Things like... I've got a super-big dataset, and I want to just update "things ...

If you use apply, you're not reading the docs to figure out how you're "supposed" to do it

#

Apply is only there as a fallback if there's no way to do it with existing methods.

split dune Aug 9, 2024, 3:34 AM

#

who need free recaptcha slover api key

half lintel Aug 9, 2024, 3:34 AM

#

So have a nice

result_df = (
   something
   .groupby(..).agg(...)
   .rename()
   .whatever()
)

# then
result_df= # like above

serene scaffold Aug 9, 2024, 3:34 AM

#

There are also cases where you just can't chain whatever you're trying to do

half lintel Aug 9, 2024, 3:37 AM

#

I couldn't figure how to chain "update some of the rows with a lambda"
because df is pretty large, and running the lambda for every one is super slow.

serene scaffold Aug 9, 2024, 3:38 AM

#

Right, because you're only supposed to use lambdas as a last resort.

#

Always assume that there's a solution that doesn't involve loops or apply (including lambdas), and only give in to using either when you're sure the docs don't have a solution.

half lintel Aug 9, 2024, 3:39 AM

#

How else to do:

if record.type == 'something' then record.id = f'xxxx{record.a} : yyyy{record.b} : zzzz{record.c}'

#

I have no loops. And try and use ... uh the arrayish things when I can

serene scaffold Aug 9, 2024, 3:40 AM

#

The replace method might help, but I can't do a deep dive right now

half lintel Aug 9, 2024, 3:40 AM

#

is this considered ok?

df.loc[df.SOMETHING.str.startswith('xxx-'), 'status'] = 'this is an xxx'

serene scaffold Aug 9, 2024, 3:41 AM

#

Yes, though I don't recommend ever looking up columns with the dot operator

half lintel Aug 9, 2024, 3:41 AM

#

it's the same as df['SOMETHING']

serene scaffold Aug 9, 2024, 3:41 AM

#

Right. I recommend you always do that and never use the dot operator.

half lintel Aug 9, 2024, 3:41 AM

#

hmmmm.

serene scaffold Aug 9, 2024, 3:42 AM

#

It's an unholy mixing of namespacing that pandas shouldn't support.

half lintel Aug 9, 2024, 3:42 AM

#

Is there any way to chain a "select and update" .loc like that?

serene scaffold Aug 9, 2024, 3:43 AM

#

Not that immediately comes to mind

half lintel Aug 9, 2024, 3:43 AM

#

ok thanks

serene scaffold Aug 9, 2024, 3:43 AM

#

Could you parameterize it with a loop?
Note that it wouldn't be a loop over the dataframe, which is what you want to avoid

half lintel Aug 9, 2024, 3:44 AM

#

Sorry, I don't understand at all. parameterize?

#

Still noob see 🙂

serene scaffold Aug 9, 2024, 3:46 AM

#

# not parameterized
thing['a'] = b + c
thing['d'] = e + f
thing['g'] = h + i

# parameterized
for letter, x, y in [
    ('a', b, c),
    ('d', e, f),
    ('g', h, i),
]:
    thing[letter] = x + y

#

so you only have df.loc[df['col'].str.startswith('xxx-'), 'status'] = 'this is an xxx' once, but with loop variables for all the parts that change.

#

anyway, I'm going to log off now

#

good luck

half lintel Aug 9, 2024, 3:47 AM

#

thanks 🙂

#

Any recommendations on a good book for learning pandas? Or is the website the best?

serene scaffold Aug 9, 2024, 3:52 AM

#

The kaggle pandas tutorial

agile cobalt Aug 9, 2024, 4:02 AM

#

the official User Guide is very good imo

proper crag Aug 9, 2024, 5:07 AM

#

Geeksforgeeks or their official docs

whole nacelle Aug 9, 2024, 5:43 AM

#

This is wild

#

send 50 messages to be verified?

#

😦

worldly dawn Aug 9, 2024, 5:43 AM

#

whole nacelle send 50 messages to be verified?

Note that if you spam, you will get moderated

whole nacelle Aug 9, 2024, 5:44 AM

#

Yeah I am not spamming I want to talk about a problem im facing with azure functions

#

I'm sorry if that's spam

hardy shuttle Aug 9, 2024, 12:38 PM

#

Hi there, I have a question around date / time data for ML training. If this is something that might interest you / know the answer to, here is a link to my help post 🙂
https://discord.com/channels/267624335836053506/1271445530307727446

hardy shuttle Aug 9, 2024, 1:01 PM

#

Does anyone have any good resource to understand SVM time series? and how to use it

torn talon Aug 9, 2024, 2:21 PM

#

if youre using python to perform numerical analysis of very large decimal numbers, what operators do you use for things like exponeniation, and what primitives do you use to represent the numbers?

buoyant vine Aug 9, 2024, 2:22 PM

#

probably just the decimal lib

#

https://docs.python.org/3/library/decimal.html

Python documentation

decimal — Decimal fixed-point and floating-point arithmetic

Source code: Lib/decimal.py The decimal module provides support for fast correctly rounded decimal floating-point arithmetic. It offers several advantages over the float datatype: Decimal “is based...

agile cobalt Aug 9, 2024, 2:24 PM

#

yeah, if you need of perfect accuracy decimal ; if you need of speed and don't mind trading off accuracy you could use numpy's int64 or float64

#

pithink seems like polars supports Decimals, might be faster without trading off precision, but doesn't supports pow (it does supports * so you can do pl.col("X") * pl.col("X") for ** 2 and alike, but if you try to .sqrt() for example it just casts to float64)

edit; disclaimer: still considered "unstable" and there are some nasty sounding issues related to Decimals open in their github

wooden sail Aug 9, 2024, 3:23 PM

#

past a certain point you'd wanna consider using sage or mathematica

shut shoal Aug 9, 2024, 4:00 PM

#

This is an awesome website. Thanks man.

#

Oh dang it only has the gpt model

scarlet owl Aug 9, 2024, 4:23 PM

#

Hi, I want to do Machine learning to get started. Can anyone suggest me any?

iron basalt Aug 9, 2024, 5:24 PM

#

torn talon if youre using python to perform numerical analysis of very large decimal number...

You can use sympy: https://docs.sympy.org/latest/modules/evalf.html

iron basalt Aug 9, 2024, 5:42 PM

#

There is also gmpy2, which is faster.

unkempt apex Aug 9, 2024, 6:02 PM

#

https://nlpcloud.com/how-to-install-and-deploy-llama-3-into-production.html?utm_source=reddit&utm_campaign=fqwerty13-6816-81ed-a26450242ac140019

How to Install and Deploy LLaMA 3 Into Production?

Learn how to install and deploy LLaMA 3 into production with this step-by-step guide. From hardware requirements to deployment and scaling, we cover everything you need to know for a smooth implementation.

#

20GB of VRAM required ..

serene scaffold Aug 9, 2024, 6:12 PM

#

unkempt apex 20GB of VRAM required ..

that's relatively low for post-ChatGPT LLMs.
the highest-performing models require the highest-end hardware.

unkempt apex Aug 9, 2024, 6:13 PM

#

serene scaffold that's relatively low for post-ChatGPT LLMs. the highest-performing models requi...

what is 50 tokens generated in 1 second?

#

each token is similiar to word ( according to inference )

#

so 50 words in 1 sec?

agile cobalt Aug 9, 2024, 6:14 PM

#

Gemma 2 has a 2B parameters model

that article is from before Llama 3 405B and Gemma 2 models were released I think?

ocean pawn Aug 9, 2024, 6:15 PM

#

agile cobalt Gemma 2 has a 2B parameters model that article is from before Llama 3 405B and ...

Gemma have 7B _I think _

#

Huh 27B since when

agile cobalt Aug 9, 2024, 6:15 PM

#

ocean pawn Gemma have 7B _I think _

2B, 9B and 27B https://huggingface.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315

unkempt apex Aug 9, 2024, 6:16 PM

#

agile cobalt Gemma 2 has a 2B parameters model that article is from before Llama 3 405B and ...

yeah, it was suggested by reddit post! for 7B model

ocean pawn Aug 9, 2024, 6:16 PM

#

agile cobalt 2B, 9B and 27B <https://huggingface.co/collections/google/gemma-2-release-667d66...

Yup I saw it, interesting

#

Realistically, what's the difference between Gemini and Gemma

#

Except the size of the model itself of course

#

Is the architecture similar, I wonder

unkempt apex Aug 9, 2024, 6:17 PM

#

ocean pawn Realistically, what's the difference between Gemini and Gemma

people used to hate gemini so they release gemma to distract!😂

ocean pawn Aug 9, 2024, 6:17 PM

#

unkempt apex people used to hate gemini so they release gemma to distract!😂

I mean, Gemini is proprietary and Gemma is open source (ish?)

serene scaffold Aug 9, 2024, 6:17 PM

#

unkempt apex what is 50 tokens generated in 1 second?

"token" is basically MLspeak for "word". a token is the primary unit of language that models deal with.

agile cobalt Aug 9, 2024, 6:17 PM

#

ocean pawn Is the architecture similar, I wonder

should be extremely similar if not the same, perhaps excluding multi-modal changes though

ocean pawn Aug 9, 2024, 6:18 PM

#

serene scaffold "token" is basically MLspeak for "word". a token is the primary unit of language...

Isn't technically, a token is part of a word?

#

Like wordpices

unkempt apex Aug 9, 2024, 6:18 PM

#

ocean pawn I mean, Gemini is proprietary and Gemma is open source (ish?)

yeah sort of !

agile cobalt Aug 9, 2024, 6:19 PM

#

ocean pawn Isn't technically, a token is _part_ of a word?

a token size could be anywhere from a single letter to an entire sentence, and there are some special tokens that don't correspond to text but rather special instructions

most of the time it should be part of a word though

serene scaffold Aug 9, 2024, 6:19 PM

#

ocean pawn Isn't technically, a token is _part_ of a word?

"wordpiece" is another ML term that isn't from linguistics. but a wordpiece is smaller than a token.

#

the reason "token" and "word" are separate is that "words" are a linguistic concept, and token boundaries for ML purposes might not be what linguistics consider to be word boundaries.

unkempt apex Aug 9, 2024, 6:20 PM

#

agile cobalt a token size could be anywhere from a single letter to an entire sentence, and t...

from single letter to whole sentence (combinations of letters) then how can we define token size?

ocean pawn Aug 9, 2024, 6:20 PM

#

agile cobalt a token size could be anywhere from a single letter to an entire sentence, and t...

Oh, I see, is the definition of a token manually defined by human?

ocean pawn Aug 9, 2024, 6:20 PM

#

serene scaffold "wordpiece" is another ML term that isn't from linguistics. but a wordpiece is s...

No I mean Google's WordPices tokenizer

#

It separates some word into parts from what I know

serene scaffold Aug 9, 2024, 6:20 PM

#

ocean pawn No I mean Google's WordPices tokenizer

I know what wordpieces are. that's what I'm talking about.

ocean pawn Aug 9, 2024, 6:21 PM

#

serene scaffold I know what wordpieces are. that's what I'm talking about.

Sorry then, I misunderstood, but don't wordpices turn sentences into tokens, each token can be a word, but it can also be part of a word?

unkempt apex Aug 9, 2024, 6:22 PM

#

ocean pawn Sorry then, I misunderstood, but don't wordpices turn sentences into tokens, eac...

and this is how we invent recurssion! lol

agile cobalt Aug 9, 2024, 6:22 PM

#

ocean pawn Oh, I see, is the definition of a token manually defined by human?

iirc most models are trained on a vocabulary created by another program - I forgot the details about how that's generated but Stel should know?

You can manually define/overwrite some tokens though, that's somewhat frequently used for fine tuning (e.g. fine tuning Stable Diffusion to recognize a person, or a LLM to perform a new task)

ocean pawn Aug 9, 2024, 6:22 PM

#

agile cobalt iirc most models are trained on a vocabulary created by another program - I forg...

Oh I see

serene scaffold Aug 9, 2024, 6:23 PM

#

ocean pawn Sorry then, I misunderstood, but don't wordpices turn sentences into tokens, eac...

no, the wordpiece tokenizer turns a string into a sequence of wordpieces.

ocean pawn Aug 9, 2024, 6:23 PM

#

serene scaffold no, the wordpiece tokenizer turns a string into a sequence of wordpieces.

Sorry I might've mixed it up with sentence piece, is it also like wordpices?

serene scaffold Aug 9, 2024, 6:24 PM

#

ocean pawn Sorry I might've mixed it up with sentence piece, is it also like wordpices?

wordpieces are basically "subtokens".
note that "wordpiece" and "wordpiece tokenizer" are separate things.

ocean pawn Aug 9, 2024, 6:25 PM

#

pithink

#

Thanks for the explanation, but I am still a bit confused

#

I'll have some more look into it

#

#

Doesn't wordpices break sentences down to smaller part

#

Then the word get tokenized or something?

#

Is this correct?

serene scaffold Aug 9, 2024, 6:27 PM

#

ocean pawn Doesn't wordpices break sentences down to smaller part

it sounds like you think "a wordpiece" is a thing that does something

#

a "wordpiece tokenizer" is a thing that does something.
"wordpiece tokenization" is the process of splitting a unit of text (such as a word or sentence) into wordpieces.

ocean pawn Aug 9, 2024, 6:28 PM

#

Oh, I see what you meant

serene scaffold Aug 9, 2024, 6:28 PM

#

wordpieces and tokens are both units of text; wordpieces are smaller than tokens.

ocean pawn Aug 9, 2024, 6:29 PM

#

serene scaffold wordpieces and tokens are both units of text; wordpieces are smaller than tokens...

May I ask why wordpices are smaller than token?

serene scaffold Aug 9, 2024, 6:30 PM

#

ocean pawn May I ask why wordpices are smaller than token?

that's just how wordpiece is defined. originally, "token" referred to the smallest units of text used in ML.

ocean pawn Aug 9, 2024, 6:31 PM

#

serene scaffold that's just how wordpiece is defined. originally, "token" referred to the smalle...

Oh

#

I thought number of token was the size of the input to the algorithm?

#

Is that an incorrect assumption?

#

If tokenizer turn test->1 try>2 ing->3 then testing would be 1,3 so 2 token, this is my understanding is it correct?

serene scaffold Aug 9, 2024, 6:33 PM

#

ocean pawn I thought number of token was the size of the input to the algorithm?

depends on the model and what you are doing. neural networks are inherently numeric, so the input to a network is often an array/tensor of integers, where each integer represents a token from the input

#

or it might be that the integer represents a subtoken from the input.

ocean pawn Aug 9, 2024, 6:35 PM

#

serene scaffold or it might be that the integer represents a subtoken from the input.

Oh, I automatically assume each float/integer in a array is a token, but they can also be a sub token?

#

Sorry for the stupid question

#

Thanks for having the patience to answer said question

serene scaffold Aug 9, 2024, 6:36 PM

#

ocean pawn Oh, I automatically assume each float/integer in a array is a token, but they ca...

They could be sub tokens.

#

It depends on the model architecture

ocean pawn Aug 9, 2024, 6:36 PM

#

serene scaffold It depends on the model architecture

I see, thanks!

serene scaffold Aug 9, 2024, 6:37 PM

#

ocean pawn Oh, I automatically assume each float/integer in a array is a token, but they ca...

Generally speaking, tokens correspond to "words" in pure linguistics, and sub tokens/word pieces correspond to morphemes. https://en.m.wikipedia.org/wiki/Morpheme

Morpheme

A morpheme is the smallest meaningful constituent of a linguistic expression. The field of linguistic study dedicated to morphemes is called morphology.
In English, morphemes are often but not necessarily words. Morphemes that stand alone are considered roots (such as the morpheme cat); other morphemes, called affixes, are found only in combinat...

ocean pawn Aug 9, 2024, 6:37 PM

#

serene scaffold Generally speaking, tokens correspond to "words" in pure linguistics, and sub to...

Oh I see

#

So to check my understanding: a word is a token, wordpices which is part of a word is subtoken

#

Is that correct?

#

So word pieces tokenizer break word such as "unbreakable" into un, break, able

#

Each of them is a subtoken?

serene scaffold Aug 9, 2024, 6:39 PM

#

Yes!

ocean pawn Aug 9, 2024, 6:39 PM

#

Is that correct? Thanks!

ocean pawn Aug 9, 2024, 6:39 PM

#

serene scaffold Yes!

Thanks!

unkempt apex Aug 9, 2024, 6:40 PM

#

yeah nice

ocean pawn Aug 9, 2024, 6:40 PM

#

Thanks for explaining this to me! And thanks for the tolerating stupid question!

unkempt apex Aug 9, 2024, 7:01 PM

#

so you put that model onto your website right?

#

that chatbot which tells about you?

#

yeah but it needs whole GPU clusters

#

then what is your monthly cost?

#

how? only ec2 is free right?

#

which one are you using?

#

1 year for free tier right?

#

so it is not completely free>

#

okay so that means I can play with llama on free tier? for completely free no additional plugins stuff require?

#

uhh ohh

#

heh, why total is 0.00?

#

100 dollars? for credit

#

okay

regal bronze Aug 9, 2024, 9:33 PM

#

Hey guys I got a question:

Before choosing the model, you have to prepare the data. How do you select columns are important for the model?

#

I have like 79 columns

spare forum Aug 9, 2024, 10:11 PM

#

regal bronze Hey guys I got a question: Before choosing the model, you have to prepare the d...

that's kinda the hardest task, you have to explore and there is no general answer, you can for example drop column that have too much missing values, low variance (and it's obvious if the column have the same value every time) , maybe sometime the column have low data quality, then there is stats, you look associations between you target and the columns with stats, correlations, anova... depends,

#

sometimes theres is more practical aspects like how easy is it to have this variable or "does it makes sense to make a model with this variable"

#

some test: correlation,anova, chi2

#

also make some visuals

#

some associations might be non linear and you have to do log transform to a variable etc... the tools goes on

#

pca

jaunty helm Aug 10, 2024, 7:20 AM

#

regal bronze Hey guys I got a question: Before choosing the model, you have to prepare the d...

best way is domain knowledge, but this is also the hardest obv
take ideas from stats as mentioned by gabigabgob, e.g. mutual information, (k)pca
use other models to assist, e.g. lasso, trees

regal bronze Aug 10, 2024, 7:26 AM

#

jaunty helm best way is domain knowledge, but this is also the hardest obv take ideas from s...

Pca is a tool?

regal bronze Aug 10, 2024, 7:45 AM

#

think it will just be guessing one by one each column

#

Thanks guys 😁

verbal oar Aug 10, 2024, 8:05 AM

#

is RNN foundation for LLM?

#

I mean do I need to care about gru, lstm before transformers etc?

#

my focus is on nlp and text

spring field Aug 10, 2024, 8:22 AM

#

verbal oar is RNN foundation for LLM?

no, but it was a predecessor to transformers in the field of nlp

spring field Aug 10, 2024, 8:23 AM

#

verbal oar I mean do I need to care about gru, lstm before transformers etc?

I mean, wouldn't be a bad thing to know about them anyway

remote stream Aug 10, 2024, 8:52 AM

#

Guys i am getting an error while using transfer learning

#

Only instances of `keras.Layer` can be added to a Sequential model. Received: <tensorflow_hub.keras_layer.KerasLayer object at 0x00000280F44756D0> (of type <class 'tensorflow_hub.keras_layer.KerasLayer'>)

proper crag Aug 10, 2024, 9:08 AM

#

encoder = OneHotEncoder(sparse_output=False)
one_hot_encoding = encoder.fit_transform(data[['Old/New']])
encoded = pd.DataFrame(one_hot_encoding, columns=encoder.get_feature_names_out(['Old/New']))


combined_encoded = pd.concat([data.drop(['Property', 'Type', 'Old/New'], axis=1), encoded_df, encoded_df_two, encoded], axis=1) # Combine with the original data, dropping the original 'Property' & 'Type' column

        #Table visualization
#pd.set_option('display.max_rows', 40)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

#sorted_data = data.sort_values(by=['Revenue'], ascending=True)

print(combined_encoded)```

#

#

why is it returned in float instead of interger?

lapis sequoia Aug 10, 2024, 9:09 AM

#

remote stream ```Only instances of `keras.Layer` can be added to a Sequential model. Received:...

https://github.com/tensorflow/tensorflow/issues/63085#issuecomment-1972483237

GitHub

ValueError: Only instances of `keras.Layer` can be added to a Seque...

Issue type Bug Have you reproduced the bug with TensorFlow Nightly? No Source source TensorFlow version v2.15.0-rc1-8-g6887368d6d4 2.15.0 Custom code No OS platform and distribution Kaggle Mobile d...

remote stream Aug 10, 2024, 9:09 AM

#

ho

proper crag Aug 10, 2024, 9:10 AM

#

also does the fact that its returned in float instead of interger, does it would efct the model prediction performance?

verbal oar Aug 10, 2024, 9:11 AM

#

ok I read it I need lstm and/or transformers for llm

#

so its base

#

ok I now read your response, ok so good to know

lyric furnace Aug 10, 2024, 10:23 AM

#

guys, can anyone suggest me a tutorial or book or a doc related to machine learning. it should cover the basic ML, THE problem is that IDK high school maths cause i am only 13. idk calculus and other maths stuff, I am just learning. so please suggest.

lapis sequoia Aug 10, 2024, 10:24 AM

#

start by coding, then move to simple frameworks like fast ai @lyric furnace

#

then you'll already have an idea if you wanna do the math part.

proper crag Aug 10, 2024, 10:25 AM

#

lyric furnace guys, can anyone suggest me a tutorial or book or a doc related to machine learn...

learn the language you wan to learn first while learn linear algebra

lyric furnace Aug 10, 2024, 10:25 AM

#

linear algebra, i have basics of python

#

wth is linear algebra

proper crag Aug 10, 2024, 10:26 AM

#

to do ML youre dealing with data you hv to do data cleansing which involving data manipulation

lyric furnace Aug 10, 2024, 10:27 AM

#

pandas ?

proper crag Aug 10, 2024, 10:27 AM

#

understand how the system execute the code

lyric furnace Aug 10, 2024, 10:27 AM

#

ive learn that a bit

#

@proper crag sorry but I am a beginner IDK much about the programming world

proper crag Aug 10, 2024, 10:28 AM

#

tbh , at 1st im like you didn know where to go but just learn python as much as you can

#

while learn the math

lyric furnace Aug 10, 2024, 10:28 AM

#

proper crag tbh , at 1st im like you didn know where to go but just learn python as much as ...

thx bro

#

i've learned a bit in khan acardemy

#

[[1,1,0]
[0,1,0]]

Ive remeber somethign like above

serene grail Aug 10, 2024, 10:29 AM

#

Keep learning Python, keep learning math, both of those will serve you well

proper crag Aug 10, 2024, 10:29 AM

#

since you're here...i wan to ask something

lyric furnace Aug 10, 2024, 10:30 AM

#

ahh'

proper crag Aug 10, 2024, 10:30 AM

#

proper crag ```py encoder = OneHotEncoder(sparse_output=False) one_hot_encoding = encoder.fi...

@final kiln

lyric furnace Aug 10, 2024, 10:30 AM

#

lol

#

i havent learn a degree so

#

hmm

proper crag Aug 10, 2024, 10:30 AM

#

#

i ve encoded them

#

but why its in float?...does wether its in float or interger would efect the model performance ? @final kiln

lapis sequoia Aug 10, 2024, 10:31 AM

#

it can do both

lyric furnace Aug 10, 2024, 10:31 AM

#

@final kiln Thx alot bro, you helped me alot, thx !!!

lapis sequoia Aug 10, 2024, 10:31 AM

#

depending which side the vector is

#

i don't think that's got much to do

proper crag Aug 10, 2024, 10:33 AM

#

encoder = OneHotEncoder(sparse_output=False)
one_hot_encoding = encoder.fit_transform(data[['Old/New']])
encoded = pd.DataFrame(one_hot_encoding, columns=encoder.get_feature_names_out(['Old/New']))


combined_encoded = pd.concat([data.drop(['Property', 'Type', 'Old/New'], axis=1), encoded_df, encoded_df_two, encoded], axis=1) # Combine with the original data, dropping the original 'Property' & 'Type' column

        #Table visualization
#pd.set_option('display.max_rows', 40)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

#sorted_data = data.sort_values(by=['Revenue'], ascending=True)

print(combined_encoded)``` this is the code

lapis sequoia Aug 10, 2024, 10:33 AM

#

[1,2][2,3] => [1,3]
[2,3][3,1] => [2,1]

proper crag Aug 10, 2024, 10:34 AM

#

im using SK Learn

wooden sail Aug 10, 2024, 10:34 AM

#

in finite dimensions at least

proper crag Aug 10, 2024, 10:35 AM

#

Old/New....i've encoded them and its give the result in float

wooden sail Aug 10, 2024, 10:35 AM

#

in infinite dims, the operation of taking the dual is not an involution

#

so it's not enough to just say something like "i'll just treat the dual space as a vector space and let the original vector space be its dual"

spare forum Aug 10, 2024, 10:38 AM

#

regal bronze think it will just be guessing one by one each column

Also it's cool to use a tree model with relatively low depth and visualize the decisions and check the used variables

proper crag Aug 10, 2024, 10:40 AM

#

in linear algebra

lapis sequoia Aug 10, 2024, 10:40 AM

#

wow ray tune seems powerful

proper crag Aug 10, 2024, 10:41 AM

#

what make almost feel stuggl is scalar

#

i mean like how to satsify vextor x from the given 2 vectors

#

and diffenrental calculus

lapis sequoia Aug 10, 2024, 10:53 AM

#

it's quite neat that kaggle gives you 2 gpus

pine heron Aug 10, 2024, 12:05 PM

#

Hello, everyone, this is my project, it allows you to easily train agents.

https://github.com/NoteDance/Note

GitHub

GitHub - NoteDance/Note: Machine learning library, Distributed trai...

Machine learning library, Distributed training, Deep learning, Reinforcement learning, Models, TensorFlow, PyTorch - NoteDance/Note

ocean pawn Aug 10, 2024, 12:59 PM

#

Is there any reason why pytorch is more popular than tensorflow/keras?

#

It looks like keras is easier to use

#

||Any why is JAX not that used? It's suppose to preform better than both of those 3? Right?||

#

I see, but it looks like keras abstract away the training loop with fit too, so even less code needed?

#

It might be, that makes sense

#

I know

#

Fair enough

#

Intresting

#

pred = model(X)
loss = loss_fn(pred, y)
# Backpropagation
loss.backward()
optimizer.step()
optimizer.zero_grad()

I don't understand, loss.backward calculate the gradient, but how do optimizer.step updates the parameter of the model?

#

I've only tried jax before, and it's functional, so this is weird

#

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

#

Right, python is pass by reference, forgot that for a second

lapis sequoia Aug 10, 2024, 1:13 PM

#

keras still allows custom loops

ocean pawn Aug 10, 2024, 1:13 PM

#

Oh I see

#

I see, I was expecting something like w = w - learning_rate * w_grad, but it seemes like pytorch don't do reassignment(?) often

lapis sequoia Aug 10, 2024, 1:15 PM

#

i think the main reason why torch is more popular is that it's easier to run

#

i.e has more hardware support, and less bugs of that sort

#

not bc of the api

ocean pawn Aug 10, 2024, 1:16 PM

#

lapis sequoia i.e has more hardware support, and less bugs of that sort

That makes more sense, the API is really similar

lapis sequoia Aug 10, 2024, 1:17 PM

#

you can see people stuck installing tf for months

ocean pawn Aug 10, 2024, 1:17 PM

#

lapis sequoia you can see people stuck installing tf for months

Yeah I've seen them in help channel

#

I understand, but I was surprised at the fact that in pytorch when you use optimized.step() it automatically updated the parameters

#

I thought it'll return a new parameter, then I'll update my model with the new parameters

#

Either way, if it works it works

lapis sequoia Aug 10, 2024, 1:20 PM

#

in tf weights are variables, and they can be mutated

ocean pawn Aug 10, 2024, 1:20 PM

#

Understandable, there's many layer with lots of neuron

#

That's the formular for regression

#

I was expecting something similar

lapis sequoia Aug 10, 2024, 1:20 PM

#

weight.update(-2) for example

#

they may be a class with some implemented behaviour

ocean pawn Aug 10, 2024, 1:21 PM

#

lapis sequoia they may be a class with some implemented behaviour

I suppose you can create your own class for optimizer? (Not that I can do that)

lapis sequoia Aug 10, 2024, 1:22 PM

#

i meant the weights themselves

ocean pawn Aug 10, 2024, 1:22 PM

#

Oh I know what you mean

#

Yeah python is pass by reference

lapis sequoia Aug 10, 2024, 1:22 PM

#

but yeah you can create or just extend it rather, since it's a large pile of inherited stuff

ocean pawn Aug 10, 2024, 1:22 PM

#

Thanks everyone @final kiln @lapis sequoia (apologies for pinging)

#

I guess I'll play with pytorch (and Jax) and see

lapis sequoia Aug 10, 2024, 2:02 PM

#

some of the current rationale is to map after batch i think this makes it faster, im unsure whether prefetch/cache order matters here.

#

one tricky thing with training using iterators is that you may run out of data, so sometimes it requires .repeat(n)

proper crag Aug 10, 2024, 2:06 PM

#

how do you handle outlier?

#

data_quantile_1 = data['Revenue'].quantile(0.75) #31050500.0
data_quantile2 = data['Revenue'].quantile(0.25)   #9021375
data_min = data['Revenue'].min()   #2336000
data_max = data['Revenue'].max() #100083000```

#

the comment is the output of the method

#

like you all see, its have outlier

fallow coyote Aug 10, 2024, 2:19 PM

#

would it be better to just use matplotlib or should i use it in combination with seaborn for ease of mind? Ive used matplotlib by itself and it pissed me off

left tartan Aug 10, 2024, 2:21 PM

#

fallow coyote would it be better to just use matplotlib or should i use it in combination with...

I prefer plotly. I think it's a matter of preference

fallow coyote Aug 10, 2024, 2:25 PM

#

isnt matplotlib, from what ive read up, more extensive compared to other modules of its type?

#

Ill probably move onto other sorts of graphing modules but for now, I want to stick with whats most commonly used and has the most features

spare forum Aug 10, 2024, 2:46 PM

#

They are both widely used

proper crag Aug 10, 2024, 2:47 PM

#

proper crag how do you handle outlier?

what i supposed to do ?

#

i wan to implemet the data for Logistic Regression

spare forum Aug 10, 2024, 2:48 PM

#

Matplotlib is mainly better for scientific graphic and simple/sober graphics, with plotly its easier to do sexier graphics

spare forum Aug 10, 2024, 2:49 PM

#

proper crag how do you handle outlier?

Is that the target?

proper crag Aug 10, 2024, 2:50 PM

#

spare forum Is that the target?

i have encoded all the object type ..okay im actual beigner ..i didn know anything regrding feature engineering

spare forum Aug 10, 2024, 2:50 PM

#

It's easier to see what's going on with a boxplot tbh

proper crag Aug 10, 2024, 2:51 PM

#

spare forum Is that the target?

okay what is this target you're talking about?

#

pls bare with me

#

im complete beginer in ML

#

however, ihave all the object type/string data encoded

spare forum Aug 10, 2024, 2:53 PM

#

Em, if you want to do a ML model that means you want to determine a target variable with other variables, is revenue what you want to determine

#

(might be a good idea to read about basics)

proper crag Aug 10, 2024, 2:55 PM

#

spare forum Em, if you want to do a ML model that means you want to determine a target varia...

so i heard about feature engineering and like outlier could efct the model performance

#

so, what i need to do regarding that step

#

then i would read anything i need just right this process which i need to get through bfore start to code my model

spare forum Aug 10, 2024, 2:59 PM

#

I think you have check for really the 101 of ML on really easy data, you should know what is the target and what the "explicative column" means, I think that would be a slightly better start

proper crag Aug 10, 2024, 2:59 PM

#

proper crag so i heard about feature engineering and like outlier could efct the model perfo...

bcuz the data have outlier

#

im using logistic regression model is bcuz i wan to predict the pattern which then i could classify what might could be the peoples factors influencing market competitiveness and consumer interest using

proper crag Aug 10, 2024, 3:04 PM

#

spare forum Is that the target?

yeah, its the target...the revenue have outlier and is the target since i can undesrtand the revenue column as the output of the input ...thats what tartget mean right?

ocean pawn Aug 10, 2024, 8:21 PM

#

class CNN(eqx.Module):
    layers: list

    def __init__(self, key):
        key1, key2, key3, key4 = jax.random.split(key, 4)
        # Standard CNN setup: convolutional layer, followed by flattening,
        # with a small MLP on top.
        self.layers = [
            eqx.nn.Conv2d(1, 3, kernel_size=4, key=key1),
            eqx.nn.MaxPool2d(kernel_size=2),
            jax.nn.relu,
            jnp.ravel,
            eqx.nn.Linear(1728, 512, key=key2),
            jax.nn.sigmoid,
            eqx.nn.Linear(512, 64, key=key3),
            jax.nn.relu,
            eqx.nn.Linear(64, 10, key=key4),
            jax.nn.log_softmax,
        ]

    def __call__(self, x: Float[Array, "1 28 28"]) -> Float[Array, "10"]:
        for layer in self.layers:
            x = layer(x)
        return x

Do anyone know why it's eqx.nn.Linear(1728, 512, key=key2), 512 is more or less abitory, but do anyone know how they figure out the size of the activation when it's ravel(ed)? (where do 1728 come from?)

whole pendant Aug 10, 2024, 8:58 PM

#

hey whats a good book for math for data science

#

quick i need to order

#

with explanations and stuff

#

nd solved examples

unkempt wigeon Aug 10, 2024, 9:18 PM

#

may i ask a question?

odd meteor Aug 10, 2024, 9:56 PM

#

whole pendant hey whats a good book for math for data science

The book by Ian Goodfellow
https://www.deeplearningbook.org/
Statistical Learning
https://www.statlearning.com/
Mathematics for ML book

Mathematics for Machine Learning
https://mml-book.github.io

Check pinned post for more

An Introduction to Statistical Learning

Mathematics for Machine Learning

whole pendant Aug 10, 2024, 10:01 PM

#

odd meteor 1. The book by Ian Goodfellow https://www.deeplearningbook.org/ 2. Statistical ...

thankyou

odd meteor Aug 10, 2024, 10:16 PM

#

ocean pawn ```py class CNN(eqx.Module): layers: list def __init__(self, key): ...

It's gotten from the image you're working with.

To build intuition, imagine you're working with a grayscale image with 14 x 14 pixels (14x14-dimensional image. That is, you have a matrix of pixels with the shape 14 rows by 14 columns )

Now, when we shrink (flatten) this image (matrix of pixels) to a row vector, you'll get a 196-dimensional row vector (14 x 14 pixels = 196 pixels)

This is what they calculated on the image you're working with to arrive at 1728, which was then passed to the 1st hidden layer.

odd meteor Aug 10, 2024, 10:27 PM

#

ocean pawn ```py class CNN(eqx.Module): layers: list def __init__(self, key): ...

The activation function doesn't have a size, I think you mean, the configuration of the hidden layers (number_of_input_features, num_of_output_features)

ocean pawn Aug 10, 2024, 10:30 PM

#

odd meteor It's gotten from the image you're working with. To build intuition, imagine yo...

But why is it 1728?

ocean pawn Aug 10, 2024, 10:31 PM

#

odd meteor The activation function doesn't have a size, I think you mean, the configuration...

Oh, my apologies

ocean pawn Aug 10, 2024, 10:32 PM

#

odd meteor It's gotten from the image you're working with. To build intuition, imagine yo...

11818 is the image, the "channel" (?) is 1 because it's grayscale

#

But I don't see where 1728 is derived from

#

What am I missing?

#

Thanks!

unkempt wigeon Aug 10, 2024, 10:32 PM

#

may i ask a question?

ocean pawn Aug 10, 2024, 10:33 PM

#

unkempt wigeon may i ask a question?

Don't ask to ask, just ask pithink

unkempt wigeon Aug 10, 2024, 10:34 PM

#

how could i create a neral network?

lapis sequoia Aug 10, 2024, 10:34 PM

#

can anyone help me with python?

severe hare Aug 10, 2024, 10:35 PM

#

unkempt wigeon how could i create a neral network?

Gotta pick what kind first- (Convolutional Neural Network, Generative, Recurrant NN)

#

https://towardsdatascience.com/first-neural-network-for-beginners-explained-with-code-4cfd37e06eaf

Medium

First neural network for beginners explained (with code)

Understand and create a Perceptron

odd meteor Aug 10, 2024, 10:35 PM

#

ocean pawn But why is it 1728?

Once you've gotten the dimension of the image you're working with, you can compute that value.

number of channel x image height x image width.

In my cooked up explanation, we assumed we're working with a grayscale image with 14 x 14 dimension.

1 (channel) x 14 (height) x 14 (width) = 196

trim saddle Aug 10, 2024, 10:35 PM

#

unkempt wigeon how could i create a neral network?

https://www.youtube.com/watch?v=VMj-3S1tku0&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ
this is one of the best tutorials to learn

YouTube

Andrej Karpathy

The spelled-out intro to neural networks and backpropagation: build...

This is the most step-by-step spelled-out explanation of backpropagation and training of neural networks. It only assumes basic knowledge of Python and a vague recollection of calculus from high school.

Links:

micrograd on github: https://github.com/karpathy/micrograd
jupyter notebooks I built in this video: https://github.com/karpathy/nn-z...

▶ Play video

ocean pawn Aug 10, 2024, 10:36 PM

#

odd meteor Once you've gotten the dimension of the image you're working with, you can compu...

I understand 196 but why 1728

odd meteor Aug 10, 2024, 10:36 PM

#

ocean pawn I understand 196 but why 1728

What is the dimension of your image data?

#

Check the shape

ocean pawn Aug 10, 2024, 10:37 PM

#

odd meteor What is the dimension of your image data?

It's a MINST data set

#

Let me check

#

*MNIST

#

28*28*1

#

784

sturdy field Aug 10, 2024, 10:39 PM

#

https://learn.deeplearning.ai/courses/ai-python-for-beginners/lesson/1/introduction

DeepLearning.AI - Learning Platform

AI Python for Beginners: Basics of AI Python Coding - DeepLearning.AI

Enroll in 'AI Python for Beginners' by DeepLearning.AI and learn Python programming with AI assistance. Gain skills writing, testing, and debugging code efficiently, and create real-world AI applications.

odd meteor Aug 10, 2024, 10:39 PM

#

ocean pawn *MNIST

Yeah MNIST is 28x28, it should be 784 not 1728. Or am I missing something 👀

ocean pawn Aug 10, 2024, 10:40 PM

#

odd meteor Yeah MNIST is 28x28, it should be 784 not 1728. Or am I missing something 👀

It's convolutional, and there's down sampling (?) with MaxPool2d maybe?

unkempt wigeon Aug 10, 2024, 10:40 PM

#

severe hare Gotta pick what kind first- (Convolutional Neural Network, Generative, Recurrant...

what can learn to find people in a photo or from a live feed??

#

my apoliges

severe hare Aug 10, 2024, 10:45 PM

#

unkempt wigeon what can learn to find people in a photo or from a live feed??

Called Facial Recognition. Just a piece of Computer Vision. Learn Computer Vision and you will get to it after a couple of chapters of applied study

severe hare Aug 10, 2024, 10:48 PM

#

ocean pawn It's convolutional, and there's down sampling (?) with MaxPool2d maybe?

My intuition is that it's your MLP outputs, but I'm kindof doing a few things at once

odd meteor Aug 10, 2024, 11:02 PM

#

ocean pawn It's convolutional, and there's down sampling (?) with MaxPool2d maybe?

Oh I see... 1728 is the result of flattening the output of the final feature map after the convolution and max pooling layers, leading to a total of 1728 elements which was then passed to the 1st hidden layer in the MLP.

shut shoal Aug 11, 2024, 12:14 AM

#

I want to make sure I'm getting RL down correctly so I'm going to give a general description of RL and if I'm missing something or it's incorrect would you guys correct me. Thank you in advance.

RL uses an agent to take some action in an environment based upon that state that it's in. The agent will either receive a positive reward or negative reward and the agent wants to obtain the highest reward possible. 

The foundational model of RL is called the Markov Decision Process which contains states, actions, rewards, the transition probabilities between states, and the policy. 

RL can be broken up into two categories, model-based and model-free. Model-based uses the environment to take predictions (policy iteration and value iteration or non-linear dynamics) based upon the environment. Model-free uses 'trail-and-error' to compute a gradient and if you know the gradient you can use some mathematical formula, otherwise, you'll use gradient-free methods mostly and they're broken up into either value-based or policy-based methods. 

Value-based methods take value functions and iterate through it (value iteration) and it uses a bellman function to help determine the optimal policy (policy iteration). Policy-based methods just takes the next best action with just one step.

#

I've been getting confused on the value-based and policy-based methods the most. I'm not sure at all if my definiton is correct on those.

past meteor Aug 11, 2024, 12:18 AM

#

shut shoal I want to make sure I'm getting RL down correctly so I'm going to give a general...

There's some slight mistakes but you're largely on the right track

#

For instance, model based vs model free isn't the only way you can categorize RL algorithms. There are many.

#

For model based algorithms you can be more specific, they specifically try to make a world model and use that model, by means of unrolling to find the best actions

severe hare Aug 11, 2024, 12:22 AM

#

Supervised RL models actively influence their own data distribution.

past meteor Aug 11, 2024, 12:22 AM

#

The model-free part can be more specific too. I'd definitely talk about the distinction between monte Carlo and temporal difference learning.

If I remember correctly value and policy based was basically if you're learning Q values or V.

Finally, I'd definitely spend some time talking about on policy and off policy.

severe hare Aug 11, 2024, 12:23 AM

#

^ I'm just saying it matters for an RL model if it's supervised or unsupervised.

past meteor Aug 11, 2024, 12:24 AM

#

Unsupervised ones were the algorithms that mostly seek novelty yeah?

#

That use some kind of novelty signal as the reward in lieu of actual rewards

#

Or wdym exactly @severe hare

shut shoal Aug 11, 2024, 12:25 AM

#

past meteor There's some slight mistakes but you're largely on the right track

Gotcha. Is there anything wrong with my value-based and policy-based def at the bottom?

past meteor Aug 11, 2024, 12:26 AM

#

shut shoal Gotcha. Is there anything wrong with my value-based and policy-based def at the ...

Yes, it's not as complicated as you present it

shut shoal Aug 11, 2024, 12:26 AM

#

past meteor The model-free part can be more specific too. I'd definitely talk about the dist...

I should learn MC and TD before learning about value-based and policy-based methods?

past meteor Aug 11, 2024, 12:28 AM

#

Value based simply learns the value of each state, V(S). You can easily derive the behaviour policy from that. Take the action that leads to the highest V(S+1). Policy based learns Q(S, A), it learns the value of a state action pair. It's also trivial to find the behaviour policy from this

severe hare Aug 11, 2024, 12:29 AM

#

past meteor Or wdym exactly <@813877367939530782>

#

from here: https://bair.berkeley.edu/blog/2021/12/15/unsupervised-rl/#:~:text=The main difference is that,through a self-supervised task.

The Berkeley Artificial Intelligence Research Blog

The Unsupervised Reinforcement Learning Benchmark

The BAIR Blog

past meteor Aug 11, 2024, 12:29 AM

#

severe hare

Yup, I read a lot about this in the context of offline RL

severe hare Aug 11, 2024, 12:30 AM

#

past meteor That use some kind of novelty signal as the reward in lieu of actual rewards

Yes it won't use intrinsic rewards; that's correct

past meteor Aug 11, 2024, 12:30 AM

#

But it's been a while and I'm rusty

severe hare Aug 11, 2024, 12:30 AM

#

past meteor But it's been a while and I'm rusty

Your knowledge is good.

severe hare Aug 11, 2024, 12:32 AM

#

shut shoal I should learn MC and TD before learning about value-based and policy-based meth...

I've really never done Temporal Difference; maybe I should try that. MCs and Bayesians yeah

#

You really want experience with both MCs' ; because there is two

past meteor Aug 11, 2024, 12:32 AM

#

Aren't most popular algorithms derivatives of TD methods?

#

Q learning etc

shut shoal Aug 11, 2024, 12:33 AM

#

past meteor Value based simply learns the value of each state, V(S). You can easily derive t...

So the only real difference between those two is that one calculates it alongside the action while the other one doesnt. Otherwise it'll still get the max reward.

past meteor Aug 11, 2024, 12:34 AM

#

shut shoal So the only real difference between those two is that one calculates it alongsid...

I actually made a mistake here sorry, it's been a while

shut shoal Aug 11, 2024, 12:35 AM

#

severe hare I've really never done Temporal Difference; maybe I should try that. MCs and Bay...

Both MCs?

past meteor Aug 11, 2024, 12:35 AM

#

Policy based is stuff like policy gradient, it doesn't learn Q or V at all.

shut shoal Aug 11, 2024, 12:35 AM

#

Oh so it basically looks at the gradient and determines its next move like that?

past meteor Aug 11, 2024, 12:35 AM

#

https://stats.stackexchange.com/questions/407230/what-is-the-difference-between-policy-based-on-policy-value-based-off-policy

Policy-based vs. Value-based In Policy-based methods we explicitly build a representation of a policy (mapping π:s→a) and keep it in memory during learning.

In Value-based we don't store any explicit policy, only a value function. The policy is here implicit and can be derived directly from the value function (pick the action with the best value).

past meteor Aug 11, 2024, 12:37 AM

#

shut shoal Oh so it basically looks at the gradient and determines its next move like that?

Think about it more abstractly, forget about gradients for a second

severe hare Aug 11, 2024, 12:37 AM

#

shut shoal Both MCs?

Monte Carlos, and Markov Chains which also can be paired together,

past meteor Aug 11, 2024, 12:37 AM

#

You have a function π: s -› a, basically something that maps a state to an action

#

Policy based methods are able to update this function directly

severe hare Aug 11, 2024, 12:38 AM

#

Just to keep it confusing; there is such a thing as the Markov Chain Monte Carlo: that combines them
https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo

Markov chain Monte Carlo

In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that is, the Markov chain's equilibrium distribution matches the target distribution. The more steps that a...

past meteor Aug 11, 2024, 12:38 AM

#

The others update Q or V and simply derive π: s -› a from that

#

Make sense? @shut shoal

shut shoal Aug 11, 2024, 12:39 AM

#

OHHH

#

Basically policy-based only looks at the current state to compute an action while value-based looks at values to get some action based on a state.

past meteor Aug 11, 2024, 12:40 AM

#

Well, maybe π: s -› a takes into account future states, we don't know that

#

All we know is that it doesn't need to estimate the value of states or state action pairs

shut shoal Aug 11, 2024, 12:41 AM

#

Oh gotcha

#

Also from what I'm understanding, the value-based functions determine the values (Q or V) by using the Bellman equation.

past meteor Aug 11, 2024, 12:45 AM

#

Yup, and to be sure just look at it this way: the value of a state is the immediate reward and the discounted future states

#

And you can expand the latter term etc.

shut shoal Aug 11, 2024, 12:46 AM

#

This makes much more sense. Thank you @past meteor @severe hare

severe hare Aug 11, 2024, 12:47 AM

#

I think of it that, the computer wants to be 'led to' a solution for the value of Q, even if your model never reaches it

severe hare Aug 11, 2024, 12:50 AM

#

shut shoal This makes much more sense. Thank you <@260493929047130113> <@813877367939530782...

Glad to help

past meteor Aug 11, 2024, 12:52 AM

#

Same

#

Try implementing as many of these algorithms as possible. The basic ones rarely exceed 25 loc and in my experience those small experiments teach you a lot

faint quail Aug 11, 2024, 2:33 AM

#

Thoughts?
https://github.com/TheonlyIcebear/Neural-Net-Framework

GitHub

GitHub - TheonlyIcebear/Neural-Net-Framework: I custom library I ma...

I custom library I made for training neural networks from scratch, using numpy and scipy - TheonlyIcebear/Neural-Net-Framework

serene scaffold Aug 11, 2024, 2:36 AM

#

faint quail Thoughts? https://github.com/TheonlyIcebear/Neural-Net-Framework

Can you tell us more? Why would one use this instead of pytorch?

#

I guess the answer is "no one, this is just a demonstration of concept"

faint quail Aug 11, 2024, 2:41 AM

#

serene scaffold Can you tell us more? Why would one use this instead of pytorch?

simple answer dont

#

#

its probably way slower and doesnt have many features like pytorch

#

from utils.layers import *
from utils.schedulers import *
from utils.network import Network
from utils.optimizers import Adam
from utils.functions import Activations, Loss
import matplotlib.pyplot as plt
import numpy as np, pickle, time

if __name__ == "__main__":
    model = [
        Input(2),
        Dense(3),
        Activation("lrelu"),
        Dense(2),
        Activation("softmax"),
    ]

    print(model)

    network = Network(model, loss_function="cross_entropy", optimizer=Adam(momentum = 0.9, beta_constant = 0.99))
    network.compile()
    
    training_percent = 1
    batch_size = 4

    save_file = 'model-training-data.json'

    xdata = [[i % 2, i // 2] for i in range(4)]
    ydata = [[(i % 2) ^ (i // 2), 1 - ((i % 2) ^ (i // 2))] for i in range(4)]

    costs = []
    plt.ion()

    start_time = time.perf_counter()

    for idx, cost in enumerate(network.fit(xdata, ydata, learning_rate=0.01, batch_size = batch_size, epochs = 1000, threads=4)):
        if idx % 10:
            save_data = network.save()

            # network = Network()
            # network.load(save_data)

        end_time = time.perf_counter()

        print(end_time - start_time, "time")

        costs.append(cost)

        print(cost)

        plt.plot(np.arange(len(costs)) * (batch_size / (len(xdata) * training_percent)), costs, label='training')

        plt.legend()
        plt.draw()
        plt.pause(0.1)
        plt.clf()

        start_time = time.perf_counter()

heres an xor solver using it

proper crag Aug 11, 2024, 3:35 AM

#

do i need to identify outliers by distributing data points from the minimum to the maximum value, and whether to use the actual data points themselves or to use the frequency (count) of those points for the y-axis in my visualization or analysis ?
im srry if i'm perhaps ovethink that its bcomes so complex when it isnt

#

i wan to use SVR for my dataset

#

and this is how my dataset looks like

Screenshot_2024-08-11_at_10.48.46_AM.png

unkempt wigeon Aug 11, 2024, 4:17 AM

#

is it possible to make an iterative language model ?

half lintel Aug 11, 2024, 4:38 AM

#

If I have a dataframe with a columns (say): A,B,C,status
And I want to group-by A,B,C with a new column saying "number of times status=X" and "total number of items"

Feels like something .groupby().agg() <=== not sure what to put here

serene scaffold Aug 11, 2024, 5:41 AM

#

unkempt wigeon is it possible to make an iterative language model ?

What would it mean for it to be iterative

faint quail Aug 11, 2024, 5:52 AM

#

do you mean like recurrent? like it plugs the output of the model back into the model

unkempt wigeon Aug 11, 2024, 5:59 AM

#

i want it to teach its self the dictionary after i nuget forward a bit

#

im sorry

craggy patio Aug 11, 2024, 6:36 AM

#

Hey so I am currently trying to use a transformer to predict the next human generated "random" number from 1-100 inclusive. I'll drop some info about it and see if anyone has any suggestions on how to potentially improve it.

Transformer:

lr: 1e-3
L2: 1e-6
dropout: 0.2
feed forward size: 32
Embed size: 32
Attention heads: 7
num classes: 100
num features 43

features:
number itself
number mod 2
number mod 3
number mod 5
number mod 10
number of divisors
Digit sum
ranking in occurance
each digit as it's own feature (2 digits. We treat 100 as 99 which ik isnt the best but is better than having an optional feature)
x3 (We do encode the previous number and the number even before that with all the same features into this number as well)
Plus an additional difference feature for the numbers preceeding it (2 features)
And also an additional quotient feature for the numbers preceeding (2 more features)

All of these values are normalized correctly I ensured as well

Lmk if there is anything I can do better

hollow night Aug 11, 2024, 7:05 AM

#

Hi everyone! Hope you all are doing great today.
I am a beginner in Python and graduated high school this year. I am having a lot of difficulities in my Python learning journey.My elder brother, who is in his last year of college and is a web developer, guided me to explore the field of machine learning and recommended the Machine Learning Specialization course by Andrew Ng (Stanford University) on Coursera. Since it consists of many difficult concepts like linear regression, Gradient Descent, Supervised learning, e.t.c. I found the course quite tough and challenging and couldn't understand much. I have asked this server for guidance many times, but they usually respond as if I were an advanced programmer
Can You please guide me step by step. What and from where should I learn.

I have also started exploring Python libraries like Numpy and pandas.

warped shale Aug 11, 2024, 7:26 AM

#

hollow night Hi everyone! Hope you all are doing great today. I am a beginner in Python and ...

If you are alrdy good with python, like as in understanding simple concepts well enough, you should revise some of the mathematical concepts u have learnt in highschool like linear algebra, probability and statistics and familiarising urself with concepts like differentiation and integration. You should also get comfortable with data handling, for example understanding numpy and pandas. Datasets are a core part so i would say you should experiment with them (kaggle is one of my fav platforms for datasets), practice loading and cleaning them, experiment with augmentation. Aftr all these you should start with the basic concepts of ml, use sckit learn at first then move on to pytorch or tensorflow, start experimenting with mnist datasets. And like your brother said, Andrew Ng's course is a great source for understanding all the fundamental concepts.
Start with simple networks like cnns and rnns then move on to more advanced topics like rl, GANs, hybrids or NLP. Start a github repo(for documanting ur progress), join communities like stackoverflow and r/Machinelearning, and most important of all participate in competitions (competitions are held in sites like kaggle, they can range from small to large)

#

////

Btw someone help me with this: My training of a multiclass img model is done, validation is done too got an f1 of 85(20- test, 80 - train). However I want to test the model on another dataset, I was thinking of using the 2017, 2018 or the ham dataset but later I found out that the isic2019 is an extension that also contains all the imgs of all those sets. So which dataset should I use for another test?

I cant find one on Google and I have been searching for too long

For context: the isic2019 is an imbalanced dataset with 25k images for dermatology (also known as skin related issues). The model is an ensemble hybrid rf

warped shale Aug 11, 2024, 7:38 AM

#

hollow night Hi everyone! Hope you all are doing great today. I am a beginner in Python and ...

Btw forgot to mention, please learn how to handle imbalance, its really important. I regret not learning it before: accuracy can be significantly impacted by imbalance

hollow night Aug 11, 2024, 7:40 AM

#

So should I quit Andrew Ng course for now or should continue with these other stuffs

warped shale Aug 11, 2024, 7:40 AM

#

Well, if you believe your still not ready for it, yeah sure

#

But that course is really good, so I recommend u follow it

#

After you are more confident ofc

ocean pawn Aug 11, 2024, 8:59 AM

#

odd meteor Oh I see... 1728 is the result of flattening the output of the final feature map...

But I want to know how it's calculated, I've tried different calculations but I can never get that specific value

toxic mortar Aug 11, 2024, 10:41 AM

#

lapis sequoia Aug 11, 2024, 10:48 AM

#

ocean pawn ```py class CNN(eqx.Module): layers: list def __init__(self, key): ...

24*24*3

lapis sequoia Aug 11, 2024, 11:16 AM

#

In this case (W-K) + 1 gives you the result; (width and kernel size.)
first it's 28-4 + 1 = 25
then its 25 -2 +1=24
then goes 24*24*3

lapis sequoia Aug 11, 2024, 11:34 AM

#

may help: https://cs231n.github.io/convolutional-networks/
feel free to ask

CS231n Convolutional Neural Networks for Visual Recognition

Course materials and notes for Stanford class CS231n: Convolutional Neural Networks for Visual Recognition.

toxic mortar Aug 11, 2024, 12:24 PM

#

What do you mean?

proper crag Aug 11, 2024, 12:32 PM

#

how to find outlier in a dataset ?

#

im asking this bcuz i wan to find outlier

#

in my dataset which i got from kaggle

#

do i need to identify outliers by distributing data points from the minimum to the maximum value, and whether to use the actual data points themselves or to use the frequency (count) of those points for the y-axis in my visualization or analysis ?
im srry if i'm perhaps ovethink that its bcomes so complex when it isnt

toxic mortar Aug 11, 2024, 12:35 PM

#

does this also applies when you train a model across epochs?

proper crag Aug 11, 2024, 12:35 PM

#

im not asking in technical perspective, rather in perspective of analysis of how it might efect the model performance...i alr have target column and 2 features

toxic mortar Aug 11, 2024, 12:36 PM

#

I mean the memory accumulation, cause I see that you mean the jupyter saves the variables in the memory

#

Until you explicitly free them

toxic mortar Aug 11, 2024, 12:37 PM

#

proper crag do i need to identify outliers by distributing data points from the minimum to t...

Depending on the concrete dataset

#

One common approach is to do dimensionality reduction and try to plot it within 2d/3d

#

https://umap-learn.readthedocs.io/en/latest/

#

Yeah, I get what you mean. But would you call that pitfall or just a feature 😄

proper crag Aug 11, 2024, 12:44 PM

#

proper crag im not asking in technical perspective, rather in perspective of analysis of how...

the features is originally a categorical column but i already encoded them

toxic mortar Aug 11, 2024, 12:45 PM

#

But this is school example for gc

#

You lose reference to it - > gc should be activated

#

ok that means it is unfixable

#

well knowing that why would you use car for floating on water, when by default it is not intended for that use-case?

#

okay you can use for EDA

#

but create a script for training, no

#

right

#

people are spoiled

proper crag Aug 11, 2024, 1:18 PM

#

118 rows x 7 columns is considered small?

spare forum Aug 11, 2024, 1:23 PM

#

Yes

#

dice_question this is a bold quote

#

Y notebook are just a sandbox

rigid timber Aug 11, 2024, 2:02 PM

#

I trained an ML model of my own that classifies brain tumor with about 92% accuracy. I unfortunately am not sure how I can integrate the .h5 file of the trained model in a simple web application or desktop application where I can upload an image and the model classifies the tumor based on the scan image. Please give me a step by step guide on how I can do that

spare forum Aug 11, 2024, 2:05 PM

#

Streamlit is kinda easy

#

(Web)

#

Didn't read desktop

proper crag Aug 11, 2024, 2:15 PM

#

what is the best solution to handle with outlier

#

when the target column which hv outlier?

spare forum Aug 11, 2024, 2:24 PM

#

How extreme are the outliers, how much outliers, are you using a robust model, does it make sense to have such outlier in the context?

rigid timber Aug 11, 2024, 2:35 PM

#

spare forum Streamlit is kinda easy

how so

spare forum Aug 11, 2024, 2:42 PM

#

Is a library to do basic python data app

left tartan Aug 11, 2024, 2:49 PM

#

There's some best practices around managing notebooks, and ensuring reproducibility. Such as only committing stripped notebooks, using something like papermill to populate notebooks for 'production' use, etc. Some devs local notebook is just a sandbox.

lapis sequoia Aug 11, 2024, 2:56 PM

#

rigid timber I trained an ML model of my own that classifies brain tumor with about 92% accur...

you can convert models with onnx and use them on the web, but i'd recommend models < 50Mb, if they run on the client.

proper crag Aug 11, 2024, 3:02 PM

#

spare forum How extreme are the outliers, how much outliers, are you using a robust model, d...

i decide to use SVR

rigid timber Aug 11, 2024, 4:13 PM

#

lapis sequoia you can convert models with onnx and use them on the web, but i'd recommend mode...

The model is 100mb, is there any workaround for that

hollow dust Aug 11, 2024, 4:15 PM

#

Yeah pretty terrible

rigid timber Aug 11, 2024, 4:15 PM

#

oof

#

I'll try a pretrained model

lapis sequoia Aug 11, 2024, 4:46 PM

#

rigid timber The model is 100mb, is there any workaround for that

yes, quantisation

#

you may be able to do it when exporting (onnx export) or within the library you use for training/saving.
depends on many things, but it's possible.

#

for example, if you have float32 weights, you'll get to 50mb a bit less precision (normally you don't notice.) with float16, and 25mb with uint8s (that sometimes isn't as good, since the error increases substantially.) if you already have done that, then idk..XD

rigid timber Aug 11, 2024, 4:53 PM

#

lapis sequoia for example, if you have float32 weights, you'll get to 50mb a bit less precisio...

I think I used float64

#

Im relatively new to this so if you can tell me some resources where I can learn from that'd be pretty great

craggy patio Aug 11, 2024, 5:44 PM

#

How can I predict sequences of numbers?

#

humanly generated "random" numbers

lapis sequoia Aug 11, 2024, 6:28 PM

#

rigid timber I think I used float64

search deep learning model quantisation

#

i think chatgpt can give you a decent introduction

#

you can check this but it's more complex https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization

A Visual Guide to Quantization

Exploring memory-efficient techniques for LLMs

rigid timber Aug 11, 2024, 6:36 PM

#

lapis sequoia you can check this but it's more complex https://newsletter.maartengrootendorst....

Thanks

faint quail Aug 11, 2024, 7:18 PM

#

lapis sequoia for example, if you have float32 weights, you'll get to 50mb a bit less precisio...

yeah but numpy seems to shit itself when you have mixed float precision

#

and certain functions just fail like np.var with float32

ocean pawn Aug 11, 2024, 8:45 PM

#

lapis sequoia 24\*24\*3

...

#

That's a simple solution I somehow haven't came up with

#

Thanks!

#

I though I have to do some trick as MaxPolling downsample the data

#

Oh and

#

When should sigmoid be used over relu in hidden layer?

ocean pawn Aug 11, 2024, 8:50 PM

#

ocean pawn That's a simple solution I somehow haven't came up with

24 (height) * 24 (width) * 3 (out channel of relU)

#

I though it would be 23 * 23 as conv2d have stride of 1

ocean pawn Aug 11, 2024, 9:57 PM

#

Does the MNIST handwritten digit generalize badly?

#

(my own handwritten image)

#

Epoch 449 loss: 0.10526546835899353 accuracy: 0.9675506353378296

#

How can I improve this

spring field Aug 11, 2024, 10:02 PM

#

ocean pawn When should sigmoid be used over relu in hidden layer?

when you feel like it 🥴
generally you just look at what various papers implement and when or just experiment

spring field Aug 11, 2024, 10:07 PM

#

ocean pawn Does the MNIST handwritten digit generalize badly?

MNIST is a dataset... it doesn't do anything except exists as a dataset
your model OTOH can either generalize or overfit/(specialize?) (if those are opposites)
your model accuracy is about where one would expect it to be though

are the MNIST digits anti-aliased as well?

but yeah, I don't think your model is really capable of generalizing well because it's literally just a single convolutional layer with an MLP at the end (technically it's not an MLP, but oh well, I hate that term, no one ever uses MLPs anymore, but the term stuck, smh, anyway...)

half lintel Aug 11, 2024, 11:05 PM

#

If I want to count the number of a particular value in a series, is there a better way than

def somefunc(blah: pd.Series):
    return (blah == 'running').sum()

serene scaffold Aug 11, 2024, 11:12 PM

#

half lintel If I want to count the number of a particular value in a series, is there a bett...

Yes. Use value counts

half lintel Aug 11, 2024, 11:13 PM

#

unless that's precalculated and cached, feels like that's doing a bunch of work I wouldn't care about? (for the other values?)

serene scaffold Aug 11, 2024, 11:14 PM

#

half lintel unless that's precalculated and cached, feels like that's doing a bunch of work ...

It's not precalculated, but both involve essentially the same amount of work.

half lintel Aug 11, 2024, 11:14 PM

#

and are you suggesting: blah.value_counts().get('running', 0)

serene scaffold Aug 11, 2024, 11:15 PM

#

I would probably keep the value count series as a variable

half lintel Aug 11, 2024, 11:16 PM

#

As it happens, i'm using this in an agg() from a groupby. Would that suggest a better way?

    df2 = df.groupby(['ACCOUNT_NAME', 'REGION', 'TYPE', 'ID', 'Name']).agg(
        RunningHr=('STATE', lambda x: (x == 'running').sum()),
        NotRunningHr=('STATE', lambda x: (x != 'running').sum())
    )

#

Think I'm going to replace NotRunningHr with a total (count).

serene scaffold Aug 11, 2024, 11:16 PM

#

That's fine. I'd use the eq and ne methods, though.

half lintel Aug 11, 2024, 11:17 PM

#

OK, eq/ne is for style?

serene scaffold Aug 11, 2024, 11:17 PM

#

Right, so you don't need parens for == and !=

half lintel Aug 11, 2024, 11:18 PM

#

Yup I just removed those 🙂

#

Optimal?

    df2 = df.groupby(['ACCOUNT_NAME', 'REGION', 'TYPE', 'ID', 'Name']).agg(
        RunningHr=('STATE', lambda x: x.eq('running').sum()),
        TotalHr=('STATE', 'count')
    )

#

I'm probably over-obsessed with chaining stuff, rather than keeping lots of temporary variables.

serene scaffold Aug 11, 2024, 11:20 PM

#

Sure

half lintel Aug 11, 2024, 11:20 PM

#

Thank-you

serene scaffold Aug 11, 2024, 11:21 PM

#

half lintel I'm probably over-obsessed with chaining stuff, rather than keeping lots of temp...

Not creating references means fewer opportunities for memory to be waiting for garbage collection.

half lintel Aug 11, 2024, 11:22 PM

#

I had a related question, and I think I've seen it somewhere, but don't know the words to search.

How can I add a "filter" to a series of chained calls?

Like if I have the code above, in a function called "summarise" how can I do:

result = pd.load_csv()...
.rename(this,that)
._CALL SUMMARISE
.other_thing()

#

Is there a "chain" or "call" thingy?

#

Ahh... "pipe" ?

serene scaffold Aug 11, 2024, 11:25 PM

#

half lintel I had a related question, and I think I've seen it somewhere, but don't know the...

Yeah, I think you have to use pipe so you can pretend that you have a variable for the dataframe at that stage

#

Pandas is weird

half lintel Aug 11, 2024, 11:43 PM

#

What's a good way to update all values in a column (in a dataframe) with a lambda? I want to remove a substring (which is in another column)

The IDE suggested

all_resources['ACCOUNT_NAME'] = all_resources.agg(lambda x: x['ACCOUNT_NAME'].replace('-' + x['REGION'], ''), axis=1)

But I dont understand why it used agg() and not .... .assign() or .apply() ?

serene scaffold Aug 11, 2024, 11:50 PM

#

#

Here's a pic I took at the zoo today

serene scaffold Aug 11, 2024, 11:50 PM

#

half lintel What's a good way to update all values in a column (in a dataframe) with a lambd...

"what's a good way to do x in pandas with a lambda" is self contradictory

#

A lambda is almost always the wrong way to do anything in pandas

#

Looks like you should make a new string column with str.replace that applies the desired string transformation.

half lintel Aug 12, 2024, 12:18 AM

#

yessir, how can I do that with not-lambda 🙂

#

can I use str.replace on a vectory thing? when the text to be replaced is actually another column?

half lintel Aug 12, 2024, 3:00 AM

#

Thoughts on how to create "pretty reports" using pandas? Presumably need some kind of templating engine....

small wedge Aug 12, 2024, 3:23 AM

#

I used matplotlib and just wrote out graphs to a pdf for presenting pandas data at my job

#

although in hindsight making a blank graph and a custom method to position text as I wrote it was a waste of time

shut shoal Aug 12, 2024, 4:49 AM

#


This is then passed to the policy function to calculate the next possible action. 
^^
||
This last sentence is a guess. Is this a correct guess?

Does this sound right?

wooden sail Aug 12, 2024, 4:54 AM

#

spring field MNIST is a dataset... it doesn't do anything except exists as a dataset your mod...

just for completeness, this isn't quite right. if you have a fixed architecture and use different data sets to train it, the data sets will determine how well the model learns and whether it generalizes well too

#

it's fairly difficult to tell whether the data or the model is responsible without extensive testing, which is why one would do a lot of cross validation and play with removing some layers and reevaluating

hollow night Aug 12, 2024, 5:33 AM

#

warped shale But that course is really good, so I recommend u follow it

I forgot to say thanks.
By the way, I took some classes from Andrew Ng's course yesterday and decided to pause the course for a while to focus on more practical work

#

I will consider taking your help on my journey with Python. I hope you don't mind.

#

😊

lapis sequoia Aug 12, 2024, 6:45 AM

#

ocean pawn When should sigmoid be used over relu in hidden layer?

ReLU tends to perform better for various reasons, including that it's got no vanishing gradient problem. (probably especially true for leaky relu though.)

it's also simpler to compute the gradient.

#

here is a standard argument https://stats.stackexchange.com/questions/126238/what-are-the-advantages-of-relu-over-sigmoid-function-in-deep-neural-networks

Cross Validated

What are the advantages of ReLU over sigmoid function in deep neura...

The state of the art of non-linearity is to use rectified linear units (ReLU) instead of sigmoid function in deep neural network. What are the advantages?

I know that training a network when ReLU is

#

nice, yes, also silu (idk what it is.) and leaky relu, which has got a small (adjustable) negative slope

#

im experimenting with hyperparameter tuning libraries and it's a neat way to test those,

blissful locust Aug 12, 2024, 6:51 AM

#

I really need some help in this kaggle competition I am taking part in so please hop on the voice chat 0 if you know a thing or two about about kaggle or ai in general. (it is my first competition)

lapis sequoia Aug 12, 2024, 6:53 AM

#

random blogpost conclusion...

So which one should you use?

It depends on your application and what works best for your network. In general, ELU or GELU may be better choices than ReLU if you’re worried about dead neurons, while SILU may be a good choice if you’re using batch normalization.

Also GELU seems to be the SOTA for transformer models and SiLU is use mostly in computer vision models.

#

(E: exponential, G: gaussian, S: sigmoid)

#

https://armandolivares.tech/2022/09/04/elu-gelu-and-silu-activation-functions/

Armand's Blog

admin

SiLU, GELU and ELU activation functions

/*! elementor - v3.21.0 - 30-04-2024 */ .elementor-widget-text-editor.elementor-drop-cap-view-stacked .elementor-drop-cap{background-color:#69727d;color:#fff}.elementor-widget-text-editor.elementor-drop-cap-view-framed .elementor-drop-cap{color:#69727d;border:3px solid;background-color:transparent}.eleme...

#

yeah

serene grail Aug 12, 2024, 6:57 AM

#

The main reason these functions are used is that they're easy (fast) to compute right? The derivative is very straightforward

lapis sequoia Aug 12, 2024, 6:59 AM

#

they need a couple of attributes: must be non linear, simpler is better indeed (but not too simple), have (or 'produce') non exploding or vanishing gradient...

#

its fun

#

idk tbh

#

interesting!

serene grail Aug 12, 2024, 7:02 AM

#

lol yeah I guess

lapis sequoia Aug 12, 2024, 7:03 AM

#

i may compare on mnist those for fun

serene grail Aug 12, 2024, 7:04 AM

#

the point of these functions is to introduce non-linearity, so less linear -> easier to fit to real world (non-linear) data?

lapis sequoia Aug 12, 2024, 7:08 AM

#

oh that's dan, he seems smart, one of the authors

#

dan hendrycks is one of the ones passing (or helping to write or smth) the bill to regulate ai i think

#

is on the 'imminent extinction' side iirc

#

fn is that one right?

wooden sail Aug 12, 2024, 7:17 AM

#

the siren part is pretty cool

lapis sequoia Aug 12, 2024, 7:18 AM

#

erf is a gaussian like fn i think

wooden sail Aug 12, 2024, 7:18 AM

#

erf is the integral of a gaussian

lapis sequoia Aug 12, 2024, 7:19 AM

#

oh yeah

#

that's the cumulative part

wooden sail Aug 12, 2024, 7:19 AM

#

you usually get it as the CDF of the gaussian, i.e. the probability of a gaussian distributed event happening. it's called error function because it describes the probability of making errors when transmitting signals under gaussian noise

lapis sequoia Aug 12, 2024, 7:20 AM

#

looks like a sigmoid :-(

wooden sail Aug 12, 2024, 7:21 AM

#

a lot of stuff looks like a sigmoid

lapis sequoia Aug 12, 2024, 7:26 AM

#

no wait, but in the paper that's \Phi(x)*x I think

#

so it's like a probability times a weight in a way

#

We perform an empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations and find performance improvements across all considered computer vision, natural language processing, and speech tasks.

same paper, ig that's not a proof, but interesting.

blissful locust Aug 12, 2024, 7:35 AM

#

I am facing an issue in the ML models I have created. Please dm me if you know a thing or two about AI and ML

lapis sequoia Aug 12, 2024, 7:37 AM

#

i only partially agree with that; the paper you included says:

But as networks became deeper, training with sigmoid activations proved less effective than the non-smooth, less-probabilistic ReLU (Nair &
Hinton, 2010) which makes hard gating decisions based upon an input’s sign.

sigmoids are non linear

#

the introduction is very neat

wooden sail Aug 12, 2024, 7:39 AM

#

there's this one talk from ICASSP 2020 or 2021 that i never found again, but discussed that if you learn the activation function, piecewise polynomials (like relu) are in some sense the optimal choice

lapis sequoia Aug 12, 2024, 7:44 AM

#

but there is leaky relu as well

#

the discussion also is very nice

#

Across several experiments, the GELU outperformed previous nonlinearities, but it bears semblance to the ReLU and ELU in other respects. For example, as σ → 0 and if µ = 0, the GELU becomes
a ReLU. More, the ReLU and GELU are equal asymptotically. In fact, the GELU can be viewed
as a way to smooth a ReLU.

wooden sail Aug 12, 2024, 7:51 AM

#

yeah, there's a handful of smoothing approximations to the relu. the problem is that it isn't differentiable at 0, only subdifferentiable. as a result, different ML libraries and implementations of autodif make different, arbitrary choices of what to do for the derivative at 0

lapis sequoia Aug 12, 2024, 7:51 AM

#

i see

#

the x in their formula turns to relu ig x*Phi(x)

#

idk what u and sigma are here (i mean the role in the network); the weights?

wooden sail Aug 12, 2024, 7:54 AM

#

mean and variance of a gaussian distribution. it's not really a pdf here though, so it's better to say they're the "shape parameters" of a "bell curve"

lapis sequoia Aug 12, 2024, 7:54 AM

#

they are tunable though, i guess?

wooden sail Aug 12, 2024, 7:55 AM

#

sure

proper crag Aug 12, 2024, 8:06 AM

#

How does log(n) can centralized skewed graph?

lapis sequoia Aug 12, 2024, 8:13 AM

#

wooden sail sure

it could also be the u and sigma of a batchnorm on x i think

#

so not tunable

#

actually (this may be incorrect) but i think x is just the output of a linear transformation; but that's assumed to be normally distributed

#

(formula just for discussion.)

wooden sail Aug 12, 2024, 8:19 AM

#

there are good arguments to be made for x being normal distributed if you got it from a large enough matrix, sure

#

the original paper discusses it only very loosely though

lapis sequoia Aug 12, 2024, 8:22 AM

#

i was wondering how they use that function since it does not have an actual expression

#

but they use tanh in replacement apparently

#

oops, they do say this though:

We could use the CDF of N (µ, σ2) and have µ and σ be learnable hyperparameters, but throughout this work we simply let µ = 0 and σ = 1.
which im assuming means the data comes from batchnorm

wooden sail Aug 12, 2024, 8:34 AM

#

it really depends on what interpretation you want to give to the activation function

#

even though they called it a cdf, by leaving it fixed it pretty much detaches the function from the data, so it's not really a cdf

#

just a function that looks like a relu but is everywhere differentiable

lapis sequoia Aug 12, 2024, 8:45 AM

#

it's a bit like x*sigmoid conceptually (in my mind at least.)

wooden sail Aug 12, 2024, 8:46 AM

#

it's exactly that, because sigmoid is an umbrella world describing any roughly s-shaped function

#

like the logistic function, which is what people usually mean, or the hyperbolic and inverse tangents, or the error function

#

those are all sigmoids

#

you can

lapis sequoia Aug 12, 2024, 8:47 AM

#

i wonder whether x*logistic would work well

wooden sail Aug 12, 2024, 8:47 AM

#

yes

#

in my work we do this all the time, since the activation functions should mimic some other algorithm

#

so you chuck the hyperparams into the training

lapis sequoia Aug 12, 2024, 8:49 AM

#

lapis sequoia i wonder whether `x*logistic` would work well

just realised that's exactly silu

#

wooden sail Aug 12, 2024, 8:50 AM

#

yep

past meteor Aug 12, 2024, 8:51 AM

#

How does it differ from a skip connection

lapis sequoia Aug 12, 2024, 8:51 AM

#

so they propose a whole family of activation functions, that's quite beautiful

past meteor Aug 12, 2024, 8:51 AM

#

Seems pretty similar

wooden sail Aug 12, 2024, 8:51 AM

#

#

just for visualization

lapis sequoia Aug 12, 2024, 8:52 AM

#

yeah all sigmoid-type actually

lapis sequoia Aug 12, 2024, 8:52 AM

#

past meteor How does it differ from a skip connection

is that like or same as dropout?

past meteor Aug 12, 2024, 8:53 AM

#

Different. You multiply theta(w, x) * x

wooden sail Aug 12, 2024, 8:55 AM

#

i'm not sure what exactly you mean by "having it depend on previous outputs"

#

through composition and the usage of iterative optimization methods, all of the parameters depend on the initial guess, all of the previous parameters in the network, and all of the previous guesses of the optimal parameters

#

all gradient based optimization methods are recurrent

lapis sequoia Aug 12, 2024, 8:56 AM

#

you can do arbitrary graphs, does that relate?

#

i.e 1 activation takes 2 prev layers as input

#

oh, skip connection as in resnets

#

i didn't know the name

#

i think i more or less suggested the same as @past meteor if i understood correctly

#

skip connections are one way to make that prev-output dependency using graphs, but you may have asked smth else

#

so this may be wrong but in my mind all the paths forward have a gradient backwards

#

then i thought that'd give a state in the sense you wrote (using just +complex graphs), but ig it does not

#

is the last representing y what you want, or what i meant?

#

does this match the description?

#

XD sorry

#

my current activation default stack sigmoid < ReLU < ELU < x * sigmoid (includes GeLU, SiLU,..)

#

according to some papers it's not for large networks

#

https://arxiv.org/abs/1412.0233

arXiv.org

The Loss Surfaces of Multilayer Networks

We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain t...

#

there is also the regularisation parts, chatgpt returns a neat summary with what is a regularizer in deep learning networks?

#

idk whether regs matter that much either. actually dropout and batchnorm are regularisers

#

they say somewhere that large neural networks get to similar local minima disregarding of initialisation (since it's random and all get to the min.)

#

not to the same parameters, but to minima of similar quality (error.)

#

btw one of the authors is lecunn, worth reading him

#

lol, i take it

#

ill take a look

#

that's my read of this part at least:

However, several researchers experimenting with larger
networks and SGD had noticed that, while multilayer
nets do have many local minima, the result of multiple experiments consistently give very similar performance. This suggests that, while local minima are
numerous, they are relatively easy to find, and they
are all more or less equivalent in terms of performance
on the test set

#

(it could be they all tuned the init, im assuming they didn't to some extent)

proper crag Aug 12, 2024, 9:42 AM

#

is kernel in kernel method is the OSes kernel?

lapis sequoia Aug 12, 2024, 9:44 AM

#

yeah but their "experiments" for example say...

#

We performed an analogous experiment on a scaled-down version of MNIST, where
each image was downsampled to size 10 × 10. Specifically, we trained 1000 networks with one hidden layer
and n1 ∈ {25, 50, 100, 250, 500} hidden units (in the
paper we also refer to the number of hidden units as
nhidden), each one starting from a random set of parameters sampled uniformly within the unit cube. All
networks were trained for 200 epochs using SGD with
learning rate decay.

#

(it certainly does matter for small networks though.)

#

maybe i should add the remaining bit:

(...)
We obtained less than 2.5% drop in accuracy, which
demonstrates the heavy over-parametrization of neural
networks as discussed in Section 3.

#

yeah seems possible, this says smth similar https://ai.stackexchange.com/questions/40495/what-is-the-impact-of-the-initialization-of-weights-in-the-performance-of-a-neur

Artificial Intelligence Stack Exchange

What is the impact of the initialization of weights in the performa...

In my own experience, weight initialization matters for model convergence.

Theoretically, can different weight initialization methods eventually converge to the same optimal solution? Are their we...

#

i was likely stretching that original model too far, it's not even meant to explain inits.

bronze trout Aug 12, 2024, 10:14 AM

#

Hello, I hope this is the correct channel for the question I have. I would like to to create an animated bubble chart, similar to https://cryptobubbles.net/ . Is there a package or framework in python I can use or is this only possible with D3.js ?

Crypto Bubbles

Explore the dynamic world of cryptocurrencies with Crypto Bubbles, an interactive visualization tool presenting the cryptocurrency market in a customizable bubble chart. Dive into the latest market trends and gain valuable insights effortlessly. Crypto Bubbles serves as an independent data aggregator, offering a comprehensive view of the crypto ...

lapis sequoia Aug 12, 2024, 10:27 AM

#

bc it reduces to a gaussian process

#

they want to prove that neural nets reduce to a gaussian process; using a recent advance that the spin-model is equal to a gaussian process

#

i may be wrong, it's in the limit of my understanding

#

uhmm i think they assume a random input vector, but would need to re-read

#

also it's about the training, and the finding of the weights; actually the gaussian process is the loss there.

#

yes, that's quite interesting

#

it's structured data (not just noise), but can be modelled statistically and hence has randomness (i assume.)

#

i spent quite some time understanding that equation, i can't do it algebraically though

#

both are a feed forward, second one is a different way of writing it

wooden sail Aug 12, 2024, 10:41 AM

#

since the sigmas are relus here, it's the same as multiplying specific paths by 1 or 0

lapis sequoia Aug 12, 2024, 10:42 AM

#

you can think of a single weight and how it moves in the network

#

it ends up multiplied by every other weight, in the following layers

#

so write that down for all weights added up, and it's just another description.

#

XD

#

well, a large enough network is a universal approximator

wooden sail Aug 12, 2024, 10:44 AM

#

the only thing they require for those first 2 equations is that the activation is a relu, as far as i can see

#

no

#

you'd have to analyze other architectures separately to show whether the result applies to them

#

general results of that kind are in general not tight and don't provide as much insight though

#

e.g. there are papers explaining the conditions under which special architectures will always reduce the training loss to 0, but you can't really make that conclusion for general networks since it would be a general statement about nonconvex opt that has eluded researchers for several years

#

the universal approx theorem is pretty much the starting point

#

which means you don't have even that for general architectures

#

that's a steep slope to fight against if you want to show any general results

#

that was kinda my point

#

you start with a general nonconvex, possibly nondifferentiable function you want to optimize, and have almost nothing to go off of

#

you'll find that theoretical guarantees of any kind are made only for special families of functions

#

and those are the special families of functions

#

if you go broader, you have pretty much nothing

#

nonconvex opt is a PITA

indigo wing Aug 12, 2024, 11:06 AM

#

https://github.com/samratsb/QueryDB

Hey I want to further my project to be able to talk with documents and dockerize it and upload it to cloud, what do I do?

GitHub

GitHub - samratsb/QueryDB: RAG with chromadb embedding and huggingf...

RAG with chromadb embedding and huggingface model. Contribute to samratsb/QueryDB development by creating an account on GitHub.

#

Anyone interested, please contribute

lapis sequoia Aug 12, 2024, 11:18 AM

#

does seem like nice project to experiment with activations, since they are easy to implement;
that paper by hendrycks is not too hard imho

#

has anyone used duckdb for logging metrics where there are a lot of metrics and they get logged ultra quickly

#

or has anyone used duckdb at all just wanted to know what people use it for and how fast it is in real tasks

buoyant vine Aug 12, 2024, 11:19 AM

#

Why duckdb for that application?

#

And define 'ultra quickly'

lapis sequoia Aug 12, 2024, 11:22 AM

#

buoyant vine And define 'ultra quickly'

so quickly that after I debugged the logging takes 40 seconds

#

not that quickly actually but there are so many metrics that it each logging event is very small amount of time from previous one

indigo wing Aug 12, 2024, 11:23 AM

#

can someone contribute to my project, I have created a semantic search. it adds querys, and does a semantic search

#

I want to use it to do more, I feel like this is not enough

lapis sequoia Aug 12, 2024, 11:23 AM

#

buoyant vine Why duckdb for that application?

I just read that it is fast but I can't find any benchmarks and dont know if it is worth trying to implement

#

but also I want to learn it anyway

left tartan Aug 12, 2024, 11:25 AM

#

lapis sequoia or has anyone used duckdb at all just wanted to know what people use it for and ...

I use duckdb all the time. It's a great analytical engine. Not what you'd use for logging tho: you might use it for log analysis tho.

lapis sequoia Aug 12, 2024, 11:25 AM

#

this is my current logging, just dicts

{
"metric1":{
  1: 100,
  4: 200,
  7: 300,
  },
"metric2":{
  2: 15,
  3: 25,
  },
}

and I thought it could be faster if I have a table for iterations, and table of metrics, and table of metric per iteration

left tartan Aug 12, 2024, 11:27 AM

#

Why not one table?

lapis sequoia Aug 12, 2024, 11:28 AM

#

left tartan Why not one table?

idk, I'm not sure what would be faster, I would need to benchmark all of it

left tartan Aug 12, 2024, 11:28 AM

#

lapis sequoia idk, I'm not sure what would be faster, I would need to benchmark all of it

Faster at what?

lapis sequoia Aug 12, 2024, 11:29 AM

#

left tartan Faster at what?

faster at saving the metrics per some iterations

left tartan Aug 12, 2024, 11:29 AM

#

Fastest is to stream a csv to a text file.

#

Appending to a single table will also be fast.

#

So will writing to a log store or time series db

#

The 'trick' to speed is to aggregate (buffer) writes... one big write is faster than many small

lapis sequoia Aug 12, 2024, 11:32 AM

#

I'll do more benchmarking

#

some metrics are logged every second iteration some like every 128 iteration which is why maybe multiple tables will be faster

left tartan Aug 12, 2024, 11:33 AM

#

lapis sequoia some metrics are logged every second iteration some like every 128 iteration whi...

What difference would that make to the tables?

#

How many metrics and how many iterations per second are we talking?

lapis sequoia Aug 12, 2024, 11:35 AM

#

indigo wing can someone contribute to my project, I have created a semantic search. it adds ...

that's a cute request, i've no knowledge about that, but keep trying :-)

#

Dan Hendrycks on Reddit, interesting story about SiLU

indigo wing Aug 12, 2024, 11:52 AM

#

lapis sequoia that's a cute request, i've no knowledge about that, but keep trying :-)

lemon_angrysad

lapis sequoia Aug 12, 2024, 11:55 AM

#

this video seems promising, terence tao about the potential of ai in mathematics (for automatic proofs etc.) https://www.youtube.com/watch?v=_sTDSO74D8Q

YouTube

Oxford Mathematics

The Potential for AI in Science and Mathematics - Terence Tao

Terry Tao is one of the world's leading mathematicians and winner of many awards including the Fields Medal. He is Professor of Mathematics at the University of California, Los Angeles (UCLA). Following his talk, Terry is in conversation with fellow mathematician Po-Shen Loh.

The Oxford Mathematics Public Lectures are generously supported by XT...

▶ Play video

indigo wing Aug 12, 2024, 11:57 AM

#

I need to do so much on this project, can someone help me please

#

add KNN, similarity check, eval downstream tasks, model fine tuning, making it work on pdfs (I am currently creating chunks from .md files and finally I want to use this project to further it and make it into a complete RAG, that is, make it ans from what it is not trained on, using what it is trained on

#

And I have no idea how to implement this all, including dockerizing, what tech to use etc. I am suffering.

lapis sequoia Aug 12, 2024, 12:00 PM

#

maybe up to 100 per second, and about 100 metrics, but only to are logged per iteration, other ones are once per few iterations

buoyant vine Aug 12, 2024, 12:02 PM

#

I mean for logs analytics if this is for a service click house or quickwit are better solutions

#

But I guess your scale is probably a bit small for those systems to really be super useful

unkempt wigeon Aug 12, 2024, 12:41 PM

#

may i ask a question

arctic wedgeBOT Aug 12, 2024, 12:43 PM

#

llama%2Fmodel.py lines 218 to 219

def forward(self, x):
    return self.w2(F.silu(self.w1(x)) * self.w3(x))```

unkempt wigeon Aug 12, 2024, 12:46 PM

#

can a rrecurent networks learn if you give it data?

fiery bane Aug 12, 2024, 12:47 PM

#

yes

unkempt wigeon Aug 12, 2024, 12:47 PM

#

no teaching requierd?

fiery bane Aug 12, 2024, 12:48 PM

#

well, you need backprop I guess

unkempt wigeon Aug 12, 2024, 12:49 PM

#

backprop?

#

@fiery bane

lapis sequoia Aug 12, 2024, 12:54 PM

#

interesting !

fiery bane Aug 12, 2024, 12:54 PM

#

unkempt wigeon backprop?

https://brilliant.org/wiki/backpropagation/

Backpropagation | Brilliant Math & Science Wiki

Backpropagation, short for "backward propagation of errors," is an algorithm for supervised learning of artificial neural networks using gradient descent. Given an artificial neural network and an error function, the method calculates the gradient of the error function with respect to the neural network's weights. It is a generalization of the d...

unkempt wigeon Aug 12, 2024, 12:56 PM

#

can you put a shut down on a neral network?

#

myapoliges

fiery bane Aug 12, 2024, 1:01 PM

#

you can shut down the machine that is running the neural network.

unkempt wigeon Aug 12, 2024, 1:07 PM

#

well i want to make a code just encase sone takes a copie of the network and train it to be dangerous or if it becomes dangerous on it's own i can reset it or shut it of in esence a time out

#

my apologizes

fiery bane Aug 12, 2024, 1:14 PM

#

unkempt wigeon well i want to make a code just encase sone takes a copie of the network and tra...

great idea. good luck!

unkempt wigeon Aug 12, 2024, 1:15 PM

#

but could the network break the shut down code?

fiery bane Aug 12, 2024, 1:18 PM

#

unkempt wigeon but could the network break the shut down code?

that would depends on the network, and the shut down code

unkempt wigeon Aug 12, 2024, 1:22 PM

#

a code decrypting neural network and image identifyer

#

so

#

what do what do you think

toxic mortar Aug 12, 2024, 1:40 PM

#

I want to integrate a model to some existing system. What should I pay attention to other then current infrastructure to ensure that my model wrapper complements their style

raw tree Aug 12, 2024, 2:17 PM

#

Hey guys, quick question
The last time I finetuned a ml model (xlm-roberta), my model basically learned to always predict the majority class - like ALWAYS, regardless of the input
Even if I used oversampling, the same issue occurred (it predicted the majority class in that epoch) : /
Do you guys have any ideas on what went wrong and how to solve it ?

serene scaffold Aug 12, 2024, 2:22 PM

#

raw tree Hey guys, quick question The last time I finetuned a ml model (xlm-roberta), my ...

these questions are hard to answer. you probably need to adjust the hyperparameters

limpid zenith Aug 12, 2024, 2:23 PM

#

raw tree Hey guys, quick question The last time I finetuned a ml model (xlm-roberta), my ...

did you use dynamic loss weighting like focal loss?

verbal oar Aug 12, 2024, 2:31 PM

#

hmm I wonder should I choose NLP with LLM or 3d deep learning path?

#

goal is to help people rather than do projects

#

or 3d deep learning is rather research?

brave yew Aug 12, 2024, 2:33 PM

#

hey guys has anyone here worked with the zero-shot-classification pipeline from transformers library? is it supposed to take so long to process a small string even when it is accelerated by a gpu? and yes i asked help on #1035199133436354600 but it was locked before anyone could answer

raw tree Aug 12, 2024, 2:38 PM

#

limpid zenith did you use dynamic loss weighting like focal loss?

Dunno what that means, but what I did was simple cross entropy
Would love to know what you are talking about

raw tree Aug 12, 2024, 2:39 PM

#

brave yew hey guys has anyone here worked with the zero-shot-classification pipeline from ...

It should warn you of you have a GPU that it detects and is not using

verbal oar Aug 12, 2024, 2:39 PM

#

focal loss is related to focal length?

raw tree Aug 12, 2024, 2:39 PM

#

Prob not lol

limpid zenith Aug 12, 2024, 2:40 PM

#

raw tree Dunno what that means, but what I did was simple cross entropy Would love to kno...

yes you can do focal loss for cross entropy ..it's a special formula that counts the numebr of classes and dynamically adjusts the weights of the observed batch

raw tree Aug 12, 2024, 2:40 PM

#

brave yew hey guys has anyone here worked with the zero-shot-classification pipeline from ...

How many parameters ?

brave yew Aug 12, 2024, 2:40 PM

#

raw tree How many parameters ?

from transformers import pipeline

# Use a model specifically fine-tuned for zero-shot classification
classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli', device=0)

res = classifier(
    "I am kinda sad today",
    candidate_labels=["happy", "sad"],
)

print(res)

this is all, and it doesn't show a output on pycharm

limpid zenith Aug 12, 2024, 2:40 PM

#

https://towardsdatascience.com/focal-loss-a-better-alternative-for-cross-entropy-1d073d92d075

Medium

Focal Loss : A better alternative for Cross-Entropy

Focal loss is said to perform better than Cross-Entropy loss in many cases. But why Cross-Entropy loss fails, and how Focal loss addresses…

raw tree Aug 12, 2024, 2:41 PM

#

limpid zenith https://towardsdatascience.com/focal-loss-a-better-alternative-for-cross-entropy...

ill give it another try with this, thanks a lot

limpid zenith Aug 12, 2024, 2:41 PM

#

np 🙂

raw tree Aug 12, 2024, 2:41 PM

#

I'll send the colab link next time lol

raw tree Aug 12, 2024, 2:42 PM

#

brave yew ```py from transformers import pipeline # Use a model specifically fine-tuned f...

Like no output whatsoever ?

brave yew Aug 12, 2024, 2:42 PM

#

wait could my internet speed be the bottleneck? because now that i think about it, how would it access teh model without locally caching it

brave yew Aug 12, 2024, 2:42 PM

#

raw tree Like no output whatsoever ?

nope it just runs and runs and i get no output

raw tree Aug 12, 2024, 2:42 PM

#

brave yew wait could my internet speed be the bottleneck? because now that i think about i...

Should show you a progreebar lol

verbal oar Aug 12, 2024, 2:42 PM

#

so like loop forever?

raw tree Aug 12, 2024, 2:42 PM

#

brave yew nope it just runs and runs and i get no output

Add prints to know where it is stuck lol

brave yew Aug 12, 2024, 2:43 PM

#

raw tree Add prints to know where it is stuck lol

where do i add prints in a 10 line code 😭

raw tree Aug 12, 2024, 2:43 PM

#

Models are dl'ed when inited

#

Not lazily

raw tree Aug 12, 2024, 2:44 PM

#

brave yew where do i add prints in a 10 line code 😭

The nine in between : P

limpid zenith Aug 12, 2024, 2:44 PM

#

brave yew hey guys has anyone here worked with the zero-shot-classification pipeline from ...

did you move the tensors to gpu?

raw tree Aug 12, 2024, 2:44 PM

#

limpid zenith did you move the tensors to gpu?

Huggingface takes care of that for you

brave yew Aug 12, 2024, 2:44 PM

#

limpid zenith did you move the tensors to gpu?

if you mean adding the device = 0 parameter then yes i did

brave yew Aug 12, 2024, 2:44 PM

#

brave yew ```py from transformers import pipeline # Use a model specifically fine-tuned f...

this the script

raw tree Aug 12, 2024, 2:45 PM

#

brave yew if you mean adding the device = 0 parameter then yes i did

Try removing that parameter
And actually try the prints lol

limpid zenith Aug 12, 2024, 2:45 PM

#

ahhh didn't see that

raw tree Aug 12, 2024, 2:46 PM

#

Again, how many parameters does that model have

#

Like if it's in the high billions, it'll take time

#

And what gpu

brave yew Aug 12, 2024, 2:46 PM

#

raw tree Try removing that parameter And actually try the prints lol

raw tree Aug 12, 2024, 2:47 PM

#

brave yew

Ah, I think I get it

#

The pycharm term seems to not support progress bars lol

#

The model is dl'ing

#

Use Windows terminal or kitty depending on your os

brave yew Aug 12, 2024, 2:48 PM

#

should i try running in the terminal?

brave yew Aug 12, 2024, 2:48 PM

#

raw tree Use Windows terminal or kitty depending on your os

alright

limpid zenith Aug 12, 2024, 2:48 PM

#

brave yew should i try running in the terminal?

in pycharm u can enable emulate the terminal btw

brave yew Aug 12, 2024, 2:49 PM

#

limpid zenith in pycharm u can enable emulate the terminal btw

yeah i think it works now

#

lmao it was at 99% all this time

raw tree Aug 12, 2024, 2:49 PM

#

That too works lol

raw tree Aug 12, 2024, 2:49 PM

#

brave yew lmao it was at 99% all this time

It'll time out
Read the error then

limpid zenith Aug 12, 2024, 2:49 PM

#

you can make it run with terminal like this instead of going to terminal

brave yew Aug 12, 2024, 2:50 PM

#

limpid zenith you can make it run with terminal like this instead of going to terminal

okay i will set it up, thanks!

lapis sequoia Aug 12, 2024, 3:32 PM

#

they've got a neat site https://www.vincentsitzmann.com/siren/

Implicit Neural Representations with Periodic Activation Functions

brave yew Aug 12, 2024, 3:38 PM

#

bro wtf! my device crashed while processing the stuff so i had to do a force restart and now it doesn't show that i have a gpu

serene scaffold Aug 12, 2024, 3:42 PM

#

brave yew bro wtf! my device crashed while processing the stuff so i had to do a force res...

in what environment?

brave yew Aug 12, 2024, 3:44 PM

#

serene scaffold in what environment?

i was working in a conda env, but not even my task manager recognizes my gpu

serene scaffold Aug 12, 2024, 3:45 PM

#

brave yew i was working in a conda env, but not even my task manager recognizes my gpu

so it's on your computer, and not something like google colab?
you can try rebooting, I guess.

lapis sequoia Aug 12, 2024, 3:58 PM

#

there are some interesting criticisms of siren here as well https://www.reddit.com/r/MachineLearning/comments/hd6tu1/d_paper_explained_siren_implicit_neural/

From the MachineLearning community on Reddit: [D] Paper Explained -...

Explore this post and more from the MachineLearning community

proper crag Aug 12, 2024, 4:03 PM

#

How to know if feature is linear?

bronze trout Aug 12, 2024, 4:08 PM

#

Thank you. So there is nothing on Python side that can be used instead?

#

Great thank you! 🙏

brave yew Aug 12, 2024, 4:25 PM

#

serene scaffold so it's on your computer, and not something like google colab? you can try reboo...

my gpu shows up again after restarting but the code has stopped working for some reason, even though it had worked earlier

serene scaffold Aug 12, 2024, 4:26 PM

#

brave yew my gpu shows up again after restarting but the code has stopped working for some...

how do you know that it "stops working"? what does that mean in this context?

brave yew Aug 12, 2024, 4:28 PM

#

Oh right, let me share it, one sec

lapis sequoia Aug 12, 2024, 4:29 PM

#

https://arxiv.org/pdf/1710.05941 "Searching for activation functions"

brave yew Aug 12, 2024, 4:30 PM

#

serene scaffold how do you know that it "stops working"? what does that mean in this context?

can i just share the repo?

serene scaffold Aug 12, 2024, 4:30 PM

#

brave yew can i just share the repo?

you can, but that won't tell us how you know that it "doesn't work".

#

was there an error message?

brave yew Aug 12, 2024, 4:32 PM

#

serene scaffold you can, but that won't tell us how you know that it "doesn't work".

right so essentially, it a review summarizer which scrapes reviews of product parses it and extracts reviews and then sends it over a zero shot classification pipline to classify the reviews by their degree of positivity or negativity, so the code returns an error while it creates an object for the class which does the classficaiton work

serene scaffold Aug 12, 2024, 4:33 PM

#

the code returns an error
If you need help in relation to an error message, always show the whole error message

brave yew Aug 12, 2024, 4:34 PM

#

from driver_init import SeleniumDriver
from review_scraper import ReviewScraper, parse_reviews
from review_classifier import ReviewClassifier


def main():
    url = "https://www.amazon.in/Number-Backpack-Compartment-Charging-Organizer/dp/B09VTDMRY7?pd_rd_w=giCzt&content-id=amzn1.sym.ec5c60c1-ae3d-4950-9707-1e49240719bc&pf_rd_p=ec5c60c1-ae3d-4950-9707-1e49240719bc&pf_rd_r=Y3MSH92QWBEKYCN9ATGK&pd_rd_wg=ZzwV4&pd_rd_r=8e0c7a40-a11e-4573-9b38-15ab13f59a8c&pd_rd_i=B09VTDMRY7&ref_=pd_hp_d_btf_unk_B09VTDMRY7"

    # Creating object of the class SeleniumDriver
    selenium_driver = SeleniumDriver()
    # Setting up the webdriver
    driver = selenium_driver.get_driver()

    try:
        # Creating object of the class ReviewScraper
        review_scraper = ReviewScraper(driver)
        page_sources = review_scraper.navigate_to_reviews(url)

        # Check if we have the required page sources
        if len(page_sources) >= 2:
            # Parse reviews
            positive_reviews = parse_reviews(page_sources[0])
            negative_reviews = parse_reviews(page_sources[1])

            # Combine positive and negative reviews
            reviews = ([review for review in positive_reviews] +
                       [review for review in negative_reviews])

            print("Reviews:")
            for i, review in enumerate(reviews, start=1):
                print(f"Review {i}:")
                print(f"Review: {review['review']}")
                print(f"Date: {review['date']}")
                print("-" * 40)  # Separator line

            # Create and use the ReviewClassifier
            review_classifier = ReviewClassifier(reviews)
            review_classifier.classify_reviews()

        else:
            print("Not enough page sources available.")

    except Exception as e:
        print(f"An error occurred: {e}")

    finally:
        # Close the driver
        selenium_driver.close_driver()


if __name__ == "__main__":
    main()

the main file

serene scaffold Aug 12, 2024, 4:34 PM

#

if your code "doesn't work", but you got an error message, you don't need to say that the code doesn't work. you only need to show the error message.

brave yew Aug 12, 2024, 4:34 PM

#

print(f"An error occurred: {e}")

as far as i understand e should send error message, but it gives only '0'

serene scaffold Aug 12, 2024, 4:35 PM

#

brave yew print(f"An error occurred: {e}") as far as i understand e should send error mes...

show your whole terminal output from when you start the program to the end of the error message

#

!paste

arctic wedgeBOT Aug 12, 2024, 4:35 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

brave yew Aug 12, 2024, 4:35 PM

#

alright

brave yew Aug 12, 2024, 4:37 PM

#

serene scaffold show your whole terminal output from when you start the program to the end of th...

this is it

serene scaffold Aug 12, 2024, 4:37 PM

#

brave yew this is it

okay. make it so that none of the code is in try-except, so that when an exception is raised, you get the exception.

brave yew Aug 12, 2024, 4:38 PM

#

serene scaffold okay. make it so that none of the code is in try-except, so that when an excepti...

okay give me a sec

arctic wedgeBOT Aug 12, 2024, 4:38 PM

#

:incoming_envelope: :ok_hand: applied timeout to @flat plaza until <t:1723481327:f> (10 minutes) (reason: duplicates spam - sent 4 duplicate messages).

The <@&831776746206265384> have been alerted for review.

brave yew Aug 12, 2024, 4:43 PM

#

serene scaffold okay. make it so that none of the code is in try-except, so that when an excepti...

https://paste.pythondiscord.com/SPEA

#

i hope that i used that right

serene scaffold Aug 12, 2024, 4:43 PM

#

brave yew https://paste.pythondiscord.com/SPEA

"sequence": sequences[0],
~~~~~~~~~^^^
KeyError: 0

do you see what this error message is telling you?

brave yew Aug 12, 2024, 4:44 PM

#

serene scaffold "sequence": sequences[0], ~~~~~~~~~^^^ KeyError: 0 do you see w...

not quite... i don't understand what is under the pipelines that well

serene scaffold Aug 12, 2024, 4:45 PM

#

brave yew not quite... i don't understand what is under the pipelines that well

do you know what keys are in Python?

brave yew Aug 12, 2024, 4:45 PM

#

yes

serene scaffold Aug 12, 2024, 4:45 PM

#

so if you got a key error from doing sequences[0], then what type of object is sequences?

brave yew Aug 12, 2024, 4:46 PM

#

a list?

serene scaffold Aug 12, 2024, 4:46 PM

#

you are mixing up indices and keys.

brave yew Aug 12, 2024, 4:46 PM

#

am i sending a list of dictionaries instead of a list of string to the classifier?

serene scaffold Aug 12, 2024, 4:47 PM

#

sequences is apparently a dict.

#

for which 0 is not one of its keys

#

what types are the keys and values of sequences? I do not know.

brave yew Aug 12, 2024, 4:48 PM

#

serene scaffold what types are the keys and values of `sequences`? I do not know.

i do not have a dictionary named sequences, in my code

serene scaffold Aug 12, 2024, 4:49 PM

#

the code where the error occurs is C:\Users\Rikhil Nellimarla\.conda\envs\NLP_env\Lib\site-packages\transformers\pipelines\zero_shot_classification.py

brave yew Aug 12, 2024, 4:49 PM

#

serene scaffold the code where the error occurs is `C:\Users\Rikhil Nellimarla\.conda\envs\NLP_e...

https://paste.pythondiscord.com/3HEQ

brave yew Aug 12, 2024, 4:50 PM

#

serene scaffold the code where the error occurs is `C:\Users\Rikhil Nellimarla\.conda\envs\NLP_e...

yes... but that is inside a package tho

#

i wouldn't edit the python files inside an imported package

lapis sequoia Aug 12, 2024, 4:54 PM

#

just went on a rabbit hole reading about sirens, if i understand correctly those can't model the probability distribution of a dataset of signals, just overfit a single signal sample.
still, it is extremely cool!

brave yew Aug 12, 2024, 4:56 PM

#

i think i got it! i

        positive_reviews = parse_reviews(page_sources[0])
        negative_reviews = parse_reviews(page_sources[1])

        # Combine positive and negative reviews
        reviews = ([review for review in positive_reviews] +
                   [review for review in negative_reviews])

here both positive and negative reviews are a list of dictionaries with the review of the product and the date, so when only isolating reviews i need to write

            reviews = ([review['review'] for review in positive_reviews] +
                       [review['review'] for review in negative_reviews])

instead so that i can take only string assigned to the key of reviews

#

@serene scaffold see

serene scaffold Aug 12, 2024, 4:59 PM

#

@brave yew sorry, I have like four coworkers asking me to do stuff

brave yew Aug 12, 2024, 4:59 PM

#

serene scaffold <@790243224999952404> sorry, I have like four coworkers asking me to do stuff

no its okay, i just wanted to say my issue got solved, thanks!

serene scaffold Aug 12, 2024, 4:59 PM

#

brave yew i think i got it! i ```py positive_reviews = parse_reviews(page_source...

you can just do reviews = positive_reviews + negative_reviews and it's the same thing

brave yew Aug 12, 2024, 5:00 PM

#

serene scaffold you can just do `reviews = positive_reviews + negative_reviews` and it's the sam...

oh? but the positive reviews and negative reviews are lists of dictionaries

serene scaffold Aug 12, 2024, 5:00 PM

#

sure, but if you're just concatenating two lists, you just + them. you don't need the list comp part.

#

[x for x in y] is pointless if y is already a list.

#

(note that for arrays, this would do elementwise addition, so don't use "list" and "array" interchangeably)

brave yew Aug 12, 2024, 5:02 PM

#

serene scaffold `[x for x in y]` is pointless if `y` is already a list.

oh okay, i will do it like that then, thanks

runic parcel Aug 12, 2024, 5:29 PM

#

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.chains import RetrievalQA
import os

openai_api_key = "sk-"

K_RESULTS = 3  
SIMILARITY_THRESHOLD = 0.5  
SYSTEM_PROMPT = "I have an AI informational website. " \
                "You should check the user's prompt and recommend the best tool as per it. " \
                "Reply with the tool names in a Python list."


def ask_question(query):
    embeddings = OpenAIEmbeddings(api_key=openai_api_key)
    vector_store = Chroma(persist_directory="./chroma_db", embedding_function=embeddings)

    llm = ChatOpenAI(api_key=openai_api_key, model_name="gpt-3.5-turbo",
                     messages=[
                         {"role": "system",
                          "content": "I have an AI informational website. You should check the user's prompt and recommend the best tool as per it. Reply with the tool names in a python list."},
                     ])

    retriever = vector_store.as_retriever(
        search_type="similarity_score_threshold",
        search_kwargs={"k": K_RESULTS, "score_threshold": SIMILARITY_THRESHOLD}

    )

    chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
    response = chain.invoke({"query": query})

    if 'source_documents' in response:
        for doc in response['source_documents']:
            print(f"Source Document: {doc.metadata['source']}, Section: {doc.metadata.get('section', 'N/A')}\nContent: {doc.page_content}\n")

    return response.get("result", "No result found.")


if __name__ == "__main__":
    query = "i need to build a website and i need tts features"
    print(ask_question(query))

when i added the system message in the ChatOpenAI, my code wont run properly and gives me
Sure! Please provide me with the user's prompt so I can recommend the best tool accordingly.

instead i wanted to print the tool name

unkempt wigeon Aug 12, 2024, 5:59 PM

#

how do i make a network library?

lapis sequoia Aug 12, 2024, 6:00 PM

#

unkempt wigeon how do i make a network library?

what type of network

lapis sequoia Aug 12, 2024, 6:01 PM

#

unkempt wigeon how do i make a network library?

what type of library?

unkempt wigeon Aug 12, 2024, 6:09 PM

#

recurent image detection and traking

#

my apoliges

#

@lapis sequoia

lapis sequoia Aug 12, 2024, 6:15 PM

#

unkempt wigeon recurent image detection and traking

so what do you want to make

#

a library that does the image detection? Or library to train new models for image detection?

#

do you just need a model or more than that

unkempt wigeon Aug 12, 2024, 6:18 PM

#

a library to train new models

#

my apoligez

#

im sorr

#

@lapis sequoia

unkempt wigeon Aug 12, 2024, 6:49 PM

#

unkempt wigeon a library to train new models

@lapis sequoia

lapis sequoia Aug 12, 2024, 6:51 PM

#

maybe train a model and put the training code into a library

#

@unkempt wigeon

toxic mortar Aug 12, 2024, 8:03 PM

#

Is this the correct formula/terminology to calculate accuracy in a scenario where I have 10 categories and want to classify my input into one of them?

For example, referring to the 10k instances that were classified as abusive, where 95% are truly abusive and 5% were classified as abusive but are actually good instances , should I use the formula: TP / (TP + FP) to calculate accuracy? Or thats for like binary

spare forum Aug 12, 2024, 8:45 PM

#

toxic mortar Is this the correct formula/terminology to calculate accuracy in a scenario wher...

Accuracy works the same

#

You can do things like precision/recall per class or averaging them over all the class to get one metric

unkempt wigeon Aug 13, 2024, 12:39 AM

#

#===[imports]===#
import sys
import numpy as np
import matplotlib
#===============#

#===[neuron network]===#
np.random.seed(0)

X = [[1, 2 ,3,2.5],
    [2.0,5.0,-1.0, 2.0],
    [-1.5, 2.7, 3.3, -0.8]]

class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
        def forward(self, inputs):
            self.output = np.dot(inputs, self.weights) + self.biases

layer0 = Layer_Dense(4,5)              
layer1 = Layer_Dense(5,2)

layer0.forward(X)
print(layer1.output)

is this ok?

violet gull Aug 13, 2024, 2:11 AM

#

unkempt wigeon ```py #===[imports]===# import sys import numpy as np import matplotlib #=======...

It should be matmul not dot in forward

#

Dot product returns a scalar

#

Also the end bit doesn’t make sense. layer1.output is going to be undefined

serene scaffold Aug 13, 2024, 2:42 AM

#

violet gull Dot product returns a scalar

this is only true for 1d arrays.

#

https://numpy.org/doc/stable/reference/generated/numpy.dot.html

#

@unkempt wigeon

There is no reason for importing sys here. It's not clear if you plan to use matplotlib later.
X should probably be an array.
You should name the class LayerDense or DenseLayer. Don't use Upper_Snake_Case in python.
you made the def forward block part of the def __init__ block.
The forward method should return the output, not assign it to self.
You don't do anything to make the output of layer0 go into layer1.

unkempt wigeon Aug 13, 2024, 3:02 AM

#

https://youtu.be/TEWy9vZcxW4?feature=shared&t=1236

YouTube

sentdex

Neural Networks from Scratch - P.4 Batches, Layers, and Objects

Neural Networks from Scratch book: https://nnfs.io

NNFSiX Github: https://github.com/Sentdex/NNfSiX

Playlist for this series: https://www.youtube.com/playlist?list=PLQVvvaa0QuDcjD5BAw2DxE6OF2tius3V3

Neural Networks IN Scratch (the programming language): https://youtu.be/eJ1HdTZAcn4

Python 3 basics: https://pythonprogramming.net/introduction-...

▶ Play video

lapis sequoia Aug 13, 2024, 3:04 AM

#

unkempt wigeon https://youtu.be/TEWy9vZcxW4?feature=shared&t=1236

His forward function is not inside of __init__ and also making X(data) into a numpy array will make operations faster and easier because you will be able to leverage the functions associated with the np.ndarray type

unkempt wigeon Aug 13, 2024, 3:06 AM

#

how could i make X into an array?

#

im new to this my appoliges

#

@lapis sequoia

lapis sequoia Aug 13, 2024, 3:29 AM

#

unkempt wigeon how could i make X into an array?

yes

unkempt wigeon Aug 13, 2024, 3:29 AM

#

asarry()?

lapis sequoia Aug 13, 2024, 3:35 AM

#

import numpy as np

np.random.seed(0)

X = np.array([[1, 2, 3, 2.5],
              [2.0, 5.0, -1.0, 2.0],
              [-1.5, 2.7, 3.3, -0.8]], dtype=np.float32)


class LayerDense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))

    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases


layer0 = LayerDense(4, 5)
layer1 = LayerDense(5, 2)

layer0.forward(X)
# print(layer0.output)
layer1.forward(layer0.output)
print(layer1.output)

#

@unkempt wigeon

#

the exact code showed in the video

#

idk If I would structure my nn like this but that just me

unkempt wigeon Aug 13, 2024, 3:38 AM

#

does array have indexing or something else

lapis sequoia Aug 13, 2024, 3:40 AM

#

unkempt wigeon does array have indexing or something else

yes

#

a np.array works

#

like a list just has certain operations that are faster

unkempt wigeon Aug 13, 2024, 3:40 AM

#

asarray()

lapis sequoia Aug 13, 2024, 3:42 AM

#

unkempt wigeon asarray()

idk what that is

#

oh

unkempt wigeon Aug 13, 2024, 3:42 AM

#

https://www.geeksforgeeks.org/convert-python-list-to-numpy-arrays/

GeeksforGeeks

Convert Python List to numpy Arrays - GeeksforGeeks

A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.

lapis sequoia Aug 13, 2024, 3:42 AM

#

!d numpy.asarray

arctic wedgeBOT Aug 13, 2024, 3:42 AM

#

numpy.asarray


numpy.asarray(a, dtype=None, order=None, *, device=None, copy=None, like=None)```
Convert the input to an array.

lapis sequoia Aug 13, 2024, 3:42 AM

#

I see

unkempt wigeon Aug 13, 2024, 3:43 AM

#

thank you

#

im sorry

lapis sequoia Aug 13, 2024, 3:43 AM

#

unkempt wigeon thank you

asarray

#

looks like a wrapper

#

of the numpy.array

unkempt wigeon Aug 13, 2024, 3:44 AM

#

[0.1 0.2 0.3 0.4]

lapis sequoia Aug 13, 2024, 3:44 AM

#

https://stackoverflow.com/questions/14415741/what-is-the-difference-between-np-array-and-np-asarray

Stack Overflow

What is the difference between np.array() and np.asarray()?

What is the difference between NumPy's np.array and np.asarray? When should I use one rather than the other? They seem to generate identical output.

lapis sequoia Aug 13, 2024, 3:44 AM

#

unkempt wigeon [0.1 0.2 0.3 0.4]

?

unkempt wigeon Aug 13, 2024, 3:48 AM

#

#===[imports]===#
import numpy as np
#===============#

X = [0.1, 0.2, 0.3, 0.4]

converted_data0=np.asarray(X)

print(converted_data0)

#

@lapis sequoia

serene scaffold Aug 13, 2024, 3:52 AM

#

unkempt wigeon https://www.geeksforgeeks.org/convert-python-list-to-numpy-arrays/

I would avoid geeksforgeeks entirely. They have bad quality control.

unkempt wigeon Aug 13, 2024, 3:54 AM

#

that was a artical i found im sorry

#

how could i get the dat from the array?

#

@serene scaffold

lapis sequoia Aug 13, 2024, 4:34 AM

#

unkempt wigeon ```py #===[imports]===# import numpy as np #===============# X = [0.1, 0.2, 0.3...

use

#

import numpy as np
lis = np.array([1,2,3,4])

unkempt wigeon Aug 13, 2024, 4:39 AM

#

how could extract the data now from the atay?

#

my apoliges

lapis sequoia Aug 13, 2024, 4:44 AM

#

unkempt wigeon how could extract the data now from the atay?

wdym?

#

lis is the same as converted_data0

#

in your example

unkempt wigeon Aug 13, 2024, 4:48 AM

#

take what is in the array and make it where I can add all of the numbers in the array

lapis sequoia Aug 13, 2024, 4:53 AM

#

unkempt wigeon take what is in the array and make it where I can add all of the numbers in the...

!e
import numpy as np
print(sum(np.array([1,2,3,4])))

arctic wedgeBOT Aug 13, 2024, 4:53 AM

#

lapis sequoia !e import numpy as np print(sum(np.array([1,2,3,4])))

:white_check_mark: Your 3.12 eval job has completed with return code 0.

smoky basalt Aug 13, 2024, 6:09 AM

#

yo where do i learn pandas and data science and ai

#

and the whole lot

violet gull Aug 13, 2024, 9:03 AM

#

unkempt wigeon https://youtu.be/TEWy9vZcxW4?feature=shared&t=1236

just a warning: this creator stopped making tutorials the moment before he got to the actual complicated stuff needing of tutorials