#data-science-and-ml

1 messages · Page 106 of 1

tribal meteor
#

Thank y’all for the help

river cape
#

Is JSON very important?

past meteor
#

It used to be XML, but it's been replaced by JSON

#

It's also used for configuration, but there I'd say YAML, TOML etc are a bit more popular

gleaming gyro
#

I have this following data

#

which I am trying to plot by

plt.plot(options['wavelength']*1e9, results_RCWA_Matrix[0].R[0], 'b-', label='R')
plt.plot(options['wavelength']*1e9, results_RCWA_Matrix[0].A_prof[0]+results_RCWA_Matrix[0].A_bulk[0]+results_RCWA_Matrix[0].A_interface[0], 'r-', label='A')
plt.plot(options['wavelength']*1e9, results_RCWA_Matrix[0].T[0], 'g-', label='T')
plt.plot(options['wavelength']*1e9, results_RCWA_Matrix[0].A[0], 'r-', label='A')```
Though I can see A_prof and A in the dataset, I think I am not indexing it correct
#

which is why I get the following error
AttributeError: 'Dataset' object has no attribute 'A_prof'
Can someone help me index it properly

long canopy
#

is there a humaneval leaderboard somewhere?

lapis sequoia
#

guys I don't understand the concept of a loss function. Imagine I train an actor-critic model, what exactly is being lost?

#

I know there is a reward function, but what does loss function mean?

#

Imagine this is the reward function, what would the loss function mean in terms of concept?

red bane
#

I tried installing tensorflow, but when I checked for the available gpu lists, it keeps coming out as having no available gpus. Running the command "nvidia-smi" in terminal shows the following image. Can anyone tell me what I've done wrong and what I should do to enable tensorflow to use gpu?

long canopy
#

less loss: good

#

lossless: audio format

lapis sequoia
#

How does it know what the real outcomes are if they are unknown? If the model knew real outcomes, it would immediately set its state to optimal policy at the next step

#

loss = current reward + (discount rate) * (estimated value of new state) - value of current state

long canopy
#

loss = bad

#

more loss = more bad

#

no more thinking than that

lapis sequoia
long canopy
#

absolute badness, pure evil

#

pure not-want

#

this is as much as you need to think about it because there are arbitrarily many loss implementations

#

there is no "The Loss Function"

long canopy
# lapis sequoia bad comparing to what?

you make a forward pass, you check what your loss fuction tells you, you modify your weights, you pass forward again, then you check your loss function again. if it increased, your weight mod is not a good one

lapis sequoia
#

i know there are different implementations

long canopy
#

you create ANY loss function that does this

#

YOU decide what your loss function is

lapis sequoia
#

so loss function is the difference between the desired outcomes from the already sampled ones and the random guess, like in linear regression?

long canopy
#

yes but only insofar as an increase in the loss function is bad, otherwise it's not called a loss function but a gain/value/reward function

#

and it's not linear regression

#

i repeat: you decide what your loss function is

lapis sequoia
long canopy
#

can you modify your model? if you created the model, whatever you want. otherwise, you need to check how the model was programmed

final kiln
#

I am now meeting rusts borrow checker

Unsure if happy about it

remote stream
#

someone can help me with data analysis of a dataset

#

i am having headache

void crescent
#

im kinda new to AI, but why isnt glob returning anything?

#

absolutely nothing is happening

#

when i run that second cell

solar oriole
#

how deep should i learn mathematics to learn machine learning?

#

should i be like perfect in the concepts or is it enough to know basic definitions and formulas?

past meteor
#

And how you want to do it

solar oriole
#

ohh

#

i still dont have any idea how far i might go

#

i am just learning cuz i am curious, interested and for my career

past meteor
#

Then I'd say just start applying it

#

Look for hands-on tutorials

solar oriole
#

i got the course material and videos

past meteor
#

If ML is "for you" you'll fall in a rabbit hole and you'll voluntarily learn the math

#

I assume you're not yet in university?

solar oriole
#

some of the math is being taught in the college soo

solar oriole
past meteor
#

With a standard linear algebra and calculus course you can get far

#

I'm not from the US so I don't know what all those calc 1, 2, 3 things are but

past meteor
#

Basically, if you have a solid notion of linear algebra and multivariate calculus you're fine

#

with fine I mean, fine for applied ML. If you want to work on designing new paradigms you need more math

#

But I wouldn't worry about that

solar oriole
#

they taught us linear algebra in the first semester, i am good at it, i scored perfect in exams, but idk if those topics are enough

#

ik eigen values, eigen vectors, curve fitting, etc

#

is that enough for linear algebra should i move on to calculus??

#

i also know partial differentiation and some integration

past meteor
#

Then you know enough math to get started

solar oriole
past meteor
#

yes

solar oriole
#

i know some basic python

#

like loops, functions and such

#

should i learn more?

past meteor
#

Yes

#

My philosophy is that you should only learn 1 really new thing at a time

#

So do Python and "just" Python until you're comfortable with it and then move on to descriptive statistics / data visualisation and then ML

solar oriole
#

now since u said my current math knowledge is enough

#

i will learn python then

long canopy
#

what binary classifier for text architecture should i look into if i'm looking to optimize either for performance or for resource efficiency? (it's probably a different one for each)

final kiln
# long canopy what binary classifier for text architecture should i look into if i'm looking t...

Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classifica...

long canopy
final kiln
long canopy
dusty forge
#

Hi, need some clarification. In this course we replaced the 'State' column with two additional columns to get rid of the categorical values. He calls them Dummy Vars. Then he mentions to always use only one Dummy (in this example of two values), because if it's not 1, it must be Cali.

The part I'm confused about, in another course we replaced a column 'Countries' with additional columns, but we called it Vectors. To be fair, in that example it were three countries, but we used all columns.

How is this situation different than the one where we call them Vectors and use all the columns?

wooden sail
#

since there are only two classes, saying class a or class b is the same as saying class a or not class a

void crescent
#

guys can someone help me in building a dataset

#

for training a sequential model

#

i have 2 LISTS of images with labels of 2 different classes

#

but idk how to make a dataset out of these

void crescent
#

guys

#

which extension should i use to save the model

#

.keras or .h5

final kiln
#

My deploy is gonna be a rust binary and a single python file that deploys the pipeline.

dusty forge
wooden sail
#

you could also say it's a vector of dummy vars, or that the dummy vars are vector-valued

#

and even that each entry of the dummy vars, which are vectors, is a scalar dummy var

long canopy
#

where would you guys go for code datasets?

wooden sail
#

it's all true, you can pick the nomenclature that helps you most

long canopy
dusty forge
void crescent
#

guys

#

can someone tell me

#
test = cv2.imread("breast-hispathology-images/IDC_regular_ps50_idx5/8863/0/8863_idx5_x51_y1251_class0.png")

print(test.shape)
model.predict(test)

why this return error?

#
Invalid input shape for input Tensor("sequential_2_1/Cast:0", shape=(32, 50, 3), dtype=float32, device=/job:localhost/replica:0/task:0/device:CPU:0). Expected shape (None, 50, 50, 3), but input has incompatible shape (32, 50, 3)
#

test.shape gives

#

(50, 50, 3)

runic lantern
#

Hi I have something I want to implement using PySpark's pandas UDF but I cannot figure out how, can someone help me out please

serene scaffold
final kiln
void crescent
final kiln
#

Also be careful with cv2.imread, it reads BGR instead of RGB, so you might need to permute the dimensions

void crescent
#

oh

final kiln
#

torch.Tensor(test) should work

#

torch.tensor(np.array([[1, 2, 3], [4, 5, 6]]))

From the docs

void crescent
#

array([[9.999566e-01, 4.341068e-05]] my classes are 0 and 1

#

so which one is the actual prediction?

final kiln
#

What am I looking at ? Is it the output of the model ?

void crescent
#

yes

#

thats whats it giving when i followed ur insturctions

final kiln
#

Do sum(output) and see what you get

void crescent
#

??

#

its returning a tensor

#

is that supposed to happen

final kiln
#

No, sum the output

#

Not the input

#

sum(x)

void crescent
#

which one is the output

final kiln
#

sum(x[0])

#

This one then

void crescent
#

how is that possible?

#

the classe are 0 and 1

final kiln
#

Awesome, it's a bit above one

void crescent
#

wait wtf

final kiln
#

But that's due floating point awkwardness

#

They are probabilities

#

For each of your classes

void crescent
#

its CLASS 0

#

WHY IS IT GIVING 1

#

wait let me try with class1

#

its giving 0.9999

#

for class 1

agile cobalt
void crescent
#

ohh

#

what do i do argmax on?

agile cobalt
# void crescent ohh

Look up how Logistic Regression works, it sounds like you did not read up on how the model you are using works at all before trying to use it?

void crescent
#

so i assumed the same

#

and got stuck

#

i used CNN on Malaria

#

and im pretty sure this is also a CNN

agile cobalt
void crescent
#

i never had to do that in my previous projects

agile cobalt
#

If you were always doing binary classification before, it might have only had one output, but if you need to work with three or more classes that approach won't work

void crescent
#

o

#

ok what do i do argmax on

#

?

runic lantern
#

I have this dataframe that I have converted into a spark dataframe

void crescent
#

thats what happens when i do argmax of the result

runic lantern
# runic lantern I have this dataframe that I have converted into a spark dataframe
def get_article_details(article_id: int, article_url: str):
    
    agent = UserAgent()

    try:
        article = Article(article_url, headers={"User-Agent": agent.random})
        article.download()
        article.parse()
        article.nlp()
        
        if len(article.text) > 0 and len(article.summary) > 0:
            return (article_id, article.text, article.summary)
        
        else:
            print(f"article with url -> {article_url} extracted with 0 length text or 0 length summary\n")

    except(ArticleException, OSError) as e:
        print('***FAILED TO DOWNLOAD***', article_url, "\n")
        print(e, "\n")
        
    return (None, None, None)
void crescent
#

i just did x[0][1]

runic lantern
#

I have trouble converting this function into a pandas UDF, can I get some pointers on how to do so?

void crescent
#

and that seemed to give accurate results

remote stream
#

Anyone knows data analysis, here

river cape
#

When do we use linear classfiers and non linear classifiers?

runic lantern
# runic lantern ```Python def get_article_details(article_id: int, article_url: str): a...

the main issue is that I have found out (I may be wrong here), the output of a UDF is a single column or a pandas series. but I want to output two columns or two series. I can't really figure out how to do it. read some stack overflow questions on this and the workaround is using nested types and then selecting the individual columns from the nested types but all those answers were using normal python UDFs in spark and not the pandas UDF. so yeah links to these stack overflow posts:

https://stackoverflow.com/questions/35322764/apache-spark-assign-the-result-of-udf-to-multiple-dataframe-columns

void crescent
#

@agile cobalt when i try to mass predict its making mistakes

#

can you tell me why pls

#

oh its coz the accuracy is 78%

#

nvm i gotta do more epochs

lapis sequoia
#

I'm studying Actor-Critic models using tensorflow, in particular CartPole-v1 from gym.

#

I kind of understand what a policy is = it's a Categorical distribution of the most likely benefitical outcomes. However I do not understand what a state value means

#

state value is tf.Tensor([[-0.11380462]], shape=(1, 1), dtype=float32)

#

policy is tf.Tensor([[0.42239672 0.5776033 ]], shape=(1, 2), dtype=float32)

#

Does anybody understand what the state value of -0.11380462 means in layman terms for Cartpole-v1 environment?

#

I know policy shows the best estimates for how likely pushing cartpole left or right is, hence it's a categorical distribution with 2 values. Model thinks 0.42239672 is chance of having reward +1 by pushing left, and 0.5776033 is chance of having reward +1 when pushing right

#

However I have no idea how to explain the state value of -0.11380462. I know this is the approximation of value function in current state from a critic

final kiln
#

Borrow checker was easy to avoid, kept it at bay for now

#

Otherwise, rust is really good

#

Auto docs, good type system, builtin test features

lapis sequoia
#

ok I spoke to chatgpt4 and apparently this -0.11380462 number is only an estimate of all expected future rewards starting from the current state, which can be imroved since critic is itself a neural network. Does that mean that Actor-Critic neural network is simply a Markov chain with estimated value assigned to current state and estimated probabilities going in all possible adjacent states according to the list of possible actions?

odd meteor
lapis sequoia
lapis sequoia
odd meteor
lapis sequoia
odd meteor
odd meteor
dusty forge
#

Question about One Hot Encoding. Before the encoding, the column it will be applied to is at [3], after it was moved to [0][1][2], why is this?

dusty forge
odd meteor
# river cape When do we use linear classfiers and non linear classifiers?

There's really no rule of thumb to this. Usually, most people use a linear-based model as baseline model and benchmark its performance against the non-linear based models (Tree-based classifiers (Decision Tree), Distance-based (KNN), Ensemble, Probability-based (Naive Bayes), Kernel-based (SVM), Neural Network etc)

full ore
#

Hi I have a question about Jypter Notebook: Do I need all of the code for a given thing in one cell or can I spread code across multiple cells and refrence them like I would normal .py files in a directory? Say if I had a cell specfically for defining dataclasess and other classes and then a different cell for reading and formatting my csv data and then yet another cell for tying the other two together.

lapis sequoia
long canopy
odd meteor
long canopy
#

i.e., none of these are strictly superseded, right?

dusty forge
odd meteor
long canopy
#

hm i see, so it's really a case-by-case, hands-on, let's try it out for this particular dataset kind of thing

#

thanks a lot for the answer, am still working up to the RAG articles you shared btw

grizzled furnace
#

does anyone know of any alternatives to Plotly?
I want to have interactive 3d plots, but plotly is sooo slow 😦

grizzled furnace
grizzled furnace
# final kiln pyvista

by any chance, do you know how to assign a specific color to each point according to some 'rule'?
Trying to plot a complex function and can't find anything in the docs

final kiln
grizzled furnace
final kiln
grizzled furnace
final kiln
#

It's certainly possible

grizzled furnace
final kiln
#

Trying to figure out

#

The color here seems to be just the height right

odd meteor
final kiln
#

Seems to be this

#

Instead of the mappings > constant, you'd using phases > constant

grizzled furnace
final kiln
#
# Define the colors we want to use
blue = np.array([12 / 256, 238 / 256, 246 / 256, 1.0])
black = np.array([11 / 256, 11 / 256, 11 / 256, 1.0])
grey = np.array([189 / 256, 189 / 256, 189 / 256, 1.0])
yellow = np.array([255 / 256, 247 / 256, 0 / 256, 1.0])
red = np.array([1.0, 0.0, 0.0, 1.0])

mapping = np.linspace(mesh['values'].min(), mesh['values'].max(), 256)
newcolors = np.empty((256, 4))
newcolors[mapping >= 80] = red
newcolors[mapping < 80] = grey
newcolors[mapping < 55] = yellow
newcolors[mapping < 30] = blue
newcolors[mapping < 1] = black

# Make the colormap from the listed colors
my_colormap = ListedColormap(newcolors)
#

You can edit the newcolors array as you wish, doesn't need to be based on the mapping height

#

Oh wait it's doing custom ranges, oops

#
# create an image using numpy,
xx, yy = np.meshgrid(np.linspace(-200, 200, 20), np.linspace(-200, 200, 20))
A, b = 500, 100
zz = A * np.exp(-0.5 * ((xx / b) ** 2.0 + (yy / b) ** 2.0))

# Creating a custom RGB image
cmap = get_cmap("nipy_spectral")
norm = lambda x: (x - np.nanmin(x)) / (np.nanmax(x) - np.nanmin(x))
hue = norm(zz.ravel())
colors = (cmap(hue)[:, 0:3] * 255.0).astype(np.uint8)
image = colors.reshape((xx.shape[0], xx.shape[1], 3), order="F")

# Convert 3D numpy array to texture
tex = pv.numpy_to_texture(image)

# Render it
curvsurf.plot(texture=tex)
#

Seems to be this, create a texture based on the phase and apply it

long canopy
final kiln
#

I forget how amazing this lib is

long canopy
#

some neat diagrams for video architectures

final kiln
long canopy
#

yeah was just pointing out the neat diagrams

final kiln
#

Oh they do have a point

#

First time I was reading the 2017 paper I was preplexed

final kiln
long canopy
#

yeah it's literally a full visual specification

#

i.e. you can fully replicate the model component from this picture

final kiln
#

It's too heavy for what it is, scaled dot product is fairly simple

#

Honestly just having the equations laid out would've helped a lot streamline that paper

long canopy
#

heheh

final kiln
#

Look at the space the equation occupy, and at the space the diagram occupies

long canopy
#

beautiful

final kiln
#

It looks cool, but idk if it makes it easier for me to understand

long canopy
# final kiln

problem is, the equation doesn't tell you how to implement it

#

picture does

#

but yeah it's a bit heavy

#

the first parts of the paper go into the symbology

final kiln
final kiln
long canopy
#

hmmmm

final kiln
#

I've been using index notation in my writings and code

long canopy
#

nice yeah that works

final kiln
#

Like so

#

I wonder if it will pick up, it really helps me at least

long canopy
#

best stuff is whatever works

final kiln
#

I wonder if GitHub will be mad at me if I use the attachments feature of the releases as a dataset store

#

Likely not right, Im sure their traffic completely overshadows anything I might do with duckdb and parquet

#

And they probably rate limit this stuff automatically

agile cobalt
#

you probably should use HuggingFace Datasets instead?

final kiln
#

Why ?

past meteor
#

I have my interns working on goose.ai / HF inference endpoints / openAI APIs

final kiln
#

I like having stuff in one place

past meteor
#

Anything else interesting they should look at?

long canopy
agile cobalt
#

actually meant to be used for large files instead of effectively abusing another service in a way it's not meant to be used as

long canopy
final kiln
#

They do set a limit for the size of the files

past meteor
#

We have 80GB vRAM but it's more cost effective to have them use a LLM service than getting them access to our compute

agile cobalt
final kiln
#

And I'm not sure I'll be doing worst than what the most popular libs get naturally

past meteor
#

Well, I think our wallet is decently deep and their experiments are small

#

But it's more expensive to go through IT to get them SSH access to our servers than paying for credits if you feel me?

long canopy
#

@past meteor are you guys using goose.ai's GPT-Neo or Fairseq?

#

never heard of these

past meteor
final kiln
#

Ig I'll just push the parquet files to S3 then, still keep a copy in the releases

long canopy
agile cobalt
#

If quality doesn't not matters a lot, you can run something like Phi2, Gemma, Llama2 or Mistral using relatively little compute (a single good GPU)

final kiln
#

It's all gonna happen inside AWS so it shouldn't cost that much extra for the movement of data

past meteor
#

With LLMs I don't have a good feeling on size vs quality (unlike with say CV)

long canopy
past meteor
#

Hence why I need this intern to find it for me 😂

long canopy
#

has anyone tried anything with parallelizing small models? because i'm thinking of some pipelines with multiple small LLMs that could get work done better than throwing a single request at a huge model

past meteor
#

It's also relative to the task we're solving

long canopy
agile cobalt
#

iirc the biggest difference is in reasoning/logic, so if it's for something trivial like re-formatting text a small model works fine, but for solving logical problems it has to be fairly big

past meteor
#

All I can say rn is that it's related to education

#

Reasoning is required

#

It'll have to be a big model then yeh

agile cobalt
#

you can also test Gemini and Mistral's non-open source models they offer via API

and not even sure if it was worth looking into, but Stability.ai also has a model (Stable Text)

hollow mortar
#

oh nice their lib can draw surfaces now

iron basalt
long canopy
#

heheh

final kiln
long canopy
final kiln
#

They like Gemini lite or something

past meteor
#

Gemma is English only

final kiln
past meteor
#

Maybe I need to just scope them better

#

But there's the issue of ideally having EN + FR + DE + NL

agile cobalt
#

I think that Gemini was supposed to be pretty good at multi-lingual tasks

final kiln
#

Uhm, can't you stick a model that does translation for you on top of Gemma

past meteor
#

Not for this use case

#

Unless I scope it, but it gives me stuff to think about ofc

#

They're more or less prepping next ~ sept's grant submission

long canopy
#

gotta get that grant money

past meteor
#

not everything needs to be done so I could tell them to focus on just English to have a working PoC sooner

final kiln
#

I think even chat gpt decreases in quality if you speak in a language other than English

agile cobalt
#

a bit in quality, a lot in safety

past meteor
#

Our scope is education which means it's bound to be at least 4 langs

#

6 if you add latin/ancient greek

agile cobalt
past meteor
#

I'll keep the latter 2 out of scope for sure

long canopy
#

manim

desert oar
past meteor
#

But fundamentally, some tasks are literally translation so if I translate the text to EN it will not work properly

desert oar
#

i assume at some point you hit the p >> n problem

past meteor
#

Latin will fail but I'm very curious to see how well it'll do, but that's post grant money research

final kiln
#

That's actually a curious use case for LLMs right, preservation of language

lapis sequoia
#

Guys can you suggest fully local ai models - that can be run on cpu

final kiln
#

There's many dying rn

past meteor
#

Ancient Greek is a no-go

supple inlet
#

hello everyone, any resources you recommend for supply chain and logistic data for buuilding forecast and optimisation models and just visualising in general.

supple inlet
long canopy
desert oar
supple inlet
#

Ive got a data science bsc and comfortable with python.

agile cobalt
# past meteor not everything needs to be done so I could tell them to focus on just English to...

Prompt engineering can change a lot depending on which model you are interacting with, not to mention translating stuff being full of complexities even when you are not talking about AI at all

Worst case scenario, making something for an English only model then trying to swap to a completely different model + trying to adjust the prompt for new languages could take as much work as building it without having made the PoC in first place

past meteor
#

Tbh, keeping it out of scope for funding round #1 is nice because I can ask for more money to research the classics 😂

final kiln
past meteor
lapis sequoia
past meteor
long canopy
#

heheh

supple inlet
#

for supply chain specific data, im not sure best approach

lapis sequoia
#

What can it do?

lapis sequoia
past meteor
#

I'm not the resident LLM guy at work either, I'll summon them when the time is right

lapis sequoia
#

Lleaked lama can intake and output docs, images, audio?

#

I have played with tloen ai

#

For example

#

The llama 2 - on its own does it comes with guard rails of any kind?

final kiln
lapis sequoia
#

Leaked one had none

past meteor
#

hmm yeah I should check it in my spare time

#

The cursed thing is that grant writing + "pre"research isn't budgeted

#

So I effectively have 0 work hours do read papers for this project before it gets money 🥴

final kiln
#

The whole paper publishing mechanism is wrong and unethical, researchers are not fairly compensated for their work

past meteor
#

the easiest way to explain it is that I'm kind of on the R&D side of academia

#

No pressure to publish and a decent salary

final kiln
#

Sounds like a nice compromise

past meteor
#

Drawback: less prestige

#

I'll be honest about that

long canopy
#

do a phd and convince people you're not wasting your time

lapis sequoia
#

I just tried llama2 - its full of restrictions

#

sticking with original

final kiln
#

Theres industry sponsored PhDs, if I ever go back that's the avenue I'm taking.

lapis sequoia
#

fine tuning does remove some of built in bias (restrictions) however its easier if main model is free

long canopy
#

it's there

#

it's always there, it's just in tiny bits in 50 different papers

lapis sequoia
#

Ai can make discoveries

final kiln
#

That's not the scary part

#

If you go for PhD, better make it be something you really love, cuz it's 3 to 4 years of being overworked for little to no money

lapis sequoia
#

Why do PHD officially?

#

Just discover and publish anon

final kiln
#

Like, if you're not going into research

desert oar
# supple inlet for supply chain specific data, im not sure best approach

are you looking for advice about datasets? other than looking around on kaggle, there might be some data published by industry groups or various government organizations. otherwise it's not exactly a hot AI field so you aren't likely to find anything resembling a benchmark dataset that's ready to use for machine learning.

final kiln
#

Is it worth it to spend almost half a decade ? I'd still do one because I like spending time acquiring knowledge, but idk if it would further my career

#

There might be a lot of nuance in there, what would happen if you filter for people in the software industry ?

lapis sequoia
#

AI is obsoleting phd

#

Ai is replacing most research job

#

If you need money - there is zero need for a PHD

final kiln
#

experience is also highly valued, a PhD might be paid less than a BSc cuz he junior level

lapis sequoia
#

Ai even now does discover and fast

final kiln
#

Imagine, you enter the industry right after finishing your BSc, you'll have 6 more years of experience than the person who went for PhD

lapis sequoia
#

Ai is cheaper, more efficient, replacing researchers

#

Not just google ai

#

Ai s in general

lapis sequoia
#

Its an old paradigm

#

Ai is replacing researchers

final kiln
#

The problem is the journals, the whole thing should be more like GitHub you know, maybe not the same thing but in the same spirit

lapis sequoia
#

Well lleaked Llama ensured zero barriers knowledge

#

School, uni - obsoleting

final kiln
lapis sequoia
#

Direct communication with software capable of high level novel connections discovery, induction, deduction and coming self learning

#

Those who refuse to think - their choice

final kiln
#

At this point in time, I wouldn't use GPT4 to learn the hardest subjects like physics and math

#

But I strongly believe in their potential for education

#

Mass education like that will be a revolution for sure

#

Like up to some level, it just doesn't work well yet

#

But it's a matter of time ig

lapis sequoia
#

Wont be - most are lazy

#

Some are using

desert oar
lapis sequoia
#

Ai is a teacher

#

So unis are evaporating

#

Furthermore many AI models have zero guard rails - pure logic

final kiln
#

That's from 2012

lapis sequoia
#

In uni some professors disliked direction of my research. AI simply researches

#

Removing asking biased humans

long canopy
#

why not both lol

#

it would be dumb not to recognize the importance of uni

odd meteor
# final kiln Is it worth it to spend almost half a decade ? I'd still do one because I like s...

It's thrilling that ML Research allows one to explore the unknown, however I wouldn't wanna do that in academia.

I'll always priortize places like Cohere, StabiltyAI, Brain, or DeepMind over Academia.

Idk for sure if PhD is for me either. Maybe I'll know once I get my Msc (if I don't get offer for PhD before Msc.)

Also, I think some schools allow people to drop out of PhD after 2 years and settle for an Msc. Certificate if they discover midway they don't wanna continue PHDing

final kiln
long canopy
#

dunno, startups go with whatever looks promising

final kiln
long canopy
odd meteor
final kiln
#

Still not the uni tho

final kiln
iron basalt
#

(The most important / obvious thing being experience (in a job))

odd meteor
final kiln
#

It depends on the learning style, Im best suited for project settings for ex, and that was always where I got my best grades, or when I didn't show up for classes and appeared on the exams

lapis sequoia
#

Uni is obsoleting

lapis sequoia
#

Also most kids are using unguarded ai

#

Cycle shifting

iron basalt
# final kiln What would be a substitute ?

Papers published, projects made (note that if you show your Github, it should probably have all green squares in the commits over time part), work experience, connections.

lapis sequoia
#

Ai is better at research

iron basalt
#

Contributing to an existing project that is not yours is also a pretty good one, shows that you can work with other people and their stuff / ways of doing things.

lapis sequoia
#

Ai is allowing China to close semi conductors tech gap

iron basalt
#

(Can you adapt to the company)

final kiln
#

Actually kinda similar to the rest of the industry right

#

Except for the papers part

iron basalt
# final kiln Not that bad actually.

Should note that startups are very different, it's all skill there. In a larger company things like getting along with everyone else / fitting in starts to outweigh the skill the larger the company is.

#

As do degrees, and certifications, since they have a hiring at scale problem.

lapis sequoia
#

Companies using AI cutting workforce nrs

final kiln
lapis sequoia
#

No

iron basalt
#

To enforce this the hiring will often become more convoluted, take longer, and have somewhat arbitrary steps / requirements.

lapis sequoia
#

Which ai models are you using?

final kiln
iron basalt
#

Also there will just be a lot of competition for those companies, so you will probably not get in just by chance.

#

And since the hiring process takes so long / is convoluted, it just wasted a lot of your time (low probability of getting in).

#

There are lots of companies to work at, don't have to go to the most popular ones just because everyone wants to.

final kiln
#

I do really like open ai tho, they're doing the coolest stuff rn

lapis sequoia
#

Define coolest

#

Strange

#

There are myriads AIs reaching self learning

final kiln
lapis sequoia
#

And open ai models well you can prompt hack them however by virtue of been public facing - restricted

#

Knowledge is knowledge

hollow mortar
#

yh they have a python lib for all the visuals in the videos

lapis sequoia
#

Ai demands human rights Xd

versed pilot
lapis sequoia
#

Do we need Phds? What for?

odd meteor
iron basalt
lapis sequoia
#

Yep

iron basalt
#

You don't need be a genius either, there are so many threads to tug on in ML that there are many simple but unexplored areas waiting for something to try them.

versed pilot
# odd meteor What's even more crazy and funny at the same time is trying to get into PhD prog...

Universities give some undergraduate research opportunities e.g, some summer projects. And every project report or lab report (depending on what you are studying) is sort of structured in a similar way to a paper or a thesis, you know, introduction, method, results, discussion etc. Most degrees will involve a hefty final year project, at that point you will be reading some research papers and not just textbooks

versed pilot
iron basalt
#

I recommend taking inspiration from the Wright brothers, start making things, explore, just "wing it." Don't just do what is currently popular. If you do similar things you can expect similar results, if you do different things you can expect different results.

odd meteor
iron basalt
#

(Not even a high school diploma btw)

iron basalt
#

This goes for non-ML stuff too, for example, if you hang around on robotics related discords you will find some very interesting people making their own unique things, and some of those might even land you a job, or want to work with you on a project / paper.

#

But please do not go into it trying to explicitly get a job, just only if you are actually interested in making things, be genuine.

odd meteor
#

Those are usually the best in my opinion although it comes with its own special stress (which is worth it most times)

versed pilot
#

Sure, there are people doing interesting things that are up there with university work, outside typical ML. I'm thinking of e.g. https://gpsjam.org/ which is one guy's pet project, but gets quoted by Stanford papers.

Maps showing daily possible GPS interference.

iron basalt
odd meteor
final kiln
#

I'm tryna get published with the stuff I'm doing rn

#

It's not fancy stuff, but it's worth the publish if I get good results

versed pilot
#

Good luck!

final kiln
#

Ty

versed pilot
iron basalt
final kiln
iron basalt
#

Also in other fields, often the best thing to buy old hardware that would be thrown out. Ebay.

final kiln
versed pilot
#

When I was at university I had a friend who would look through the skips to rescue random bits of dead kit

odd meteor
iron basalt
versed pilot
#

The days before they became serious about recycling

long canopy
#

i mean, do you put up links when you publish?

iron basalt
final kiln
#

Like, it's that known issue that journals refuse to publish falsification of hypothesis, thus corrupting the whole process

iron basalt
final kiln
iron basalt
odd meteor
# long canopy i mean, do you put up links when you publish?

Most are on Arxiv. I intend to stay incognito for now. But this is one of my works https://arxiv.org/abs/2304.09972

final kiln
iron basalt
#

Can try some conferences too, not the super popular ones everyone tries to get into, but any really, your local ones.

final kiln
#

Yeah I have a couple already

#

Also a talk at a workshop

#

But not ML stuff yet

#

Stuff from my previous field

iron basalt
#

Oh, a note about conferences, some are more business oriented, you want to find one that is about what you actually care about or you will be bored.

odd meteor
# final kiln I'm not sure if I can get published w/ bad results, at least that's not what we ...

Tbh I don't think there's a "bad result" in research. When exploring the unknown, you either unravel something spectacular or you learn your approach didn't yield the expected outcome.

Either way, you gain a new insight.

Publishing failed attempt is also laudable because it provides clear perspective as regards what was explored, how it was explored, and how to possibly approach the same work differently.

final kiln
#

I'd swallow my words if you showed me a nature article about a failed hypothesis

odd meteor
final kiln
#

I have way more crazier stuff planned for later.

#

I just need this as a stepping stone to gain experience.

odd meteor
#

ICLR 2024 decisions are now public, and it's confirmed that the recent (pretty high-profile) Mamba paper was rejected.

Sometimes I do wonder what those reviewers are smoking.

void crescent
#

guys, how is it possible that after 100+ epochs and with an accuracy of 0.99, this is happening?

#

only 74.5 accuracy

#

did i overfit?

#

how come train data the accuracy is so high

raw mortar
#

This is overfitted

#

The model is not complex to capture all the details

void crescent
#

when i did the same with 20 epochs

#

the same thing happened

#

how many do i need to do

raw mortar
#

What is it trying to do?

void crescent
#

detect cancer from 50x50x3

#

images

#
def create_model():
    model = Sequential()
    model.add(Conv2D(64, kernel_size=3, activation="relu", input_shape=(50,50,3)))
    model.add(Conv2D(32, kernel_size=3, activation="relu"))
    model.add(Flatten())
    model.add(Dense(2, activation="softmax"))
    adam = Adam(learning_rate=0.0001)
    model.compile(loss="binary_crossentropy", optimizer=adam, metrics=["accuracy"])

    return model
raw mortar
#

This is like experimenting, keep adding more layers, dropouts etc till something works

#

The norm nowadays would be retraining an existing model like resnet, because they're unbeatable

void crescent
#

they somehow got a different accuracy

#

and thats what im trying to figure out

raw mortar
#

I can't read this without singing in, why do blogs have to be so scummy 😐

void crescent
#

ill explain

#

my code is mostly the same as theirs

#

the part where im confused with

#

is that strategy.scope part

#

it was never defined

#

so i assumed it would be Mirrored Stratefy

#

but since they are getting different accuracy

#

while everything else is the exact same

#

i assume the problem is with strategy.scope

#

for this model what strategy would you reccomend

raw mortar
#

Are you running it on a multi GPU setup btw?

void crescent
#

nah

raw mortar
#

Then this is irrelevant, you can just remove it

void crescent
#

will it cause a difference

#

is my question

#

my pc is kinda bad

raw mortar
#

Not that I'm aware of, the default parameter could have changed from the time the blog was published and the one you're using now

#

That could be why the results are different

void crescent
#

which default parameter

raw mortar
#

For any of the classes or function in the code

#

I'm just speculating really

#

But are they training it in GPU?

#

That could make a difference, like a GPU could look at more batches of samples at the same time

void crescent
#

so do i activate gpu accel then

#

wait lemme check if its already active

#

yes its on

raw mortar
#

Moreover, try looking into increasing the complexity of the model

#

From the graph it seems like it's overfit and not generalizing

void crescent
#

so add more layers

raw mortar
#

You could try the dropout layer between the conv ones

void crescent
#

k

#

but rn it seems to be different

#

its starting with 0.80 accuracy

#

on the validation

raw mortar
#

Ya each training behaves differently

#

Try reading about methods to reduce overfitting

void crescent
#

after a bit of research

#

is this fine?



def create_model():
    model = Sequential()
    model.add(Conv2D(64, kernel_size=3, input_shape=(50, 50, 3)))
    model.add(BatchNormalization())
    model.add(Activation("relu"))
    model.add(Conv2D(32, kernel_size=3))
    model.add(BatchNormalization())
    model.add(Activation("relu"))
    model.add(Flatten())
    model.add(Dense(128, kernel_regularizer=l2(0.01)))  
    model.add(BatchNormalization())
    model.add(Activation("relu"))
    model.add(Dropout(0.5))  
    model.add(Dense(2, activation="softmax"))
    
    adam = Adam(learning_rate=0.0001)
    model.compile(loss="binary_crossentropy", optimizer=adam, metrics=["accuracy"])

    return model

model = create_model()
model.summary()

odd meteor
past meteor
#

Then when you roll your own architecture you can compare your performance vis a vis what you got out of those two

final kiln
# void crescent

You should also consult papers with code to see what results SOTA is getting with this dataset. You might be getting close to the best possible and not realize it.

past meteor
void crescent
#

i changed the model a bit, but i am facing a problem

#

def create_simple_model():
    model = Sequential()
    
    # Convolutional layers
    model.add(Conv2D(64, kernel_size=3, input_shape=(50, 50, 3), activation="relu"))
    model.add(BatchNormalization())
    
    model.add(Conv2D(32, kernel_size=3, activation="relu"))
    model.add(BatchNormalization())


    model.add(Flatten())
    

    model.add(Dense(128, activation="relu", kernel_regularizer=l2(0.01)))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))


    model.add(Dense(1, activation="sigmoid"))
    

    adam = Adam(learning_rate=0.0001)
    model.compile(loss="binary_crossentropy", optimizer=adam, metrics=["accuracy"])

    return model

model = create_simple_model()
model.summary()

early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
mixed_precision.set_global_policy('mixed_float16')
tf.config.optimizer.set_jit(True)

model.fit(x_train, y_train, validation_data=(x_val, y_val), batch_size=64, epochs=20, verbose=1)
model.save("cancer.keras")
#

for some reason

#

it is showing 1 epoch as 8 hours???

past meteor
void crescent
#

no i have gpu accel activated

past meteor
#

After that I really think you need to look at my and propagation's advice

void crescent
#

i interrupted training to make some changes and now its 8 hours

#

i dont know how to use resnet

#

but right now the problem is why it is taking so long

hollow pumice
#

Uh well see, our teacher took all the examples once in one question to calculate Z but then in the next question he took only 1 example at a time. Why is that

#

All exmaples at once in first ques

#

But in this ques he asked us to take one example at a time

final kiln
past meteor
#

I've been there, even recently

#

It's good advice to pass on imho

void crescent
#

WHY

#

how is it showing that much

dusty forge
#

Question about Linear Regression and encoding categorical columns. Suppose I have 5 distinct categorical values, one-hot encoding would replace this column with 5 columns, for each distinct value. No biggie. But what if I have 30 distinct values, or 100, or 10.000? Is one-hot encoding still the solution? For example, I'm doing a revenue forecast on 1000 stores across the country and I have a column with the city, am I still suppose to use one-hot encode?

past meteor
#

This encoding scheme is useful with categorical features with high cardinality, where one-hot encoding would inflate the feature space making it more expensive for a downstream model to process. A classical example of high cardinality categories are location based such as zip code or region.

#

Does that answer your question?

dusty forge
void crescent
#

how is it possible that even with such a simple model.. i am gettign 1 epoch as 10+ hours

dusty forge
#

I used the TargetEncoder

#

and this is the result

#

Now I'm a full beginner, but these results are not bad right? 😅

final kiln
#

for a parquet file, is it better to have one table with everything or to split it into various tables and use the relational database thing, for example, is it better to do


CREATE TABLE test (
    id INTEGER,
    sentiment_id INTEGER,
    sentiment VARCHAR CHECK (sentiment = 'pos' OR sentiment = 'neg'),
    rating INTEGER CHECK (0 <= rating AND rating <= 10),
);


CREATE TABLE train (
    id INTEGER,
    sentiment_id INTEGER,
    sentiment VARCHAR CHECK (sentiment = 'pos' OR sentiment = 'neg'),
    rating INTEGER CHECK (0 <= rating AND rating <= 10),
);

or


CREATE TABLE dataset (
    id INTEGER,
    sentiment_id INTEGER,
    sentiment VARCHAR CHECK (sentiment = 'pos' OR sentiment = 'neg'),
    rating INTEGER CHECK (0 <= rating AND rating <= 10),
   split VARCHAR, 
);
versed pilot
#

@final kiln doesn't it depend? if storage is expensive and compute (joins) is cheap, you split into tables and do a relational database. If storage is cheap and compute is expensive (think terabytes/petabytes on S3 etc.) then you keep one table

final kiln
raw mortar
#

test and train in different tables? why?

versed pilot
#

Thinking about it, you are not really joining on a unique key here, so not sure my answer is relevant to your particular scenario

#

the columnar compression should work better in the single table scenario than the two tables scenario though

final kiln
#

this is very close to how the files and folders are laid out



--- a positive review has a score >= 7 out of 10
CREATE TABLE template_pos (
    id INTEGER,
    review VARCHAR,
    score INTEGER CHECK (score >= 7),
);

--- negative review has a score <= 4 out of 10
CREATE TABLE template_neg (
    id INTEGER,
    review VARCHAR,
    score INTEGER CHECK (score <= 4),
);

CREATE TABLE test_pos AS FROM template_pos LIMIT 0;
CREATE TABLE test_neg AS FROM template_neg LIMIT 0;
CREATE TABLE train_pos AS FROM template_pos LIMIT 0;
CREATE TABLE train_neg AS FROM template_neg LIMIT 0;

DROP TABLE template_neg;
DROP TABLE template_pos;
#

I think during training I'll be sampling randomly from train_pos and train_neg

#

is there a query that selects randomly, I havent done a lot of sql

versed pilot
raw mortar
final kiln
#

that's what I was thinking yeah

raw mortar
#

still wondering why the data split into different tables 👀

final kiln
#
final kiln
#

so I'm hoping that getting them into different tables also means less bandwidth usage

#

like it uses some headers called range headers or something of the sort

raw mortar
#

haven't really used duckdb, but these are tables in memory and don't persist?

final kiln
final kiln
#

For Parquet files, DuckDB can use a combination of the Parquet metadata and HTTP range requests to only download the parts of the file that are actually required by the query.

raw mortar
#

ok so its like, sqlite but for analytical purpose, the tables are are just in memory for the duration of the runtime

final kiln
raw mortar
#

ya but the tables are not persisted to disk if understand it correctly

#

you can output the result back, thats another thing

void crescent
#

for some reason, everytime i attempt to train with GPU accel, its 7+ hours per epoch

final kiln
#

the way im gonna set it up the files are gonna be in some remote, so it won't need to use up disk

void crescent
#

no

#

local gpu

#

and its a simple model

raw mortar
#

ya which gpu though

void crescent
#

only like 30k-40k params

void crescent
#

my gpu on my computer is intel (r) uhd (it sucks i know but its only 30k params)

raw mortar
#

check the gpu utilization during training, if its pinned at 100%, then can't do much

#

which os btw?

void crescent
#

windows

#

currently im re running the pre processing to see if there is an issue there

raw mortar
#

10 and 11 has a gpu section in the task manager

#

ctrl+shift+esc to launch task manger

void crescent
#

where

raw mortar
#

performance section in the left

void crescent
#

wait wtf

#

i think it was a pre processing error

#

becaues now it shows 3 mins

#

in that case

#
def create_simplified_model():
    model = Sequential()

    model.add(Conv2D(64, kernel_size=3, input_shape=(50, 50, 3), activation="relu"))
    model.add(Conv2D(32, kernel_size=3, activation="relu"))

    

    model.add(Flatten())
    
 
    model.add(Dense(1, activation="softmax"))
    
    # Compile the model
    adam = Adam(learning_rate=0.001)
    model.compile(loss="binary_crossentropy", optimizer=adam, metrics=["accuracy"])

    return model

is there a better way to improve this model because last time i did this it gave incorrect results

raw mortar
#

you have to add and remove layers if you're experimenting, and form an understanding of what you intend to do

#

the norm now is to use and existing arch and upgrade and modify it

raw mortar
lapis sequoia
#

I made reward function to be (1/100)*x*(100-x) which has a maximum at (50,25)

raw mortar
#

@final kiln this is how you work with duckdb?
having mixed feelings about mixing python and sql together :\

void crescent
#

facing the same issue again

#

7 hour for 1 epoch

#

gpu is as 9%

#

its fluctating between 8 and 15

raw mortar
#

can you take a screencap and put it

void crescent
#

vs code tajing 95% of memory

#
def create_improved_model():
    model = Sequential()

    model.add(Conv2D(64, kernel_size=3, input_shape=(50, 50, 3), activation="relu"))
    model.add(Conv2D(32, kernel_size=3, activation="relu"))

    model.add(Flatten())
    model.add(Dropout(0.5))  # Adjust the dropout rate

    model.add(Dense(1, activation="sigmoid"))

    # Compile the model
    adam = Adam(learning_rate=0.001)
    model.compile(loss="binary_crossentropy", optimizer=adam, metrics=["accuracy"])

    return model

model = create_improved_model()
model.summary()

mixed_precision.set_global_policy("mixed_float16")
tf.config.optimizer.set_jit(True)

early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
model.fit(x_train, y_train, batch_size=64, epochs=20, validation_data=(y_val, y_val), verbose=1, callbacks=[early_stop])
model.save("cancer.keras")
#

current code

#

and gpu accel is on

#

so how is it possible that 1 epoch is 5+ hours

river cape
#

What is a train test folds?

raw mortar
river cape
raw mortar
void crescent
#

bruhhh

raw mortar
#

does tf even support intel gpu btw? not sure
i have amd cpu/gpu at the moment

void crescent
#

how do i actually activate gpu accel

void crescent
#

i just did

#

it returned CPU

raw mortar
#

then its not able to see a supported gpu unfortunately

void crescent
#

....

#

im sure it would work fine on google colab

#

but for some reason i cant even get the files there

raw mortar
#

in colab the usual pattern is to put files in gdrive or some bucket store and get it from there

potent sky
raw mortar
#

iirc for gdrive the it gives the required code

void crescent
potent sky
#

mhm and if you don't want to/can't attach the gdrive to the colab instance (e.g. not your gdrive acc)
You can just wget it

void crescent
#

it only worked properly on my local computer

#

and my google drive just shuts down

potent sky
void crescent
#

if i try to transfer from local to drive

void crescent
final kiln
potent sky
void crescent
#

wait let me show

void crescent
#

and its missing tons of files

potent sky
#

Using the same method to unzip?
Check the paths and the compatibility of unzip tool you're using with the archive(zip) format

void crescent
#

yeah using 7Zip

#

in google colab

#

i just do !unzip

red bane
#

This is a quick question, but how much difference should a tensorflow training process running on AMD Ryzen 7 4800H with Radeon Graphics compare to the same process being run on NVIDIA GeForce GTX 1660Ti? When I tried both, one with cpu took 11 hours while gpu takes 10 hours

odd meteor
# void crescent

You need Nvidia gpu in order to leverage CUDA. From this picture, you don't have the kind of GPU that'll allow you accelerate model training on your local machine.

I know this is sad news. I've been there. Blame Intel & AMD for allowing Nvidia monopolize the market unopposed.

You can't even train a deep neural nets with CIFAR-10 using a small model like ResNet-18 on a PC with Iris Xe GPU in < 4 hours. Now, imagine how long it'll take a pc with UHD GPU.

You should move your work to Colab or Kaggle if you'd want to use a free-tier GPU.

Or better still, if your pc has thunderbolt port and supports eGPU, you might wanna buy your own external GPU and just connect it to your pc.

raw mortar
red bane
#

would this be more of a model issue?

raw mortar
#

that list the gpu, you have to use nvidia-smi or task manger if you're on windows

red bane
#

oh yeah it shows a gpu

#

if needed i will post a screenshot

red bane
red bane
raw mortar
#

usually a tf process should come in this

red bane
#

wait a bit while please

#
  • would a model with about 500,000 params take this long for one epoch when done with 64 batches with about 1490 datapoints?
past meteor
raw mortar
red bane
#

i mean, training doesn't seem to have started but gpu is being used up

raw mortar
#

iirc tf allocates all memory during initialization
someone correct me if i'm wrong

wooden sail
#

that's the default behavior, you can set limits and/or have it allocate vram as needed too. that's usually slower

untold cliff
dusty forge
#

I'm really struggling with learning the high level view of which ML is part of. Made a Linear Regression model, happy with the scores and results, but now what? How do I save this model, where do I save it, how do I make it part of a pipeline for future use?

#

Every tutorial, course, and blog so far, focus on the experiment part, training, testing, hooray you made your first model ... that stuff. But once I have, not a single source I found so far talks about what I'm suppose to do with this model.

raw mortar
#

90% of ml don't go into prod

#

its in the name, Data [Science]

dusty forge
#

If anyone can point me to the proper direction, that would be great. Just to clarify, I have this model on colab, yes I can download both the notebook and the py files ... but what the heck do I need to do with it? 😅

wooden sail
#

that's really up to you. do you wanna make it part of a bigger product? make it so that it is automatically applied to a specified dataset? have the trained parameters be frozen and used for inference in a less powerful device like a phone?

#

there's no one direction, the question is more "what do you want to do with it?" which depends entirely on the problem you wanted to solve and why

raw mortar
#

ideally you want to look for problems which can be solved with data and compute

#

then decide what to do next

dusty forge
# wooden sail that's really up to you. do you wanna make it part of a bigger product? make it ...

I want to make it into a permanent solution. For example, if I make a model to predict procurement prices, I want to 'run' the model every quarter so our managers can have a fresh prediction every quarter to use for their decisions. This means that every quarter, the model receives new data from the quarter before this, retrains and predict. Of course, I will also review the model every quarter to check on data drift and other errors/issues. Does this makes sense?

raw mortar
wooden sail
#

yeah, then you want a way to keep training the model with new data in the future, and also a way of "deploying" it. will it always be you that runs it, or do you want other people to also be using it, possibly without needing to know how it works behind the scenes?

raw mortar
wooden sail
#

you could achieve this as easily as having a jupyter notebook and storing a history of the trained parameters somehow, or e.g. make it into an easily deployable container

dusty forge
raw mortar
red bane
raw mortar
red bane
dusty forge
wooden sail
#

then that requires you to automate the process of training and inference. that means automatically preparing and batching data, for example, and storing the results in a desired way

dusty forge
wooden sail
#

how to do that depends on what infrastructure and software you have at hand

#

there's no one size fits all 😛

#

i guess this falls under MLOps, as someone pointed out above

#

you can read about that to get a feel for it, but the specific implementation really depends on the software you want to use for it

dusty forge
#

I think this is what people mean when being asked what the most valuable skills are ... putting it in production 😄 I can see this in job descriptions and finally know what it means, now that I'm struggling with it myself even if it's a practice model (small scale, nothing fancy, nothing complex)

dusty forge
raw mortar
dusty forge
# raw mortar mle/mlops engineer here, there is no generic strategy for this each one depends ...

Oke so right now I'm doing this in my free time at home, to develop my skills for future jobs (currently in data analysis/management). Let's say I want to simulate the entire pipeline on my local computer. Basically, data is stored on csv, csv goes in the py file, model training and testing. How do I get the data that is loaded in, to be refreshed with new data? My gut says something called orchestration plus Edd mentioned automation. And then after the model did it's thing, I somehow need to store it in a way, that my Power BI can access it so I can make a dashboard.

raw mortar
#

there are to be something which tells the data is new

dusty forge
#

so new data is coming in regularly

raw mortar
#

i guess from powerbi would connect to data sources like a database, so it makes sense to dump the results into a database

dusty forge
#

correct, so that is probably the least challenging part

raw mortar
dusty forge
#

I think what I'm struggling with is how to 'make' the pipeline itself

raw mortar
#

pipeline could be anything really, its a sequence of step, could be functions, modules etc

dusty forge
raw mortar
#

you want to something fancy, you can look at some pipelining tools like prefect, airflow, dagster etc

raw mortar
final kiln
#

I'm using prefect for pipelines

raw mortar
#

schedule it via cron, and we're done

dusty forge
#

Ahhh someone mentioned Prefect the other day, is that the type of tool that I need? I can make the steps you mentioned and in each step it's a piece of code, ranging from loading data, cleaning it up, and my actual ML model, all the way to storing it in a database so Power BI can access it?

final kiln
#

Yeah

#

It boots up a fancy UI and everything if you want

#

With a flow chart showing all the tasks being executed real time

#

And you can also do it in deploy mode, so you can manually trigger stuff

#

Like in the UI

dusty forge
#

Flow chart, oke now you speak my language haha, I love flow chart as it makes things easier to understand.

raw mortar
#

basically all of the pipeline tools helps us in creating something called a DAG (directed acyclic graphs), you should really not be limited by what tool you pick

final kiln
#

I'm coding one rn let me see if I can show you

raw mortar
dusty forge
raw mortar
#

just try out some, see which you like

#

dagster is the new shiny toy, its more geared towards DE i'd say

lofty ermine
#

Hey everyone, I have a question which kind of laptop should i buy for machine learning and data science

dusty forge
final kiln
#

actually dont have a good one yet cuz im stil coding it, but it looks like this

dusty forge
final kiln
raw mortar
lofty ermine
#

I am a begineer

final kiln
#
@flow
def imbd():

    conn = duckdb.connect()
    slice = []
    for split in [TRAIN_POS, TRAIN_NEG]:
        data = ( x[0] for x in conn.sql(f"SELECT review FROM '{split}';").fetchall() )
        for review in data:
            slice.append(encode_text(review))
lofty ermine
#

and i have very slow laptop running right now

final kiln
#

and it's literally just a wrapper, "flow"

lofty ermine
#

currently replacing my laptop so going for future proof

raw mortar
lofty ermine
raw mortar
#

colab, kaggle provides some free gpu usage per month

dusty forge
#

oh nevermind, didn't see the sql method being called haha

river cape
#

print("Accuracy: {:.2f}%".format(accuracies.mean()))
could someone explain this line of code

serene scaffold
#

looks like you're following an old tutorial. most people would write

print(f"Accuracy: {accuracies.mean():.2f}%")
obtuse turret
#

I am thinking of learning pytorch and LLM's along side python how should i proceed???

autumn acorn
#

Yo if anyone uses or plans to use the GPT-4 API (Turbo or Vision or whatever) please dm me, I have an opportunity for you

serene scaffold
#

@autumn acorn messages such as these are not allowed.

autumn acorn
#

?

#

oh my bad

serene scaffold
#

!rule 9 6

arctic wedgeBOT
#

6. Do not post unapproved advertising.

9. Do not offer or ask for paid work of any kind.

autumn acorn
#

just was wondering something abt openai pricing

serene scaffold
#

"I have an opportunity for you"

autumn acorn
#

sry sir

long canopy
#

anyone hear of anything involving pure function classification? classifying code into pure vs. non-pure functions?

#

easy enough in haskell lol, but python?

full ore
lapis sequoia
#

I assume you run them in this order Cell A -> Cell B -> Cell C -> Cell A. Do you mean this?

odd meteor
# dusty forge If anyone can point me to the proper direction, that would be great. Just to cla...

The MLOps engineer role is different from an ML engineer role. Even though the role varies from company to company, in general, ML engineers focus more on bringing individual projects to production, while MLOps engineers work more on building a platform that is used by machine learning engineers and data scientists.

odd meteor
odd meteor
final kiln
#

Maybe it's because I'm more used to py, but now that I'm back to py I'm noticing how easy it is for me to produce code that is hard to test

dusty forge
red bane
#

Im trying to get the profile for tensorboard to work, but it says that it failed to load libcupti, although I have installed nvidia toolkit. How can i fix this error?

fervent dew
#

hey guys what if i combine many machine learning models together, like llama2 with falcon, etc. . what is the problem i am gonna face while doing this?

final kiln
#

Define combine

remote stream
#

how do i paste csv or ppt here

#

i got a data set and i need immediate help 😢

obtuse turret
final kiln
#

I think you can't do attachments, only pictures and links

remote stream
#

Anyone can help me?

final kiln
full ore
fervent dew
final kiln
final kiln
obtuse turret
fervent dew
obtuse turret
final kiln
odd meteor
final kiln
#

Learning this stuff is fairly easy if you get your foundations right

final kiln
# obtuse turret Ohh

For math I recommend Khan academy, you can start at whatever level you are and build up

obtuse turret
final kiln
odd meteor
# fervent dew hey guys what if i combine many machine learning models together, like llama2 wi...

I guess the only way to know the possible problem(s) you'll encounter is to try it first or read the research paper.

Alternatively, I found this blog/ tutorial on implementation of model soups.

https://lightning.ai/lightning-ai/studios/efficient-linear-model-merging-for-llms

Model merging is a technique for combining multiple pretrained or finetuned LLMs into a single, more powerful model. This approach is particularly useful when individual models excel in different domains or tasks, and merging them can create a model with a broader range of capabilities and improv…

obtuse turret
final kiln
#

You can also totally parallelize learning here

#

Like implementing math algorithms using python and numpy, helps you learn the math, py and preps you for pytorch cuz the indexing magic is similar to numpy

dusty forge
#

Ok so I learned that saving and loading a model, either in the same file/notebook or across files/notebooks, can be done easily with joblib. Has the community accepted a naming convention that I can use? Pretty sure something like 'final_model-25-2-204_version2_realfinal_withupdatedparam-002' will not be liked? 🤣

obtuse turret
final kiln
void crescent
#

nvm

final kiln
#

Pytorch hides a lot of details of what's happening

#

When I say implement it's you writing very barebones algorithms

halcyon verge
#

support ar ?

final kiln
#

Wrong channel I believe

fickle oxide
#

Sorry

final kiln
#

Idk if there is a specific one for django

void crescent
#

bro like half of my files are just not there when i unzipped my dataset

#

in google drive

#

i did it from google colab

obtuse turret
void crescent
#

is there an easy way to unzip a huge file (like my dataset) to google drive

final kiln
#

Like, if youre not proficient with any, that's the path, py, pytorch and then llm

obtuse turret
# final kiln Not sure if I get your point tho

Actually I am at 2 sem at uni , and thinking of applying for internship during fall break , and to be prepare ahead I read the requirements for same internship during summer and the requirements were 3 month of experience In Python , pytorch , llms....
So that's why I was asking for If I could go for parallel learning

final kiln
#

3 months of each is really asking for just surface level knowledge

obtuse turret
obtuse turret
final kiln
#

I'm having a hard time with SQL because I can't get variable queries unless I do templating

#

I wanted to avoid templating since I have every info I need in my shells env

long canopy
#

anyone work with einsum? if so, any good tutorials?

#

looks like it really is something you should get used to

void crescent
#

guys

#

i downloaded a dataset w kaggle in colab

#

colab

#

but i cant find a way to unzip all the files to the cloud

#

i can do it locally

#

but it wont let me tranfer it

final kiln
#

Like I really need a for loop ._.

raw mortar
final kiln
#

for name in ["epoch1", "epoch2"]

raw mortar
#

SQL is meant to be declarative vectorized operations at column level

final kiln
#

that makes it fairly useless tho, I can't do anything ; /

raw mortar
#

When you want to do element level operations you should stick with python

final kiln
raw mortar
#

Something like

for epoch in epochs:
    f"""ALTER TABLE dataset ADD COLUMN {epoch} FLOAT DEFAULT random();"""
final kiln
raw mortar
#

Can't have everything lol

final kiln
#

perhaps I can find a midterm using jinja

#

I think im gonna go with something like this yeah

raw mortar
#

This is an unnecessary dependency, use fstrings or vanilla jinja

final kiln
#

yeah im going for vanilla jinja

#

{% for epoch_name in epochs %}
    ALTER TABLE dataset ADD COLUMN {{ epoch_name }} INTEGER DEFAULT trunc( {{number_of_partions}}*random() );
{% endfor %}

dont know why sql doesnt support this

#

trunc is needed because casting uses weird rounding rules

raw mortar
final kiln
#

no I mean just basic control flow, i don't really see much of a reason for it to not be supported by sql

raw mortar
#

plsql might be what you're looking for

#

udfs or procedures are used for custom functions, but i guess at this point we should stick with a real programming language

left tartan
final kiln
#

It has already automated a lot of stuff for me tho. This dataset is also pretty small so I'm literally just doing a select from a link to the repository, didn't need to move any files around, just committed them to the repo and that was it

#

I completely regret splitting the data like this tho, just gonna do a single table >.>

final kiln
potent sky
#

I meant plsql is a possible solution in terms of what you're looking for

final kiln
#

Oh interesting

dusty forge
#

Instructor said the model is overfitted a bit, but it's oke for this use. I haven't had the theory about over/under fitted yet, but am I close when I guess that the line following the dots so tight is what he means with overfitting? And that it's ok for this use since the predictions will be nicely 'on par' with the rest?

final kiln
#

You can imagine that your data has been generated by an underlying fundamental law, but due to the nature of measurement, it will always contain some level of noise. Overfitting means that the model has memorized the details of the noise instead of the underlying law

#

Tomorrow I'm gonna finish this pipeline. One additional thing In adding to the mix is that I'm computing the random splits a priori and saving them to S3 with MLFlow. This basically lets me do fault tolerance, if AWS takes away my spot instance, I know where to start from without losing the splits.

Altho, I'm guessing that just saving the seed oughta be enough.

Yeah nevermind I'm just gonna keep saving the seed.

lapis sequoia
#

AI is becoming a professor

final kiln
#

GPT4 is becoming dumber tho

#

Bet you they're doing some form of rate limiting

lapis sequoia
#

A bit off topic - Is universal income coming

final kiln
#

Like using gpt3.5 for some of the prompts to 4

lapis sequoia
#

There are tons of open local ais

final kiln
#

Good thing about rust is that I can't do a @contextmanager that yields a closure that returns a generator over the data

#

Oops I pinged someone

#

Like I know all the abstraction is bad for me, but I do it anyway cuz I'm tired and I need to get this done

#

SENTIMENT_TO_INTEGER = {
    'pos': 1,
    'neg': 0
}


def set_seed(conn, seed):
    raise NotImplementedError

def create_table(conn):
    conn.sql(f"""
        CREATE TABLE dataset AS
        SELECT id
        FROM '{DATASET_LINK}';
    """)

def add_epoch_column(conn, epoch_idx, number_of_partitions):
    conn.sql(f"""
        ALTER TABLE dataset
        ADD COLUMN epoch_{epoch_idx}
        INTEGER DEFAULT trunc( {number_of_partions}*random());
    """)

def select_partition(conn, epoch_idx, slice_idx):
    return conn.sql(f"""
        SELECT sentiment, review
        FROM '{DATASET_LINK}' as remote
        JOIN dataset ON (dataset.id = remote.id)
        WHERE dataset.epoch_{epoch_idx}={slice_idx};
    """)
    

@contextmanager
def dataset_partitioning(number_of_epochs, number_of_partions, seed = 0.5):
    with duckdb.connect() as conn:
        set_seed(conn, seed)
        create_table(conn)

        # note: sampling is pre-computed here, data is fetched on demand with "fetch_data"
        for epoch_idx in range(number_of_epochs):
            add_epoch_column(conn, epoch_idx, number_of_partions)
        
        def fetch_data(epoch_idx: int, slice_idx: int):
            sentiments, reviews = [], []
            for sentiment, text in select_partition(conn, epoch_idx, slice_idx).fetchall():
                sentiments.append(SENTIMENT_TO_INTEGER[sentiment])
                reviews.append(text)
            return sentiments, reviews
        yield fetch_data


if __name__ == "__main__":
    with dataset_partitioning(number_of_epochs=2, number_of_partions=5) as fetch_data:
        sentiments, reviews = fetch_data(epoch_idx=0, slice_idx=3)
#

it's not so bad im guessing

final kiln
#

I think I'm gonna start placing my test cases in the same file, like how it's done in rust

long canopy
#

simple implementations of the self-attention algorithm consider the K, Q, V objects to be matrices, i.e., 2-dimensional tensors. when does self-attention operate on higher dimensional tensors?

final kiln
#

b = batch dimension, c = context window, d = vector embedding dimension

#

input_bcd * Wq_dk = output_bck

#

forgot k, k = projection dimension

long canopy
#

batch dimension, will look into that thanks

final kiln
long canopy
#

will do ty

#

are features a type of token?

final kiln
#

ive seen people call the output of the layers features too

long canopy
#

hm, noted thank you!

lapis sequoia
#

guys I'm stuck, can anybody help with this?

            log_prob = action_probs.log_prob(self.action) # gives error
            # action_probs = tfp.distributions.Normal("Normal", batch_shape=[1, 1], event_shape=[], dtype=float32) <class 'tensorflow_probability.python.distributions.normal.Normal'>
            # self.action = tf.Tensor([[-0.00545995]], shape=(1, 1), dtype=float32) <class 'tensorflow.python.framework.ops.EagerTensor'>
            # log_prob = tf.Tensor([[nan]], shape=(1, 1), dtype=float32)

Why does this line output [[nan]] instead of a proper number?

final kiln
lapis sequoia
lapis sequoia
lapis sequoia
lapis sequoia
final kiln
lapis sequoia
final kiln
lapis sequoia
final kiln
#

That will print the docstring of the function, but your IDE should probly show it too

final kiln
#

Second line of help is the libraries documentation

#

Which sometimes are even compiled from the docstrings anyway

lapis sequoia