#data-science-and-ml

1 messages ¡ Page 97 of 1

tidal bough
#

seems right

#

in 3.12 you can simplify it a bit by using math.sumprod

scarlet siren
# tidal bough seems right

For
[[-1.0, 0.0, 1.0], [1.0, 0.0, -2.0], [-1.0, -1.0, 2.0]]
dot [[26], [20], [970]]
I'm getting
[[944.0], [-1914.0], [1894.0]]
but it should be
944
1966
1894

#

It's like a 3x3 matrix dot 1x3 matrix

tidal bough
arctic wedgeBOT
#

@tidal bough :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | [[  944.]
002 |  [-1914.]
003 |  [ 1894.]]
scenic token
#

I am generating a plot of a graph with the networkx library

How can I make the spaces between my nodes larger in a circular draw

scarlet siren
#

r1 + r2 = d1
r2 + d3 = d2
r3 + r4 = d3
r4 + r5 = d4

knowing that r1 = 2r5 how is the linear equation matrix constructed? (assuming r1 = 2R and r5 = R)

#

2R + r2 = d1
r2 + r3 = d2
r3 + r4 = d3
r4 + R = d4

would it be

2 1 0 0 R d1
0 1 1 0 x r2 = d2
0 0 1 1 r3 d3
1 0 0 1 r4 d4

or should I ignore r1 = 2r5

tidal bough
#

either you solve for r1 manually like that and get a 4x4 matrix, yeah, or you rewrite r1 = 2r5 as r1 - 2r5 = 0 and then you have a 5x5 matrix.

scarlet siren
#

And for the equation to have an answer, A has to have an inverse, right?

tidal bough
#

If A has an inverse, then a solution exists, but I don't think the opposite has to be true - it's generally https://en.wikipedia.org/wiki/Rouché–Capelli_theorem

In linear algebra, the Rouché–Capelli theorem determines the number of solutions for a system of linear equations, given the rank of its augmented matrix and coefficient matrix. The theorem is variously known as the:

Rouché–Capelli theorem in English speaking countries, Italy and Brazil;
Kronecker–Capelli theorem in Austria, Poland, Croatia, Ro...

scarlet siren
#

A not having an inverse doesn't mean AX = B doesn't have an answer?

tidal bough
#

No. Consider A = [[1,1],[0,0]], b = [[1],[0]]. A has determinant 0 and hence is noninvertible, yet A x = b has infinite solutions.

desert oar
#

i think it's pretty common to do things like sentiment analysis etc. using simple models on top of pre-trained word vectors

#

i've certainly done it for text classification. word vectors basically just acting as dimension reduction at that point.

final kiln
#

Tomorrow I'm gonna see if there's a threshold where the feed forward doesn't work

#

Presumably it won't work in cases where the text is more complex

#

And the context window is larger

final kiln
#

What I did next was to just delete the transformer blocks altogether. The embedder + feed forward converged crazy quick

#

Haven't checked these things but

#

I think the embeddings themselves will come grouped into regions, negative words to one side and positive words to the other

#

And the only thing the feed forward does is count them in the input

scarlet siren
#

Inverse on
[2, 1, 0, 0],
[0, 1, 1, 0],
[0, 0, 1, 1]
[1, 0, 0, 1]

gives back
[[-0.0, -0.0, -0.0, -0.0], [1.0, -0.0, -0.0, -0.0], [-1.0, 1.0, -0.0, -0.0], [1.0, -1.0, 1.0, -0.0]]

when tested with numpy I got
[[ 1 -1 1 -1]
[-1 2 -2 2]
[ 1 -1 2 -2]
[-1 1 -1 2]]

#

numpy version:

import numpy as np
import numpy.linalg as alg


def main():
    matrix = np.array([
        [2, 1, 0, 0],
        [0, 1, 1, 0],
        [0, 0, 1, 1],
        [1, 0, 0, 1]
    ])
    print(f'A = \n{matrix}')
    inv = alg.inv(matrix).astype(int)
    print(f'det(A) = {alg.det(matrix)}')
    print(f'A-1 = \n{inv}')
    print(f'det(A-1) = {alg.det(inv)}')
if __name__ == '__main__':
    main()
desert oar
desert oar
final kiln
desert oar
#

however I don't think it's necessarily invalid even on longer documents as long as they are well separated by their vocabulary. Consider book reviews for example

final kiln
#

This is one of the tasks that I'll use to make the ablation study on the transformer

desert oar
#

I bet I could build a book review sentiment classifier with > 50% accuracy by just looking for words in some fixed-size neighborhood of "bad" and "good" in a pretrained fasttext embedding space

final kiln
#

Perhaps, depends on the complexity of the text. If it's thesis type of text for example, with an opinion on some geopolitical matter, a transformer is likely needed since it requires actual conceptual understanding

desert oar
#

remember what transformers do: they construct a new sequence of vectors such that each individual vector in the new sequence represents its own context in the original sequence

#

so transformers only improve your model if context is important

desert oar
#

but I wouldn't say it's just a matter of length, more about the subtlety of ideas involved in the text

final kiln
#

Yeah I'd say so too

#

Gonna need to scrape the web for datasets

#

This is actually a good exercise to get to speed with all the NLP tasks

#

Sentiment analysis, machine translation, topic classification, and a couple others I don't recall

#

I'm gonna use them to compare my variant, and then replicate the MetaFormer study but for NLP

#

I haven't found anyone doing it yet, don't know why

cinder jay
#

Hi, i have the following code to segment the blood vessels of the eye:

import cv2
import numpy as np
import skimage

def vessel_segmentation(image):
    im_rgb = cv2.imread(image)
    
    # Extract green channel
    im_green = im_rgb[:, :, 1]
    
    # CLAHE enhancement
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    im_enh = clahe.apply(im_green)

    # Negative
    im_gray = cv2.bitwise_not(im_enh)
    
    # Use Top-Hat transform
    se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (21, 21))
    im_top = cv2.morphologyEx(im_gray, cv2.MORPH_TOPHAT, se)

    # OTSU Thresholding
    _, im_thre = cv2.threshold(im_top, 50, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)


    return im_thre

i have the following result:

#

but should be something like this:

manic linden
#

Hey if I traspose a dataframe, shouldn't the size of the df always be the same? In my program I lose 3 columns when I do it?

serene scaffold
#

if you appear to be losing data, there must be more going on.

ashen galleon
#

Q:
I have a discord bot, and, for every guild,
I have an image banning method using embeddings of the image, comparing to stored banned embeddings.
Currently,
the structure is just every embedding in one tensor
and it works, but I can't have the same embeddings for every guild.

I could simply store an element of the tensor that's the guild_id,
but that feels like an antipattern,
and that I should use the Pony ORM database I have, and then have the guild_id-tensor pairs.

Is there a preexisting standard for this sort of application?

viral field
#

I have seen lots of videos about creating object detectors using a camera or webcam
But can't we do the same thing on our PC or mobile screen?

final kiln
#

anyone knows a good literature review on transformers ? I'm looking for something done in 2023

fallen dagger
#

Demo of a CLI tool I built over the weekend that connects with Google's Gemini LLM and use it with your files.

It lets you add your own custom commands as well, so you can further enhance the CLI or use it to interact with the LLM and your files however you want.

final kiln
#

don't know if this is a limitation of the dataset or of the feedforward

orchid cargo
#

Does anyone know how to make a weather forecast? Maybe it's impossible.

desert oar
#

if you're trying to forecast the temperature at a single location, you can do OK with traditional timeseries methods, but your predictions will at best only be interpretable as an average

#

beyond that, you're getting into meteorology, not just statistics or machine learning

feral sand
river cape
#

Hey guys I have these 3 problem statements , how do I forward with these problem?
PS1 - Anonymise user identities in large databases to ethically employ machine learning in understanding customer trends and behaviour without violating their ri`ghts.
PS2 - Create a blockchain-powered platform allowing users to lease their information to social media services, with the assurance that the data will not be retained upon exiting the service.
PS3 - Develop an AI based solution to offer timely insights into current global hacking trends, prioritising potential threats based on their likelihood of targeting specific enterprises.

final kiln
#

For sentiment analysis, I can't have the embedder module be learnable, it's gonna overfit every single time. I looked around and people seem to start with a pre trained one and stick a feed forward on top.

So I decided to just train one by getting a transformer to first do next token prediction.

But at that point might as well just let the next token prediction be the sentiment. I have a bit of padding on the sequences and the last token is the classification

#

The expectation is that the transformer will be obliged to learn syntax as it does on the normal next token pred. It's training rn will see if it works out. At least it's not overfiti g

left tartan
final kiln
# final kiln

50 epochs in with the next token prediction way, no overfitting and passed a similar test as this one

molten elk
#

Does anyone know if in this code the binomial function only counts whether something is a 0 or 1? These values refer to heads and tails respectively

desert oar
desert oar
final kiln
#

I think it is working, the limitation will be the dataset, which won't include stuff I can come up with, like sarcasm or the entire phrase being positive and then end with "jk, it was the opposite"

molten elk
molten elk
final kiln
#

Well, it seems like it worked, gonna run over some stats after a well deserved break

But this will do, I can train both the transformer and the metric tensor net, it gives me clear performance metrics, etc

#

Next task will be summarization

desert oar
wooden sail
#

as for the dice, you have to do some prep work yourself in defining what counts as a "success"

#

for a standard 6-sided fair die, you'd think of which sides represent a success. then for a single roll of the die, this determines the value of p, the probability of success

desert oar
#

but yes if you can interpret some outcome of the die roll as "success" and other outcomes as "failure" (eg a saving throw in D&D) then yes you can use the binomial distribution

#

and by the way, binomial with exactly 1 trial has a special name: the Bernoulli distribution. a binomial distribution is the sum of independent draws from a Bernoulli distribution

#

and likewise for multinomial. a multinomial distribution is the sum of independent draws from a categorical distribution

#

Wikipedia articles for probability distributions are usually interesting reference points, even though most other stats articles on Wikipedia are not great

final kiln
#

What does it mean when the validation loss is lower than the training loss

#

I don't think I have data leak, it only happens on certain hyper parameters

desert oar
final kiln
#

I'm shuffling the training data on every epoch

#

Have a running average for both sets

#

Ah they just flipped

desert oar
final kiln
#

Now it's overfiting ah

desert oar
final kiln
desert oar
final kiln
desert oar
#

is that what they say to do? normally "test" is reserved for checking at the very end

final kiln
#

I decreased the size of the model and now the val is larger

desert oar
#

the names are confusing and disagree with common english usage

final kiln
cinder jay
final kiln
#

Thanks, I'll check it out

#

But I reckon it might be picking up on some pattern that is more pronounced in the validation set

final kiln
#

But shouldn't matter right

desert oar
#

a bit? very confusing

final kiln
#

As long as there's no leakage

desert oar
#

yes, as long as the set you use to "follow along" during training is not the one you use for final score

final kiln
#

I'm now just reducing model size till it stops overfiting

#

I know I got this before

river cape
left tartan
river cape
left tartan
#

Ok, now that you’ve chosen one, what are you expected to do?

river cape
#

Anonymise user identities in large databases to ethically employ machine learning in understanding customer trends and behaviour without violating their rights.

left tartan
#

Is this just an essay question? A coding assignment? Etc

river cape
#

I need to implement a machine learning model for a given database which understands the customer trends and behaviour of the customer without enclosing thier details

#

Its a hackathon

ashen galleon
merry briar
#

@lapis sequoia the problem was the learn function I was doing self.weights[j][i] instead of self.weights[layer_i][i][j] so the changing one weight would change the other instead so it couldn't caluate the cost for the weight it was thinking about

final kiln
#

I'm on one block with 2 heads

#

Kind of insane that this miniscule model is threatening to memorize the data ._.

cinder jay
#

hey, how opencv subtract works?
i don't get it

final kiln
#

Aaaah back to regularization

left tartan
final kiln
#

Increased model size, included L1 and L2, model size affects LR schedule which might've actually been messing up the other loops

#

But at some point I'm gonna have to do data augmentation or find myself a larger set

frigid owl
#

hey guys i need help saving architecture from autokeras

#

i trained a text classifier and i just want to save architecture not the already trained model

#

is there a way to do this?

left tartan
#

@agile owl it’s not that it gets used the most times… usually you’re sliding the model forward and doing N tests, not a single test

#

Ie; train on 2019-2022, and test against 2023

#

Or start at 2010-2015, then walk forward

agile owl
#

how do you get any sort of signal against a recent trend then

#

if you want behavior from a high rate environment

#

etc.

left tartan
agile owl
#

let's say you're trading equity indices right

#

now equity indices have different correlations to rates/inflation in different historical periods

left tartan
#

(We haven’t gotten there yet, but Monte Carlo is also a topic to discuss)

agile owl
#

let's say you went from a low inflation environment where inflation was associated with higher returns on equities

#

but now you're in a high inflation environment and the opposite is true

#

and the last time you had something like this was 10 years ago

#

what is your sliding window gonna do

left tartan
agile owl
#

right

#

if you use too short a window, for instance

#

and any window is probably too short given how far back our datasets go in finance

#

the bias of your dataset depends on the timeframe right

#

if you made a trading bot in 2008-2009 to trade treasuries and it just always bought, that would be the right thing

#

if you made a trading bot including 2012-2021 and tried to use it in 2022 and it just always bought bonds

#

you'd just get blown up by the Fed

left tartan
#

You could certainly test a model against some historical era, it’s just somewhat inevitable that you’re overfitting tho. This is where, perhaps, I’d Monte Carlo it rather that test against market actual

agile owl
#

sure but implicit in the montecarlo design is that you understand the market dynamics well enough to produce a better sample than historical conditioned on the current environment

#

which is a big claim

left tartan
#

Yah, absent a multiverse, what’s the alternative?

agile owl
#

pretending that history repeats itself

#

that's the necessary axiom to any of this anyway right

left tartan
#

Not precisely, the necessary axiom is that there’s patterns, but not that the patterns exhibit the same order/etc

left tartan
agile owl
#

sure me too

#

so it's hard to think about the assumptions we are implicitly making about what historical behavior means about future behavior sometimes

left tartan
agile owl
#

I think it often goes unsaid what people are actually assuming with respect to that

#

my favorite thing that has no real theoretical basis but gets used by everyone is implied to realized volatility ratios

left tartan
agile owl
#

one is future looking and the other is past looking

#

but everyone uses them in every asset class

#

what people should look at is the implied volatility from X days ago vs the realized volatility

#

but that doesn't even answer the same question

left tartan
#

lol, that’s interesting to people like us. But to the market players, they have zero hindsight

agile owl
#

also the meaning of realized volatility is completely different if you're delta hedging vs not

#

but yeah who cares

#

low ratio good high ratio bad

left tartan
#

Whelp, off to dinner, nice chat!

agile owl
#

yup u 2

stiff axle
#

I installed miniconda but when I type python I get the non conda version. My terminal also doesn't detect the conda command. Do I need to add this manually to my environment variables? I'm asking because during the installation process it said doing so was not recommended.

serene scaffold
stiff axle
left tartan
#

That said; In some data science circles, Conda is well entrenched.

sterile flare
#

HELP idk whats wrong

sterile flare
serene scaffold
serene scaffold
serene scaffold
sterile flare
serene scaffold
#

helping people often involves googling error messages or running segmants of their code. and it's rude to expect people to retype stuff by hand.

sterile flare
#

i apologize again

serene scaffold
#

Don't worry about it for now. Just run print(df.columns.tolist()) and put the resultant text in the chat.

sterile flare
serene scaffold
serene scaffold
#

But it appears that 'England' is not the name of a column in your dataframe

sterile flare
#

exactly it is thats the problem

#

thats why i was confused

#

wanna show u my dataframe ?

#

can i do a ss ?

serene scaffold
#

I wanted you to run print(df.columns.tolist()) and put the text in the chat.

sterile flare
#

['A', 'B', 'c', 'D'] this is what appeared, i guess it is the D column named england

#

this is weird tho

serene scaffold
#

Okay, so there's no column named England. But you expected there to be one with that name. Should the England column have been there when the dataframe was initially created?

sterile flare
serene scaffold
#

It might help if you show the code that creates the dataframe, and everything that comes after it

#

!paste

arctic wedgeBOT
#
Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

sterile flare
serene scaffold
#

I want to see the code, in this instance. Not the dataframe.

sterile flare
#

whats a paste bin

serene scaffold
sterile flare
#

i guess that's what i was asked to do ?

serene scaffold
#

Yes

#

Thank you

sterile flare
#

i think i have 2 df

serene scaffold
#

It looks like you have two dataframes: data_frame and df.

#

and data_frame is based on whatever defra_consumption.csv is.

sterile flare
#

yes, sorry i'm not thinking straight

serene scaffold
#

That's okay

sterile flare
serene scaffold
#

so whatever columns are in defra_consumption.csv will be the columns for data_frame (but not df). It might be that 'England' is one of them.

#

make sense, @sterile flare?

serene scaffold
#

great 🙏

sterile flare
#

this is what i realized after remembering i have 2 dfs

sterile flare
#

and sorry if i was rude

serene scaffold
#

No worries, now you know for the future 🙏

stiff axle
#

I appreciate y’alls input on the conda advice. Ima just run with it and see how it goes.

cinder jay
#

Hi guys, im programming a python script to segment retinal blood vessel, the current result is this:

#

I've applied CLAHE and another pre processing

#

how do i remove the circle and the small particles¿¿¿

agile owl
#

you could use object detection and mask over the pixels where they are detected

past meteor
final kiln
#

this model is gonna get trained, whether it likes it or not

ashen axle
#

I am attempting to optimize a baseline fitting algorithm for signal preprocessing by instantiating it as a SKLearn Custom Estimator and using GridSearchCV. How can I define a custom score to optimize the baseline fit, i.e. maximise fit function smoothness and intersections with the signal function? Also, is that a rational way of defining an optimal fit?

The end goal is a SKLearn Pipeline with several preprocessing steps prior to deconvolution through model fit optimization.

past meteor
past meteor
final kiln
#

loss/train is large because I'm adding l1

past meteor
#

I'm too used to using drop out to ever look at train loss

final kiln
#

when I started this thing I didn't know about dropout, it's not coded into the transformer yet

past meteor
#

I was mostly intrigued by how large the learning rate is and how the val loss isn't dropping that much at the end

#

But honestly, it's a meaningless observation on my part 😄 they could be reasonable numbers for your domain

final kiln
bronze robin
#

Any method to calculate standard error of c, for a best fit line given by the equation log(y) = mlog(x) + log(c) ?

final kiln
#

at some point val will plateu and start to grow

past meteor
#

I always start with a learning rate of exactly 3e-4

#

And train with early stopping

final kiln
#

I'm using the LR schedule originally proposed in the 2017 paper, where the transformer was introduced

#

tho the formula doesn't seem to work as intended for small dimensions like these

#

I adapted it so that it defaults to 1e-3 if it tries to output larger values

#

but the intend behaviour is that it starts super small, grows up to 1e-3 or something, and then exponentially decays

past meteor
final kiln
#

I might need to modify it tho, I don't think they tested it for these values

wooden sail
# ashen axle I am attempting to optimize a baseline fitting algorithm for signal preprocessin...

this is a pretty broad topic and the score depnds a lot on your application. kinda rough without further context. regarding smoothness and intersections with the data, you can achieve this by having an L2 error term, which measures how good the fit is at points you specify, and then an additional term measuring how large the derivative is, trying to minimize both. this is also roughly the idea behind classical methods like splitting the signal with low and high pass filters first

ashen axle
ashen axle
ashen axle
# wooden sail this is a pretty broad topic and the score depnds a lot on your application. kin...

so as I understand it, i would mask the signal to find the “zero” points, i.e. local minima, and see how well the estimation fits them? Is the size of the derivative the smoothness metric? I thought minimizing the second derivative would be the way to go.

I was thinking of scoring by measuring how many zero points existed in the signal minus the baseline, avoiding the need to find local minima

wooden sail
#

the size of the derivative is a measure of a type of smoothness, yes. there are several kinds of smoothness, not just the basic "is the function even differentiable". many of the definitions involve finding a bound for how much the function changes if you change the input

#

and indeed, for spectral data like yours, one tends to look for the local minima and try to make those points zero, meaning the baseline function doesn't need to pass through all data points exactly

#

there are many papers discussing this topic of you just look up "baseline fitting algorithm" on google scholar, most of them on spectrographic data

#

sadly you'll probably find that in your data, even after optimization, probably 0 points will be exactly 0 😛 so oyu'll have 0 or only very few zero crossings

#

1d peak-finding is not very complicated though, you could almost consider this an input to your procedure and just calculate them ahead of time. this would tell you which points in the domain to include into your cost function

ashen axle
ashen axle
wooden sail
#

sadly the specifics of sklearn are beyond me, i don't use it. but that sounds about right

#

sklearn's grid search says something about defining the estimator with a score function, which i presume entails defining a function for the baseline, a score, and having sklearn fiddle with the baseline function's parameters through grid search

final kiln
#

made a small change in the output data, removed all regularization, it's looking promising

#

too soon tho

#

but gonna let it cook

ashen axle
#

is there a particular framework you recommend working in for problems such as this? Thus far everything had been written as python classes from scratch, based around pandas, with scipy.optimize for curve fitting

final kiln
#

reduced the model size by half and adapted the LR scheduler to have its intended behavior, I'm getting there for sure

#

like, it's not baaad

#

it's very sensitive to punctuation changes, so there's quite a bit of data augmentation that can be done here

river cape
#

can anyone provide some references to federated learning and differential privacy

final kiln
# final kiln like, it's not baaad

I think the lesson I'm gonna take from this one is that I should define my end goal more clearly. I'm trying to make the model not overfit, but I don't even know if the value b4 overfit is good or not.

cinder jay
#

Hi, im implementing a python script using opencv that segment the retinal blood vessel, the current result are this one:

#

how do i remove the small points????

final kiln
#

But also, try to not produce them in the first place

cinder jay
#

which kernel size???

final kiln
#

No idea, you have to experiment with it

outer jasper
#

can anyone help me with pytorch? torch.cuda.is_available() returns false even though i installed cuda

final kiln
cinder jay
#

ty

outer jasper
final kiln
#

Which OS r u using

outer jasper
#

windows

final kiln
#

Uhm

outer jasper
#

it opend a console then closed

final kiln
#

Literally never debugged this on windows, but I expect funky non sensical behavior like this

#

Maybe try to go via the Linux subsystem thing

tidal bough
outer jasper
#

i really should switch to linux

tidal bough
#

If you just did pip install torch, that'd get you the CPU-only version.

final kiln
#

Yeah Linux is a good choice since almost all production servers are Ubuntu

#

I think that's actually how I've been installing it tho

#

Gonna check

outer jasper
final kiln
#

I just do pip install torch

outer jasper
#

i am following a guide to install a Text to spech thingy

odd meteor
outer jasper
#

and every thing was going well unil i reached the pytorch steb

#

step

tidal bough
outer jasper
#

for some reason they say (the guide) you need an older version of pytorch

past meteor
tidal bough
past meteor
#

I did this as a student as well, fiddle with tons of hyperparameters for ages. It's not time efficient, best to set this all up, run it, sleep and check the results in the morning

tidal bough
#

i used this command .\Scripts\pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html then it gave me error saying it does not exist
torch 1.8.0 only supports up to python 3.9, probably that's why these versions didn't work

outer jasper
#

oh

#

okay

#

why is every thing conflicting

#

very annoying for a beginner

final kiln
final kiln
#

I got a bit of infra setup, I can build on top of it

#

I'm thinking of using the GitHub actions API to programmatically start several training loops

#

Instead of manually triggering them

past meteor
#

Yeah, that's the way to go

#

Well, at least some variant of it

#

Before I do experiments at work nowadays I think about what I want to evaluate etc. and then build something ad hoc to automate training / hyperparams etc.

#

but it's probably better to use mlflow, tensorboard, optuna, ... for this

final kiln
#

The infra I have saves me on GPU compute. I could setup on kaggle/colab, but the free tiers will just run out over night

#

I think kaggle connects to Google cloud but all our credit is on AWS

#

But yeah, lesson about knowing the goal is well learned here

#

Gonna run over the equations for the cross entropy in the context of my output data (which is kinda funky) and make a Fermi estimate of what I would consider a good result

#

Or just an actual estimate

final kiln
past meteor
#

Hmmm

#

I can only speak of my personal experiences but I usually think "okay this is my task" and then I draw out a schematic about how I'll try and evaluate what I want to do, what types of models, what types of metrics and I code this all up.

#

Then I might run experiment A manually a few times to see if it runs and try and make sense of the initial results, afterwards I run the experiment pipeline.

As it's running I prepare experiment B and repeat.

final kiln
#

Yeah that sounds reasonable

#

There's quite a lot of stuff tho. So there's dropout, L1 and L2 regularization, the various LR schedules that themselves may have several parameters, and then there's potentially 3 models to compare across various sizes with quite a bit of parameters themselves

#

And this is just the first task, I'd wanna do this for at least 3 or 4, which I think is what they did in the MetaFormer study

#

And I haven't even gotten to the data

past meteor
#

Yeah, hence why you should parametize it and use some optimizer.

It's a nasty problem. One of the first things they teach you in intermediate ML courses is you train models to solve a problem that is usually convex. On top of this you have a argmin_Loss wrt hypermaramters: Loss = F(hyperparameters) but this isn't a convex problem whatsoever.

crisp raptor
#

I feel like this image seems so unprofessional in a paper on NLG

final kiln
#

I do have some ideas on how to make this hyper parameter stuff searchable with gradient descent. But I'm sure a lot of people have tried it b4 me

past meteor
agile owl
#

and you aren't predicting the future per se you are predicting the best action given the current state

molten elk
#

Can anyone tell me how this is different np.random.binomial(1000, 0.7, 500) from np.random.binomial(1000, 0.7)? Is the size parameter different from the number of times?

agile owl
#

I agree that could be clearer

#

the question with reinforcement learning is if the environment admits information about the reward

molten elk
# agile owl

I get it now it runs the binomial check in the number specified by size

agile owl
#

the number of trials is a distribution parameter but the number of samples is not

mint palm
#

I know what conditional probability is, but man it will be great if anyone of you could please help me interpret the equations

final kiln
#

looking at cross entropy loss is not a good way to do it, I care more about the percentage of correct guesses

#

random chance is on the order of 1e-5

desert oar
final kiln
#

All this time I thought I needed to reduce model size, now I see that I get better results by increasing it

#

see how val actually increased, but the percentage of correct guesses got better

river cape
odd meteor
# river cape Anonymise user identities in large databases to ethically employ machine learnin...

I suggested using Federated Learning + Differential Privacy earlier. Have you looked in them?

I also shared a github repo on research paper implementation of Federated Learning.

Try checking that as well.

If you however don't fancy the idea of reading a research paper and using the code implementation of that paper to learn a new topic, then I'll suggest taking your time to go through this nice detailed blog and tutorial from Flower.

https://flower.dev/docs/framework/tutorial-series-what-is-federated-learning.html

odd meteor
dark lichen
#

anyone here good in advanced math?

final kiln
#

Honestly at this point I'll be happy with an overfit 💀

#

Ok training set is getting to 50% correct guesses

#

So I think I'm getting somewhere

odd meteor
final kiln
#

Omg validation loss just started converging out of nowhere around 75% correct guesses on the train set

#

It was at like 12 randomly jumping around, now it's under 1 and dropping

#

Honestly I gotta just let it cook

river cape
# odd meteor I suggested using Federated Learning + Differential Privacy earlier. Have you lo...

I did look into them especially the recommendation systems , and this is for a hackathon so we need to implement a technique which is suitable for the above problem statement.

So what I initially thought is we have a large dataset and I would just search on Google for the entities which are Personally Identifiable . Check the similarity of those against the columns of the dataset. And i am stuck here as to what to do

umbral charm
#

when would one use matplotlib compared to plotly

#

i dont know if i should stick to matplotlib or learn plotly

odd meteor
final kiln
#

my notebook tab just crashed, : D........

odd meteor
odd meteor
umbral charm
#

I was just wondering since i feel fluent enough with matplotlib to learn somethign more complicated

#

But there would be no point learning it if its worse / inefficient

final kiln
#

I'm gonna give it a rest, tomorrow I'll implement the cloud infra stuff so I don't have to babysit notebooks

odd meteor
river cape
odd meteor
# final kiln It was kaggle

Oh I see. Having to randomly move your mouse every 30 mins in order to keep the notebook active 😂😂😂

Well, I still prefer Kaggle to free tier of Colab

final kiln
#

But damn, this really shouldn't be so hard, it's just a classification problem

odd meteor
# river cape Its mostly the concept and accuracy plus I just need a roadmap of how I can do i...

I'm afraid I might not be of much help at this time; since I've not worked on any project where I had to implement differential privacy yet.

However, I'm sure there are more knowledgeable people here with much experience in Differential Privacy who can be of help.

If I were you, I'd go down the rabbit hole of checking research paper with code implementation on this same topic or even learning from YouTube or something. ( Devot 1 or 2 days and you'll have a lot to write in the abstract you're expected to submit)

Hopefully, your team wins this Hackathon. All the best 👍

odd meteor
fleet hemlock
#

Hi can you tell me what can i do to improve in python and what are the projects can I do

still coyote
arctic wedgeBOT
#
Kindling Projects

The Kindling projects page on Ned Batchelder's website contains a list of projects and ideas programmers can tackle to build their skills and knowledge.

fleet hemlock
#

Thank you

mint palm
final kiln
#

That's the notation for saying that a given random var follows a given dist

mint palm
#

no the conditional prob equation

final kiln
#

What about it ? The || ?

mint palm
#

i know one |

#

not ||

final kiln
#

Yeah that's a good question

mint palm
#

also the double arrow?

final kiln
#

Seems to only happen within KL

#

So it's probably defined somewhere back

mint palm
#

This paper is nutz.
5 %contribution 95 % flassy equation

final kiln
#

Double arrow is not any standard notation I am aware of

mint palm
#

this is start of methodology

#

let me share paper

final kiln
#

Anything that's not standard they have to define it

final kiln
#

And its preferable not to use anything that's not standard

mint palm
#

no supplementry nothing

final kiln
#

Yeah that's kinda wild, I wonder if it becomes standard notation around some specialized niche

#

Maybe follow the closest related citation

#

See if they define it there

mint palm
#

Its like those fancy restaurants

final kiln
#

Check 37

#

Bet it's gonna be there

mint palm
#

yeah i was checking that one only

#

found it

#

but its a bigger night mare

#

thanks for the suggestion

final kiln
#

It's defined back there, I'm reading it rn

mint palm
#

yup integration of kl loss from one of the term

final kiln
#

The double arrow thing, maybe it relates to the concept of clustering somehow idk

mint palm
#

i will try to find.

final kiln
#

Is the pytorch documentation an open source thing ? I really wanna contribute to it if it is

#

I see, they are generated from the docstrings

#

I wonder if they're open to mods on this stuff, I can make them easier to understand

desert oar
#

"Cat" is the categorical distribution, Bernoulli with >2 categories or multinomial with n=1 trial

#

that double arrow notation is new to me, it might be defined in reference 37

#

i agree that it appears to denote some kind of clustering structure, but i can only guess as to what it means

abstract wasp
#

Hi, I’m trying to make a model that converts bullet points into full sentences. I’m not sure how to structure my dataset. I currently just have a .txt file with something like:
Input:

  • finished data collection
  • started cleaning data
    Output:
    I finished the data collection and I started cleaning the data.
    Is this good enough? I’ve seen some people use json format for this, what’s best?
final kiln
#

Uhm, I'd just pickle a python object, or put it into a parquet file or an SQLite file

serene scaffold
serene scaffold
#

Presumably there are examples that are more intricate than "I ... And I ... And I ..."

abstract wasp
# serene scaffold What kind of model is it btw?

I’m thinking of a sequence to sequence model.
So I’m in this research hub and we have to send weekly emails about our progress through that week. I forget to send the emails most of the time 🤡😭 I usually type down my hours and what I completed—I’m making this so that it can take the notes I have, convert it into full sentences and have the email be sent out automatically 🤡😂

abstract wasp
desert oar
agile owl
#

~ means distributed as

#

like x ~ N means x is normally distributed

#

you know the big fancy N

dense yarrow
#

does anyone have a pdf of this book?Pandas for Everyone: Python Data Analysis, 2nd Edition by daniel chen

#

i'd reeally appreciate it

#

i made the wrong orielly account and cannot access it but i have hw due

#

in a few hours

#

nvm i got it now

magic dune
#

@serene scaffold can I ask you a question about markov chains? (I saw you were a Computational Linguist). I know it can create sudo realstic senetences but sometimes they don't flow very well. Are there any solutions for this type of markov chain problem or is that just a known limitation. (Sorry to bother.)

serene scaffold
#

which obviously doesn't have that particular limitation.

magic dune
#

or am I wrong?

serene scaffold
#

you might enjoy this reddit post I wrote when I created a markov chain language model for my homework

#

I therefore submit to all of you one of those most statistically probable passages to appear in the Book of Mormon.
Ha, I'm even funnier than I remember.

magic dune
#

lol

#

I have been having a lot of fun with markov chain

#

And wondered if there is any fix. thanks for answering all my questions super informatively!

serene scaffold
#

or decrease the temperature. I guess.

#

but both of these will just make the text more similar to passages from the training data.

magic dune
#

ya

serene scaffold
#

markov chains with ngrams aren't sophisticated enough to produce things that are "new"

magic dune
serene scaffold
#

I dunno

magic dune
ionic umbra
#

I'm trying to run the example code for d3graph from here: https://erdogant.github.io/d3graph/pages/html/Edge properties.html

... but whenever I execute, I just get a "File not found" error from Firefox, saying the temporary file it's trying to create doesn't exist.

"Firefox can’t find the file at /tmp/tmpog5luy4x/d3graph.html"

...any ideas what to do here?

final kiln
#

This network

#

Will get trained whether it likes it or not 😈

final kiln
#

I'm gonna try to design a generalized pipeline

#

Only one input, which will be in json form

#

I gotta brainstorm

final kiln
#

here's the initial version of the game plan

#

this is similar to what I already have, the only difference is that I'm extending it so that train.py can be selected

#

there's possibly a much better way though

#

if I get an AWS AMI with self hosted runners pre-installed

#

I need to read up on it

final kiln
#

looks almost done, but it' not stopping the instance

final kiln
#

this gonna be epic

final kiln
#

this would be amazing if it works out

#

and it worked out, this is great, im very happy rn

spiral whale
#

hello, ive discovered LM studio, where u can download free open source models and run them locally. Is there a way to download them and import them on my own python script? with keras or tensorflow?

livid goblet
#

Hi
Guys, which topic would you think is more interesting to work on for a Master Thesis ? I'm kinda on the fence about that
DeepStereoBrush – Depth Map Interpolation Using Deep Learning
Neural Networks Optimization for Edge/Mobile Computing (such as CLIP network, etc..)

barren fable
#

Hi, I have two questions about dummy variables and feature selection in machine learning.

First, so I know that to avoid the dummy variable trap, I should drop a column from the dummy variables. So now I see some people who say that you don't need to drop a column from your dummy variable because Sklearn will do it automatically.
I read some articles that said, "You don't have to do this because the sci-kit learn library automatically removes one of the variables for you in the following code."

# X is the training dataset and we are using Sci-Kit Learn
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder', OneHotEncoder(), [3])], remainder = 'passthrough')
X = np.array(ct.fit_transform(X))
print(X)        

Other people said that you need to do it manually, and when I searched and looked at the OneHotEncoder parameters on the Sklearn website, I found that
drop{‘first’, ‘if_binary’} or an array-like of shape (n_features,), default=None
so the default of the drop is None, not first.
So, does the SK Learn actually take care of the dummy variable trap and remove a column, or should I do it manually? im confused?

Second, I read an article that says, "Backward Elimination is irrelevant in Python, because the Scikit-Learn library automatically takes care of selecting the statistically significant features when training the model to make accurate predictions."
So again, should I do feature selection manually? or Sklearn takes care of it?

Thanks.

odd meteor
fading compass
#

Hello, I have to generate a 'random on a grid' grid, every mesh has to be occupied by a point, can someone help ?
Here is my code in the case of a regular grid :

def generate_gridded(dim,nb_pts):
    dx,dy=dim
    x=np.linspace(0,dx,int(np.sqrt(nb_pts)))
    y=np.linspace(0,dy,int(np.sqrt(nb_pts)))
    X,Y=np.meshgrid(x,y)
    X1,Y1=np.meshgrid(x,y)
    return X.flatten(),Y.flatten()

Thank you

mint palm
#

why some ROC curve life left, and some like right?

#

is right one having high discretisation of thresholds?

#

and due to rectangular interpolation, it looks choppy?

final kiln
#

one job per hyperparameter set

final kiln
# final kiln one job per hyperparameter set

@past meteor what do you think ? I literally just setup my set of hyper parameters as a matrix in the GitHub workflow and it runs them sequentially (or concurrently if possible)

#

It doesn't really handle fault tolerance though, if AWS takes away the spot instance it just kinda fails

#

Still working on how to handle it

#

I think I can possibly check for unfinished jobs and try to schedule them, but idk yet

past meteor
final kiln
#

thanks!

#

here's how the workflow is looking iff you're curious

#

ideally I think I'm gonna do 1 workflow per experiment instead of having a bunch of input parameters

past meteor
#

Do all of them fail as soon as the first fails or what?

final kiln
#

no, I'm disabling that on fail-fast: false

#

but I worry about the case where the spot instance is removed by aws

#

I can make this sort of "idempotent"

#

all jobs run when I run the workflow, but they will first check if that set of hyper parameters were already trained

#

yeah that's how it's gonna go

#

and I can set it to dispatch a totally seperate workflow

#

that checks for failed jobs and restarts this one in case of it

river cape
#

After making a model , what do we do?

serene scaffold
river cape
serene scaffold
river cape
serene scaffold
river cape
serene scaffold
river cape
#

Doesnt every model need to deployed on the cloud

serene scaffold
#

no. but even if they did, the way that you deploy it depends on the model

river cape
serene scaffold
serene scaffold
#

this is like asking "how do people deploy software" without being any more specific.

river cape
#

Hmmm My bad

serene scaffold
#

if you want to make the question more specific, I or someone else might be able to help

serene scaffold
serene scaffold
river cape
#

Ahhhhh sorry mate if I didnt comprehend my question properly , such a newbie trying to figure out things

serene scaffold
desert oar
#

however it think there is a library that makes doing JSON CRUD stuff easier with django

#

"django rest framework" i think it's called

past meteor
#

or django ninja

desert oar
#

that one is new to me. i haven't used django since 2017

past meteor
#

I used django rest framework (DRF) for the backend of an app I never finished and it's soul crushing 🤣

primal agate
#

Guys I have a question I dont know how to start with numbpy and pandas

#

Any ideas

left tartan
primal agate
#

What I should learn at first

left tartan
primal agate
#

Thank you so much

#

Thats what I needed

final kiln
primal agate
#

You have been learning by this?

left tartan
#

It’s a nice intro because it doesn’t try to teach too much at once, I recommend it a lot

valid tree
#

Hi folks, in pandas, why is this the case:

import pandas as pd

df = pd.read_csv("...")
df.groupby('country').size() # calling the size() function on the group

def size_filter(grp):
    return grp.size() > 2

df.groupby('country').filter(size_filter) # error on size is not callable

Also notably, when I do type(grp) from within that filter function, it's a Series. I'm just not sure what I'm operating on fundamentally in my filter function - why is it a Series? Which Series is it?

serene scaffold
#

not a DataFrameGroupBy

valid tree
#

Thanks for getting back to me

serene scaffold
#

one way to do this is to do print(df.head().to_dict('list')) with the actual dataframe that you have.

#

but it needs to have enough unique values for the country column to be interesting.

valid tree
#

yeah just double checked, I was calling type on the wrong thing - pitfall of notebooks. I see it's a DataFrame now

valid tree
serene scaffold
#

(I use notebooks for some things, for as much as I shit on them.)

valid tree
lapis sequoia
#

is there any better models than Black Scholes in terms of call option price?

serene scaffold
lapis sequoia
#

Ah ok mb

serene scaffold
#

That model you mentioned. What does it do?

lapis sequoia
#

So the model, which I've recently heard throughout my researches throughout youtube. The model: Black-scholes models is a pricing model used for the valuation of stock options. I would say it looks good to me as it refers to the volatility of price, risk free rate, etc.

#

However, I would like to know if there's any better model I could use or should I stick with learning this new model

left tartan
#

It’s one of those; everyone uses it to some extent, even if there’s some flaws and weaknesses, it’s still pretty good.

#

The interesting thing about it is that implied volatility is derived from it (since you can observe the current price, you can solve for ivol)

lapis sequoia
#

intresting, anyways I'll learn this new model and hopefully know how I can implement or apply it to my future projects

#

It sounded intresting to me, but I wanted to know if it's worth it. Now I know it's worth it, so thank you so much 🙏

barren iris
#

does anyone know a good implementation of Kendall's W concordance coefficient?

eternal bridge
#

anyone here used Quarto on VS? I need some assistance with running Quarto

teal lance
#

So I was able to compare dxy , spy , vix but I want to use python to build a model using this because well the dxy acted as the open interest while vix declined sending spy up

#

Anybody want to help me also finish a good project it’s turning the cot report into the dmi indicator ? Basically taking the values of the dmi to spit a bias and trend confirmation by acting as a live cot report

tacit basin
tacit basin
teal lance
valid tree
primal agate
teal lance
#

Do you guys think a dmi is able to be used in python to spit a bias out based on the values since they are numerical ? Using the values makes it easier to configure true or false faster and more reliable with out too many outliers tbh

#

hard to take a thinkscript and turn it into python

#

Starting to enjoy it I’m taking my 3 1/2 market experience and just trying to replicate my manual trading

tacit basin
valid tree
tacit basin
teal lance
#

Loving this fetching 🥹🐍🐍🐍

teal lance
final kiln
#

almost there

midnight harbor
#

can someone give me good advice for getting colab alternative that can run at background (colab pro seem to me kinda expensive what people say on reddit as its computing unit get finished)

so if any one here have good alternative of this cheap and fast please recommend by tagging me

tacit basin
final kiln
#

I'm running ec2 spot instances using GitHub actions

#

I'm getting 10-30% of the original price

midnight harbor
# tacit basin how long do you need a session to last?

in googl colab t4 gpu free plan taking 11 hours to complete but as its free plan it's session get expired

well also i m experimenting but still 11 hours i think would be the one (1x t4 gpu of google)
can run multi gpu if available

tacit basin
midnight harbor
#

also i m very confuse in what this computing units of colab, like hwo they work

midnight harbor
#

i once created accoutn on this XD

tacit basin
#

each gpu on colab will have different compute units cost, like A100 will be most expensive, t4 cheapest i think

midnight harbor
#

many people are saying on reddit like their computing unit end in a day like if units get finished will they get back next day or just the end

midnight harbor
final kiln
#

Inputted the wrong parameter for the 1000th time

#

I'm so distracted rn

#

I'm even gonna train this on CPU tbh, what's the problem in it lasting all night if it will only cost me like one dollar

#

Better than having it run in 3h and costing triple

buoyant vine
#

think The relative scale of that should not line up

#

typically, the cost of getting enough cpu cores to match the GPU (assuming the actual math that would be done on the GPU is the limiting factor) would result in the cpu cost version being much higher

jovial heath
#

heello, i'm working on a CNN project using keras

final kiln
jovial heath
#

My result when I trained the model was val_accuracy 98.06
and my accuracy was 95.11

final kiln
#

I keep getting my quota requests wrong too

#

So I'm always operating on a subset of what's available

jovial heath
#

but my graph looks like this

#

i'm using a small dataset

buoyant vine
#

what do your other metrics say? I.e. F1, Recall, Precision?

jovial heath
#

i think i dont use

#

my model is like this

#

i'm using reduce on plateu too

#

reduce lr on plateau

#

I saw someone saying it's good to use

final kiln
#

Try the reverse, starting with small LR, build up for a bit, then exponential decay

versed gulch
#

If I have a 2D array of values [[1, 2, 3, 4, 5, 6], [2, 4, 5, 1 , 1, 2]], is there way I can get a minimum of each column such that the values are above 2?

primal agate
#

How much time did you spended

#

Spend*

#

Building this

amber cairn
#

Hello all, slightly off topic but I wonder if there's anybody aware of any app out of there, which in a similar manner like Duolingo, can support training in small chunks on data science challenges.
The idea is to get help developing those skillsets without digging into projects of any sort of without managing little scripts creation without any support in confirming the correct implementation

final kiln
#

Yeah Duolingo is pretty cool

#

Sometimes wish leetcode was like Duolingo

final kiln
#

the experiment is literally gonna take 24h to run

#

waiting on GPU quota again

#

I think aws makes it hard on purpose so that people don't overuse the spot instances

tacit basin
midnight harbor
#

Thanks @tacit basin

tacit basin
#

yeah for A100 gpu you get like 20 something hours per month on most expensive plan pro plus

#

good thing about it it's that they are usually available, on some other clouds that's not the case

teal lance
primal agate
#

It was nice for you

#

How much time did you spend

agile owl
#

what is the best library for GANs right now

teal lance
#

A good amount of time the next thing I need is to correlate the volume and compare assets to the vix for low volatility or high volatility

odd meteor
agile owl
#

I don't see how that's not a valid question

#

there are libraries for RNNs and CNNs

#

but yeah I guess you could say generative AI

odd meteor
# final kiln Yeah Duolingo is pretty cool

I was using this app to learn Deutsch around 2020. It was all fun and nice till they updated the app and introduced gamified style of learning. I don't know if things has changed now

agile owl
#

they have admitted that they are not an education company but an entertainment company

serene scaffold
#

as a linguist, my professional opinion is duolingo bad

odd meteor
# agile owl I don't see how that's not a valid question

Of course, no question asked here is deemed invalid 😊
What I inferred from your original question was:

/Which library (framework) is best for GANs./

And by "library", if you were referring to Keras, PyTorch, TensorFlow etc... then, I don't think there's a specific framework that's better than the other in that regards.

agile owl
#

my favorite foreign language is korean and they had the absolute worst korean lessons that's how I realized it was a scam

#

there's also libraries that have a bunch of networks already implemented like stable baselines 3

#

for use in particular contexts like reinforcement learning

final kiln
odd meteor
final kiln
#

After some time I'm gonna branch out, voice chat, see movies in German, etc

#

model keeps overfittign

#

what is it about sentiment analysis that makes it so easy to overfit

#

this stuff has been a huge success tho

#

I really just need for aws to for the lvoe of god give me access to that juicy spot gpu already

#

been playing with quotas for almost a month

#

okay, I think I'm going to look for a larger dataset and data augmentation techniques, I refuse to believe that the transformer can't perform this task

final kiln
#

1.6M samples, I was working with 50k

buoyant vine
#

We have almost never any issue getting on demand instances, I dont think you'll ever get them on spot though

final kiln
#

I'm trying spot

buoyant vine
#

yeah but most of the time they are never available enough

final kiln
#

Imma cry

#

Just spent so much effort to get spot infra thing

buoyant vine
#

Normally when we do training runs we have retries on our scripts to spawn instances because we need to check other availability zones for available instances on demand

final kiln
#

Okay, I'm gonna search Google for a bit on this GPU shortage thing, see what I can come up with

buoyant vine
#

Have you tried some of the TRN / non-cuda instances?

#

does your tooling support it?

final kiln
#

It can boot up any ec2 on demand or spot instance

buoyant vine
#

nah I mean like your ML lib

#

i.e. PyTorch, since they are what are interacting with the hardware doing the math

final kiln
#

Oh, I don't know what you mean by TRN, I assumed it was some instance type

#

Is it like TPU type of thing

buoyant vine
#

there is TRN1 which is AWS' tpu thing

final kiln
#

I'm using pytorch, don't know if it supports TPU, but I assume it does

buoyant vine
#

you might have a better time getting some spot instances on those perhaps

final kiln
#

Oh that is clutch

buoyant vine
#

and still have a decent speedup

final kiln
blissful hatch
#

hey bro!

#

guess you're quite familiar with tensorflow

#

right?

final kiln
#

I got a GPU spot instance

mint palm
#

where can i learn low level working of LSTM

#

I know the states/gates and suff

#

but i wanna know how embedding and iterations run though

amber cairn
blissful hatch
#

LSTM?

#

what does that mean?

serene scaffold
serene scaffold
amber cairn
serene scaffold
amber cairn
teal lance
#

Idea to add on to my script

lavish swift
#

Does anyone have suggestions for a course on AI and more specifically LLMs? I don't mean creating a model, but topics should include:

  • Running a model LOCALLY and not just sending data to OpenAI
  • RAG
  • How to differentiate and pick a model to implement in the chain
  • LangChain (or other relevant libraries)
  • Fine-tuning (maybe?)
  • Ideally the course would also have a community to ask further questions

I'm doing some of this now, but it mostly feels like guessing. So I'd like to fill in some of my many knowledge gaps.

final kiln
#

This is so much work, y so much setup for this I don't get it

#

Y r people publishing 10GB sized images

#

What is life

#

I might've dropped the ball when defining the storage for the last AMI tho

#

Don't matter, might as well now over then now under

#

Storage is supposed to be cheap anyway

#

The instance I managed to catch is AMD based, which is way I'm still on this. Amazon Linux AMIs come with stuff for Nvidia

#

And it failed

#

AMI got corrupted for sure

#

Need to repeat from the original

#

Gonna give it a rest, is getting late

#

But it's a matter of time, tomorrow I'll finally have GPU

undone dust
#

hey 2 questions, should I learn how to use pytorch or tensorflow and what's a good video to get me into it?

serene scaffold
undone dust
serene scaffold
merry ridge
#

Is there a LaTeX or MathJax bot available to render math for this channel?

delicate apex
#

.help latex

strange elbowBOT
#
Command Help

**```
.latex <query>

*Renders the text in latex and sends the image.*
delicate apex
#

helpful embed, but yes - there it is

#

you can experiment with it #sir-lancebot-playground if you like, as well, especially as the resulting images do not have delete or revision features if you have incorrect latex input

serene scaffold
#

.latex \latex

#

What

delicate apex
#

.latex \LaTeX

strange elbowBOT
serene scaffold
#

Yay
Now I can be happy

merry ridge
#

Thanks

#

So I decided to enroll in a machine learning course focusing on neural networks. I don't know if it's just me, but I thought I was very comfortable with multivariable calculus, and this notation is really killing me. For example they wrote that given a model for a neural network X with depth N, the model

#

.latex $Y^i = F^i(\mathbf{X}) = f( \sum_{i_N} w_{N j_N}^i f ( w_{{N-1}, j_{N-1} }^{j_N} \ldots f(w_{1,j_1}^{j_0} X_{j_0})))$

#

I'm just going to compile it on my side and paste it as an image I guess

#

With some loss function:

#

Clearly has that the derivative with respect to the weights depend only on the outer most nested function so that

#

So my main confusion is that I have no idea how people are able to chew through this much notational complexity and just conclude something about the form of the partial derivative so such a blasĂŠ manner. Do people just not really care about the fine details? It took me nearly an hour to carefully keep every subscript and subscribe in my head, understood what the equation was trying to do and then apply the chain rule.

#

This is aimed at an upper senior undergraduate level, so it's not exactly ML for babies I guess. But I was kind of expecting a little bit more hand holding with respect to the computation.

odd meteor
odd meteor
# undone dust oh ok thanks and is there like a video a lot of people recommend or just start w...

Welcome to the most beginner-friendly place on the internet to learn PyTorch for deep learning.

All code on GitHub - https://dbourke.link/pt-github
Ask a question - https://dbourke.link/pt-github-discussions
Read the course materials online - https://learnpytorch.io
Sign up for the full course on Zero to Mastery (20+ hours more video) - https:/...

▶ Play video
iron basalt
#

Switch courses.

undone dust
bold timber
#

I have a question about Bidirectional RNN. How does Bidirectional RNN work when there's a sentence like "I am ___ hungry, and I can eat half a pig."? Can Bidirectional RNN be used to fill in the blank?

left tartan
lavish swift
dense yarrow
#

i've been really sad lately because i'm struggling in the math course (probability, linear algebra, discrete math, but mainly probability) in my data science program. i feel inadequate and i was wondering if you guys have any tips or advice on how to improve my understanding and skills in those subjects? are there any youtube videos or anything that can help me understand how different math problems apply to different tasks in data science projects? I think if i understand how i'm going to use them in a job setting, it will help me learn better

#

i think i always struggled a little with probability even when i took stats courses before

wooden sail
#

chain rule is the name of the game

agile owl
#

UNLIMITED POWER

agile owl
merry ridge
# left tartan As a CS grad student, I made the mistake of taking a stats class that was for bo...

I have absolutely no problem with graduate level statistical notation so if this is comfortable for you in CS I’ll just have to get used to it.

It is particularly annoying that this course uses a subscript in some cases and a super script in other cases to denote the same thing such as the index of the current epoch. It is making it very difficult for me to be able to just ignore a symbol that isn’t of interest at the moment because those symbols appear in multiple inconsistent locations.

amber cairn
#

Great day everybody.
I would love your opinion in understanding what could be the best approach in determining the impact in web site traffic changes given changes on the page.

I basically have historical data, and I know the point in time when changes occurred.

I don't have confidence the casual impact is the right direction, also because there's no other way to compare/confirm the impact.

What is your take/advice?

jovial heath
#

Hello, I need to do work that checks whether the game "beat saber" is being played or not. To do this, I separated some images of him standing still or playing, but the still images are very similar, does this interfere with the model?

#

If they are not like this, the database becomes too small

mild dirge
#

If they are all in the same position, the model could f.e. get very good guess if it checks just a single pixel in your data

#

Whereas you want it to learn that it is not playing when the position does not change

jovial heath
#

Is it better then for me to take similar images even if the database gets smaller?

serene scaffold
agile owl
#

all my threads are doing their duty

agile owl
#

PPO marches on in its inevitable but lengthy quest for convergence

agile owl
#

..w-when will it bend

#

lovely

teal lance
teal lance
lapis sequoia
#

honestly it still learns

#

people trained models on random labels and they still learn stuff

#

you could also train model and see what images the loss is the biggest on after training and remove those

teal lance
final kiln
agile owl
#

vectorized reinforcement learning envs

#

this computer is living up to its vocation

#

it was given 64 virtual threads to be used

#

I feel bad for all the computers that are never used to their potentials, doing nothing but opening chrome tabs and copying memory around for stupid youtube video browsing

final kiln
#

with gpu and stuff

#

it was a lot of work because AMD has very bad ML support on AWS

#

in the end I found an available nvidia instance

#

so I didnt even manage to make the amd stuff work

#

their latest image is outdated, theres no aws ami for it, etc etc

merry briar
#

when u get the sus-est error ever

agile owl
#

ah that's satisfying

#

does anyone else enjoy learning curve charts

final kiln
#

I feel like I'm doing what I was doing b4 but now at an industrial level

#

Can spawn hundreds of training loops in dozens of GPU machines

#

Only limited by AWS quotas

#

And money ofc, even with spot the burn rate can become large

primal agate
#

I love data science

teal lance
agile owl
#

nice. Iwas long TY today

teal lance
agile owl
final kiln
#

And I can track my loops on the go

#

ML=infra, all else is EDA

#

That's the lesson I'm taking

agile owl
#

how do I take a standard normal distribution and transform it into something that looks like a square root or log shape what function can i use

teal lance
agile owl
#

chat gippity to the rescue

final kiln
#

I'm very tired rn, that thing took me 2 days to make

agile owl
#

we're gonna use boxcox transformation

#

from scipy.stats import boxcox

final kiln
#

Didn't even lunch todah

final kiln
agile owl
#

it's called boxcox

#

lol

final kiln
#

You want a transform on a gaussian that transforms it into a sqrt or log shape, I never encountered that problem

agile owl
#

It's to make the reward function convex

#

so the agent is risk averse

final kiln
#

It's actually easy tho

#

Think point wise

#

You're solving an equation at every point

#

Like you want

#

gauss * f = sqrt

#

f = sqrt / gauss

#

Something like that

agile owl
#

yeah I see what you mean but I'd have to pick out points and do the math by hand

#

I'm not that smart

#

this is a great use for chat gpt

final kiln
#

No it's literally just dividing the samples from one by the other

agile owl
#

easily verifiable

final kiln
#

As long as no zeros

agile owl
#

the gaussian function is not easy to evaluate in my head

#

lol

final kiln
#

I'd just use numpy

#

My brain is v slow rn, I have to sleep

agile owl
#

return boxcox((self._get_return() - self.rate / 252) / self.return_volatility)

#

this is the reward function now

#

I expect a better mean variance ratio out of sample with this let's see what happens

candid spruce
#

Hi would anyone be willing to teach me ML using python 😄

teal lance
candid spruce
willow pelican
#

If I want to go into data science, should I major in CS: ML or stats: data science

willow pelican
#

But a double major would be painful

agile owl
#

I'm actually using Yeo-Johnson with a lambda of -1 that's basically what I wanted

#

I actually don't think it's trivial if it's named after someone tbh

#

will be interesting to see what happens with different values of lambda

primal agate
#

I would like to start with ML but I am not enought good at maths yet

#

Only one problem

#

but I am preety enjoying data science

#

its kinda easy

#

and it gives you fun

#

imo

agile owl
#

ok bro lol

final kiln
# agile owl

Dude solve it numerically by sampling both functions, dividing one array by the other and cubic splite it

#

Easy

#

those equations are probly the result of the same procedure

#

But with the analytic expressions themselves instead of their samples

#

Which also does not look hard to do

final kiln
agile owl
#

yea I skipped real analysis sue me

final kiln
#

my real analysis was insane

#

the professor decided that he wanted to summarize the entire math field and teach it to 2nd year students

#

The memes were insane

#

Like dude was straight up teaching differential geometry

#

Which was only gonna be useful to the 20% of the class who would've eventually gone to MSc in physics

#

Sorry you triggered me by mentioning real analysis

#

._.

warm copper
#

hi fellow data scientists

warm copper
final kiln
#

I'm eating a snack cuz otherwise I can't sleep

agile owl
#

my original idea was to just multiply everything less than 0 by 2 and everything greater than 0 by 0.5

#

which probably would have worked but I never tried it

agile owl
#

when ur algorith, makes a scientific breakthru

#

(actually these jumps are just an artefact of convolution smoothing of the episode rewards, the negative outliers make those big depressions)

willow pelican
#

going into data science majors, is taking statistics in highschool more vauable than a CS class? I feel like I can easily learn python and other tools outside of school than learning stats on my own

serene scaffold
limber mesa
willow pelican
#

so, I think I'm oging to play it safe with CS

#

then, If I find that I want to specialize on a certain thing, maybe I'll get a minor in it, or switch to it for my masters?

#

feel like thats the most logical way to go about it at least for now, I like looking at the whole picture, probably don't need too though

agile owl
#

anyone have an example of using GANs to generate samples from correlated time series

#

if not I guess that's going to be my next project

final kiln
#

Milan has 16gb gpu at 7 cents

#

on aws

#

I've been applying LR as a function of the epoch, should I be doing it as a function of the current batch ?

#

the model overfits no matter what I change, culprit is data for sure, tho I think freezing the tokenizer and positional encoding would help a lot

#

increasing the distance between output and the tokenizer seems to help a lot

#

which is not intuitive since the number of parameters grows, so it should overfit more easily

#

my working hypothesis is that making the model grow that way slows down the convergence by a bit, so the final values on the loss val end up being shifted

agile owl
#

tfw you're not sure if you're gonna run out of RAM or not

final kiln
#

so there's two paths here

  • find a larger dataset and use that + data aug
  • modify the training procedure so that positional encoding is determined analytically and tokenizer is pre-trained

I'm tempted to tinker with the model, but experience has taught me that data is king, there's like a good chance that changing the dataset to higher quality stuff will make loss val converge in 10 nano seconds to the planck scale ._.

peak patio
#

Hello,
I have equations like these(32) that I need to solve:
i_6 + i_22 = i_3 + 83
i_12 =i_26 + i_7 - 114
i_16 =i_18 - i_5 + 51
i_30 - i_8 = i_29 - 77
i_20 - i_11 = i_3 - 76
..........................

I have tried to use sympy, but its been 12 hours and the program is still running, am I doing something wrong ?

from sympy import symbols, Eq, solve
symbols_list = ['i_'+str(i) for i in range(32)]
vars_list = symbols(symbols_list)

equations = [
    vars_list[29] - vars_list[5] + vars_list[3] - 70,
    vars_list[2] + vars_list[22] - vars_list[13] - 123,
    .....
    vars_list[1] + vars_list[21] - vars_list[11] - vars_list[18] - 43
]

solution = solve(equations, vars_list)

for var in vars_list:
    print(f"{var}: {solution[var]}")
final kiln
#

you're trying to solve it symbolically

#

the best approach is to translate the problem into a matrix equation

#

Ax = B

#

then use numpy or scipy to solve it

#

can even solve it by hand

peak patio
#

thanks

peak patio
# final kiln Ax = B

Can numpy or somethign else do that for me ? translate equations like ax+b=c+d into ax=c+d-b ?

final kiln
#

but then the solving itself gotta be a numerical approach, there's just to many equations

wooden sail
#

the best is doing that yourself on paper

#

you'd have to read the documentation of available solvers and then it's up to you to prepare the problem in a compatible way

peak patio
wooden sail
#

sadly i'm not trolling you 😛 that's why people go learn this in uni

final kiln
#

it's easier than it looks, after some practice it will be second nature

#

I reckon most people who studied this can transform it into matrix form right from the equations you wrote without modifying them

wooden sail
#

you already have them in matrix form, just gotta move a few coefficients around

#

i really do suggest you grab a pencil and a piece of paper and write it down, it won't take you long

peak patio
#

Okay then

#

thanks

empty willow
#

Hey whats up guys

#

in the opinion of voices which language works well with python

lapis sequoia
#

is it possible to train a model with sine function without using LSTM? from what i see it doesnt work beyond training dataset

final kiln
lapis sequoia
#

its using single value input and output, how am i supposed to use taylor series

final kiln
#

you should also normalize it

#

y = x % 2pi something of the sorts

lapis sequoia
#

You will need to send like 100 last values as input

empty willow
#

hmm

agile owl
#

how do you expect it to learn sine at all just given a single datapoint

tender umbra
#

How to host a low traffic deep learning model?
So i want to host a deep learning model, A10 seems good enough for my needs. I am not expecting a lot of traffic, so paying hourly for aws ec2 or ecs doesnt seem like the best way? Can anyone guide to alternatives that charge on per api call basis?

old radish
#

um so guys i wanna develop an ai application to detect if the user is looking at his computer what tools should i use and how do i do it?

lapis sequoia
final kiln
true spade
#

Hi there, just curious, how do y'all modularize/organize your code in Jupyter notebooks?

For context, I have recently been given 2 problems to solve using different types of machine learning models (i.e. classification and regression models) and I am having difficulty splitting the code into individual functions that can be placed in another Python file (so as to avoid having the Jupyter notebook become too cluttered with long sections of code).

golden hill
#

Hi, guys. Can you tell my someone python libraries for beginner developer?

true spade
true spade
# golden hill data science

I see, that is a category of projects that can be done in Python, are you trying to do some data analysis on a dataset? Or are you trying to do something else?

#

If you are doing data analysis, the following libraries might be useful to you:

numpy
pandas
matplotlib (Used for plotting charts and visualizing data)
#

However, this is just a general list of libraries as I am not sure what exactly in data science you are trying to do

golden hill
#

I wanna make ai for sorting flowers

#

for example

#

dataset: roze(500img), sunflower(500img), chamomile(500img)

golden hill
#

have a nice day

#

when i will make this, i will tell you about this

true spade
true spade
golden hill
#

xd

true spade
teal lance
lapis sequoia
final kiln
#

Keep learning rate small, I've done this before and LR was the final bullet

lapis sequoia
final kiln
#

Also note that the tailor series of sine doesn't have all orders

#

x, x3 and x5, and etc

tidal bough
#

omitting even powers is cheating :p