final kiln Mar 10, 2024, 7:08 PM

#

sometimes the runner is my laptop, so I need to be careful

#

never heard of caddy, the automatic https thing sounds appealing

past meteor Mar 10, 2024, 7:11 PM

#

If you ever need to do anything webby like deplying mlflow I'd use it ahead of nginx

final kiln Mar 10, 2024, 7:12 PM

#

I've been using traefik, it works well with docker compose, which has been the main orchestrator I use

past meteor Mar 10, 2024, 7:12 PM

#

Traefik is Caddy's main competitor yeah. It also works well with Docker compose (which is what I use as well)

final kiln Mar 10, 2024, 7:13 PM

#

tho the issue here is just that the free machine only has two cores, so the two workers get overrun very easily

past meteor Mar 10, 2024, 7:13 PM

#

Personally I deploy 1 Caddy for all my apps and not 1 per

final kiln Mar 10, 2024, 7:13 PM

#

I'll give it a try for sure, using nginx was a nightmare, especially with docker

past meteor Mar 10, 2024, 7:14 PM

#

Maybe you could look into getting an EC2 instance or similar to permanently host stuff for you

final kiln Mar 10, 2024, 7:15 PM

#

yeah I'll upgrade to a spot instance, use skypilot to get a new machine once it's taken away

#

it's like 5 bucks per month for one of the good ones, assuming constant usage, which wouldn't really be the case

#

5 or 10

past meteor Mar 10, 2024, 7:16 PM

#

What do you get for €5?

final kiln Mar 10, 2024, 7:16 PM

#

I don't recall, it was one of the cX machines, I just skimmed through to see what price I could get

long canopy Mar 10, 2024, 9:04 PM

#

man doing profiling in python sucks

past meteor Mar 10, 2024, 9:05 PM

#

What are you using?

long canopy Mar 10, 2024, 9:05 PM

#

am trying to see why training is slowly filling up my RAM then my swap, am using memory_profiler

#

the __call__ to DistilbertModel from huggingface's transformers increments memory usage by 100 mb on each call, so i'm trying to see what exactly is going on

#

also for some reason memray says peak memory usage is 600 mb, yet memory_profiler shows it going over 12 GB

past meteor Mar 10, 2024, 9:17 PM

#

I always use Scalene https://github.com/plasma-umass/scalene

GitHub

GitHub - plasma-umass/scalene: Scalene: a high-performance, high-pr...

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals - plasma-umass/scalene

#

pretty good experience with it all round

long canopy Mar 10, 2024, 9:18 PM

#

past meteor I always use Scalene https://github.com/plasma-umass/scalene

nice had not heard of it, will be trying it out right now

#

thanks!

coral lotus Mar 10, 2024, 10:16 PM

#

Hi so I am trying to train a neural network to detect cells in an image using faster RCNN. I have a dataset consisting of 1300ish such images, that are labelled as well . Each image contains several cells, most of which are red blood cells and the rest of which are infected cells. Is there something pre-existing that I can use to train a network on my dataset of images?

past meteor Mar 10, 2024, 10:27 PM

#

coral lotus Hi so I am trying to train a neural network to detect cells in an image using fa...

Typically object detection is trained on the 20ish classes from the coco dataset, if you wnat to do anything else (which you are) you'll have to finetune

serene scaffold Mar 10, 2024, 10:27 PM

#

coral lotus Hi so I am trying to train a neural network to detect cells in an image using fa...

Please only ask the same question in one place. You can link to the original question in other places.

coral lotus Mar 10, 2024, 10:28 PM

#

past meteor Typically object detection is trained on the 20ish classes from the coco dataset...

what do you mean

coral lotus Mar 10, 2024, 10:28 PM

#

serene scaffold Please only ask the same question in one place. You can link to the original que...

sorry my bad

desert oar Mar 10, 2024, 10:29 PM

#

past meteor I always use Scalene https://github.com/plasma-umass/scalene

new to me. it supports asyncio?

past meteor Mar 10, 2024, 10:29 PM

#

coral lotus what do you mean

Okay, there's a lot more than 20 https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/

Amikelive | Technology Blog

Tech Admin

What Object Categories / Labels Are In COCO Dataset?

One important element of deep learning and machine learning at large is dataset. A good dataset will contribute to a model with good precision and recall. In the realm of object detection in images…

desert oar Mar 10, 2024, 10:30 PM

#

coral lotus Hi so I am trying to train a neural network to detect cells in an image using fa...

with 1300 images you might be in the sweet spot of having enough data to train a good model but not needing a pre-trained base model. how many labels do you have? is it relatively balanced? how big are the images?

past meteor Mar 10, 2024, 10:31 PM

#

desert oar new to me. it supports asyncio?

Good question, I haven't used it for async (yet)

coral lotus Mar 10, 2024, 10:31 PM

#

desert oar with 1300 images you might be in the sweet spot of having enough data to train a...

the images are on average 2000kb and I'm pretty sure all the images are labelled. Here is the dataset if you want to take a look or in case i might be wrong: https://bbbc.broadinstitute.org/BBBC041

Broad Bioimage Benchmark Collection

past meteor Mar 10, 2024, 10:31 PM

#

but I wouldn't know why

desert oar Mar 10, 2024, 10:32 PM

#

also @coral lotus from a human perspective, how hard is it to distinguish the images? are you interested in getting a useful estimate of the probability distribution over labels, or just minimizing prediction error on labels?

past meteor Mar 10, 2024, 10:33 PM

#

How good are you with object detection/segmentation already?

coral lotus Mar 10, 2024, 10:33 PM

#

just minimizing prediction error i guess. my main goal is just to make cell identification as accurate as possible

coral lotus Mar 10, 2024, 10:33 PM

#

past meteor How good are you with object detection/segmentation already?

i havent done anything with object detection or segmentation

desert oar Mar 10, 2024, 10:33 PM

#

oh yeah is this a detection/segmentation task, or are you just classifying the entire image?

past meteor Mar 10, 2024, 10:33 PM

#

coral lotus i havent done anything with object detection or segmentation

Doing 2 things at the same time is > 2x as hard as learning it one by one

coral lotus Mar 10, 2024, 10:33 PM

#

desert oar oh yeah is this a detection/segmentation task, or are you just classifying the e...

detection/segmentation

#

so this is what an average iamge looks like

past meteor Mar 10, 2024, 10:34 PM

#

If I were you I'd learn about object detection and segmentation first and then learn how to do what you're trying to do after

#

If you don't know enough about neural nets while you're doing that I'd advise you to do that as well

coral lotus Mar 10, 2024, 10:34 PM

#

im gonna be honest, this is for my science fair project thats coming up pretty quick so i dont have much time. But I don't think it should be too hard to do them both?

past meteor Mar 10, 2024, 10:35 PM

#

Gonna be honest and say that I (and most folks) won't want to walk you through the entire thing either but are definitely willing to help if you have specific questions

coral lotus Mar 10, 2024, 10:35 PM

#

i was going to use a faster rcnn framework to detect the cells in an image and then a cnn framework to classify the exact cells

desert oar Mar 10, 2024, 10:35 PM

#

@coral lotus if you just want to classify the entire image, this might be on par with mnist (considered easy/solved) depending on how distinct the infected cells are from healthy cells. for actually detecting/counting infected cells it might also be easy but i don't have experience with detection or segmentation and don't want to speculate

coral lotus Mar 10, 2024, 10:36 PM

#

I mean i guess its just classifying an image? not completely sure. like if there is one infected cell in an image then the whole image can just be considered as infected

#

because they are close up images of blood smears of patients

#

so if there is a single infected cell in an image then that means the patient is infected

coral lotus Mar 10, 2024, 10:37 PM

#

past meteor Gonna be honest and say that I (and most folks) won't want to walk you through t...

fair enough i cant blame you

past meteor Mar 10, 2024, 10:37 PM

#

Do you want to draw boxes on every infected cell or do you want to say "this image has infected cells"?

coral lotus Mar 10, 2024, 10:37 PM

#

I mean preferably drawing boxes, but just saying "this image has infected cells" would be enough for my project

past meteor Mar 10, 2024, 10:38 PM

#

As usual I agree with salt rock lamp

#

"This image has infected cells" is easy

#

Drawing boxes isn't too hard either but you may have to label data and that's the time consuming part

coral lotus Mar 10, 2024, 10:39 PM

#

yeah so like if this is an image, id just it to say "this blood smear shows that the patient likely has malaria"

past meteor Mar 10, 2024, 10:39 PM

#

Unless your dataset is already labelled

coral lotus Mar 10, 2024, 10:39 PM

#

its already labelled

#

past meteor Mar 10, 2024, 10:40 PM

#

Then you can follow a guide such as https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

coral lotus Mar 10, 2024, 10:41 PM

#

alright thank you

#

also by labelled i mean the dataset came with a json file consisting of thousands of lines of this

[{"image": {"checksum": "676bb8e86fc2dbf05dd97d51a64ac0af", "pathname": "/images/8d02117d-6c71-4e47-b50a-6cc8d5eb1d55.png", "shape": {"r": 1200, "c": 1600, "channels": 3}}, "objects": [{"bounding_box": {"minimum": {"r": 1057, "c": 1440}, "maximum": {"r": 1158, "c": 1540}}, "category": "red blood cell"},

desert oar Mar 10, 2024, 10:50 PM

#

coral lotus also by labelled i mean the dataset came with a json file consisting of thousand...

guess what: you actually have labeled image segments

#

at least according to that data

#

i would still start by just trying to classify the entire image infected or not. easier project, more forgiving

#

but there should be plenty of introductory material on image segmentation as well. the data already being labeled with bounding boxes should help a lot

#

but again i would have to defer to other people regarding how to actually build a segmentation model

#

seems like a great practice task for me to learn 😆

past meteor Mar 10, 2024, 11:13 PM

#

coral lotus also by labelled i mean the dataset came with a json file consisting of thousand...

yeah, that's fine

#

you'll have to read the guide and plug in the gaps as you go

broken arch Mar 11, 2024, 12:43 AM

#

hello guys i Hope Ur doing well , i was wondering if anyone can send me an interesting dataset medium or bug sized for machine learning and he got some décent results After working on It , tysm 🙏

warm copper Mar 11, 2024, 12:45 AM

#

.latex Suppose that $f_1$ is a model with $X$ features and the loss function is denoated as $L_1$, whereas $f_2$ is a quadratic model with $X_2$ features and the loss function is denoted as $L_2$, we want to show that:

$L_2 \geq L_1$

We know that the logistic loss function is:
$L(y,\hat{y})= -ylog(\hat{y}) - (1 - y)log(1 - \hat{y})$

Assume that the predicted probability for $f_1$ is $\hat{y}_1$ and $\hat{y}_2$ for $f_2$ on the same data point $(x_i, y_i)$:

Then, $\hat{y}_1 = f_1(x_i)$ and $\hat{y}_2 = f_2(x_i)$.

Since $X_2$ has the quadratic features of $X$, it contains all the features of $X$ as well. Thus, $X_2$ has at least as many features as $X_1$, which also means $f_2$ is more flexible to fit the data compared to $f_1$.

Because $f_2$ is at least as flexible as $f_1$ and both are optimized to minimize the loss function, we can conclude that:

$L_2 \geq L_1$

strange elbowBOT Mar 11, 2024, 12:45 AM

#

$latex.png$

warm copper Mar 11, 2024, 12:45 AM

#

heres the question:

#

.latex .latex Suppose that $f_1$ is a model that optimally fits the data $(X,y)$, and $f_2$ is another model that optimally fits the data $(X_2,y)$, where $X_2$ are the quadratic features of $X$. Then the loss function value obtained by $f_2$ is always going to be at least equal to that for $f_1$. Try to come up with a solid mathematical argument that justifies this claim.

strange elbowBOT Mar 11, 2024, 12:45 AM

#

$latex.png$

warm copper Mar 11, 2024, 12:46 AM

#

correct idea? @wooden sail

wooden sail Mar 11, 2024, 12:47 AM

#

i'd say no

warm copper Mar 11, 2024, 12:48 AM

#

WHAT WHY?!

wooden sail Mar 11, 2024, 12:49 AM

#

none of that is true nor useful

#

by your argument, raising all the entries of x to the 0th power sill also be "at least as flexible"

warm copper Mar 11, 2024, 12:53 AM

#

o.O

wooden sail Mar 11, 2024, 12:54 AM

#

i would try doing the math for the scalar case and then generalizing

warm copper Mar 11, 2024, 12:54 AM

#

when ML is all about math

#

🥲

spring field Mar 11, 2024, 12:58 AM

#

it's always been about math, you could train models with pen and paper
it'd just take ages...

warm copper Mar 11, 2024, 12:58 AM

#

#

so teacher gave a line like this in the exam

#

there were tons of points on each side of the line (two different classes) seperated by that line above

#

and asked this question:

#

This line of separation cannot be done using SVM

#

I said yes

#

we can't right?

#

its because SVM needs a linear decision boundary

#

unless we use a kernel trick

past meteor Mar 11, 2024, 1:22 AM

#

warm copper its because SVM needs a linear decision boundary

An RBF SVM is still an SVM

#

I think it's not a great exam question either way. you can make a new feature x' where you can get a linear seperation

warm copper Mar 11, 2024, 2:17 AM

#

yeah

#

I was confused with that question tbh

#

i mean it can with kernel trick but also it cant @past meteor

#

question should have been specific

terse quarry Mar 11, 2024, 2:48 AM

#

Anyone knows a place where I can learn AI

native bough Mar 11, 2024, 5:34 AM

#

youtube

abstract wasp Mar 11, 2024, 6:50 AM

#

Hi, anyone know what the N and M stand for here?

wooden sail Mar 11, 2024, 6:52 AM

#

abstract wasp Hi, anyone know what the N and M stand for here?

it means there is a set with N examples x_n, and another for y_m with M examples

abstract wasp Mar 11, 2024, 6:54 AM

#

wooden sail it means there is a set with N examples x_n, and another for y_m with M examples

Oh duh, thanks lol, appreciate you

midnight harbor Mar 11, 2024, 7:09 AM

#

Has anyone in this community started using the Google Gemini API following GPT-3, and could you provide insights into its strengths and weaknesses? Specifically, I'm interested in understanding its performance in terms of

pricing
speed
reasoning capabilities
multilingual understanding
controllability.
Any feedback would be greatly appreciated.

kindly Tag me

smoky hamlet Mar 11, 2024, 7:47 AM

#

Hey um ummm

#

Im gay

#

pithink

modern storm Mar 11, 2024, 7:58 AM

#

hi anyone that know about apis and python can help me out

#

i am trying to use novelai api to generate an image

final kiln Mar 11, 2024, 8:51 AM

#

In reviewing the MetaFormer paper. So the conclusion is that a local operation like average pooling can substitute global operations like scaled dot product. And by extension, a kernel conv layer can substitute the avg pooling.

#

So here's what's bothering me. Did no one ever thought of using CNNs for language modelling ?

#

Even if that doesn't work well, it looks like a very small step to go from a CNN to a series of (conv + MLP) type of model

#

It would legit be the second thing I'd try

#

In fact, I thought the whole idea behind transformers is that CNNs, despite reducing dimensionality, they can't capture long rage relations

#

So maybe the important feature is the multi headed thing, which granted, I wouldn't easily think of it

#

Omg they even use the identity map to replace the attention module, I'm sooooo confused

#

I'm super suss'd out rn ngl

#

#

the -> means substitute, so in case of the identity mapping the removed the avg pooling and used an identity, reducing the model to a series of MLPs with layer norm in between

#

that gives them 74.3%, which is obviously suss because the presence of a given token will not affect any of the others right

#

the most suss part is the hybrid stages

#

they sneakly don't include the results for [Attention, Attention, Attention, Attention,]

#

and there's a clear increase as more attention modules are included

final kiln Mar 11, 2024, 9:10 AM

#

final kiln that gives them 74.3%, which is obviously suss because the presence of a given t...

this is for vision though, I need to see if there's anything funky going on with patch embedding, haven't looked into it yet

final kiln Mar 11, 2024, 9:22 AM

#

final kiln this is for vision though, I need to see if there's anything funky going on with...

does exactly what it sounds like it does

#

they have the code public tho, adding it as a todo to replicate their results and see how the missing row looks like: https://github.com/sail-sg/poolformer

desert oar Mar 11, 2024, 9:51 AM

#

final kiln that gives them 74.3%, which is obviously suss because the presence of a given t...

there can still be token interactions if there's a hidden layer

desert oar Mar 11, 2024, 9:53 AM

#

final kiln So here's what's bothering me. Did no one ever thought of using CNNs for languag...

how would you do it? same approach as transformers? mapping to low-dim space to apply the convolution filter, then un-mapping back to the original vector space?

final kiln Mar 11, 2024, 9:54 AM

#

desert oar there can still be token interactions if there's a hidden layer

the MLP acts on each individual embedding, so does layer normalization, maybe they changed normalization

desert oar Mar 11, 2024, 9:54 AM

#

ooooh right

#

yeah good question then

#

i always forget that detail

final kiln Mar 11, 2024, 9:55 AM

#

desert oar how would you do it? same approach as transformers? mapping to low-dim space to ...

No I'd just slide a kernel over the batch, which is what they are doing with avg pooling

#

Like, in a way, it's a batch of images

#

I guess I can buy it for the avg pooling, but the identity mapping is suss

desert oar Mar 11, 2024, 9:56 AM

#

final kiln No I'd just slide a kernel over the batch, which is what they are doing with avg...

conv1d with 50 dimensions?

final kiln Mar 11, 2024, 9:57 AM

#

desert oar conv1d with 50 dimensions?

No like, a NxN kernel

desert oar Mar 11, 2024, 9:57 AM

#

oh i see

final kiln Mar 11, 2024, 9:57 AM

#

A single one ig

desert oar Mar 11, 2024, 9:57 AM

#

I'm sure it was tried, i just wouldn't know 😆

final kiln Mar 11, 2024, 9:58 AM

#

You'd think it was one of the first things to try right

desert oar Mar 11, 2024, 9:58 AM

#

ask during more normal US hours so one of the actual ML experts can weigh in

#

I like the interpretation of attention as a "token mixer"

#

But i think the main issue is likely to be what you said: text is less "local" than an image and you probably can't do without the long range token graph

final kiln Mar 11, 2024, 10:00 AM

#

Yeah could be an explanation, the patches of an image fit together in a very different way

desert oar Mar 11, 2024, 10:01 AM

#

@final kiln https://dennybritz.com/posts/wildml/understanding-convolutional-neural-networks-for-nlp/

Denny's Blog

Understanding Convolutional Neural Networks for NLP

When we hear about Convolutional Neural Network (CNNs), we typically think of Computer Vision.

#

2015

#

safe to say it has been tried

final kiln Mar 11, 2024, 10:05 AM

#

Pixels close to each other are likely to be semantically related (part of the same object), but the same isn’t always true for words. In many languages, parts of phrases could be separated by several other words.

#

Could be that it won't work for NLP

#

They used imagenet, and if they are doing classification, I can see this working

#

Even with identity, the patch embedding already packs a ton of information into the embedding

crisp raptor Mar 11, 2024, 10:57 AM

#

final kiln Could be that it won't work for NLP

Words aren't pixels...

final kiln Mar 11, 2024, 10:58 AM

#

crisp raptor Words aren't pixels...

Well they certainly are if you consider the input tokens, it's a 1xsequence_size image

crisp raptor Mar 11, 2024, 10:59 AM

#

final kiln Well they certainly are if you consider the input tokens, it's a 1xsequence_size...

It's not necessarily a semantic relationship though

final kiln Mar 11, 2024, 11:00 AM

#

When you embed the tokens you get something that resembles an image, quite a lot actually

#

#

Well this is actually a batch

#

But I reckon it would look fairly similar if you imshow a sequence of embeddings

desert oar Mar 11, 2024, 11:41 AM

#

final kiln When you embed the tokens you get something that resembles an image, quite a lot...

here's an idea I haven't seen: stack transformers in front of convolutions. the idea being that the transformer is learning a representation that is optimal for the CNN to operate on, which one would hope implies a kind of "localizing" effect, where you place relevant tokens adjacent to each other so that they can be picked up by a sliding filter

#

i don't think it would be necessarily better for text generation, but could be interesting linguistically

#

that's always been my intuition about what transformers do anyway, we've talked about that here before

#

i wonder if you could do some kind of "semantic filtering" on the output of a transformer stack, and then running a convolutional layer over something like a sliding average of tokens. less grammatical detail, more big picture

#

i should noodle around with that + nanogpt

#

or maybe something encoder-only like bert... need our NLP experts' opinion

final kiln Mar 11, 2024, 11:46 AM

#

desert oar here's an idea I haven't seen: stack transformers in front of convolutions. the ...

They sort of explored that idea

final kiln Mar 11, 2024, 11:46 AM

#

final kiln

In the hybrid section

#

But ofc, with avg pooling

#

The residual connections transfer information from earlier layers to later layers, so the order might not matter that much

desert oar Mar 11, 2024, 11:47 AM

#

final kiln But ofc, with avg pooling

right, but they didn't have multi-head attention at the time

final kiln Mar 11, 2024, 11:47 AM

#

desert oar i wonder if you could do some kind of "semantic filtering" on the output of a tr...

Yeah there's definitely a ton to explore here

desert oar Mar 11, 2024, 11:48 AM

#

so avg pooling of the attention-ized token sequence specifically

final kiln Mar 11, 2024, 11:48 AM

#

desert oar right, but they didn't have multi-head attention at the time

Oh, that's an ablation study on the transformer

#

They took out the scaled dot product and used an avg pooling

#

To show that the token mixer is not important

desert oar Mar 11, 2024, 11:49 AM

#

final kiln Oh, that's an ablation study on the transformer

in a way, sure. instead of five transformers, do three transformers followed by two convolutions

desert oar Mar 11, 2024, 11:50 AM

#

final kiln To show that the token mixer is not important

oh i was still thinking about text. i never actually looked into metaformer because image-related data never comes up in my work, i just glanced at the paper when you mentioned it

final kiln Mar 11, 2024, 11:50 AM

#

I'm replicating it for NLP

#

Quite curious on how it will turn out

desert oar Mar 11, 2024, 11:51 AM

#

as in, running the same experiment but on a text dataset?

#

yeah that will be interesting for sure

final kiln Mar 11, 2024, 11:51 AM

#

This one is already with a quadratic form attention, instead of Q K V

final kiln Mar 11, 2024, 11:51 AM

#

desert oar as in, running the same experiment but on a text dataset?

Yes

desert oar Mar 11, 2024, 11:51 AM

#

your metric tensor variant right?

#

how did that turn out by the way

final kiln Mar 11, 2024, 11:52 AM

#

The metric tensor one is not done yet, I'm gonna code it in cuda directly to impose all the weird conditions that I need

#

But the pytorch implementation did well for the Amazon dataset

#

It achieved close to SOTA performance

#

If you filter out those models that weren't pre trained beforehand

#

Like training Bert on next token pred with google level datasets and then fine tuning for sentiment analysis

#

I'm coding a bunch other attention mechanisms today, including the avg pooling

desert oar Mar 11, 2024, 11:54 AM

#

cool!

final kiln Mar 11, 2024, 11:55 AM

#

It's taking a while because I was figuring out my ML workflow, I definitely got it down now

#

So after finishing the model stuff and the extra validation steps on the training loop

#

I gotta setup a couple more datasets and then I can trigger the experiments

#

Those are gonna take a looong while, so I can finally pause this and give a bit more attention to my job search 😅

final kiln Mar 11, 2024, 11:59 AM

#

final kiln

I mean, they did only use one dataset here, so I might do the same

#

Yeah I might keep it to text classification. The IMBD dataset is very good

#

Perhaps too good tho

desert oar Mar 11, 2024, 12:07 PM

#

final kiln It's taking a while because I was figuring out my ML workflow, I definitely got ...

well it's been exciting to watch your progress on all this 😆 I am enjoying it vicariously while I mess around with IoT data and maps at work

turbid drift Mar 11, 2024, 12:08 PM

#

Just tried data scraping https://www.city-data.com/ and I made sure to use a timeout between each request, anywhere from 1 to 2 seconds long (random), but my connection still got closed and the IP I was using was blocked. Thankfully I was using a VPN so I can try again, but how the heck is 1-2 seconds between requests too fast? I thought hosts would only deny requests that were like 0.1 seconds or less in between.

#

I can't make it much longer, because I don't have all that much time to wait (imagine if I was having to do this for a real job; they certainly wouldn't be that patient) and what if it encounters some problem with a random page partway through and I have to do the whole thing over again?

#

A little frustrated since I'm doing this for a portfolio project that my resume kind of hinges on. Why is 1-2 seconds too little anyway? If I wanted to ddos attack someone (which I don't), common sense would say I'd do it a lot faster than that.

final kiln Mar 11, 2024, 12:29 PM

#

desert oar well it's been exciting to watch your progress on all this 😆 I am enjoying it v...

It's been a ton of fun for me for sure. Honestly, I think I'm gonna do more of these long pauses. Definitely not frequently, but yeah, every so and so years I give myself 6 months to do wtv.

desert oar Mar 11, 2024, 12:45 PM

#

turbid drift Just tried data scraping https://www.city-data.com/ and I made sure to use a tim...

unfortunately their ToS prohibits automatic scraping, so we cannot help as per server rules.

https://www.city-data.com/terms.html

This license does not include any right to private or commercial collection, aggregation, copying, duplication, display or derivative use of the Service nor any use of data mining, robots, spiders, or similar data gathering and extraction tools for any purpose unless expressly permitted in advance in a written document signed by us. The sole exception is the limited right provided to general purpose internet search engines and non-commercial public archives that use such tools to gather information for the sole purpose of displaying hyperlinks to the Service, provided they comply with our robots.txt file.

#

in the future, i advise not basing a school project around ToS violation and circumvention of access control

desert oar Mar 11, 2024, 12:46 PM

#

final kiln It's been a ton of fun for me for sure. Honestly, I think I'm gonna do more of t...

it's been inspiring for me as well. i should really try to build a couple of 6-month sabbaticals into my financial plans at some point

#

@turbid drift this doesn't seem like a commercial site. so if you contact the owner of the site, they might be willing to provide you with a dataset.

turbid drift Mar 11, 2024, 12:47 PM

#

desert oar in the future, i advise not basing a school project around ToS violation and cir...

Yeah, I just sent them an email.

#

Still frustrated though, web hosts are so overly paranoid.

#

It should be obvious that I'm not an attacker.

desert oar Mar 11, 2024, 12:48 PM

#

yeah, good luck. this seems like a hell of a lot of work for a volunteer data aggregation project 🤔 are they making money off of this somehow?

desert oar Mar 11, 2024, 12:48 PM

#

turbid drift It should be obvious that I'm not an attacker.

but you are violating their stated ToS and they have every right to try to prevent you from doing that

turbid drift Mar 11, 2024, 12:49 PM

#

I know. Hopefully I can find another source if this doesn't work with a more flexible ToS.
I'd just do an analysis on an existing Kaggle project but apparently employers want me to actually come up with my own data and not just use something from there.

turbid drift Mar 11, 2024, 12:50 PM

#

desert oar yeah, good luck. this seems like a hell of a lot of work for a volunteer data ag...

There isn't anyone making money off of this; this is purely for my portfolio. It's my first major data analyst portfolio project.

#

More specifically, I'm wanting to get data on US sister cities, including information like connections between ethnic populations and whether or not they correspond with the countries represented in those sister cities.

#

For example, do US cities with sister cities in Asia tend to have higher-than-average Asian populations?

#

Seems easy enough for a beginner project, but challenging enough to demonstrate skills to an employer.

potent sky Mar 11, 2024, 12:55 PM

#

final kiln The metric tensor one is not done yet, I'm gonna code it in cuda directly to imp...

Interesting
What was this, is there some prev message you can quote?

desert oar Mar 11, 2024, 1:02 PM

#

turbid drift More specifically, I'm wanting to get data on US sister cities, including inform...

for the USA at least, that kind of data should be publicly and freely available through the US Census and related gov't agencies

#

given that this seems to be work-sponsored, why don't you bring this up with your manager / mentor / whoever and let them know that you might need to adjust your project, to avoid getting bogged down in a "gray-hat" web scraping task that's unrelated to the actual topic?

turbid drift Mar 11, 2024, 1:04 PM

#

Ah, it's not work-sponsored. I don't have a specific job I'm doing this for, this is just to make my portfolio look more attractive to any employers who are looking for data analysts.

desert oar Mar 11, 2024, 1:05 PM

#

oh

#

just pick a different project then

#

i get the desire to work on something particularly interesting, but imo there's no point wasting your time with a distraction. if you want to practice webscraping, start on wikipedia, which does permit scraping (within reasonable limits)

#

but for a data analyst job i think your attention will be better spent elsewhere. web scraping and related tasks can be extremely useful and can make you seem like a wizard. but focus on fundamentals is more important.

turbid drift Mar 11, 2024, 1:07 PM

#

So can I just look for a random, but information-rich project on Kaggle and just do a bunch of analysis on it, and have it be good enough for a portfolio project?

desert oar Mar 11, 2024, 1:07 PM

#

that job market is absolutely brutal right now as i'm sure you know. if you have a particular industry of interest, you might be able to gain an advantage by doing a project in that particular industry's domain.

desert oar Mar 11, 2024, 1:08 PM

#

turbid drift So can I just look for a random, but information-rich project on Kaggle and just...

ideally you'd still pick an interesting project with some kind of realistic "research question" that you can answer. you are always somewhat limited with public/low-cost data, but for certain topics (macroeconomics, meteorology) public data abounds, published by the USA and other governments

turbid drift Mar 11, 2024, 1:09 PM

#

I honestly don't enjoy data scraping that much, and I only really do it because I feel like I'm pressured to come up with data I gathered myself, otherwise employers won't think I'm desirable.

desert oar Mar 11, 2024, 1:09 PM

#

right, but there's a lot of data out there that you don't need to scrape from the web or call from an API in small batches

#

in general, getting data yourself can be very important. so i don't want to undersell it too much. but i think you're on the right track in not wanting to spend too much effort on it.

turbid drift Mar 11, 2024, 1:10 PM

#

I feel like I might be misunderstanding a lot about the industry too. I'm coming off the Google Data Analytics certificate by the way.

desert oar Mar 11, 2024, 1:11 PM

#

what kinds of jobs are you looking to get? are you looking for your first job in tech / data?

turbid drift Mar 11, 2024, 1:11 PM

#

Yeah, just any kind of data analyst/data science job, remote or nearby, involving Python, SQL, R, Microsoft Excel, all of which I can use well.

#

I have a Bachelor's in CS as well, along with my Google certificates.

#

So really just my portfolio is stopping me.

desert oar Mar 11, 2024, 1:12 PM

#

great, so you have the programming and technical skills. then you just need to show that you can put together a research question, make useful data visualizations, do some basic statistics, and write a coherent executive summary of your results.

turbid drift Mar 11, 2024, 1:13 PM

#

Yeah, and I was on the track to doing that with my sister cities project, and moving along quite nicely with it too. Had already gathered some very useful information.

#

So I'm overcoming that imposter syndrome.

desert oar Mar 11, 2024, 1:13 PM

#

based on that background you are probably a stronger programmer than 90-95% of data analysts and you might want to consider looking at more of a data science career path if you can get some work experience + a masters degree (ideally filling in the gaps you probably have in math, masters is optional but might be faster than grinding away for years at self study & looks stronger on a resume)

#

the sister cities thing is great, but unless you can get that data you might have to divert

#

what about just comparing US cities instead? come up with some kind of comparative analysis, the actual topic is less important than demonstrating that you can come up with interesting questions and answer them coherently

turbid drift Mar 11, 2024, 1:15 PM

#

Perhaps. I think another thing I'm afraid of is the thought in the back of my head that someone has probably done/found that info out before, and I'm just re-inventing the wheel, in which an employer might find that out and just accuse me of having copied from somewhere else.

desert oar Mar 11, 2024, 1:15 PM

#

i'd actually encourage spending less time on this particular project (maybe a few afternoons at most? just enough to answer your own question in a nice 2-page writeup) and then maybe go deploy your programming skills on some AI task. that could catch recruiter eyeballs.

#

of course, but who cares? you're not trying to get published in Econometrica, you're trying to get a job

turbid drift Mar 11, 2024, 1:17 PM

#

desert oar i'd actually encourage spending less time on this particular project (maybe a fe...

As someone who's highest level of math is Calculus 1 and discrete math, I heard AI needs at least multivariable calculus doesn't it? I get the concept of gradient descent though, but not the implementation of it to the little details.

desert oar Mar 11, 2024, 1:17 PM

#

turbid drift As someone who's highest level of math is Calculus 1 and discrete math, I heard ...

yes, but keep in mind that this also puts you at the 80-90th percentile of data "analysts" (as opposed to data "scientists")

#

that's probably an exaggeration but still, your imposter syndrome seems severe and you haven't even been hired anywhere yet

#

you are doing fine. scale back the project, learn to embrace everything being slightly fucked, and go get a job

#

"everything being slightly fucked" is a normal state of affairs in data. the best data analysts are the ones who get stuff done anyway.

#

your job is to show up and answer useful questions for the business. all you need to do right now is demonstrate that you can do that. the particular choice of research question is only interesting insofar as it demonstrates your ability to think about the real-world context behind the data and come up with an interesting question. but you have plenty of other skills you need to demonstrate too, so don't get hung up on that one aspect in particular.

gentle sierra Mar 11, 2024, 1:25 PM

#

Can anyone hop in my python post for a sec?

final kiln Mar 11, 2024, 1:57 PM

#

potent sky Interesting What was this, is there some prev message you can quote?

Aaaah I'd have to dig far back. The concept is simple tho, take out Q, K, V and have a metric tensor for calculating the dot product between each token and every other token. Plus a matrix at the start to project the tokens to a lower dimensional space. The rest remains the same

#

I spent 2 weeks training it thinking the performance was subpar but the dataset was just bad in on itself, best models are getting 65% acc on papers with code, I was getting 55% before overfit

potent sky Mar 11, 2024, 1:59 PM

#

final kiln Aaaah I'd have to dig far back. The concept is simple tho, take out Q, K, V and ...

To combat the problem of quadratic scaling of compute with scaling of num tokens?
Why metric tensor tho

final kiln Mar 11, 2024, 2:00 PM

#

turbid drift As someone who's highest level of math is Calculus 1 and discrete math, I heard ...

The jump from calculus to multivariate calculus is not that hard tho. The nabla operators all have an intuitive interpretation, and even come with descriptive names, "gradient" and "curl".

final kiln Mar 11, 2024, 2:00 PM

#

potent sky To combat the problem of quadratic scaling of compute with scaling of num tokens...

To make it more interpretable. It also lets me half the number of parameters

potent sky Mar 11, 2024, 2:00 PM

#

final kiln I spent 2 weeks training it thinking the performance was subpar but the dataset ...

Nice

potent sky Mar 11, 2024, 2:01 PM

#

final kiln To make it more interpretable. It also lets me half the number of parameters

I don't understand, how does that make it more interpretable?

final kiln Mar 11, 2024, 2:02 PM

#

potent sky I don't understand, how does that make it more interpretable?

You can view it as constructing pseudo-eucledian metric spaces, which, despite sounding fancy, it's actually easier to picture in the head than keys queries and values

serene jolt Mar 11, 2024, 2:03 PM

#

ERROR: Failed building wheel for llama-cpp-python...I want to build a docker image and I got this error message

desert oar Mar 11, 2024, 2:05 PM

#

potent sky To combat the problem of quadratic scaling of compute with scaling of num tokens...

they noticed that it kind of just pops out of the matrix arithmetic:

Q = X @ Wq
K = X @ Wk
V = X @ Wv

Q @ K.T == (X @ Wq) @ (X @ Wk).T == (X @ Wq) @ (Wk.T @ X.T) == X @ (Wq @ Wk.T) @ X.T

you can set M = (Wq @ Wk.T) and then impose constraints like "M must be a metric tensor" (i.e a distance matrix)

past meteor Mar 11, 2024, 2:05 PM

#

Nice, I'm back on the hackathon grind and I have a top 6 placement (finalist) for my first

#

€3k pot for the winner

desert oar Mar 11, 2024, 2:06 PM

#

past meteor Nice, I'm back on the hackathon grind and I have a top 6 placement (finalist) fo...

congrats!

#

i never even tried entering a hackathon

past meteor Mar 11, 2024, 2:06 PM

#

I think you'd be great at them

#

The ones I participate usually get won by a good mix of communication/business skills and tech stuff

#

So not just 1 or the other

#

Generally a nice past time that sometimes gets you money and networking

desert oar Mar 11, 2024, 2:09 PM

#

past meteor I think you'd be great at them

what's the idea, you enter as a small team or individual, and then build something over 2 long sessions?

past meteor Mar 11, 2024, 2:16 PM

#

desert oar what's the idea, you enter as a small team or individual, and then build somethi...

I've done many. Some are team based and you had to build something in anything ranging from 4 hours to 2 days.

Others you enroll alone. The last one I did was actually one where you enrolled alone, got a random team and had an hour to conjure up an end-to-end data architecture and pitch it

desert oar Mar 11, 2024, 2:31 PM

#

past meteor I've done many. Some are team based and you had to build something in anything r...

how do you find them to enroll in? meetup.com kind of thing? was it "programming" or data-specific?

past meteor Mar 11, 2024, 2:32 PM

#

I only participate in data science/ML ones. Typically just LinkedIn or even Facebook ads

desert oar Mar 11, 2024, 2:32 PM

#

i'll keep an eye out!

boreal gale Mar 11, 2024, 2:34 PM

#

how would you describe the level of participants in a regular DS/ML hackathon?
always wanted to join one but didn't want to be dead weight, especially after not doing DS properly for ages..

past meteor Mar 11, 2024, 2:35 PM

#

Industry professionals

serene scaffold Mar 11, 2024, 2:36 PM

#

boreal gale how would you describe the level of participants in a regular DS/ML hackathon? a...

I would just do it. everyone who didn't form teams in advance is taking a risk wrt to their teammates.

boreal gale Mar 11, 2024, 2:36 PM

#

that's a fair point

serene scaffold Mar 11, 2024, 2:37 PM

#

(and the winners will probably be a team of high-performing professionals who formed a team in advance--ngl)

past meteor Mar 11, 2024, 2:37 PM

#

Luckily the ones I participated at post graduating where ones where you didn't form teams ahead of time

#

But I think the level of the regulars here is probably higher than the people I see participating

#

Soft skills matter a lot

#

Personally the only thing I'm sure I can do well is presenting/pitching do that's always my angle

final kiln Mar 11, 2024, 3:17 PM

#

Attention mechanisms done, gonna code a bunch more metrics to get a complete report at the end of each run

#

Time to start thinking about how I can use the metric tensor symmetry to optimize for speed, and also, how am I gonna code c++ CUDA kernels through rust into torch or, through torch into rust ? Idk, but however they did it to make these rust bindings in the first place, I gotta do the same

#

I'm actually gonna think about this first before thinking how to code the kernels. That part will be much easier and idk how feasible the integration will be, so that goes first

tiny stag Mar 11, 2024, 3:28 PM

#

hmm guys do yall think i should master machine learning? like ive got alot of knowledge and experience with simple software dev but was wondering bout ml and whether its worthwhile or ont

grizzled sail Mar 11, 2024, 4:10 PM

#

tiny stag hmm guys do yall think i should master machine learning? like ive got alot of kn...

that really seems like a question you need to ask yourself before anyone else, but beside all that, ive heard quite a few times recently that companies are hiring less and less ai/ml/ds devs after the boom that happened 5-10 years ago

desert oar Mar 11, 2024, 4:13 PM

#

serene scaffold I would just do it. everyone who didn't form teams in advance is taking a risk w...

have you done it as well? what was your experience like?

serene scaffold Mar 11, 2024, 4:30 PM

#

desert oar have you done it as well? what was your experience like?

I only did one hackathon (the year before covid started), and I gave up before it ended.

trim jewel Mar 11, 2024, 4:56 PM

#

I was doing a project on video highlights generation on python, can someone let me know how can i use the text from the video in deciding certain "key" moments from the video?

serene scaffold Mar 11, 2024, 5:07 PM

#

trim jewel I was doing a project on video highlights generation on python, can someone let ...

This would be a challenging beginner project.

You need the transcription of the video to have timestamps associated with each part. I assume you have that.

You'll want to look into techniques for determining which part of a text is most salient.

#

so you'll want to google stuff like "saliency detection nlp"

trim jewel Mar 11, 2024, 5:08 PM

#

i was thinking about how was i even going to stitch them back up, guess i have to to do timestamps, i was just happy the transcription came out fine

potent sky Mar 11, 2024, 5:33 PM

#

past meteor Nice, I'm back on the hackathon grind and I have a top 6 placement (finalist) fo...

Ayy congrats 🎉

potent sky Mar 11, 2024, 5:36 PM

#

final kiln You can view it as constructing pseudo-eucledian metric spaces, which, despite s...

Ahh positive definite scalar forms
I'm actually doing some research at the intersection of ML, information theory and algebraic topology atm so this is quite interesting xd

potent sky Mar 11, 2024, 5:37 PM

#

desert oar they noticed that it kind of just pops out of the matrix arithmetic: ```python Q...

Interesting
Thanks for the illustration!

past meteor Mar 11, 2024, 5:37 PM

#

I think my sklearn wrapper for Torch stinks a bit 😂

Training time is pretty invariant to the size of the net which means all the time is spent loading data to the GPU

#

If I care enough I'll refactor the hyperparameter search to use raw torch and the rest my wrapper

potent sky Mar 11, 2024, 5:41 PM

#

final kiln Time to start thinking about how I can use the metric tensor symmetry to optimiz...

Rust for CUDA has been a passive problem I've been looking at for sometime
The cuda ecosystem for rust isn't really mature as far as I could find
Lmk if you make any headway!

final kiln Mar 11, 2024, 5:41 PM

#

potent sky Ahh positive definite scalar forms I'm actually doing some research at the inter...

I'm not sure how much fancy stuff it'd be possible to do with it since it doesn't really give for very interesting spaces like the ones you see for general relativity. It's more like minkowski spaces and really, the only thing that matters is how it affects the angles between the embeddings.

But like, the point is just that, to get the interest of people who study this kind of math, I'm not entirely sure what kind of insights can be extracted, but I think it's a step in the right direction.

past meteor Mar 11, 2024, 5:41 PM

#

Do you guys think ML on tabular data is a solved problem?

#

If my job was more industry, less research I'd just create lags for my time series throw it into xgboost and call it day

final kiln Mar 11, 2024, 5:42 PM

#

potent sky Rust for CUDA has been a passive problem I've been looking at for sometime The c...

I'm actually gonna code c++ CUDA and bind it into torch, which I'll then bind into rust

potent sky Mar 11, 2024, 5:42 PM

#

final kiln I'm not sure how much fancy stuff it'd be possible to do with it since it doesn'...

Prodding at something from all the different angles is great, if nothing else we just find a lot of angles that don't work and why they don't work

past meteor Mar 11, 2024, 5:42 PM

#

The client paid for exotic nets, so the client gets exit nets but it's a bit of a waste of time

potent sky Mar 11, 2024, 5:42 PM

#

final kiln I'm actually gonna code c++ CUDA and bind it into torch, which I'll then bind in...

Why not bind it directly to rust

past meteor Mar 11, 2024, 5:43 PM

#

The improvement vs lags+xgboost is so marginal

wooden sail Mar 11, 2024, 5:43 PM

#

potent sky Prodding at something from all the different angles is great, if nothing else we...

maybe you're interested in "information geometry", where they look for manifolds and the corresponding geodesics and metrics to be able to train with less data and measure distances more sensibly

final kiln Mar 11, 2024, 5:43 PM

#

potent sky Why not bind it directly to rust

Because this way is easier since I can cheat by looking how they did it for the torch rust bindings I'm using

potent sky Mar 11, 2024, 5:43 PM

#

past meteor Do you guys think ML on tabular data is a solved problem?

I guess we can say we're more of a fraction through to 'solved' than we are for vision, audio etc.

past meteor Mar 11, 2024, 5:44 PM

#

They're each other's inverse in the sense collecting data for those is a bit easier but more heavy weight models are required

potent sky Mar 11, 2024, 5:45 PM

#

wooden sail maybe you're interested in "information geometry", where they look for manifolds...

Yessss exactly
That and I'm also looking into manifolds that give a better "native" representation for particular types of data
Like negatice curvature Riemann manifolds for hierarchical structures (power law)

wooden sail Mar 11, 2024, 5:46 PM

#

don't let me trick you into thinking i know what i'm doing though, i just know these things exist. i dabble at most tangentially in that i work a lot with fisher information, which happens to define a metric tensor in special cases

potent sky Mar 11, 2024, 5:46 PM

#

final kiln Because this way is easier since I can cheat by looking how they did it for the ...

oh lol true ig
Can you share those torch rust bindings btw
I've seen you share code written in that a few times here and it was so sleek and nice (obviously, rust xd)

final kiln Mar 11, 2024, 5:47 PM

#

potent sky oh lol true ig Can you share those torch rust bindings btw I've seen you share c...

https://github.com/LaurentMazare/tch-rs

potent sky Mar 11, 2024, 5:47 PM

#

wooden sail don't let me trick you into thinking i know what i'm doing though, i just know t...

Is this in any way related to fisher vectors

final kiln Mar 11, 2024, 5:48 PM

#

Information geometry is an interdisciplinary field that applies the techniques of differential geometry to study probability theory and statistics. It studies statistical manifolds, which are Riemannian manifolds whose points correspond to probability distributions.
first time I've actually been interested in studying any kind of stats

potent sky Mar 11, 2024, 5:48 PM

#

And you probably know more than me haha
I invade the algebraic topology territory starting from ML and information theory, that's more my home ground

wooden sail Mar 11, 2024, 5:48 PM

#

potent sky Is this in any way related to fisher vectors

i had never heard of them, but it looks like it. based on the fisher score

#

log likelihood and what not

potent sky Mar 11, 2024, 5:50 PM

#

ooh interesting I'll have to look into it
Fisher vectors are more of a toy I like to use from time to time
Very elegant, but DL beats them for most applications

potent sky Mar 11, 2024, 5:50 PM

#

final kiln https://github.com/LaurentMazare/tch-rs

Thanks!

final kiln Mar 11, 2024, 5:53 PM

#

potent sky Prodding at something from all the different angles is great, if nothing else we...

I'm actually yet to see any sign it won't work, I'm replicating a study (made for vision) that claims that the attention mechanism is inconsequential. they at one point even substitute it for an identity mapping

#

in all likelihood, I think the identity and avg pooling won't work as well as for vision

#

(as substitutes for attention)

potent sky Mar 11, 2024, 5:54 PM

#

final kiln I'm actually yet to see any sign it won't work, I'm replicating a study (made fo...

I think I've heard of this paper

final kiln Mar 11, 2024, 5:54 PM

#

but anything else will work and the network doesn't care as long as you give it a way of comparing the tokens

potent sky Mar 11, 2024, 5:54 PM

#

Lots of complaints when it came out 😂

final kiln Mar 11, 2024, 5:55 PM

#

potent sky Lots of complaints when it came out 😂

yeah I've had my own

potent sky Mar 11, 2024, 5:55 PM

#

I haven't dived into it myself tho

potent sky Mar 11, 2024, 5:55 PM

#

final kiln but anything else will work and the network doesn't care as long as you give it ...

True

final kiln Mar 11, 2024, 5:55 PM

#

final kiln

.

potent sky Mar 11, 2024, 5:55 PM

#

Are you trying to reduce it from quadratic scaling or is that not a concern for you

final kiln Mar 11, 2024, 5:56 PM

#

it's not currently a concern, I've just halved the number of parameters in the attention head and will use the symmetry to reduce the number of operations

#

I'm more interested in the interpretability and in replicating those guys results

potent sky Mar 11, 2024, 5:58 PM

#

ahh nice fair enough
I've very recently started keeping an eye on ways to reduce the quadratic scaling problem in attention
Massive benefits if we can find a way, but ofcourse it's a difficult task and we're not currently in a place to drop everything else and focus on thay

#

Has anyone here read the Retentive net paper

final kiln Mar 11, 2024, 5:59 PM

#

I do have a couple ideas

#

the most straightforward way is to make it a funnel like structure like you do with UNETs

#

there's no reason for the output dimension to be the same as the input dimension for the attention module

potent sky Mar 11, 2024, 6:00 PM

#

final kiln there's no reason for the output dimension to be the same as the input dimension...

We actually played around with this a bit

#

Nice

final kiln Mar 11, 2024, 6:00 PM

#

I also suspect you can take half of the network and have it do convolution, like

#

imagine two branches with a series of attention modules

#

the first branch does self attention, and the second branch does conv

#

you can then feed one into the other like you do with encoder decoder

#

the conv captures local relations, the attention mechanism captures global ones

potent sky Mar 11, 2024, 6:02 PM

#

But the self attention will still scale quadratic right

final kiln Mar 11, 2024, 6:02 PM

#

the hope would be that you wouldn't need to scale as much embedding dimensionality, since part of the burden has been shifted to a different branch

#

so it doesn't actually completely solve it

potent sky Mar 11, 2024, 6:03 PM

#

Ahh hmmmm

#

And how will the self attention be incentivized to prioritise only learning global context, just backprop?

final kiln Mar 11, 2024, 6:05 PM

#

potentially via masking of the attention scores

potent sky Mar 11, 2024, 6:19 PM

#

final kiln potentially via masking of the attention scores

Of the tokens close by? I'm not sure if that's a good idea.
For starters tokens are often sub-words and such
You might only get enough useful information to compute relative global context when taking a bunch of close tokens together

final kiln Mar 11, 2024, 6:22 PM

#

potent sky Of the tokens close by? I'm not sure if that's a good idea. For starters tokens ...

That would be captured by the conv layers, the self attention is followed by cross attention that is fed with the conv layer results

#

With residual connections info is never really lost

potent sky Mar 11, 2024, 6:23 PM

#

final kiln That would be captured by the conv layers, the self attention is followed by cro...

But it would be required in the self attention being computed, itself

potent sky Mar 11, 2024, 6:24 PM

#

final kiln With residual connections info is never really lost

Hmm yeah

final kiln Mar 11, 2024, 6:24 PM

#

Uhm, not sure if I understand

#

Each token would suffer influence from far away tokens only

potent sky Mar 11, 2024, 6:25 PM

#

The self attention computation being performed is on masked vectors? Or masking is done after the computation?

final kiln Mar 11, 2024, 6:27 PM

#

On the attention scores, like you do to make it causal

#

But really the only way to know if it works is to try it out

#

I think my next project is gonna be to write the torch bindings for that new programming language

#

Gleam

#

I'd make it cool tho, using the compiler to aid the ML dev process

final kiln Mar 11, 2024, 6:31 PM

#

final kiln But really the only way to know if it works is to try it out

But b4 that I might try this

#

There's so much cool stuff to do ._.

potent sky Mar 11, 2024, 6:38 PM

#

final kiln There's so much cool stuff to do ._.

Yeah ikr 😭

#

I just try to find justifications (excuses) to shoe-horn it in my work time 😂

final kiln Mar 11, 2024, 7:02 PM

#

That's one way to do it

lapis sequoia Mar 11, 2024, 7:52 PM

#

If someone wanna do a chatbot with custom data dm me for join the project

hollow reef Mar 11, 2024, 10:57 PM

#

facing a moral dilemma with my personal project
on one hand, i do not want to use AI to work on it because then it feels like less of my baby
on the other... the practicality it offers in eliminating redundancy and whatnot is unmatched, because i'm working with a lot of math and i'm not all that great at python yet

i'm considering a compromise to be only using it to handle the more rote things like big tables/dicts/definitions, or to figure out the harder coding problems that i'm still learning
maybe i could only use it for learning how to code?

idk, thoughts on coding with AI?

left tartan Mar 11, 2024, 11:42 PM

#

I think you'll find you'll get over the "not all that great at python yet" if you put down the AI and muscle through it.

desert oar Mar 11, 2024, 11:43 PM

#

wooden sail don't let me trick you into thinking i know what i'm doing though, i just know t...

what do you do with fisher information specifically?

desert oar Mar 11, 2024, 11:44 PM

#

left tartan I think you'll find you'll get over the "not all that great at python yet" if yo...

as an illustrative example, i have a junior colleague who refuses to put down the AI, and as a result his progress is very slow

#

it seems causal, because we consistently make more progress in our 1-hour socratic discussion sessions (so what do you think we should do in XYZ situation?) than he does on his own time

#

he clearly knows how to do the stuff. he just has it in his head somehow that the AI is definitely helping him even when it seemingly is holding him back

left tartan Mar 11, 2024, 11:46 PM

#

desert oar it seems causal, because we consistently make more progress in our 1-hour socrat...

I have a similar view on leetcode and similar puzzle questions: the best problems are the one where you're stuck on them for days, and then think your way through it.

#

The process is the point, not the result

desert oar Mar 11, 2024, 11:46 PM

#

yeah! if he'd just put down the damn AI and think + write notes on pen and paper, he'd make a ton of progress and become very strong

desert oar Mar 11, 2024, 11:46 PM

#

left tartan The process is the point, not the result

this, yes. he doesn't seem to get that the part where you think hard and you feel like you don't know what you're doing is the part where you're learning

#

i used to be so afraid to feel like i didn't know what i was doing. it took me so long to embrace that feeling. i'd have been toast in school using AI for everything, even as it is, wolfram alpha did me no favors in learning calculus. i dealt with the consequences of that laziness for years.

#

so i get it

#

but, learn from my mistakes

left tartan Mar 11, 2024, 11:49 PM

#

I halfjoke but, perhaps the saving grace of this AI craze, is job security for the rest of us

iron basalt Mar 11, 2024, 11:50 PM

#

desert oar yeah! if he'd just put down the damn AI and think + write notes on pen and paper...

Rubber duck time.

#

Also if you don't feel uncomfortable, you are probably not learning, much like how not feeling uncomfortable while working out means you are probably not making progress.

#

(There is no way to minimize / avoid it, but we try really hard (human nature to avoid things that make us uncomfortable), and so turn to stuff like AI)

#

(This can be anything else, like watching tutorials instead of actually doing it (just watching is not uncomfortable))

midnight harbor Mar 12, 2024, 12:34 AM

#

Has anyone in this community started using the Google Gemini API following GPT-3, and could you provide insights into its strengths and weaknesses? Specifically, I'm interested in understanding its performance in terms of

pricing
speed
reasoning capabilities
multilingual understanding
controllability.
Any feedback would be greatly appreciated.

kindly Tag me

pseudo pasture Mar 12, 2024, 2:26 AM

#

I'm Stuck on This Project for 3 Days because I want to deploy the Flask app online. So, I can get data from anywhere online but after trying to deploy The Project on almost every Cloud and Hosting site, I'm just facing one error ( 502 Bad Gateway). No matter, if I talk about AWS, Azure, Google app engine, Vercel, Heroku, Netlify, Konbey. Everywhere I'm getting the timeout error however locally the project is working perfect and Through Clis also working but as I deploy successfully and hit end point, it says 502 Bad Gateway.

If any of you have Solution for this then for God's sake pls tell me. I'm Stuck and can't move forward.

https://github.com/saqib772/sportsodds

GitHub

GitHub - saqib772/sportsodds: Betting Sports Odds for NBA Games

Betting Sports Odds for NBA Games. Contribute to saqib772/sportsodds development by creating an account on GitHub.

hollow mortar Mar 12, 2024, 3:58 AM

#

hollow reef facing a moral dilemma with my personal project on one hand, i do not want to us...

ai isnt here to beat us, its to provide more competition, :D

wooden sail Mar 12, 2024, 4:48 AM

#

desert oar what do you do with fisher information specifically?

look at estimation bounds and optimal design

spare scarab Mar 12, 2024, 6:32 AM

#

pseudo pasture I'm Stuck on This Project for 3 Days because I want to deploy the Flask app onli...

very cool project dude

final kiln Mar 12, 2024, 6:56 AM

#

desert oar he clearly knows how to do the stuff. he just has it in his head somehow that th...

One pattern I'm noticing is that everyone including me thinks that they are using AI right and it's this other group of people that are using it wrong and not learning and being held back.

Which is making me seriously reflect on how I'm using AI.

#

I don't use copilot, partly because of that, it spits out code, that isn't necessarily that good, and it's just so easy to leave it there as if it were a lib function

#

So I mostly use chat gpt, and mostly when I notice that I'm looking through the documentation and finding no success or it's just taking too long and it's just easier to just ask the omniscient chat bot about it

#

Like, what's the difference between reading the answer to a stack overflow question and a chat gpt answer to my specific question ?

#

The difference is that I had to write it and I had to cross check the answer either way

#

The me writing it part can be strangely beneficial

#

Which doesn't happen when you use copilot

past meteor Mar 12, 2024, 8:14 AM

#

final kiln Which doesn't happen when you use copilot

Interesting, I don't use copilot but do use chatGPT for 100 % the same reasons

spare scarab Mar 12, 2024, 9:55 AM

#

How can I train an ai model to become a support helper? I have all the discord messages saved from the support channel help. In any of this format CSV/TXT/JSON. I want to train the ai on the messages.

final kiln Mar 12, 2024, 11:21 AM

#

Totally forgot to do no grad, so that's why it was overfitting the val data

#

I'm also getting used to rust

nocturne narwhal Mar 12, 2024, 11:29 AM

#

Hey guys wanted to start with Data science can some one help me get best resources and roadmap , tried to search on udemy , coursera , youtube but im confused from where to start

past meteor Mar 12, 2024, 11:44 AM

#

nocturne narwhal Hey guys wanted to start with Data science can some one help me get best resourc...

#data-science-and-ml message

desert oar Mar 12, 2024, 12:27 PM

#

wooden sail look at estimation bounds and optimal design

ahh, is that for experiment design? of all classes i regret not taking, that was the #1.

wooden sail Mar 12, 2024, 12:29 PM

#

yeah

#

a lot of my current research deals with stuff like imaging, localization, parameter estimation, etc based on a small number of measurements. like one would do in x-rays and what not

#

and then there's always the question of "where do i put the sensors?"

final kiln Mar 12, 2024, 12:32 PM

#

100% accuracy achieved >.>

final kiln Mar 12, 2024, 12:50 PM

#

I'm either doing something wrong or I'm going right to the top of the SOTA leaderboard

#

I reckon I'm doing something wrong

desert oar Mar 12, 2024, 1:04 PM

#

wooden sail and then there's always the question of "where do i put the sensors?"

very cool. did you learn that stuff from a book or a course? or just picked it up as you went?

pseudo pasture Mar 12, 2024, 1:33 PM

#

spare scarab very cool project dude

yert brainmon

sturdy slate Mar 12, 2024, 2:02 PM

#

Hello guys, need some help with medical image segmentation. I am working on lung tumor segmentation and the model seems to work well with the train dataset (overfitting, mostly) but the testing does not go well. I am not sure if I am doing things right. I am using pytorch and segmentation models pytorch for Unet models.

final kiln Mar 12, 2024, 2:06 PM

#

I'm trying to force an overfit rn

#

That would prove no leakage

#

For anyone curious, average pooling is working here

#

But I got leakage no doubt, 1e-7 loss on both

wooden sail Mar 12, 2024, 2:16 PM

#

desert oar very cool. did you learn that stuff from a book or a course? or just picked it u...

my supervisor asked me a really hard question while i was writing my masters thesis, and then he made a suggestive comment pointing toward this stuff as a possible answer

#

5 years and many books and papers later, here we are

final kiln Mar 12, 2024, 2:51 PM

#

leakage has been found, part of my code still assumed multi processing instead of async, so they used the same duck db connection to create the tables

#

The tables all had the same name

#

Just created a uuid for the name of the tables

desert oar Mar 12, 2024, 2:58 PM

#

wooden sail 5 years and many books and papers later, here we are

i'll take resources if you have a couple that helped you in particular!

final kiln Mar 12, 2024, 4:22 PM

#

final kiln Just created a uuid for the name of the tables

That wasn't it, which in hindsight makes sense because I'm creating a different connection for each

#

So.... How am I gonna debug this >.>

#

Maybe join the two tables via de text field ?

#

Alright, 123 rows are shared by both tables

#

This is out of 50k

#

So it does not explain it yet

#

I do believe this is in the original dataset

#

Must be there's no way I mixed this up in such a specific way, and also by only using curl and tar

#

Matter fact, I'm gonna code the download part right in the pipeline

little arrow Mar 12, 2024, 4:48 PM

#

How do I reshape Table A into Table B? Do I melt, pivot, stack or something else entirely. Many thanks in advanced!

valid quartz Mar 12, 2024, 4:54 PM

#

I need your folks help on this:

So Im making an AI assistant for my school project and there are two problems Im facing: (Im using PyCharm btw)

import os
import time
import pyaudio
import playsound
from gtts import gTTS
import openai
import speech_recognition as sr

api_key = "API"

openai.api_key = api_key

lang = 'en'

while True:
def get_audio():

r = sr.Recognizer() - here Pycharm tells me to indent the "r" right here
with sr.Microphone(device_index=1) as source:
    audio = r.listen(source)
    said = ""

    try:
        said = r.recognize_google(audio)
        print(said)
        if "BanglaGPT" in said:
            completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": said}])
            text = completion.choices[0].messages.content
            speech = gTTS(text=text, lang=lang, slow=False, tld="com.au")
            speech.save = "welcome1.mp3"
            playsound.playsound("welcome.mp3")

    except Exception:
        print("Exception")

return said: - Several problems with this line,  It says its out of function, it needs End Statement

get_audio(): - An illegal target for variable annotation, and expression is expected

(Ignore the fact that the name of the AI is BanglaGPT and the language it is supposed to speak is english, its for tests ok?)

agile cobalt Mar 12, 2024, 4:55 PM

#

valid quartz I need your folks help on this: So Im making an AI assistant for my school proj...

go delete/regenerate that API Key ASAP
you should treat them as passwords, if not even more important than that

valid quartz Mar 12, 2024, 4:55 PM

#

agile cobalt go delete/regenerate that API Key ASAP you should treat them as passwords, if no...

I did

agile cobalt Mar 12, 2024, 4:55 PM

#

just editing/deleting the message is not enough - make sure you actually delete the key

valid quartz Mar 12, 2024, 4:55 PM

#

alr

#

done

valid quartz Mar 12, 2024, 5:13 PM

#

Ok so I was able to fix almost all the issues in the code

#

but now the problem is

#

It doesnt speak

#

Smh

wooden sail Mar 12, 2024, 5:21 PM

#

desert oar i'll take resources if you have a couple that helped you in particular!

statistical signal processing: detection, estimation, and time series analysis by louis l. scharf
fundamentals of statistical signal processing: estimation theory by steven m. kay
spectral analysis of signals by petre stoica and randolph moses

desert oar Mar 12, 2024, 5:27 PM

#

little arrow How do I reshape Table A into Table B? Do I melt, pivot, stack or something else...

"stack" and "unstack" convert column names into row labels (aka index levels) and vice versa. that is, they operate on labels.

"melt" and "pivot" convert data columns between long and wide format. that is, they operate on data.

both can be used for this reshaping operation. melt and pivot might be easier to reason about though.

in fact, you need both here. first you want to melt this to "long format":

Region | Country | Studies | Date

then pivot this to "wide format" with respect to date:

Region | Country | Studies | Jan 2023 | Feb 2023 | ...

desert oar Mar 12, 2024, 5:27 PM

#

wooden sail statistical signal processing: detection, estimation, and time series analysis b...

oh i see, very technical for signal processing! i'll take a look at the first one and see if there's anything i can get from it

wooden sail Mar 12, 2024, 5:28 PM

#

the application motivates the problems covered in those books, but i think you'll find most of the stuff is easily generalizable

#

also what constitutes a "signal" is very easy to satisfy 😛

#

a large chunk of AI/ML is parameter estimation and estimation theory in a trench coat

desert oar Mar 12, 2024, 5:31 PM

#

that's what i'm hoping to get here. i remember trying to get into this stuff back when i was an undergrad studying economics, for the same reasons you just stated, but i had a hard time connecting to the applications at the time & wasn't strong enough with math yet.

final kiln Mar 12, 2024, 6:16 PM

#

I was parsing one of the classes incorrectly and that led to only one output token, which was 0 and that was the source of it

#

Now I'm getting a more realistic goes up to 0.5 acc and annoyingly stays there forever

#

I suspect there's gonna be something in the SQL still, gonna write some unit tests for this

#

{
    "run_name": null,
    "experiment_id": 1,
    "data": {
        "slices": 1,
        "batch_size": 256,
        "test_source": "***/dataset/test.parquet",
        "train_source": "****/train.parquet"
    },
    "model": {
        "depth": 6,
        "heads": 10,
        "encoding": "tiktoken-gpt2",
        "dimension": 64,
        "kernel_size": null,
        "attention_kind": "quadratic",
        "context_window": 300,
        "input_vocabolary": 60000,
        "output_vocabolary": 5
    },
    "train": {
        "epochs": 100,
        "learning_rate": 0.0005
    },
    "process": {
        "use_gpu": true,
        "executable_source": "****t"
    }
}

#

in case anyone has any ideas, but hopefully it will be a temporary issue related to the data

#

that hypothesis is motivated by the fact that it's not overfiting

#

which might mean I'm messing up my randomization again, maybe I'm mixing up the labels each time

final kiln Mar 12, 2024, 6:34 PM

#

final kiln I suspect there's gonna be something in the SQL still, gonna write some unit tes...

Does not seem to be the case

jagged latch Mar 12, 2024, 6:36 PM

#

I have a question to those experienced in Excel. I'm having an issue in a sheet where after new data is generated through my Python script via Openpyxl, I am getting some values in a column with a yellow highlighted cell with bolded font. The thing is I double click on the cell and the formatting in question goes away and returns to looking like the other cells in the column. I know it's not my code because I tested the same exact code on a blank workbook and sheet and got everything without the formatting. Is there some type of hidden check that was in the original Excel template?

#

If I try and go over the cell and click no fill or try and unbold it, nothing happens. It only returns to normal after double clicking it and then clicking somewhere else.

jagged latch Mar 12, 2024, 6:57 PM

#

I found the problem. The person before me left some conditional formatting in there. I ended up removing it. Problem solved.

final kiln Mar 12, 2024, 7:08 PM

#

Alright.

#

Here's what I'm NOT gonna do, spend two weeks experimenting with this stuff.

Gotta change my approach.

What I'm gonna do instead is go through the literature and see what people did and how. And then replicate that.

#

Time to PR, wait for the checks to be done, do a release so that my binary gets built automatically and then published and then it's time for a well deserved rest.

#

Tomorrow I'm gonna freeze some of the interfaces and write integration tests for them. After that I'll collect some papers and also think about the most efficient way to take the derivative of a metric tensor.

crisp raptor Mar 12, 2024, 7:48 PM

#

right now I'm working on a neat little project on my calculator for AI generated music

desert oar Mar 12, 2024, 7:58 PM

#

final kiln Here's what I'm NOT gonna do, spend two weeks experimenting with this stuff. Go...

what was "this stuff" again?

final kiln Mar 12, 2024, 8:03 PM

#

desert oar what was "this stuff" again?

Hyper parameter exploration

#

I spent a lot of time just messing about with the parameters, when I should've just looked it up. I think it's gonna be the same thing here

#

And its also about time I write some tests. Particularly, I'm gonna write some for the training loop itself. Just gonna generate a dataset that oughta be easy to generalize

zealous spear Mar 12, 2024, 10:10 PM

#

Hi, can anyone tell my why I get this error?

File "C:\Users\barte\AppData\Roaming\Python\Python312\site-packages\textract\parsers\utils.py", line 87, in run
    pipe = subprocess.Popen(
           ^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Python312\Lib\subprocess.py", line 1538, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] Nie można odnaleźć określonego pliku

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\barte\Documents\GitHub\docgpt\main.py", line 10, in <module>
    doc = textract.process("spa.pdf")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\barte\AppData\Roaming\Python\Python312\site-packages\textract\parsers\__init__.py", line 79, in process
    return parser.process(filename, input_encoding, output_encoding, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\barte\AppData\Roaming\Python\Python312\site-packages\textract\parsers\utils.py", line 46, in process
    byte_string = self.extract(filename, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

#

    raise ex
  File "C:\Users\barte\AppData\Roaming\Python\Python312\site-packages\textract\parsers\pdf_parser.py", line 21, in extract
    return self.extract_pdftotext(filename, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\barte\AppData\Roaming\Python\Python312\site-packages\textract\parsers\pdf_parser.py", line 44, in extract_pdftotext
    stdout, _ = self.run(args)
                ^^^^^^^^^^^^^^
  File "C:\Users\barte\AppData\Roaming\Python\Python312\site-packages\textract\parsers\utils.py", line 95, in run
    raise exceptions.ShellError(
textract.exceptions.ShellError: The command `pdftotext spa.pdf -` failed with exit code 127
------------- stdout -------------
------------- stderr -------------```

#

This is my code:

import textract
import os
from transformers import GPT2TokenizerFast
from langchain.text_splitter import RecursiveCharacterTextSplitter

from dotenv import load_dotenv

load_dotenv()

doc = textract.process("spa.pdf")

with open('./dataFromPdf.txt', 'w') as f:
    f.write(doc.decode('utf-8'))

with open('./dataFromPdf.txt', 'r') as f:
    text = f.read()

tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

def count_tokens(text: str) -> int:
    return len(tokenizer.encode(text))

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 512,
    chunk_overlap  = 24,
    length_function = count_tokens,
)

chunks = text_splitter.create_documents([text])

final kiln Mar 12, 2024, 10:28 PM

#

zealous spear ``` File "C:\Users\barte\AppData\Roaming\Python\Python312\site-packages\textrac...

It literally says in the traceback, you gotta read those

#

Exit code 127

#

oblique comet Mar 12, 2024, 10:48 PM

#

was stuck figuring out why cropping images took about 3-4 seconds per image and why it was running on cpu (100% usage) instead of the cuda device
after some debugging I found the related line:

visible_pixels = crop[crop > 0]

and changed it to this:

mask = crop.gt(0.0).to(crop.dtype)
visible_pixels = mask * crop + (1 - mask)

function execution time went down from 3408ms to just 28ms lol
thats a 99,17% reduction!

love optimizing stuff like this but its rare that I manage to lower it this much
this is why i love programming

spring field Mar 13, 2024, 1:05 AM

#

on the topic of resources, what do you think of this? http://neuralnetworksanddeeplearning.com/

vapid storm Mar 13, 2024, 3:13 AM

#

Hi guys, (very quick question 🥺)
I recently downloaded a dataset of images of shotguns, handguns, and knives. I am using this to train a cnn used to detect potential weapons through doorbell camera images or footage. However, I don't know if i should normalize all the images to a certain size.

if I do, then the bound boxes in the corresponding txt file for each image would be skewed.

jaunty helm Mar 13, 2024, 3:44 AM

#

I have this preprocessing step

def frequency_encode(df: pd.DataFrame, features: str | list[str]=None, inplace=False):
    if features is None:
        features = df.columns
    elif isinstance(features, str):
        features = [features]

    if not inplace:
        df = df.copy()
    for feature in features:
        frq = df[feature].value_counts()  # <-- problem
        df[f'{feature}_FrqEncode'] = df[feature].replace(frq.to_dict())
    if not inplace:
        return df
```I put this in a `FunctionTransformer` in a `Pipeline`, then later I realized that at `# <-- problem`, I should instead somehow store a `df_train` that was seen during `.fit()` and use `df_train[feature].value_counts()` when `.transform()`ing
how do I do this? (while still being able to use a `Pipeline` of course)

lapis sequoia Mar 13, 2024, 4:08 AM

#

You can use the Pipeline class to wrap your custom class with the necessary preprocessing and encoding. You can do it like this

import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.metrics.pairwise import pairwise_distances
from sklearn.expr import FunctionTransformer
from sklearn.pipeline import Pipeline

class DistanceTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, metric_func, **kwargs):
        self.metric_func = metric_func
        self.inplace = True

    def fit(self, X, y=None, **kwargs):
        if not self.inplace:
            raise RuntimeError("Transformers must be called with inplace=True")
    
        X = self.metric_func(X)
        return X

    def transform(self, X, **kwargs):
        if not self.inplace:
            raise RuntimeError("Transformers must be called with inplace=True")
    
        X = self.metric_func(X)
        return X

class PairwiseDistanceEstimator(BaseEstimator, TransformerMixin):
    def __init__(self, **kwargs):
        super().__init__()
        self.kwargs = kwargs

    def fit(self, X):
        X = self.get_metric(X)
        return X

    def transform(self, X):
        X = self.get_metric(X)
        return X

    def get_metric(self, X):
        if self.kwargs['metric'] == 'euclidean':
            return pairwise_distances(X, metric='euclidean')
        if self.kwargs['metric'] == 'cosine':
            return pairwise_distances(X, metric='cosine')
        raise ValueError(f"Invalid metric '{self.kwargs['metric']}'. Defaulting to Euclidean distance")

jaunty helm Mar 13, 2024, 5:58 AM

#

jaunty helm I have this preprocessing step ```py def frequency_encode(df: pd.DataFrame, feat...

class StatedFunctionTransformer(FunctionTransformer):
    def fit(self, X: pd.DataFrame, y=None):
        def deco(fn):
            def wrapper(*args, **kwargs):
                kwargs['df_train'] = X
                return fn(*args, **kwargs)
            return wrapper
        self.func = deco(self.func)
        return super().fit(X, y)

def frequency_encode(df: pd.DataFrame, features: str | list[str]=None, inplace=False, df_train: pd.DataFrame=...):
    if features is None:
        features = df.columns
    elif isinstance(features, str):
        features = [features]

    if not inplace:
        df = df.copy()
    for feature in features:
        frq = df_train[feature].value_counts()  # <-- problem
        df[f'{feature}_FrqEncode'] = df[feature].replace(frq.to_dict())
    if not inplace:
        return df
```this is what I've settled on for now, if anyone knows of a better/more conventional method, or there's a problem to what I'm doing here, please tell me

mellow vector Mar 13, 2024, 6:10 AM

#

I know this isn't DS but you guys use jupyter... I'm reviewing jupyter and the course instructor is describing tooltip uses, when he presses shift tabx3 the tool tip remains open while hes typing, my tooltip closes immediately, what am i doing wrong?

past meteor Mar 13, 2024, 7:50 AM

#

spring field on the topic of resources, what do you think of this? <http://neuralnetworksandd...

Looks fine. I'd still go for the deep learning book I linked though 😄 it's also hands-on, but it's more topical than the one you linked

boreal gale Mar 13, 2024, 9:19 AM

#

mellow vector I know this isn't DS but you guys use jupyter... I'm reviewing jupyter and the c...

tooltip remains open is not a default behaviour, my guess is he is using https://github.com/jupyter-lsp/jupyterlab-lsp or something.

GitHub

GitHub - jupyter-lsp/jupyterlab-lsp: Coding assistance for JupyterL...

Coding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol - jupyter-lsp/jupyterlab-lsp

mellow vector Mar 13, 2024, 9:20 AM

#

ty

red kraken Mar 13, 2024, 12:19 PM

#

hi, i'm creating a project based on detecting a larvae presence in different water types. I'm getting the data thru sensors such as turbidity, oxygen, ph Level, and temperature. I wanna ask is random forest the way to go to properly detect larvaes depending on the data or is there other better ml algorithms?

boreal gale Mar 13, 2024, 12:51 PM

#

red kraken hi, i'm creating a project based on detecting a larvae presence in different wat...

some problem lends itself to certain models, e.g. sound and wavelet models, image and CNNs.

RF is a good starting point, it's up to you to search for better models once you establish a baseline model, xgboost has always been a strong contender in kaggle for a reason, i suggest you do some more research in that regard if you aren't familiar.

also sometimes it's not so much about the model you use, but the features you craft - e.g. your problem could potentially be solved by a temporal snapshot (i.e. just sensor values in one instance of time), or an alternative maybe more useful set of features might be some aggregate of sensor values over time (diff, % change maybe?), sometimes it's worth looking into the fundamental aspect of the problem, in this case think about the biological impact of larvae presence (they might make the water warm, "more warm" than usual? idk - not a biologist.. but if so how do you describe that properly?)

potent sky Mar 13, 2024, 1:32 PM

#

Anyone tried out the LLMs in 1.58 bits paper yet?

desert oar Mar 13, 2024, 1:46 PM

#

boreal gale some problem lends itself to certain models, e.g. sound and wavelet models, imag...

i second this. although i do think RF is a great default choice for medium-size data with a relatively small number of features

final kiln Mar 13, 2024, 2:29 PM

#

Tests - are important

final kiln Mar 13, 2024, 3:05 PM

#

found one

#

#

the pre trained embeddings part might be critical

slate crystal Mar 13, 2024, 4:36 PM

#

Im training a housing price prediction model in TensorFlow with dimensions of X (20433 rows × 13 columns), loss="mae", optimizer="Adam()".
The problem I am getting is that upon training the loss initially decreases but after some epochs becomes stagnant.

Any suggestions on improving the model, and how many layers should I use?

#

tf.random.set_seed(42)

model = tf.keras.Sequential([
tf.keras.layers.Dense(13),
tf.keras.layers.Dense(32),
tf.keras.layers.Dense(1)
])

model.compile(
loss=tf.keras.losses.mae,
optimizer=tf.keras.optimizers.SGD(lr=0.001),
metrics=["mae"]
)

norm_history = model.fit(X_train_scaled, y_train, epochs=100, batch_size=64)

final kiln Mar 13, 2024, 4:41 PM

#

I know im always gonna enjoy these more cuz it took so much work, but damn these things look good

final kiln Mar 13, 2024, 4:42 PM

#

slate crystal tf.random.set_seed(42) model = tf.keras.Sequential([ tf.keras.layers.Dense(...

always check with papers with code to see how people are doing it

final kiln Mar 13, 2024, 4:43 PM

#

slate crystal tf.random.set_seed(42) model = tf.keras.Sequential([ tf.keras.layers.Dense(...

but also you are missing your non linear activations I think

slate crystal Mar 13, 2024, 4:49 PM

#

final kiln but also you are missing your non linear activations I think

non linear activations are used for linear regression problem?

final kiln Mar 13, 2024, 4:52 PM

#

slate crystal non linear activations are used for linear regression problem?

Uhm, I don't know, I would try it yeah

slate crystal Mar 13, 2024, 4:52 PM

#

https://www.kaggle.com/datasets/camnugent/california-housing-prices this is the dataset

California Housing Prices

Median house prices for California districts derived from the 1990 census.

final kiln Mar 13, 2024, 4:52 PM

#

There was a lot of discussion recently because of the meaning of linear in linear regression

#

Just try it, can't hurt to just try rite

slate crystal Mar 13, 2024, 4:53 PM

#

Okay i'll try and say

final kiln Mar 13, 2024, 4:54 PM

#

final kiln found one

this is paying off rn, getting a slow but steady decrease instead of a plateau

slate crystal Mar 13, 2024, 4:57 PM

#

final kiln Just try it, can't hurt to just try rite

Tried it, its worse. Before it was plateauing around 48000, now its 53000

final kiln Mar 13, 2024, 4:59 PM

#

slate crystal Tried it, its worse. Before it was plateauing around 48000, now its 53000

uhm, not much difference really, your loss is very large, try normalizing your data somehow, and adding layer normalization too in between

#

networks will prefer stuff between 0 and 1, in transformers z-score normalization is used across each batch, followed by a trainable affine

slate crystal Mar 13, 2024, 5:02 PM

#

This is the normalization I am already using,

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train)

X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

final kiln Mar 13, 2024, 5:04 PM

#

slate crystal Tried it, its worse. Before it was plateauing around 48000, now its 53000

which activation did you use ?

final kiln Mar 13, 2024, 5:04 PM

#

slate crystal This is the normalization I am already using, from sklearn.preprocessing import...

im not sure what this does

#

would be helpful to see your loss graph

slate crystal Mar 13, 2024, 5:04 PM

#

final kiln which activation did you use ?

relu for both layers

slate crystal Mar 13, 2024, 5:05 PM

#

final kiln would be helpful to see your loss graph

I'm not aware of a loss graph

#

is it done using Pandas?

final kiln Mar 13, 2024, 5:06 PM

#

slate crystal I'm not aware of a loss graph

ah, the plot of the training and validation loss against step

#

something of this sort

slate crystal Mar 13, 2024, 5:11 PM

#

Loss graph

y7c3d2Fg4ODeOihh8TFixflK7oFuTYI8Vrbzvr160WvXr2EVqsVQUFBYvny5Vb7LRaLmDt3rvD29hZarVaMHDlSJCcny1Rt82Y0GsX06dOFn5f0Ol0olOnTuL1118XpaWlUhte71uzY8eOWvNnjRpkhCibtf1ypUrYuLEicLJyUm4uLiIp59WuTn5992bQohqi2ZSURERNSKcIwQERERtVoMQkRERNRqMQgRERFRq8UgRERERK0WgxARERG1WgxCRERE1GoxCBEREVGrxSBERERErRaDEBEREbVaDEJERETUajEIERERUavFIERERESt1v8D3fuC2yc7VlIAAAAASUVORK5CYII.png

final kiln Mar 13, 2024, 5:12 PM

#

slate crystal Loss graph

alright, can you include loss/val and loss/train ?

#

mine is also not looking too good ngl, the missing ingredient is gonna be the pre trained embedder

slate crystal Mar 13, 2024, 5:16 PM

#

just a minute i'll try to do it

#

This will do right?

76xnuBgiA0KoqqqmpTD0IQBKE5oigKX3zxBZMmTWrqoQiC0EKQHCtBEARBEIQgIcJKEARBEAQhSEiOlSAIgh8kU0IQhNoijpUgCIIgCEKQEGElCIIgCIIQJERYCYIgCIIgBAkRVoIgCIIgCEFChJUgCIIgCEKQEGElCIIgCIIQJERYCYIgCIIgBAkRVoIgCIIgCEFChJUgCIIgCEKQP8wGcGAp6xRWAAAAABJRU5ErkJggg.png

final kiln Mar 13, 2024, 5:20 PM

#

slate crystal This will do right?

woah, are those really per epoch ?

slate crystal Mar 13, 2024, 5:21 PM

#

yeah😅

final kiln Mar 13, 2024, 5:21 PM

#

model.compile(
loss=tf.keras.losses.mae,
optimizer=tf.keras.optimizers.SGD(lr=0.001),
metrics=["mae"]
)

#

so, don't use SGD, use Adam

#

mean average error, that sounds fishy to me

#

I dont use keras

#

let me check this

#

loss = mean(abs(y_true - y_pred))

#

try using mean squared error isntead

slate crystal Mar 13, 2024, 5:23 PM

#

final kiln > loss = mean(abs(y_true - y_pred))

I think that is mae - mean absolute error

final kiln Mar 13, 2024, 5:23 PM

#

yeah, mean squared error is better

#

no need for a sqrt

#

uhm, what is your batch size ?

slate crystal Mar 13, 2024, 5:24 PM

#

I've tried mse before it gives huge loss numbers

slate crystal Mar 13, 2024, 5:24 PM

#

final kiln uhm, what is your batch size ?

64

final kiln Mar 13, 2024, 5:24 PM

#

slate crystal I've tried mse before it gives huge loss numbers

that's fine, the issue is that the data is likely not properly normalized

slate crystal Mar 13, 2024, 5:25 PM

#

After using Adam optimizer

final kiln Mar 13, 2024, 5:25 PM

#

the validation and the training loss follow each other very closely here

#

that can be suss, after many epochs the model should start overfiting

#

try increasing your model capacity

#

more layers

slate crystal Mar 13, 2024, 5:26 PM

#

u mean the no of layers?

final kiln Mar 13, 2024, 5:26 PM

#

model = tf.keras.Sequential([
tf.keras.layers.Dense(13),
tf.keras.layers.Dense(32),
tf.keras.layers.Dense(1)
])

this is quite small

#

model = tf.keras.Sequential([
tf.keras.layers.Dense(100),
tf.keras.layers.Dense(100),
tf.keras.layers.Dense(50),
tf.keras.layers.Dense(25)
tf.keras.layers.Dense(1)
])

#

something like that, plus the activations ofc

#

you want your model to have more capacity than the dataset requires

slate crystal Mar 13, 2024, 5:27 PM

#

could you suggest any activations?

final kiln Mar 13, 2024, 5:27 PM

#

GeLU

slate crystal Mar 13, 2024, 5:28 PM

#

final kiln you want your model to have more capacity than the dataset requires

okay

slate crystal Mar 13, 2024, 5:28 PM

#

final kiln GeLU

ReLU?

final kiln Mar 13, 2024, 5:28 PM

#

slate crystal ReLU?

no, GeLU

#

https://paperswithcode.com/method/gelu

slate crystal Mar 13, 2024, 5:29 PM

#

okkay

final kiln Mar 13, 2024, 5:29 PM

#

after you've managed to overfit your model, you know you got something that has the power to do the task

#

you'll then try to cripple it so that it doesn't overfit, or, overfits just a little

#

you do that using dropouts

slate crystal Mar 13, 2024, 5:30 PM

#

final kiln model = tf.keras.Sequential([ tf.keras.layers.Dense(100), tf.keras.layer...

One question here, can I make the 1st Dense layer as 13 neurons because the dataset has 13 features?

final kiln Mar 13, 2024, 5:30 PM

#

slate crystal One question here, can I make the 1st Dense layer as 13 neurons because the data...

what are the 13 features ?

slate crystal Mar 13, 2024, 5:31 PM

#

longitude latitude housing_median_age total_rooms total_bedrooms population households median_income ocean_proximity_<1H OCEAN ocean_proximity_INLAND ocean_proximity_ISLAND ocean_proximity_NEAR BAY ocean_proximity_NEAR OCEAN

the last 5 were one-hot encoded with pd.Dummies

final kiln Mar 13, 2024, 5:32 PM

#

and what does the output of the model mean ?

slate crystal Mar 13, 2024, 5:33 PM

#

#

the output is median_house_value

final kiln Mar 13, 2024, 5:35 PM

#

alright, let's try to first normalize these, maybe with z-score along each column except for the one hot encodings

slate crystal Mar 13, 2024, 5:35 PM

#

This is the output

final kiln Mar 13, 2024, 5:35 PM

#

tho that would kinda make it dependent on the sample

#

you need to get these in more reasonable ranges

#

for example house median age, maybe I'd divide every value by 30 or something like that

#

median house value too, by 300k or something

slate crystal Mar 13, 2024, 5:37 PM

#

I actually did normalize before training

NORMALIZATION & STANDARDIZATION

features = ['longitude', 'latitude', 'housing_median_age', 'total_rooms',
'total_bedrooms', 'population', 'households', 'median_income',
'ocean_proximity']

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train)

X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

X_test_scaled[0]
array([ 1.16597857, -1.33318189, -0.68338903, -0.76968499, -0.61778743,
-0.79510954, -0.64364484, -0.36439632, -0.89050504, -0.68141436,
-0.01649168, -0.35421275, 2.59982148])

final kiln Mar 13, 2024, 5:38 PM

#

oh okay

#

right, so now for the network then

slate crystal Mar 13, 2024, 5:38 PM

#

slate crystal I actually did normalize before training # NORMALIZATION & STANDARDIZATION fe...

the resulting X_test_scaled is alright?

final kiln Mar 13, 2024, 5:39 PM

#

model = tf.keras.Sequential([
tf.keras.layers.Dense(13),
tf.keras.layers.Dense(50),
tf.keras.layers.Dense(100),
tf.keras.layers.Dense(50),
tf.keras.layers.Dense(25)
tf.keras.layers.Dense(1)
])

#

maybe something like that

slate crystal Mar 13, 2024, 5:39 PM

#

final kiln oh okay

negative values are acceptable right?

final kiln Mar 13, 2024, 5:39 PM

#

slate crystal negative values are acceptable right?

yeah

#

make sure the output is also normalized

#

which I suspect it is not because of your large loss values

slate crystal Mar 13, 2024, 5:40 PM

#

final kiln make sure the output is also normalized

Output must be normalized?

slate crystal Mar 13, 2024, 5:41 PM

#

slate crystal This is the output

This?

final kiln Mar 13, 2024, 5:41 PM

#

yes

final kiln Mar 13, 2024, 5:41 PM

#

slate crystal After using Adam optimizer

if you look here it kidna plateus on the same order of magnitude as those values

slate crystal Mar 13, 2024, 5:48 PM

#

For y

scaler = StandardScaler().fit(np.array(y_train).reshape(-1,1))
y_train = scaler.transform(np.array(y_train).reshape(-1,1))
y_test = scaler.transform(np.array(y_test).reshape(-1,1))

this is the scaling I'm using for y values

#

This is the graph now

final kiln Mar 13, 2024, 5:51 PM

#

awesome

#

uhm

slate crystal Mar 13, 2024, 5:51 PM

#

It plateaus in its scale

final kiln Mar 13, 2024, 5:52 PM

#

how long does it take to do 100 epochs ?

slate crystal Mar 13, 2024, 5:52 PM

#

2 minutes

final kiln Mar 13, 2024, 5:52 PM

#

let's go nuts then, add moar capacity

#

try to go as far as the gpu lets you

slate crystal Mar 13, 2024, 5:52 PM

#

I think generally ppl dont normalize the output y_values, right?

final kiln Mar 13, 2024, 5:53 PM

#

why not ?

slate crystal Mar 13, 2024, 5:56 PM

#

I dont know I've never seen

final kiln Mar 13, 2024, 5:59 PM

#

ive never seen a non-normalized one, usualyl you're ven supposed to interpret the output as a set of probabilities

slate crystal Mar 13, 2024, 5:59 PM

#

That is for classification

#

right?

final kiln Mar 13, 2024, 5:59 PM

#

it's also for image segmentation

#

which is classification in disguised

slate crystal Mar 13, 2024, 6:00 PM

#

Do they use it in regression problems?

final kiln Mar 13, 2024, 6:01 PM

#

last time I did a curve fitting like the one you're doing I used normalization on the output

#

also tailor and fourier features

#

a small learning rate helped

#

lets gogoogoooooooooooooooooooo

#

the weight initialization helped

#

omg thank god

#

slate crystal Mar 13, 2024, 6:04 PM

#

Nicee

#

But I must figure out where it is lagging now

final kiln Mar 13, 2024, 6:05 PM

#

slate crystal But I must figure out where it is lagging now

make it larger

#

fill your gpu memory as far as possible

#

I got 84% accuracy, aint gonna need no pretraineds

slate crystal Mar 13, 2024, 6:10 PM

#

Yeah but your graph looks smooth

final kiln Mar 13, 2024, 6:11 PM

#

you have no idea how much work that took

#

gonna do 40 epochs like in the paper

#

then im just gonna push and do a release

#

tomorrow is CUDA time yo

slate crystal Mar 13, 2024, 6:13 PM

#

you do that with Docker?

final kiln Mar 13, 2024, 6:13 PM

#

which part ?

#

im using docker extensively

slate crystal Mar 13, 2024, 6:13 PM

#

packaging n everything...

slate crystal Mar 13, 2024, 6:14 PM

#

final kiln which part ?

I'm just now learning that's why I asked

final kiln Mar 13, 2024, 6:14 PM

#

yeah, so I got these base images, which are meant for production use, I then have these github actions workflows that build on top of these images to produce my development and stagin environments

#

this setup allows me to very quickly switch from developing on my laptop, to developing on an aws machine with gpu

#

anything that works during development is guaranteed to work during production

#

and it's all very cost effective because I'm using interruptible instances

final kiln Mar 13, 2024, 6:16 PM

#

slate crystal I'm just now learning that's why I asked

docker is very nice, I like it a lot, tho it's a solution to a problem that the industry created

slate crystal Mar 13, 2024, 6:17 PM

#

yeah so Docker ensures all the dependencies are packed together so that in production the image can deploy and run anywhere, right?

serene scaffold Mar 13, 2024, 6:17 PM

#

slate crystal yeah so Docker ensures all the dependencies are packed together so that in produ...

each docker container is basically its own VM. so yes.

slate crystal Mar 13, 2024, 6:17 PM

#

VM is?

serene scaffold Mar 13, 2024, 6:18 PM

#

virtual machine
an instance of an operating system

slate crystal Mar 13, 2024, 6:18 PM

#

Okay, where do I learn docker and all its applications

#

I just got started with a 3hr video course on yt

serene scaffold Mar 13, 2024, 6:19 PM

#

a good way to practice would be to build a model, and then create a docker image that, when run as a container, allows users to interact with that model in a jupyter notebook.

#

which means that you'll need to write a Dockerfile that copies the model into the image, and installs all the Python dependencies

slate crystal Mar 13, 2024, 6:21 PM

#

Hmm great...

final kiln Mar 13, 2024, 6:22 PM

#

this is using avg pooling instead of the usual attention mechanism

#

it would seem that there's something about the metaformer paper

#

only slightly worse

#

now I wanna train it for next token prediction, now way it could work for that right

#

I can believe sentiment analysis, because really, all you need is to count how many bad words and how many nice words

#

in fact im using identity now instead of the attention

final kiln Mar 13, 2024, 6:46 PM

#

Identity is suss tho, identity can be suss, but also could be not suss due to the aforementioned reasoning

#

The network will operate on each embedding individually and then average out to one token which is then projected to the output probabilities

#

This is kinda suss

past meteor Mar 13, 2024, 6:59 PM

#

slate crystal Okay, where do I learn docker and all its applications

By installing Docker and doing the official tutorial

#

Docker ain't perfect, but it's the best we got

potent sky Mar 13, 2024, 7:07 PM

#

potent sky Anyone tried out the LLMs in 1.58 bits paper yet?

noone? ;-;

potent sky Mar 13, 2024, 7:08 PM

#

final kiln now I wanna train it for next token prediction, now way it could work for that r...

It would be interesting to see for sure

final kiln Mar 13, 2024, 8:44 PM

#

potent sky It would be interesting to see for sure

Might not be able to do it tho, I'm getting this CUDA stuff done this week and next week I gotta shift focus to my job search

#

I'm leaving the project in a good state tho, easy to extend, all that's really needed is to add a pipeline that generates the right data

#

Next token prediction is just sequence to sequence without global average pooling

#

Ah there's stuff that does require major mods

#

Like machine translation or summarization, which I'm guessing require encoder decoder

final kiln Mar 13, 2024, 8:49 PM

#

final kiln Might not be able to do it tho, I'm getting this CUDA stuff done this week and n...

I'm gonna start applying like a madman. Likely gonna focus on London cuz the EU market is so small

#

There's like 5 job openings in Switzerland ._.

odd meteor Mar 13, 2024, 8:58 PM

#

final kiln I'm gonna start applying like a madman. Likely gonna focus on London cuz the EU ...

I'm rooting for you 💪💪💪💪
This might interest you.

https://jobs.inverid.com/ml-ops-engineer/en

Inverid - creators of ReadID

ML Ops engineer - Inverid - creators of ReadID

Is innovation in your DNA? Do you love tinkering with the latest technologies, and do you understand that security is very important? Do you know what it means to create trusted scalability for our software? Then we might be looking for you!

past meteor Mar 13, 2024, 9:14 PM

#

final kiln I'm gonna start applying like a madman. Likely gonna focus on London cuz the EU ...

https://www.ml6.eu/careers/join-us

Join Our Team | ML6 - Explore Exciting Career Opportunities in AI

ML6 offers exciting career opportunities in the field of AI. Join our dynamic team of AI experts and contribute to cutting-edge projects that shape the future of artificial intelligence. Discover our open positions and embark on a rewarding career journey with ML6.

#

Look at Belgium, the ML market is really "English friendly"

open raven Mar 13, 2024, 10:58 PM

#

pandas DataFrame, to select every n-th row

Starting pandas version 2.2.0 it becomes harder to use iloc property to select every n-th row from DataFrame. It happens because iloc got deprecated. What are alternative ways when index has the default form (it was created implicitly by DataFrame constructor called without index-related arguments neither it was modified)?

agile cobalt Mar 13, 2024, 11:01 PM

#

open raven pandas DataFrame, to select every n-th row Starting pandas version 2.2.0 it bec...

It happens because iloc got deprecated
what?

agile cobalt Mar 13, 2024, 11:02 PM

#

open raven pandas DataFrame, to select every n-th row Starting pandas version 2.2.0 it bec...

Pretty sure that's just misinformation. If not, show proof / link where did you see that.

open raven Mar 13, 2024, 11:03 PM

#

found it in pandas.DataFrame.iloc API reference

agile cobalt Mar 13, 2024, 11:04 PM

#

this?

open raven Mar 13, 2024, 11:05 PM

#

You‘re right only one feature depricated. Sorry

agile cobalt Mar 13, 2024, 11:06 PM

#

tbh I don't get what they mean by "Returning a tuple from a callable is deprecated.", this doesn't makes sense on this page and I do not see anything about it in the 2.2.0 changelog either

#

!e oh wait, probably something like ```py
import pandas as pd

df = pd.DataFrame({"A": [1, 2, 3, 4, 5, 6], "B": [1,2,3,4,5,6], "C":[1,2,3,4,5,6]})
test = df.iloc[lambda frame: (len(frame.index)//2, len(frame.columns)//2)]
print(test)

arctic wedgeBOT Mar 13, 2024, 11:08 PM

#

@agile cobalt :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | /home/main.py:4: FutureWarning: Returning a tuple from a callable with iloc is deprecated and will be removed in a future version
002 |   test = df.iloc[lambda frame: (len(frame.index)//2, len(frame.columns)//2)]
003 |    A  B  C
004 | 3  4  4  4
005 | 1  2  2  2

agile cobalt Mar 13, 2024, 11:09 PM

#

hmm yep, that doesn't really works like I expected either
(it picked multiple rows instead of a row and a column)

#

!e that may as well be why it's deprecated lol```py
import pandas as pd

df = pd.DataFrame({"A": [1, 2, 3, 4, 5, 6], "B": [1,2,3,4,5,6], "C":[1,2,3,4,5,6]})
test = df.iloc[(len(df.index)//2, len(df.columns)//2)]
print(test)

arctic wedgeBOT Mar 13, 2024, 11:10 PM

#

@agile cobalt :white_check_mark: Your 3.12 eval job has completed with return code 0.

agile cobalt Mar 13, 2024, 11:10 PM

#

still weird that it is not mentioned in https://pandas.pydata.org/docs/whatsnew/v2.2.0.html though

#

eh, just leaving the Issue that caused it to be deprecated in case anyone is curious https://github.com/pandas-dev/pandas/issues/53533

broken arch Mar 14, 2024, 12:12 AM

#

hey guys has anyone worked on the elec2 dataset and what should i do about drift detecting and what stuff should i apply to improve the performance

final kiln Mar 14, 2024, 7:21 AM

#

past meteor https://www.ml6.eu/careers/join-us

thank you !

final kiln Mar 14, 2024, 7:23 AM

#

past meteor Look at Belgium, the ML market is really "English friendly"

yeah I'll be looking into almost all European countries where it's customary to use english in the workplace, I'm learning german but it will take quite a while

final kiln Mar 14, 2024, 7:23 AM

#

odd meteor I'm rooting for you 💪💪💪💪 This might interest you. https://jobs.inverid.com...

thank you !

past meteor Mar 14, 2024, 7:23 AM

#

ML6 is a great company, I think you'd be a good fit and you could get in so definitely apply 🙂

final kiln Mar 14, 2024, 7:23 AM

#

sure, I will, thank you for the suggestion !

#

Being a Machine Learning Engineer at ML6 means you consider yourself as a healthy mix between a machine learning expert, a software engineer, a researcher, and a hacker! 🤖

very nice

past meteor Mar 14, 2024, 7:26 AM

#

Yup, I've seen many of their talks. They do really cool stuff. They'd be high on my list when I decide to change jobs myself.

final kiln Mar 14, 2024, 8:29 AM

#

aaah, nothing like battling nvidia right in the morning to get the heart pumping

final kiln Mar 14, 2024, 8:31 AM

#

past meteor Yup, I've seen many of their talks. They do *really* cool stuff. They'd be high ...

what kind of stuff do they do ?

past meteor Mar 14, 2024, 8:33 AM

#

Cutting edge ML, but practice focused

#

I battled my sysadmin yesterday for more resources on my VM but the result was a 24h downtime and a wiped machine

final kiln Mar 14, 2024, 8:43 AM

#

I know smart people who refuse to work in software because they don't have the patience for it, and indeed it does really require a lot of it at times

past meteor Mar 14, 2024, 8:44 AM

#

When my colleagues say BS like "bUt WhY nOT rUn oN CpU" I want to quit immediately

past meteor Mar 14, 2024, 8:45 AM

#

final kiln I know smart people who refuse to work in software because they don't have the p...

Do you mean ML people?

#

I think there's a lot of anti software snobbery with data scientists but I've mentioned this already

final kiln Mar 14, 2024, 8:49 AM

#

past meteor Do you mean ML people?

no, people outside the industry altogether, who could get in if they wanted because they have the educational background for a transition

#

I think the two worst environments are nvidia's and javascript, they compete for first place as the worst dev experience possible

#

nvidia does it by forcing pytorch to do 8gb docker images and not allowing emulation so that people have to buy their gpu

#

I think there's also no cross platforming with the layer where cuda resides

#

I think the Nvidia docker runtime is well done tho, I wonder if it is possible to map nvcc inside the container instead of having to pull their dev image

#

"no space left on device", there's literally 100gb on that machine and I'm just pulling a docker image

#

I don't get it

#

And I just added 50gb, so somehow it's downloading 50 extra gbs

#

I had pulled it locally first to test it out, says 8gb

#

No wait, says 18gb

#

The root volume of the AMI I was using is different from the other AMI, so the volume got mounted but it wasn't root.

#

Giving it 125gb for good measure tho

#

An entire morning later, I have the compiler running

final kiln Mar 14, 2024, 11:25 AM

#

I believe

#

This is gonna require some thinking

buoyant vine Mar 14, 2024, 11:27 AM

#

The Nvidia stuff with docker drives me nuts

left tartan Mar 14, 2024, 11:28 AM

#

I’m unfamiliar with this, What’s the docker stuff with nvidia?

buoyant vine Mar 14, 2024, 11:30 AM

#

It is just an inconvenience with using CUDA, the Nvidia libraries and tooling single handily make the docker images enormous and difficult to run reliable across environments (Things like CUDA versions not aligning sadge )

#

I am looking forward to Burn's auto fusion system using WGPU becoming a bit more mature since it solves this issue providing you don't need the absolute most efficient and fastest system possible.

left tartan Mar 14, 2024, 11:32 AM

#

Oh, sure, yah we have to build out the images, the whole cuda setup is just a pain. Thought you were saying something else about docker

buoyant vine Mar 14, 2024, 11:33 AM

#

nah, it isn't specific to docker either, but with docker images you normally care to make them as small as is reasonable since it helps start times among other things, and CUDA just throws that out the window :P

final kiln Mar 14, 2024, 11:34 AM

#

maybe that's the issue, and I should try to build this directly on the machine

#

but then it breaks the rest of the workflow I think, since im using docker for everything

buoyant vine Mar 14, 2024, 11:35 AM

#

I think that is normally the best place to start, at least that way you can start to pinpoint what might be causing it

#

If I remember right, we had some issues with miss matched cuda versions, where the CUDA v11 pytorch image didn't want to work on EC2 for some reason, but the V12 image did

#

No idea why, didn't question it, just accepted that it was working and agreed to never touch it again

final kiln Mar 14, 2024, 11:37 AM

#

if it's anything like what I experienced it was probly the ARM architecture stuff, don't know if they resolved it but some time back they didn't have arm wheels or arm images

buoyant vine Mar 14, 2024, 11:37 AM

#

Haven't attempted ARM yet

#

My end goal is to use the Inf2 EC2 instances, but I still need to write a bunch of bindings and libs to work with the compiler and things

#

personally I think AWS has actually got a CUDA killer... If they focussed on lib support and integration a bit more

final kiln Mar 14, 2024, 11:39 AM

#

after this morning I think anyone that comes forward with a better dev experience is a cuda killer

#

but i reckon it's more complicated than that cuz it's also hardware stuff

buoyant vine Mar 14, 2024, 11:43 AM

#

The inf2 instance are pretty insane compute wise

#

It is just the lib support that is -_-

#

I believe they also stopped supporting ONNX which was a weird move imo

trim jewel Mar 14, 2024, 12:08 PM

#

can someone help me if they know about topics like nlp, summarization, topic modelling, stuff like that? i'm doing a project and i need to know if there are articles/videos which specifically will be useful to my project

final kiln Mar 14, 2024, 12:13 PM

#

I'm gonna go in steps

Compile simple c++
Bind it to rust
Compile c++ with some torch in it
Bind it to rust
Compile c++ with CUDA
Bind it to rust
Compile c++ with torch and CUDA
Bind it to rust

#

Rn I'm trying to do a python binding using setup() in py. Which might not even be the right direction since I'm not binding it into py like that

#

If I bind anything into py it would be rust

#

I can't not use docker, the experience is bad, but it will only be worst without it

#

Even if some short term relief is achieved, and even that is not guaranteed

final kiln Mar 14, 2024, 12:16 PM

#

trim jewel can someone help me if they know about topics like nlp, summarization, topic mod...

See the nano gpt repo, really good to get into transformers

#

Another possible approach is to try to see how rust behaves when interoping with CUDA, and then see if there's some way to assemble that into a custom layer directly in the torch rust bindings

#

I did check, and have not found a good way to do it in the rust torch bindings. But I'd definitely have to dig deeper.

They are mostly holding on to a pointer to the tensor and then calling c++ code with it.

quaint loom Mar 14, 2024, 12:30 PM

#

Hi people. Is there anything who is here and can have a look on my code?

#

So I have trying to make a temporal view from my data. 2 variable. Each variables has 6 different areas with (3;2;1 sub-area).

Over the temporal subplots I`ve created, I want to make a mean value as for the area (from the sub-area together). So I want basically but the mean value to be shown for each day as well as the individual data.

I`ve done 2 different experiment. But the data will only be shown in the first experiment and for day 11 of the experiment 2

Experiment 1 (Day 1-5)
Experiment 2 (Day 6-11). So on experiment 2, data for 6-10 is missing, it must be either the way I am calculating the mean value of filtering the data.

Code: https://paste.pythondiscord.com/JBVA

#

lapis sequoia Mar 14, 2024, 12:38 PM

#

Alright so if i got you correctly,

You want the mean value of the all the data from each day right?

quaint loom Mar 14, 2024, 12:39 PM

#

lapis sequoia Alright so if i got you correctly, You want the mean value of the all the data...

That is accurate. But also the individual data.

lapis sequoia Mar 14, 2024, 12:40 PM

#

quaint loom That is accurate. But also the individual data.

Alright so just to clarify,

For each day you want the individual data and the mean value of that data right?

quaint loom Mar 14, 2024, 12:43 PM

#

lapis sequoia Alright so just to clarify, For each day you want the individual data and the m...

Yes.
The mean data for :

URT which would be (URT1, URT2 and URT3) and the individual for each URT1, URT2 and URT3
URC which would be (URC1, URC2 and URC2) and the individual for each URC1, URC2 and URC2 and so on

lapis sequoia Mar 14, 2024, 12:43 PM

#

quaint loom Yes. The mean data for : URT which would be (URT1, URT2 and URT3) and the ind...

Alright so when i run your code, it runs and creates the mean value, but can u let me see your raw data so that i can check it real fast?

quaint loom Mar 14, 2024, 12:45 PM

#

lapis sequoia Alright so when i run your code, it runs and creates the mean value, but can u l...

I can PM you the data.

quaint loom Mar 14, 2024, 1:01 PM

#

quaint loom So I have trying to make a temporal view from my data. 2 variable. Each variable...

Problem was not solved. I am open for hearing suggestions.

quaint spade Mar 14, 2024, 1:08 PM

#

hey everyone , just got started with computer science and theres a course on database , main ERDs , i was wondering if any of you have sources to free exercises i can try and maybe even software for the diagrams , thanx in advance

desert oar Mar 14, 2024, 1:28 PM

#

quaint loom So I have trying to make a temporal view from my data. 2 variable. Each variable...

can you at least state what dimensions your data has? i see your data looks like this? Experiment, Area (is that the legend?), Day, CH4, and CO2 -- and what are the colors?

#

ideally you could share a sample dataframe

desert oar Mar 14, 2024, 1:29 PM

#

quaint spade hey everyone , just got started with computer science and theres a course on dat...

you might want to ask in #databases , but check the pinned messages & search the chat history in that channel before asking your question. "can someone recommend a course" questions tend to get the same answer over and over.

potent sky Mar 14, 2024, 1:30 PM

#

Is there a stdlib way to do memory profiling?

quaint loom Mar 14, 2024, 1:32 PM

#

desert oar can you at least state what dimensions your data has? i see your data looks like...

What do you mean by saying "Dimension" in this contex?
I call it experiment has the same stuff has been done twice, just two different times.
Area would be the legend. The color would represent the individual spots for each area.

I can check out #databases for course.

desert oar Mar 14, 2024, 1:33 PM

#

quaint loom What do you mean by saying "Dimension" in this contex? I call it experiment has...

a "dimension" is like an identifier. so area is a "dimension", day is a "dimension", etc. all the "IDs" that aren't "data", if you will.

quaint loom Mar 14, 2024, 1:34 PM

#

desert oar a "dimension" is like an identifier. so area is a "dimension", day is a "dimensi...

Got it.

desert oar Mar 14, 2024, 1:34 PM

#

got it. so your data is like this?

Experiment | Day | Area | Measurement | Value
---------------------------------------------
1          | 1   | URT  | δ13C-ΣCO2   | 10.5
...

quaint loom Mar 14, 2024, 1:35 PM

#

desert oar Mar 14, 2024, 1:35 PM

#

so you have 1 measurement each of CH4 and CO2, per bucket per day?

#

or do you have repeats per bucket per day?

quaint loom Mar 14, 2024, 1:37 PM

#

desert oar so you have 1 measurement each of CH4 and CO2, per bucket per day?

The first mention. I have one measurement of CH4 and CO2, per bucket, per day.

desert oar Mar 14, 2024, 1:37 PM

#

quaint loom The first mention. I have one measurement of CH4 and CO2, per bucket, per day.

okay, so what are you trying to visualize? the daily average across buckets? or you just want to plot that full data?

raw mortar Mar 14, 2024, 1:39 PM

#

potent sky Is there a stdlib way to do memory profiling?

As far I'm aware, there isn't

quaint loom Mar 14, 2024, 1:39 PM

#

What I initially want to plot is the mean value for each bucket given the same name:

Mean value in grey of all URT (URT1, URT2, URT3) and additionally each individual given a color. And so on for the other. CV will only have 1 tho. and CH2 two

desert oar Mar 14, 2024, 1:39 PM

#

quaint loom What I initially want to plot is the mean value for each bucket given the same n...

mean value across days? experiments? both?

#

because i thought you only had one measurement from each bucket per experiment per day. so there's nothing to take the mean of

quaint spade Mar 14, 2024, 1:40 PM

#

desert oar you might want to ask in <#342318764227821568> , but check the pinned messages &...

thanks mate

desert oar Mar 14, 2024, 1:40 PM

#

@quaint loom you also mentioned an "individual" -- what's that in this context?

quaint loom Mar 14, 2024, 1:41 PM

#

desert oar mean value across days? experiments? both?

The mean value of ex URT1, URT2 and URT3. The individual would be shown as 1 plot for URT 1, one plot for URT 2 and 1 plot for URT 3

desert oar Mar 14, 2024, 1:43 PM

#

quaint loom The mean value of ex URT1, URT2 and URT3. The individual would be shown as 1 plo...

yes but the mean value aggregating across which other variable?

desert oar Mar 14, 2024, 1:44 PM

#

quaint loom

is this the table for one experiment? or is this the whole data?

#

don't be coy here

raw mortar Mar 14, 2024, 1:46 PM

#

I tried to read to get the context, but i didn't get it

#

Maybe a diagram or some sample showing input and output makes more sense?

quaint loom Mar 14, 2024, 1:47 PM

#

desert oar is this the table for _one_ experiment? or is this the whole data?

Is the same tavke for both experiment.

quaint loom Mar 14, 2024, 1:48 PM

#

desert oar yes but the mean value aggregating across which other variable?

Could be better that if I choosed the average value instead.

desert oar Mar 14, 2024, 1:48 PM

#

quaint loom Is the same tavke for both experiment.

i think you might need to share example data. or step back and provide a more detailed explanation of your data

#

i feel like you're saying conflicting things

#

# Columns: Bucket, Day, Experiment, δ13C-ΣCO2, δ13C-ΣCH4
data: pd.DataFrame = ...

data = data.set_index(["Bucket", "Day", "Experiment"])

is this what your data looks like, or no?

quaint loom Mar 14, 2024, 1:49 PM

#

https://paste.pythondiscord.com/JBVA

quaint loom Mar 14, 2024, 1:51 PM

#

desert oar ```python # Columns: Bucket, Day, Experiment, δ13C-ΣCO2, δ13C-ΣCH4 data: pd.Data...

How can I share a example of the data similar as you did here?

desert oar Mar 14, 2024, 1:53 PM

#

quaint loom How can I share a example of the data similar as you did here?

!code

arctic wedgeBOT Mar 14, 2024, 1:53 PM

#

Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

desert oar Mar 14, 2024, 1:55 PM

#

quaint loom https://paste.pythondiscord.com/JBVA

i want to know what the raw data looks like, i.e. what's in the excel sheet. the screenshot you posted shows one measurement of CH4 and one measurement of CO2 per day, per bucket. it sounds like that is in fact representative of your data. but now you're talking about averaging and i'm trying to figure out what you want to average over.

quaint loom Mar 14, 2024, 1:59 PM

#

desert oar i want to know what the raw data looks like, i.e. what's in the excel sheet. the...

My internet seem to be a bit slow, so I can`t paste the data somewhere.

Maybe I am also not using the vocability (As you`ve mention to be years ago) not accurate. Not sure if it would be more suitable to use the word average or mean. But I would like to plot the Mean/Average of URT 1, URT2, URT 3 etc for each day.

I will share a picture of the dataset to I find a solution or if you could add me as a friend and I can share the excel file .

desert oar Mar 14, 2024, 2:00 PM

#

quaint loom My internet seem to be a bit slow, so I can`t paste the data somewhere. Maybe ...

plot the Mean/Average of URT 1, URT2, URT 3 etc for each day.
you want to plot that average across all 4 experiments that were carried out each day?

quaint loom Mar 14, 2024, 2:00 PM

#

potent sky Mar 14, 2024, 2:01 PM

#

Hmm then any as-good-as-stdlib de facto way?

desert oar Mar 14, 2024, 2:02 PM

#

@quaint loom for each of CH4 and CO2, you want to create a plot with an X axis that says "Day" and a Y axis that's the average CH4 or CO2 level, across all buckets?

#

or you want the average within each bucket? but then i don't know what you want to average over.

quaint loom Mar 14, 2024, 2:02 PM

#

desert oar > plot the Mean/Average of URT 1, URT2, URT 3 etc for each day. you want to plo...

Maybe that is where the issues lays. Because the way you phrase it makes me confused but logical somehow. Just Average/mean given that day1, 2, 3 - 12

desert oar Mar 14, 2024, 2:03 PM

#

# Columns: Bucket, Day, δ13C-ΣCO2, δ13C-ΣCH4
data: pd.DataFrame = ...

data = data.set_index(["Bucket", "Day"])

so your data is like this?

#

where Bucket and Day uniquely identify 1 measurement of δ13C-ΣCO2 and 1 measurement of δ13C-ΣCH4?

raw mortar Mar 14, 2024, 2:05 PM

#

potent sky Hmm then any as-good-as-stdlib de facto way?

For memory profiling you mean?

potent sky Mar 14, 2024, 2:05 PM

#

Yes

quaint loom Mar 14, 2024, 2:05 PM

#

desert oar <@950847230422712420> for each of CH4 and CO2, you want to create a plot with an...

One Graphs with CH4 (Experiment 1), One graph for CO2 (experiment 1) and same for Experiment 2
X axis I want days, Y, the value of CH4 or CO2

potent sky Mar 14, 2024, 2:05 PM

#

Sorry the messages in between had not loaded in for me

raw mortar Mar 14, 2024, 2:06 PM

#

potent sky Yes

These memory-profiler, which is commonly used

quaint loom Mar 14, 2024, 2:06 PM

#

desert oar or you want the average _within_ each bucket? but then i don't know what you wan...

Average/mean of all 3 bucket for each day.

potent sky Mar 14, 2024, 2:07 PM

#

raw mortar These memory-profiler, which is commonly used

Thanks

raw mortar Mar 14, 2024, 2:07 PM

#

I've been having a great time with scalene recently

#

A few of my colleagues have also been testing out memray from Bloomberg, heard it's pretty good

quaint loom Mar 14, 2024, 2:08 PM

#

@desert oar But I also want the individual bucket to be represented in the graph, shown with color.

desert oar Mar 14, 2024, 2:08 PM

#

so you want the individual buckets and the average across buckets?

quaint loom Mar 14, 2024, 2:08 PM

#

desert oar so you want the individual buckets _and_ the average across buckets?

Yes.

desert oar Mar 14, 2024, 2:08 PM

#

i see. that was not clear

#

@quaint loom does this work?

# Columns: Bucket, Day, δ13C-ΣCO2, δ13C-ΣCH4
data: pd.DataFrame = ...

fig, axs = plt.subplots(2, 1)

# CO2
ax = axs[0]
ax.set_title("δ13C-ΣCO2")
ax.set_ylabel("Value")
ax.set_xlabel("Day")
for bucket_name, bucket_data in data.groupby("Bucket"):
    ax.scatter(bucket_data["Day"], bucket_data["δ13C-ΣCO2"], label=bucket_name)
ax.legend()

# CH4
ax = axs[1]
ax.set_title("δ13C-ΣCH4")
ax.set_ylabel("Value")
ax.set_xlabel("Day")
for bucket_name, bucket_data in data.groupby("Bucket"):
    ax.scatter(bucket_data["Day"], bucket_data["δ13C-ΣCH4"], label=bucket_name)
ax.legend()

plt.show()

#

oh i didn't add the averages, hang on

quaint loom Mar 14, 2024, 2:14 PM

#

desert oar Mar 14, 2024, 2:15 PM

#

what's this Experiment 1 and Experiment 2...

#

you didn't say anything about that. i was trying to get clarification but it sounded irrelevant

#

anyway what i posted above should at least give you the right idea. just need to separately call ax.scatter for each plot

quaint loom Mar 14, 2024, 2:15 PM

#

Think about it as different days. I thought I mentioned to you that it was the same thing, just different days.

desert oar Mar 14, 2024, 2:16 PM

#

quaint loom Think about it as different days. I thought I mentioned to you that it was the s...

but you also have day on the X axis

#

are they days, or are they something else?

quaint loom Mar 14, 2024, 2:16 PM

#

desert oar but you _also_ have day on the X axis

Experiment 1 is day 1-6 and experiment 2 is day 7-12

desert oar Mar 14, 2024, 2:17 PM

#

ah

#

# Columns: Bucket, Day, δ13C-ΣCO2, δ13C-ΣCH4
data: pd.DataFrame = ...

daily_avgs = data.groupby("Day")[["δ13C-ΣCH4", "δ13C-ΣCO2"]].mean()

fig, axs = plt.subplots(2, 1)

# CO2
ax = axs[0]
ax.set_title("δ13C-ΣCO2")
ax.set_ylabel("Value")
ax.set_xlabel("Day")
for bucket_name, bucket_data in data.groupby("Bucket"):
    ax.scatter(bucket_data["Day"], bucket_data["δ13C-ΣCO2"], label=bucket_name)
ax.scatter(daily_avgs["Day"], daily_avgs["δ13C-ΣCO2"], label="average")
ax.legend()

# CH4
ax = axs[1]
ax.set_title("δ13C-ΣCH4")
ax.set_ylabel("Value")
ax.set_xlabel("Day")
for bucket_name, bucket_data in data.groupby("Bucket"):
    ax.scatter(bucket_data["Day"], bucket_data["δ13C-ΣCH4"], label=bucket_name)
ax.scatter(daily_avgs["Day"], daily_avgs["δ13C-ΣCH4"], label="average")
ax.legend()

plt.show()

that should do it with 2 plots, one for each chemical you're measuring

#

if you want to split it up into 4 plots per day you'll have to do more work

#

but again you might want to look into seaborn, it automates a lot of this for you

#

i don't personally like seaborn very much, but it does make it a little easier to make multi-faceted plots like this

quaint loom Mar 14, 2024, 2:18 PM

#

desert oar i don't personally like seaborn very much, but it does make it a little easier t...

Yea, I like seaborn.

desert oar Mar 14, 2024, 2:19 PM

#

this is how you can define the experiment column easily:

data["Experiment"] = np.where(data["Day"] >= 7, 1, 2)

#

and that lets you now set Experiment as a facet

quaint loom Mar 14, 2024, 2:21 PM

#

desert oar this is how you can define the experiment column easily: ```python data["Experim...

Dank. I have been filtering the days, so separate them from other day: experiment2_filter = [6, 7, 8, 9, 10, 11] Not sure if you looked at my code

desert oar Mar 14, 2024, 2:28 PM

#

quaint loom Dank. I have been filtering the days, so separate them from other day: experimen...

i did but it was 160 lines of stuff that i didn't fully understand, so i hope you'll excuse me for not trying to analyze it too carefully 😅

quaint loom Mar 14, 2024, 2:30 PM

#

desert oar i did but it was 160 lines of stuff that i didn't fully understand, so i hope yo...

All good. Thank you for your input, Salty rocky l@mp

desert oar Mar 14, 2024, 2:31 PM

#

quaint loom All good. Thank you for your input, Salty rocky l@mp

of course. in the future it helps to define your terminology up-front to avoid back and forth interview like this

quaint loom Mar 14, 2024, 2:32 PM

#

desert oar of course. in the future it helps to define your terminology up-front to avoid b...

Im working on the terminology ☺️

tawdry plover Mar 14, 2024, 3:12 PM

#

Why is numpy.linalg.det() so inaccurate?

#

or is there something wrong with my algorithm?

#

(I'm using reduction to upper triangular matrix method)

final kiln Mar 14, 2024, 3:30 PM

#

i got tested cpp code bound to rust, tested in the rust code

#

it also compiled with torch

#

now I'm gonna try to get a tensor operation to compute in the cpu via the cpp code

merry ridge Mar 14, 2024, 3:52 PM

#

tawdry plover (I'm using reduction to upper triangular matrix method)

If you are already transforming it into an upper triangular matrix why do you even need to compute the determinant with a function like that in the first place.

tawdry plover Mar 14, 2024, 3:53 PM

#

merry ridge If you are already transforming it into an upper triangular matrix why do you ev...

I'm using np.linalg.det() for testing if my algorithm works

#

apparently I'm passing 614/10000 tests

#

and getting an average of 0.42% of error

#

wait wait

merry ridge Mar 14, 2024, 3:59 PM

#

I don't really understand what you are doing at all where you can conclude that the method is numerically unstable as opposed to whatever your algorithm is.

#

What is the condition number and error of one of these matrices that are "so inaccurate"?

tawdry plover Mar 14, 2024, 4:01 PM

#

no numpy was giving me results in the weird floats so i thought it was using heavy optimization tricks which results in inaccuracy

#

I actually fixed my algorithm now

#

numpy was correct

#

9938/10000 passed

#

my bad

final kiln Mar 14, 2024, 4:56 PM

#

final kiln now I'm gonna try to get a tensor operation to compute in the cpu via the cpp co...

CUDA is compiling into rust, just gotta get the syntax right

#

But I'm gonna have to do it without torch for now.

#

No idea how their CUDA API works

#

So I'm just gonna have float pointer passed to the kernel and play around a bit with it

final kiln Mar 14, 2024, 5:38 PM

#

Just got my first kernel run on a GPU

#

Didn't do anything

#

Cuz I didn't code anything, but I can see from Nvidia smi that the process is accessing the GPU as I hit cargo test

#

This was a lot smoother than expected ngl

lapis sequoia Mar 14, 2024, 5:48 PM

#

Afternoon guys, I'm having trouble getting Cuda working on VS Code using Python 11 + Pipenv. Even though I have a CUDA enabled GPU and installed CUDA toolkit, torch.cuda.is_available() outputs false and that's it. I can't use my GPU in windows 10.
Does anybody know what the propblem might be?

final kiln Mar 14, 2024, 5:49 PM

#

lapis sequoia Afternoon guys, I'm having trouble getting Cuda working on VS Code using Python ...

Can you run nvidia-smi ?

lapis sequoia Mar 14, 2024, 5:50 PM

#

lapis sequoia Mar 14, 2024, 5:50 PM

#

final kiln Can you run nvidia-smi ?

i think so

final kiln Mar 14, 2024, 5:50 PM

#

lapis sequoia i think so

Which command did you use to install pytorch ?

lapis sequoia Mar 14, 2024, 5:51 PM

#

final kiln Which command did you use to install pytorch ?

pipenv install torch==2.2.1

final kiln Mar 14, 2024, 5:51 PM

#

lapis sequoia pipenv install torch==2.2.1

That's not the one for GPU

lapis sequoia Mar 14, 2024, 5:52 PM

#

final kiln That's not the one for GPU

is there a different library for torch gpu?

final kiln Mar 14, 2024, 5:52 PM

#

lapis sequoia is there a different library for torch gpu?

Well actually now that I'm seeing it would work, but only if your CUDA version is 12.1

#

https://pytorch.org/get-started/locally/

lapis sequoia Mar 14, 2024, 5:53 PM

#

final kiln Well actually now that I'm seeing it would work, but only if your CUDA version i...

lapis sequoia Mar 14, 2024, 5:54 PM

#

final kiln Well actually now that I'm seeing it would work, but only if your CUDA version i...

I have cuda 12.4.99

final kiln Mar 14, 2024, 5:54 PM

#

uhm you'd think 12.4 is compatible with 12.1

lapis sequoia Mar 14, 2024, 5:55 PM

#

final kiln uhm you'd think 12.4 is compatible with 12.1

where did you check that it has to be 12.1 specifically?

final kiln Mar 14, 2024, 5:55 PM

#

in the link I sent

lapis sequoia Mar 14, 2024, 5:55 PM

#

final kiln in the link I sent

not sure if it's possible to downgrade cuda version

final kiln Mar 14, 2024, 5:56 PM

#

I've been using the nvidia docker runtime

#

it can also be frustrating at times, but it does free me from these kinds of issues

lapis sequoia Mar 14, 2024, 5:56 PM

#

final kiln I've been using the nvidia docker runtime

what's this?

final kiln Mar 14, 2024, 5:57 PM

#

it's a docker runtime that gives containers cuda capabilities

#

you just do --gpus all when running the container and it becomes available

#

the good part is that pytorch has official docker images

#

so version mismatches are usually not an issue

lapis sequoia Mar 14, 2024, 5:58 PM

#

final kiln it's a docker runtime that gives containers cuda capabilities

which containers are you referring to?

buoyant vine Mar 14, 2024, 5:58 PM

#

Note the docker deamon also needs to be configured normally to use the nvidia container toolkit

lapis sequoia Mar 14, 2024, 5:58 PM

#

sorry im very new to this

final kiln Mar 14, 2024, 5:58 PM

#

ah I code in aws machines that come with it installed

#

but last time I had to install it for some deployments I dont recall having to do that much with it

#

even had a script for it

lapis sequoia Mar 14, 2024, 5:59 PM

#

final kiln even had a script for it

thanks for all your help, I will try to fix it

final kiln Mar 14, 2024, 5:59 PM

#

lapis sequoia sorry im very new to this

this is the worst part of it, dealing with this stuff

lapis sequoia Mar 14, 2024, 6:01 PM

#

final kiln this is the worst part of it, dealing with this stuff

I just blindly ran pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 inside of my pipenv virtual environment lol

#

it's moving something, so we will have to see if it works

final kiln Mar 14, 2024, 6:02 PM

#

I think the source of this mess is, if I'm not mistaken, is that Nvidia won't let us have a CUDA alternative - not too sure about it, I overheard it in a yt video

#

There's even a video of the creator of Linux flipping them the finger, must be a reason for it >.>

lapis sequoia Mar 14, 2024, 6:06 PM

#

final kiln There's even a video of the creator of Linux flipping them the finger, must be a...

Wait it started to work, but hmmm
None of the installed files are in pipfile

final kiln Mar 14, 2024, 6:07 PM

#

pipfile ?

lapis sequoia Mar 14, 2024, 6:08 PM

#

final kiln pipfile ?

yeah I'm using a pipenv virtual environment

final kiln Mar 14, 2024, 6:10 PM

#

I see, a requirements.txt replacement

lapis sequoia Mar 14, 2024, 6:10 PM

#

final kiln I see, a requirements.txt replacement

something like that

final kiln Mar 14, 2024, 6:11 PM

#

It's in the title

#

There's also poetry

#

I've replaced all this with docker, not sure if it will ever make sense to go back, don't even have a requirements.txt in my project

raw mortar Mar 14, 2024, 6:16 PM

#

final kiln I've replaced all this with docker, not sure if it will ever make sense to go ba...

The dependencies are pinned in the docker file?

final kiln Mar 14, 2024, 6:16 PM

#

Yes

raw mortar Mar 14, 2024, 6:17 PM

#

How is it any different than having a requirement file?

final kiln Mar 14, 2024, 6:17 PM

#

I can pin dependencies from 3 or 4 diff languages

#

There's also env variables and other necessary scripting that can be moved to the docker file

#

So it centralizes a lot of stuff

raw mortar Mar 14, 2024, 6:18 PM

#

Within the docker file itself it's either
Pip install packages...
Vs
Pip install -r requirements.txt

#

I don't see an improvement

final kiln Mar 14, 2024, 6:20 PM

#

The improvement lies in not having an extra requirements.txt file around and in being able to pin down the dependencies from the other languages and other parts of the project

raw mortar Mar 14, 2024, 6:20 PM

#

Lol it just combines both the files

final kiln Mar 14, 2024, 6:20 PM

#

The docker file is also production ready

raw mortar Mar 14, 2024, 6:21 PM

#

final kiln The docker file is also production ready

What does that mean?

final kiln Mar 14, 2024, 6:21 PM

#

raw mortar Lol it just combines both the files

I think you are not seeing the big picture

raw mortar Mar 14, 2024, 6:21 PM

#

raw mortar Within the docker file itself it's either Pip install packages... Vs Pip install...

This doesn't make an improvement

#

Either i have all the packages within the docker file or point the docker file to a requirement file

#

I don't see an improvement

#

Rather it makes the docker file more cluttered

final kiln Mar 14, 2024, 6:23 PM

#

I've listed my reasons already, if you have any argument against it I'm happy to listen to it. But at this time this is the best workflow I came up with.

raw mortar Mar 14, 2024, 6:23 PM

#

raw mortar Either i have all the packages within the docker file or point the docker file t...

@final kiln this basically

final kiln Mar 14, 2024, 6:24 PM

#

raw mortar <@935270247366271027> this basically

why duplicate effort and have a requirements.txt file when you can just hardcode it into the dockerfile

raw mortar Mar 14, 2024, 6:25 PM

#

final kiln why duplicate effort and have a requirements.txt file when you can just hardcode...

How is it duplicated though? The docker file can always point to the requirement file, which can get updated independently of each other

#

It's even more clear what's changed in a pr

final kiln Mar 14, 2024, 6:26 PM

#

raw mortar How is it duplicated though? The docker file can always point to the requirement...

you're stating the same thing twice, install these dependencies

#

like, I'm not gonna argue about this, I prefer stuff in one place

raw mortar Mar 14, 2024, 6:26 PM

#

final kiln you're stating the same thing twice, install these dependencies

Buddy the docker file will have a pointer to the requirement
Like
Pip install -r requirements.txt
That's it

final kiln Mar 14, 2024, 6:27 PM

#

raw mortar Buddy the docker file will have a pointer to the requirement Like Pip install -...

I don't like it and I have valid reasons to not like it, there's dependencies that have to be installed in a particular order for example

#

has happened before

raw mortar Mar 14, 2024, 6:27 PM

#

final kiln like, I'm not gonna argue about this, I prefer stuff in one place

Just want to hear your reasoning not an argument

final kiln Mar 14, 2024, 6:27 PM

#

there's dependencies that you only need in a builder step

#

I don't like having one file for setting up my environment, and then a separate file for setting up another part of it, makes no sense when I can just have it laid out right there within its context

raw mortar Mar 14, 2024, 6:28 PM

#

You can have multiple dependencies files

final kiln Mar 14, 2024, 6:29 PM

#

yes, I can have a lot of things

#

doesn't mean it's the most practical way to do it

raw mortar Mar 14, 2024, 6:29 PM

#

Usually I have like deps folder with main, dev, test etc etc

final kiln Mar 14, 2024, 6:29 PM

#

in general I avoid making people jump around my code, I try to minimize the number of indirections

raw mortar Mar 14, 2024, 6:30 PM

#

final kiln yes, I can have a lot of things

So having everything in one file is more practical?

Understandable

final kiln Mar 14, 2024, 6:30 PM

#

raw mortar So having everything in one file is more practical? Understandable

yes, it is, because it's literally just a list

raw mortar Mar 14, 2024, 6:31 PM

#

Who even reads dependencies files at all, i usually look it up once in a quarter to bulk update all together or setup a deps bot

#

pithink

final kiln Mar 14, 2024, 6:31 PM

#

raw mortar Who even reads dependencies files at all, i usually look it up once in a quarter...

I read it when there's an issue

#

or when I need to check a version

#

or when cuda is acting up again

#

hardcoding is how I do it, you do it different, it's fine

raw mortar Mar 14, 2024, 6:32 PM

#

Like core dependencies are usually defined in pyproject.toml, for packages

raw mortar Mar 14, 2024, 6:33 PM

#

raw mortar Like core dependencies are usually defined in pyproject.toml, for packages

For apps to make reproducible build, a requirement file is used to pin all dependencies

final kiln Mar 14, 2024, 6:33 PM

#

raw mortar For apps to make reproducible build, a requirement file is used to pin all depen...

I hardcode the production dependencies in my docker file and my dev dependencies are installed via github actions workflows

spring field Mar 14, 2024, 6:34 PM

#

something that just came to my mind: is cgpt, at least initially, speaking so, you could say, eloquently, because those words may have been rarer in the dataset so they were artificially inflated with weights or whatever and that turned out to just make it choose those words more often in the end?

final kiln Mar 14, 2024, 6:34 PM

#

it allows me to code in any machine in a reproduceable way, it fetches me a machine from my cloud provider and exposes me a vscode instance in the browser

raw mortar Mar 14, 2024, 6:34 PM

#

raw mortar For apps to make reproducible build, a requirement file is used to pin all depen...

This is a common pattern used across many projects, you might wanna look up some oss projects on how they do it

final kiln Mar 14, 2024, 6:35 PM

#

raw mortar This is a common pattern used across many projects, you might wanna look up some...

I understand it is a common pattern, but I don't use it as I don't code exclusevely in python

#

there's a lot of moving parts in my project

#

which don't fit into that structure

raw mortar Mar 14, 2024, 6:35 PM

#

Neither do it, polyglot ftw

final kiln Mar 14, 2024, 6:37 PM

#

spring field something that just came to my mind: is cgpt, at least initially, speaking so, y...

I think they just selected a good dataset right, at least in the fine tuning stages

raw mortar Mar 14, 2024, 6:37 PM

#

final kiln there's a lot of moving parts in my project

Usually it's not always as complex as we think it is, somebody would have solved the problem we have
If not that will birth a new oss utility 😁

final kiln Mar 14, 2024, 6:37 PM

#

raw mortar Usually it's not always as complex as we think it is, somebody would have solved...

I'm interoping cuda, c++ and rust with torch permeating all of them

#

and I also dont have a gpu

#

if you have a better way of achieving a burn rate of 10cents an hour, I'm all hears tbh

raw mortar Mar 14, 2024, 6:39 PM

#

final kiln I'm interoping cuda, c++ and rust with torch permeating all of them

I don't know what lead to this, but having each as a seperate self contained module and using python to interact would be a viable solution

final kiln Mar 14, 2024, 6:40 PM

#

raw mortar I don't know what lead to this, but having each as a seperate self contained mod...

I just decided I wanted to challenge myself

#

I actually couldn't compile a cuda extension with python, ended up being a lot smoother with rust, go figure

raw mortar Mar 14, 2024, 6:40 PM

#

final kiln I just decided I wanted to challenge myself

Well good luck engineering, don't forget what you originally tried to solve 🤪

final kiln Mar 14, 2024, 6:42 PM

#

raw mortar Well good luck engineering, don't forget what you originally tried to solve 🤪

no, there's solid reasoning behind each decision, I'm exploring the use of compiled languages for machine learning, I believe that a synergy is possible because the compiler can be used to perform early math checks

#

rust is being used because the torch cpp interface is not stable, the only job of the rust torch maintainers is to keep that torch interface stable to its users

#

I decided to code cuda because it's the only way to squeeze out the performance gains from my proposed attention mechanism

#

unless there's a layer that leverages the symmetry of symetric tensors, but I couldn't even fiind a way of imposing that with pytorch without a lot of overhead

#

rust runs the training loop in a process, python generates data in another, which means there's never a gpu down time thus being cost effective

past meteor Mar 14, 2024, 6:45 PM

#

My deep learning set up at work finally works 🙏

#

Still don't have all I need, in the sense I'm being bottlenecked by RAM and CPU and not by GPU/vRAM

#

Luxury problem

raw mortar Mar 14, 2024, 6:47 PM

#

@final kiln what is the actual problem you're trying to solve?

final kiln Mar 14, 2024, 6:47 PM

#

raw mortar <@935270247366271027> what is the actual problem you're trying to solve?

I'm exploring a couple of research questions

#

I've just laid out one of them

raw mortar Mar 14, 2024, 6:48 PM

#

I still didn't get the intent though, the use of certain tech could have it's reasons

final kiln Mar 14, 2024, 6:50 PM

#

raw mortar I still didn't get the intent though, the use of certain tech could have it's re...

the intent is to the use of compiled languages for machine learning, I believe that a synergy is possible because the compiler can be used to perform early math checks

and to also extract results from my new attention mechanism by reproducing the metaformer study

raw mortar Mar 14, 2024, 6:51 PM

#

But aren't most ml packages complied to begin with?

final kiln Mar 14, 2024, 6:51 PM

#

raw mortar But aren't most ml packages complied to begin with?

the compiler can be used to perform early math checks

raw mortar Mar 14, 2024, 6:52 PM

#

Python is used to interact and interface

final kiln Mar 14, 2024, 6:52 PM

#

which are not currently done

raw mortar Mar 14, 2024, 6:52 PM

#

pithink

final kiln Mar 14, 2024, 6:52 PM

#

I don't think you're trying to understand what I'm saying

raw mortar Mar 14, 2024, 6:53 PM

#

final kiln the compiler can be used to perform early math checks

You mean to say optimise out steps?

final kiln Mar 14, 2024, 6:53 PM

#

raw mortar You mean to say optimise out steps?

I mean to detect that a matrix operation will not pan out before running a training loop

#

yes you can have runtime checks

#

but with a compiled language you can have the compiler carry out the work and not have the overhead when training

#

I've also felt the need for stronger type checking in the models as my projects grew larger

raw mortar Mar 14, 2024, 6:57 PM

#

Still I'm wondering the data/matrix only comes in to place during runtime, how a compiler can find out this at compile time?

final kiln Mar 14, 2024, 6:58 PM

#

raw mortar Still I'm wondering the data/matrix only comes in to place during runtime, how a...

you can, for example, declare a tensor to be of a shape Tensor[a, b, c] and use symbolic calculus to infer the rest

#

I managed to trick the rust compiler into it using macros

raw mortar Mar 14, 2024, 6:59 PM

#

final kiln you can, for example, declare a tensor to be of a shape Tensor[a, b, c] and use ...

On the arch yes

final kiln Mar 14, 2024, 6:59 PM

#

but it's an open ended exploration, idk how far it will get and if rust is even the right language for it