#data-science-and-ml | Python | Page 142

lapis sequoia Aug 20, 2024, 7:55 PM

#

i said pixels but it's not necessary, i meant 'image' and said instead 'set of pixels'

#

each image is the 'item'

#

each image is a random variables, and those variables are iid

spare forum Aug 20, 2024, 8:03 PM

#

Yeah mb

lapis sequoia Aug 20, 2024, 8:04 PM

#

np, i appreciate replies :-)

spare forum Aug 20, 2024, 8:05 PM

#

If you flatten the image, is basically a big vector with can be considered as observation of random variables

lapis sequoia Aug 20, 2024, 8:05 PM

#

thanks. in the wikipedia though are the x_i the inputs or outputs?

#

im confused, but i think it explains why one uses logs

serene grail Aug 20, 2024, 8:06 PM

#

lapis sequoia im confused, but i think it explains why one uses `logs`

more computationally efficient it says

lapis sequoia Aug 20, 2024, 8:07 PM

#

yeah, idk why!

#

pretty low level reasons surely

spare forum Aug 20, 2024, 8:07 PM

#

x_i are generally the observation/value from a variable X so yeah the input

lapis sequoia Aug 20, 2024, 8:08 PM

#

if one thinks of the cross entropy formula, imho there is smth off

#

i know its not the CE, but resembles it

#

i think it'd be P(y_i~theta)^y_ir

spare forum Aug 20, 2024, 8:10 PM

#

It's log likelihood you speaking about I think ?

lapis sequoia Aug 20, 2024, 8:10 PM

#

yes

#

just needs the power of the y_ir i.e 'real' imho

spare forum Aug 20, 2024, 8:11 PM

#

Hmm not really

lapis sequoia Aug 20, 2024, 8:12 PM

#

#

the power is P(y_ir), taking the log gives the cross entropy (image above.)

#

so i think x_i in Wikipedia can also be the outputs of the network (normally y_i), which can also be considered random variables

#

if that's correct, that would mean that the network is trying to fit the statistical distribution of the outputs (and that power/exponent P(y_ir), somehow.) ?

#

(note that the minus is because they are trying to maximise, one could add it and minimise i presume.)

#

gotta go

serene grail Aug 20, 2024, 8:21 PM

#

Bye!

wooden sail Aug 20, 2024, 8:22 PM

#

lapis sequoia (note that the minus is because they are trying to maximise, one could add it an...

the negative comes from the definition of shannon information

serene scaffold Aug 20, 2024, 8:22 PM

#

who the fuck is shannon

wooden sail Aug 20, 2024, 8:23 PM

#

this is also not a proof, it's just a list of reasonable assumptions. none of them are correct, but they're often close enough for practical use

wild coral Aug 20, 2024, 8:33 PM

#

im ttryna do scipy.optimize.curve_fit

#

and when i give an initial guess for params, it literally doesnt change those params

#

when i give the default value for params, all 1, then it gives an overflow warning but at least it slightly changes it

#

anyone get this issue

iron basalt Aug 20, 2024, 8:40 PM

#

There are entire categories of NNs that don't operate under an i.i.d. assumption. But "deep learning" (backpropagation based) does (there are some exceptions).

#

As Edd also wrote, it's an assumption. You often don't have a nice i.i.d. situation.

wooden sail Aug 20, 2024, 8:45 PM

#

wild coral when i give the default value for params, all 1, then it gives an overflow warni...

what function are you trying to fit? sadly, general non convex optimization has no guarantees and your success depends very strongly on your initial guess

wild coral Aug 20, 2024, 8:48 PM

#

It’s a custom function… basically I take like 37 Params and form them into a grid of 37 by 110, where in each column we have (1-1/p)^ column, and then convolve that grid with my data lol

#

I do have an extra parameter that I haven’t implemented how to fit yet, but it is still being passed into the fitter could that be a reason?

wooden sail Aug 20, 2024, 8:50 PM

#

i don't get how you convolve that with the data

wild coral Aug 20, 2024, 8:53 PM

#

It’s numpy.convolve, I take a row of that grid and I take the corresponding row of my data (which is 36 by 37 by 110) and convolve the slice of the data and the row of the grid which is the kernel fully, so both inputs to convolve are array of length 110. Convolve fully and take the first 110 values

wooden sail Aug 20, 2024, 8:55 PM

#

that sounds like a clever vectorization should let you use a pseudoinverse

#

if p and the column numbers are fixed, anyway

wild coral Aug 20, 2024, 8:57 PM

#

? wdym

#

you mean the SVD inverse?

wooden sail Aug 20, 2024, 8:57 PM

#

yeah

wild coral Aug 20, 2024, 8:57 PM

#

what would that do

wooden sail Aug 20, 2024, 8:57 PM

#

get you back the parameters, so that your estimate is the application of the function to the parameters

wild coral Aug 20, 2024, 8:59 PM

#

i could get back to the parameters by doing 1/(1- kernel[:,1])

wooden sail Aug 20, 2024, 9:00 PM

#

then what exactly are you trying to fit?

wild coral Aug 20, 2024, 9:01 PM

#

theres my simulated signal and an experimental signal, and if I apply the convolution to my simulated signal it should smooth out the signal, and we are trying to tune the constants of smoothing, our fitting parameters, such that it fits best to the experimental smoothness of signal

wooden sail Aug 20, 2024, 9:02 PM

#

and the constants are those 37 params?

wild coral Aug 20, 2024, 9:03 PM

#

yea

wooden sail Aug 20, 2024, 9:04 PM

#

wild coral i could get back to the parameters by doing 1/(1- kernel[:,1])

doing this gives you back the parameters you chose, then. not the ones that fit the data

#

the pseudo inv should give you the best fit of the params to the data. at least with the model as you described it now, since you presented only linear operations (after reparametrizing the entries of the matrix, for clarity)

#

maybe i misunderstood though. if you can present the problem a bit more clearly, someone should be able to give more help while i go sleep

wild coral Aug 20, 2024, 9:07 PM

#

wait im confused, the function I pass to scipy.optimize.curvefit takes in 37 parameters, then in that function I tell it how to transform the list of paramaeters into the 37 by 110 grid, which then convolves it. so the final output of scipy.optimize,curvefit is 37 parameters which supposedly minimizes the nonlinear least squares from my simulated signal to experimental signal

wooden sail Aug 20, 2024, 9:07 PM

#

yes

wild coral Aug 20, 2024, 9:08 PM

#

so i dont get how taking psuedoinv would help

wooden sail Aug 20, 2024, 9:08 PM

#

what is p in the expression you gave above

wild coral Aug 20, 2024, 9:08 PM

#

the parameter

wooden sail Aug 20, 2024, 9:08 PM

#

aha, there we go

wild coral Aug 20, 2024, 9:08 PM

#

sry

wooden sail Aug 20, 2024, 9:08 PM

#

then no, pinv doesn't help

#

yeah that's a general nonlinear problem. things you can try include: giving curve_fit the jacobian and hessian explicitly, and running it several times with different initial conditions

wild coral Aug 20, 2024, 9:10 PM

#

aite thanks

wooden sail Aug 20, 2024, 9:10 PM

#

since you have the function, you can get the derivatives fairly easily

wild coral Aug 20, 2024, 9:10 PM

#

i have absolutely no idea how to get the deriviatives 💀

#

idk how derivatives handle array splciing and indexing

wooden sail Aug 20, 2024, 9:13 PM

#

kinda nastily tbh, each order of derivatives adds an extra dimension to the array (you can avoid this entirely by rewriting everything in sigma notation and differentiating component-wise, but it can be tedious)

#

but tbh since the problem is anyway nonlinear, i would take a look here https://docs.scipy.org/doc/scipy/tutorial/optimize.html#global-optimization

#

try some gradient-free optimization methods that use heuristics to try and find a global optimum (but have no nice guarantees)

#

the dual_annealing method is fairly standard

#

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.dual_annealing.html for reference

wild coral Aug 20, 2024, 9:17 PM

#

thanks,

#

the thing is this was previously impelemented in C using levenberg marqueete, which is relatively simpler than what I am reading here, however they did this all custom, and the convolution was defined recursively rather rather than absolutely

iron basalt Aug 20, 2024, 9:21 PM

#

wild coral the thing is this was previously impelemented in C using levenberg marqueete, wh...

When looking at C code, it will often be all done manually, but also this manual implementation is often the result of a lot of upfront work elsewhere and that is just the end result written out which makes it hard to follow from the code alone. In Python you will usually find more high level functions called that just solve it for you.

#

Those high level functions have a ton of options, because they need to cover everything, take your time reading them.

wooden sail Aug 20, 2024, 9:23 PM

#

the scipy curve fit func also uses levenberg marquart in the quasi-newton flavor (at least by default)

#

if you don't explicitly pass the derivatives, some finite difference approx is used for the jaco and hessian

#

i'm not sure what criteria it uses to choose a step size, i'm sure it's an issue for badly behaving functions though

hard fern Aug 20, 2024, 11:16 PM

#

alright lets go, whats everyones favourite data science python libraries

#

bonus points for libraries i havent heard of

proper crag Aug 21, 2024, 1:34 AM

#

can i use hugging face for this project to host the model ?

#

like can the model inside hugging face still can accept time-streaming data from my local machine?

#

the pipeline occur locally

frosty fulcrum Aug 21, 2024, 3:28 AM

#

Does anyone know the best models to generate similar images?

small wedge Aug 21, 2024, 4:22 AM

#

frosty fulcrum Does anyone know the best models to generate similar images?

similar to what?

frosty fulcrum Aug 21, 2024, 4:42 AM

#

small wedge similar to what?

Similar to the original image duh

small wedge Aug 21, 2024, 4:45 AM

#

frosty fulcrum Similar to the original image duh

are you looking to build this from scratch or just for a tool to use?

frosty fulcrum Aug 21, 2024, 4:46 AM

#

small wedge are you looking to build this from scratch or just for a tool to use?

I’m looking for some pre-trained models to use for my research project.

small wedge Aug 21, 2024, 4:50 AM

#

basically any autoencoder trained on images, especially images in a similar style to the ones you want to input, should work

#

you could just encode the image, add an extremely tiny bit of noise, then decode

#

VAE's specifically as this is kinda their bread and butter, might not even need extra noise from the variational nature of them

faint quail Aug 21, 2024, 5:24 AM

#

Traceback (most recent call last):
  File "C:\Users\ekila\Downloads\Neural Network Framkework\main.py", line 247, in <module>
    val_xdata = xdata[mask]
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 1.66 PiB for an array with shape (387158016, 3, 448, 448) and data type float64

I think I need some more memory 💀

#

Theres a weird error with multiprocessing where you cant send data larger than 2gb back to the main thread, but im trying to train a large computer vision model, how can I overcome this limitation

brave yew Aug 21, 2024, 6:03 AM

#

anyone here use pycharm professional? you guys know how to enable the hugging face tool window on the left bar? i used to have it now its gone

#

nvm i was using the wrong env

hard fern Aug 21, 2024, 7:49 AM

#

why use pycharm over vscode? Genuinely curious ive been using vscode forever and love the extensions

left plover Aug 21, 2024, 7:49 AM

#

faint quail ``` Traceback (most recent call last): File "C:\Users\ekila\Downloads\Neural N...

Petabyte?

#

What the actual

lapis sequoia Aug 21, 2024, 11:25 AM

#

nice section about ML-logistic regression https://en.wikipedia.org/wiki/Logistic_regression#Other_approaches

Logistic regression

In statistics, the logistic model (or logit model) is a statistical model that models the log-odds of an event as a linear combination of one or more independent variables. In regression analysis, logistic regression (or logit regression) estimates the parameters of a logistic model (the coefficients in the linear or non linear combinations). In...

#

For binary class P(Y_i|X_i), none of Y_i|X_i is ||identically distributed||:
if X=age, and Y=disease or not, then Y will be a different distribution for each age. So Y_i are not identically distributed but ||they are independent.||; which is all that is used in the derivation (link above.)

#

visually it'd be like so:

drowsy ice Aug 21, 2024, 1:46 PM

#

So I'm getting the fabled expected 5 got 4 error. I imagine it's common?

serene scaffold Aug 21, 2024, 1:53 PM

#

drowsy ice So I'm getting the fabled expected 5 got 4 error. I imagine it's common?

hello, be sure to always show the whole error message and the code that caused it.

drowsy ice Aug 21, 2024, 1:55 PM

#

it just randomly started working and I have no idea why

#

which is even worse because now idk why I can't fix anything if it goes wrong

#

the biggest change I made was changing the python version

#

Which means the library I was using really was deprecated

#

Hate that

#

Nope that wasn't it. No idea why it worked

rigid timber Aug 21, 2024, 2:35 PM

#

http://hollywood.mit.edu/GENSCAN.html
how can I make a similar model to this

proper crag Aug 21, 2024, 2:43 PM

#

i wan to host my model in hugging face but i wan it to be able to recieve live streaming data too

#

is it possible to do so ?

heavy lily Aug 21, 2024, 2:53 PM

#

Hii anyone will someone volunteer themselves to guide me and my 2 friends to become a good data scientist

lapis sequoia Aug 21, 2024, 2:59 PM

#

this is really cool, hadnt seen it before, i was wondering days ago whether gaussian radial activations were used anywhere

#

they are used in RBF https://en.wikipedia.org/wiki/Radial_basis_function_network

Radial basis function network

In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including function approxi...

#

imho the X-auto encoders should have less green neurons though, but it's a detail, and maybe im wrong.

red dust Aug 21, 2024, 3:05 PM

#

Hello, I want to create a model that is able to take a picture of some clothing and return some parameters such as color, condition, type of fastener, etc. I have a rather large database with photos and parameters. Can I train a model this way in Keras? Or can I get something already trained?

proper crag Aug 21, 2024, 3:09 PM

#

red dust Hello, I want to create a model that is able to take a picture of some clothing ...

check classification models

#

idk like logistic regression or KNN or anything ig

last rain Aug 21, 2024, 3:12 PM

#

Would you say the AI has stopped improving? The average loss is not changing much

#

Epoch 23, Average Loss: 0.02855170254285137
Epoch 24, Average Loss: 0.027263049941716924
Epoch 25, Average Loss: 0.02655800049089723
Epoch 26, Average Loss: 0.027358725149598386
Epoch 27, Average Loss: 0.02881776740671032
Epoch 28, Average Loss: 0.027877019860574767

thin plaza Aug 21, 2024, 3:13 PM

#

Hello 🤗

red dust Aug 21, 2024, 3:14 PM

#

proper crag idk like logistic regression or KNN or anything ig

Which tools should I use for this?

proper crag Aug 21, 2024, 3:15 PM

#

red dust Which tools should I use for this?

idk anything

#

any library you are comfort to use

#

or even any language

red dust Aug 21, 2024, 3:22 PM

#

proper crag idk anything

Well, I'm a newbie on the DS topic, so I'd rather take something simple. I would also appreciate a literature recommendation, I've started read Deep Learning with Python about Keras, so I've thought it's right tool

proper crag Aug 21, 2024, 3:24 PM

#

idk i havent explore pytorch/keras and tensorflow yet

#

ive been using sk learn

red dust Aug 21, 2024, 3:30 PM

#

@proper crag I've heard I should use CNN for this case, what do you think?

proper crag Aug 21, 2024, 3:55 PM

#

red dust <@739545184383139990> I've heard I should use CNN for this case, what do you thi...

don ask me

#

i havent deploy my 1st model yet

#

let alone to suggest you anything

red dust Aug 21, 2024, 3:57 PM

#

Okay, I get it

left vault Aug 21, 2024, 4:01 PM

#

Hi matplotlibs. Could someone help me plot energy diagram? I got no idea how to achieve what I desire, been trying for quite a long time (three days) and without a success...

spare forum Aug 21, 2024, 4:05 PM

#

Give more details pls

left vault Aug 21, 2024, 4:13 PM

#

I try to reproduce exactly something like this

pine escarp Aug 21, 2024, 4:47 PM

#

What is the best coding environment for data science?
I use Jupyter notebook from anaconda. like the one it comes with. not jupyter lab.
my friend uses vs code,

#

im actually comfortable with jupyter notebook

#

it aint that bad

jaunty helm Aug 21, 2024, 4:47 PM

#

left vault I try to reproduce exactly something like this

crude example

#

import numpy as np
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
a = 1 / np.arange(1, 6)
ax.scatter([1] * len(a), a, marker="_", s=5000, linewidths=2)
for y in a:
    ax.annotate(str(y), (1, y), (1.01, y))

plt.show()

pine escarp Aug 21, 2024, 4:48 PM

#

jaunty helm ```py import numpy as np import matplotlib.pyplot as plt fig, ax = plt.subplots...

what is 's'?

jaunty helm Aug 21, 2024, 4:49 PM

#

pine escarp what is 's'?

in that graph, how long those lines are on the x axis

pine escarp Aug 21, 2024, 4:49 PM

#

jaunty helm in that graph, how long those lines are on the x axis

oh, got it

unreal condor Aug 21, 2024, 5:04 PM

#

last rain Epoch 23, Average Loss: 0.02855170254285137 Epoch 24, Average Loss: 0.0272630499...

Yes, your AI propbably converged, which mean the loss value reached a minimum

unreal condor Aug 21, 2024, 5:07 PM

#

red dust Hello, I want to create a model that is able to take a picture of some clothing ...

I would say your problem is probably object detection, YOLO algo is SOTA in that field right now i guess

red dust Aug 21, 2024, 5:12 PM

#

unreal condor I would say your problem is probably **object detection**, YOLO algo is SOTA in ...

Does YOLO would by able to return things like kind of fabric, color etc?

unreal condor Aug 21, 2024, 5:15 PM

#

red dust Does YOLO would by able to return things like kind of fabric, color etc?

Yes, if there are specific classes for them like: red cloth, yellow cloth, etc

brave yew Aug 21, 2024, 5:37 PM

#

is the hands-machine learning with sci-kit book enough to start of finetuning models? or are there any more pre requisites?

fiery bane Aug 21, 2024, 5:44 PM

#

depend on what model you want to finetune

#

if all you want is fine tune

#

then I think reading a book is too much

#

just follow some online tutorial

fiery bane Aug 21, 2024, 5:46 PM

#

pine escarp What is the best coding environment for data science? I use Jupyter notebook fro...

I use vscode. The answer is whatever floats your boats

fiery bane Aug 21, 2024, 5:48 PM

#

red dust Hello, I want to create a model that is able to take a picture of some clothing ...

just use chatgpt?

fiery bane Aug 21, 2024, 5:50 PM

#

red dust <@739545184383139990> I've heard I should use CNN for this case, what do you thi...

CNN or ViT are good places to start.
If this is a job that you need to get done, just use chatGPT.
If this is a learning exercise, then yea, use CNN and then ViT

red dust Aug 21, 2024, 5:50 PM

#

fiery bane just use chatgpt?

No because it's too expensive and too slow.

fiery bane Aug 21, 2024, 5:51 PM

#

how slow?

unreal condor Aug 21, 2024, 5:51 PM

#

ChatGPT is more like prompt-learning, there is no training

red dust Aug 21, 2024, 5:53 PM

#

fiery bane how slow?

Eh for several reasons I don't want to use chatGPT

fiery bane Aug 21, 2024, 5:54 PM

#

red dust Eh for several reasons I don't want to use chatGPT

like ChatGPT in particular, or any multi modal LLM?

fiery bane Aug 21, 2024, 5:55 PM

#

unreal condor ChatGPT is more like prompt-learning, there is no training

yea again right,
Is this a task that needs to get done? if it is, then it doens't matter if there are no training

red dust Aug 21, 2024, 5:55 PM

#

fiery bane like ChatGPT in particular, or any multi modal LLM?

We tried ChatGPT and it was kinda okay, but definitly too expensive. We haven't tried any other LLM

fiery bane Aug 21, 2024, 5:57 PM

#

coz u have type of fastener, then you might need to do some fine tuning, hard to do zero shot stuff

unreal condor Aug 21, 2024, 5:57 PM

#

fiery bane yea again right, Is this a task that needs to get done? if it is, then it doens'...

Definitely, but the quality is not ensured so it's also depend on what you try to achieve.

#

Prompt-learning still got a long way to go and its performance is still not comparable to many SOTA models nowadays

lapis sequoia Aug 21, 2024, 6:01 PM

#

1990s universal fn approximators seem to have been a hot topic

#

fiery bane Aug 21, 2024, 6:01 PM

#

unreal condor Definitely, but the quality is not ensured so it's also depend on what you try t...

exactly, really depends on what's the goal

serene grail Aug 21, 2024, 6:02 PM

#

lapis sequoia 1990s universal fn approximators seem to have been a hot topic

Are they a hot topic right now or do you mean in the 90s?

lapis sequoia Aug 21, 2024, 6:02 PM

#

yes, i meant in the 90s, sorry.

fiery bane Aug 21, 2024, 6:03 PM

#

red dust We tried ChatGPT and it was kinda okay, but definitly too expensive. We haven't ...

if you have sufficient amount of data, then go finetune an existing model. But the overhead is kinda big

#

overhead in terms of effort

jaunty helm Aug 21, 2024, 6:04 PM

#

multi modal is overkill if you don't need the text prompting part

lapis sequoia Aug 21, 2024, 6:04 PM

#

this is the paper in case smone wants to give it a shot https://psycnet.apa.org/record/1991-32257-001

fiery bane Aug 21, 2024, 6:04 PM

#

lapis sequoia 1990s universal fn approximators seem to have been a hot topic

I just realized fn is not function, but functional lol

lapis sequoia Aug 21, 2024, 6:05 PM

#

it's quite cited apparently

brave yew Aug 21, 2024, 6:05 PM

#

fiery bane just follow some online tutorial

no like this isn't a one time thing i want to do, i have made simple projects and now want to try finetuning and get good at it

lapis sequoia Aug 21, 2024, 6:05 PM

#

i think it's used in the same way, it's a function of functions

#

because of the hidden layers, a neural network would be a functional to some extent ig

fiery bane Aug 21, 2024, 6:05 PM

#

jaunty helm multi modal is overkill if you don't need the text prompting part

If it is not multimodal, it is kinda hard to do zero shot asking about fastener types

fiery bane Aug 21, 2024, 6:06 PM

#

brave yew no like this isn't a one time thing i want to do, i have made simple projects an...

really depends on what model have you made?

red dust Aug 21, 2024, 6:07 PM

#

fiery bane if you have sufficient amount of data, then go finetune an existing model. But t...

Any recommendations on the existing model and tools for this job? Probably I have to try this way

brave yew Aug 21, 2024, 6:07 PM

#

fiery bane really depends on what model have you made?

oh.. well its just a mnist reader, only difference i could say i have made is that i didn't use libraries so that i can understand the math, and implemented way to save the result of train curves and the model and loading pretrained models

fiery bane Aug 21, 2024, 6:08 PM

#

brave yew oh.. well its just a mnist reader, only difference i could say i have made is th...

There's this, but it is quite a jump https://github.com/EleutherAI/cookbook?tab=readme-ov-file#best-practices

GitHub

GitHub - EleutherAI/cookbook: Deep learning for dummies. All the pr...

Deep learning for dummies. All the practical details and useful utilities that go into working with real models. - EleutherAI/cookbook

brave yew Aug 21, 2024, 6:09 PM

#

i didn't know what else to make now, so i am making a review summarizer webapp, which scrapes reviews and then puts it through pipeline for zeroshot classification, and charts a graph of sentiment but for the summarization part i wanted to fine tune a base model instead of relying on pipeline@fiery bane

sour lark Aug 21, 2024, 6:10 PM

#

if im using the jupyter hub extension on vscode does anyone know if i can also use the vim one at the same time

#

alternatively does anyone have reccomendations for vim like extensions for fast keyboard shortcuts while using jupyter lab

brave yew Aug 21, 2024, 6:11 PM

#

fiery bane There's this, but it is quite a jump https://github.com/EleutherAI/cookbook?tab=...

wow... this is great, thanks a ton, visualising was a huge problem for me

fiery bane Aug 21, 2024, 6:12 PM

#

red dust Any recommendations on the existing model and tools for this job? Probably I hav...

something like this? https://github.com/google-research/vision_transformer?tab=readme-ov-file#available-vit-models

GitHub

GitHub - google-research/vision_transformer

Contribute to google-research/vision_transformer development by creating an account on GitHub.

lapis sequoia Aug 21, 2024, 6:12 PM

#

sour lark alternatively does anyone have reccomendations for vim like extensions for fast ...

vscode has them

#

also colab

fiery bane Aug 21, 2024, 6:13 PM

#

brave yew wow... this is great, thanks a ton, visualising was a huge problem for me

like, that list link to this list https://github.com/stas00/ml-engineering
But maybe just go through the cookbook and see where you end up.
Also depends on your domain as well. This is geared towards llm.
You said you started with mnist, so do you actually care about cv or anything goes?

GitHub

GitHub - stas00/ml-engineering: Machine Learning Engineering Open Book

Machine Learning Engineering Open Book. Contribute to stas00/ml-engineering development by creating an account on GitHub.

brave yew Aug 21, 2024, 6:17 PM

#

fiery bane like, that list link to this list https://github.com/stas00/ml-engineering But m...

since i am in my 3rd sem of uni i was advised to set an upper limit to ml that would be up to fine tuning models but suggested that i explore horizontally, so far i have classified ml into two field one is more geared towards data science while the other is more of an ml implentation route, honestly i have no idea on the fields inside ml, none of my peers are really interested nor are my profs

#

i did mnist because i was told it was the "hello world" of ml

red dust Aug 21, 2024, 6:18 PM

#

fiery bane something like this? https://github.com/google-research/vision_transformer?tab=r...

So do you recommend VIT instead of CNN?

lapis sequoia Aug 21, 2024, 6:22 PM

#

Vision transformers are quite good apparently: https://paperswithcode.com/task/image-classification, also Efficient Net (CNN) https://arxiv.org/pdf/1905.11946

fiery bane Aug 21, 2024, 6:35 PM

#

red dust So do you recommend VIT instead of CNN?

I don't have first hand experience, but from what I look around, it seems so

fiery bane Aug 21, 2024, 6:36 PM

#

brave yew since i am in my 3rd sem of uni i was advised to set an upper limit to ml that w...

I think TRL is a better way to think about, rather than vertical / horizontal. or data science vs implementation https://github.com/ai-infrastructure-alliance/mltrl

GitHub

GitHub - ai-infrastructure-alliance/mltrl: Repository for the ML Te...

Repository for the ML Technology Readiness Levels framework - ai-infrastructure-alliance/mltrl

red dust Aug 21, 2024, 6:39 PM

#

fiery bane I don't have first hand experience, but from what I look around, it seems so

Great

#

thank you all for your help

fiery bane Aug 21, 2024, 6:44 PM

#

brave yew since i am in my 3rd sem of uni i was advised to set an upper limit to ml that w...

this is a good list of the fields inside ML https://paperswithcode.com/sota

Papers with Code - Browse the State-of-the-Art in Machine Learning

11372 leaderboards • 5047 tasks • 10460 datasets • 138538 papers with code.

brave yew Aug 21, 2024, 6:45 PM

#

fiery bane this is a good list of the fields inside ML https://paperswithcode.com/sota

so at my stage should would it be better to pick one and specialize? or are they interconnected enough that i can pursue more than one?

fiery bane Aug 21, 2024, 6:52 PM

#

brave yew so at my stage should would it be better to pick one and specialize? or are they...

imo pick 1, but that's just my personal opinion

brave yew Aug 21, 2024, 6:53 PM

#

fiery bane imo pick 1, but that's just my personal opinion

okay, thanks a lot for the resources

lapis sequoia Aug 21, 2024, 7:06 PM

#

what...https://en.wikipedia.org/wiki/Convolution

Convolution

In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions (

    f
  

{\displaystyle f}

and

    g
  

{\displaystyle g}

) that produces a third function (

    f
    ∗
    g
  

{\display...

#

unexpected stuff

serene grail Aug 21, 2024, 7:07 PM

#

Adding up functions like this makes me think of Fourier Transform

fiery bane Aug 21, 2024, 7:09 PM

#

serene grail Adding up functions like this makes me think of Fourier Transform

because it is

lapis sequoia Aug 21, 2024, 7:09 PM

#

they mentioned it, idk what either of those are though

wooden sail Aug 21, 2024, 7:10 PM

#

you kinda use it all the time though

#

especially in the discrete case, the DFT/FFT is just a special case of matrix multiplication

serene grail Aug 21, 2024, 7:11 PM

#

fiery bane because it is

Oh, makes a lot of sense

lapis sequoia Aug 21, 2024, 7:18 PM

#

wooden sail you kinda use it all the time though

yeah i think im a bit lost

wooden sail Aug 21, 2024, 7:18 PM

#

how are you doing with your matrix multiplication

lapis sequoia Aug 21, 2024, 7:20 PM

#

normally either hadamard or dot product

wooden sail Aug 21, 2024, 7:21 PM

#

the "with" there meaning your understanding of it

lapis sequoia Aug 21, 2024, 7:21 PM

#

oh, no troubles so far

#

maybe i wasn't clear, i meant that i do not see the mapping between the discrete and continous part

#

there is no reflection (apparently that's not a big deal.) and im not sure the values should be allowed to be very separated

#

(for the integral and the sum to have a similar meaning, i think you need small intervals)

hard fern Aug 21, 2024, 7:34 PM

#

lapis sequoia what...https://en.wikipedia.org/wiki/Convolution

#

There is a reflection, if you see the parts I’ve circled in maroon, g is reflected on the y axis. Also for the part I’ve circled in purple, there appears to be no reflection because f is the function reflected and it is symmetrical

lapis sequoia Aug 21, 2024, 7:35 PM

#

but there isnt in neural nets, that's what i meant

hard fern Aug 21, 2024, 7:37 PM

#

They usually use cross correlation

#

So the filter is applied directly to the input

#

Without reflection

#

What you’re referring to is convolution in the purely mathematical sense

shut shoal Aug 21, 2024, 8:31 PM

#

When extracting data from a PDF, is there a method in pdfplumber to get rid of the headers and footers without figuring out the exact location of them?

lapis sequoia Aug 21, 2024, 8:42 PM

#

https://github.com/jsvine/pdfplumber/issues/843 @shut shoal

GitHub

Is there any method to remove the header and footer of a pdf? · Iss...

Please describe, in as much detail as possible, your proposal and how it would improve your experience with pdfplumber.

#

The PDF specification does not have concept of a header or footer; anything that looks like a header or footer is implemented by the particular software that is writing the PDF. For that reason, there is no generic solution for removing headers/footers (although there may be a specific solution for whatever specific PDFs you're working with).

#

so it's pretty much "styled body", you can either crop it or regex it ig

shut shoal Aug 21, 2024, 8:46 PM

#

lapis sequoia so it's pretty much "styled body", you can either crop it or regex it ig

https://tenor.com/view/lebron-james-crying-hug-cavs-nba-sad-gif-11873768

Tenor

#

Appreciate it

lapis sequoia Aug 21, 2024, 9:38 PM

#

i think the discrete and continuous convolutions could be mapped in certain way, as if the discrete were already bucketised areas

#

then the integral is a sum

solar shard Aug 21, 2024, 9:53 PM

#

For random forest regression, does anyone have a solid resource for a complete tutorial? I can't get my model over 71% r2 and feel like I might be missing something very simple because thats normally the case. I ran cv grid and even random. When I drop my only highly correlated variable, it drops to -0.001. I know that RFR isn't supposed to be sensitive to outliers and correlations but I'm sort of stuck. Switching from onehotencoding to LE increased my performance by 1% lol

hard fern Aug 21, 2024, 11:23 PM

#

lapis sequoia i think the discrete and continuous convolutions could be mapped in certain way,...

This looks like a cave painting

hard fern Aug 21, 2024, 11:24 PM

#

solar shard For random forest regression, does anyone have a solid resource for a complete t...

There’s tons on YouTube

#

All will help

solar shard Aug 21, 2024, 11:48 PM

#

hard fern There’s tons on YouTube

You'd think that, ha. I'll keep looking. YouTube is forcing the shorter visuals to the top. I'll select a longer vid duration.

drowsy ice Aug 22, 2024, 2:48 AM

#

from stable_baselines3.common.vec_env import vec_frame_stack whenever I add this I get a no module error ideas?

serene grail Aug 22, 2024, 2:50 AM

#

Are you following a tutorial/docs? If so, linking whatever you're following will probably help

drowsy ice Aug 22, 2024, 2:53 AM

#

https://www.youtube.com/watch?v=dWmJ5CXSKdw&t=472s 33 min

YouTube

Nicholas Renotte

Reinforcement Learning for Gaming | Full Python Course in 9 Hours

Ever wanted to learn how to apply ML to games?

Here ya go!

What's happening team! This is a compilation of the RL tutorials for gaming in one mega course. In this course, you'll learn an absolute TON about best practices when training reinforcement learning models for games using Python and Stable Baselines 3.

Chapters
0:00 - START
1:07 - MA...

▶ Play video

drowsy ice Aug 22, 2024, 2:53 AM

#

serene grail Are you following a tutorial/docs? If so, linking whatever you're following will...

.

serene grail Aug 22, 2024, 2:54 AM

#

Ah, well I can't help with a video right now, sorry
Hopefully someone else can

drowsy ice Aug 22, 2024, 2:56 AM

#

alright its chill I'll figure it out

drowsy ice Aug 22, 2024, 3:04 AM

#

serene grail Ah, well I can't help with a video right now, sorry Hopefully someone else can

fixed it

#

it needed to reinstall stable baselines for whatever reason

serene grail Aug 22, 2024, 3:09 AM

#

Oh, nice

drowsy ice Aug 22, 2024, 3:20 AM

#

if I had a dollar for how many times I've had to install nad reininstall soemthing

wooden tapir Aug 22, 2024, 4:24 AM

#

I have to create presentation for evolution of computers for tools for data science
any tips

fiery bane Aug 22, 2024, 5:09 AM

#

wooden tapir I have to create presentation for evolution of computers for tools for data scie...

find survey papers

unreal condor Aug 22, 2024, 5:15 AM

#

wooden tapir I have to create presentation for evolution of computers for tools for data scie...

You have to be more specific then

#

What kind of presentation

#

And what are u trying to achieve

left vault Aug 22, 2024, 6:37 AM

#

jaunty helm crude example

Thanks 🙂 Looks good, but certianley I'll need to change it

#

even someting similar to this would be nice

wooden sail Aug 22, 2024, 7:10 AM

#

left vault even someting similar to this would be nice

this would be a lot easier to make with latex + tikz than with matplotlib tbh

#

here's an example on how to do it with "raw" tikz https://tex.stackexchange.com/questions/124269/energy-level-diagrams-with-texhttps://tex.stackexchange.com/questions/124269/energy-level-diagrams-with-tex

#

and some other examples https://tikz.net/blackbody_oscillators/

TikZ.net

Black body oscillators

left vault Aug 22, 2024, 7:24 AM

#

@wooden sail Link not found ;D

#

Not to say, Latex is a thing I don't know at all

#

i managed to do something like this in GNUplot but need some things that are more precisable in matplotlib

wooden sail Aug 22, 2024, 7:27 AM

#

left vault <@467435887236612106> Link not found ;D

ok, so it absolutely has to be matplotlib?

left vault Aug 22, 2024, 7:28 AM

#

what's a better option? i need to include arrows from the right of each red to the left of each blue

wooden sail Aug 22, 2024, 7:29 AM

#

i would really say latex + tikz is the easiest... if you already know how to use them

#

but regardless of how you do it, that type of plot requires you to do a fair amount of math with the coordinates of the endpoints of the lines

left vault Aug 22, 2024, 7:29 AM

#

wooden sail i would really say latex + tikz is the easiest... if you already know how to use...

but i dont 😅

#

never did anything with latex

wooden sail Aug 22, 2024, 7:30 AM

#

then i would scavenge for people's projects that have already done this in matplotlib because doing it by hand is a mess

#

mpl is not a good tool for it

left vault Aug 22, 2024, 7:31 AM

#

i searched for but they are not willing to share the code 😛

wooden sail Aug 22, 2024, 7:31 AM

#

are you sure?

#

cuz i found some rn

#

https://github.com/giacomomarchioro/PyEnergyDiagrams

GitHub

GitHub - giacomomarchioro/PyEnergyDiagrams: This is a simple script...

This is a simple script to plot energy profile diagrams using Python and matplotlib. - giacomomarchioro/PyEnergyDiagrams

left vault Aug 22, 2024, 7:31 AM

#

i know this one

wooden sail Aug 22, 2024, 7:32 AM

#

https://pypi.org/project/leveldiagram/
this one too

PyPI

leveldiagram

Energy level diagram plotting from graphs

left vault Aug 22, 2024, 7:33 AM

#

rather former than later, but I tried with the former and got some weird output given I have multielvels on the two categories

wooden sail Aug 22, 2024, 7:33 AM

#

otherwise what purplys did is your best bet: trace lines working with their coordinates and doing some math to make arrows between points

left vault Aug 22, 2024, 7:34 AM

#

okey, thank you

wooden sail Aug 22, 2024, 7:35 AM

#

(don't be scared of tikz and latex, you can try it out on overleaf)

#

i'm guessing you're aware of it by now, but making pretty plots eats a ridiculous amount of time 😛

left vault Aug 22, 2024, 7:50 AM

#

I see !

boreal nest Aug 22, 2024, 8:09 AM

#

hello everyone , I'm starting to learn about polars, coming from pandas. There seems to be a lot of issues with this library. Has anyone tried working with polars?

jaunty helm Aug 22, 2024, 9:12 AM

#

boreal nest hello everyone , I'm starting to learn about polars, coming from pandas. There s...

There seems to be a lot of issues with this library
what specifically

#

heads up: polars isn't 'pandas but faster', if you just tried to e.g. line by line convert pd to pl code you won't have a good time

boreal nest Aug 22, 2024, 9:31 AM

#

jaunty helm heads up: polars isn't 'pandas but faster', if you just tried to e.g. line by li...

yes ofc appreciate the heads up

#

I think I found my issue lol

#

I was working with schema lazyframes

#

I just noticed you can't type cast the schema with numpy types. but instead you can only do it with numpy arrays

#

i just noticed also that my polars version wasn't updated to 1.0 and I was following the older version because for some reason pip installed a older version

#

https://tenor.com/view/regert-gif-22853534

Tenor

pearl parrot Aug 22, 2024, 9:37 AM

#

Im kinda lost
Which is the correct channel for Machine Learning stuff?

boreal nest Aug 22, 2024, 9:40 AM

#

pearl parrot Im kinda lost Which is the correct channel for Machine Learning stuff?

here also for sure

pearl parrot Aug 22, 2024, 9:51 AM

#

Its me again

What libraries amd modules should I master to get started with TensorFlow ML?

Can you also suggest some projects that I should fi ish to get myself ready?

Its that Im really passionate about ML, and lost at the same timeIm kinda lost
Which is the correct channel for Machine Learning stuff?

unreal condor Aug 22, 2024, 10:00 AM

#

I'm going to participate in SemEval 2025, which is just a NLP competition in a nutshell. Participants can publish a number of research papers (this can help boost your academic status) based on the number of tasks they took part in. I'm trying to find some partners interested in such tasks. If you are interested, maybe DM me and we could discuss further. Here is the information for the upcoming SemEval 2025: https://semeval.github.io/

SemEval

International Workshop on Semantic Evaluation

lapis sequoia Aug 22, 2024, 10:11 AM

#

hard fern This looks like a cave painting

fair, it's meant to describe how the two ways to calculate convolutions are conceptually equivalent.

lapis sequoia Aug 22, 2024, 10:12 AM

#

wooden sail this would be a lot easier to make with latex + tikz than with matplotlib tbh

there are packages directly to do those i think

#

im not sure if it achieves the same, but seems quite close. https://ftp.snt.utwente.nl/pub/software/tex/macros/latex/contrib/modiagram/modiagram_en.pdf

wooden sail Aug 22, 2024, 10:15 AM

#

yeah there's probably several, i just didn't find them in 30 seconds of googling :p

wooden tapir Aug 22, 2024, 10:47 AM

#

unreal condor You have to be more specific then

It's College project,use of powerpoint can be done for presenting

unreal condor Aug 22, 2024, 11:02 AM

#

wooden tapir It's College project,use of powerpoint can be done for presenting

why yes, but what are you doing specifically, like what is the description of the task ?, the input and the output ?, what methods are feasible ?

unreal condor Aug 22, 2024, 11:16 AM

#

pearl parrot Its me again What libraries amd modules should I master to get started with Ten...

TensorFlow is a framework specialized for deep learning, however, I would recommend Pytorch as an alternative since it's more popular nowadays so there are more documents about it.
I advise you to get acquainted to basic ML concepts first like Linear Regression, Logistic Regression, Loss Function, Gradient descent, Regression problems, Classification problems, etc. Knowing all of them is probably enough for you to learn deep learning/neural network, but learning some other traditional ML algos like Decision Tree, Random Forest, SVM is also a great way to get familiar with ML in general.
You can find some great ML courses on Coursera, I recommend the ML course and DL course taught by Andrew NG - one of the leading ML researcher.
Your starting projects should be simple and get you acquainted with Regression problems and Classification problems (there are more types of problems, but those two are the most basic and common) like: House price prediction, Classify cat and dog, etc.
Also math is the foundation of practically all ML methods nowadays, if you are good at math you will have an easier time understanding all the concepts, but it is not compulsory unless you want to do research in specific fields.

spare forum Aug 22, 2024, 11:23 AM

#

unreal condor * **TensorFlow** is a framework specialized for deep learning, however, I would ...

Pytorch more popular and less a pain in the ass too

sour horizon Aug 22, 2024, 11:50 AM

#

I'm not sure if this is the right channel. I've created this using plotly and I'm wondering if there's a way to shift the neutral section into the middle and split it in half?

#

I'm a beginner in using plotly so I based my code from this thread https://community.plotly.com/t/need-help-in-making-diverging-stacked-bar-charts/34023/3

Plotly Community Forum

Need help in making Diverging Stacked Bar Charts

Thanks for posting this script. Successfully used with some of my own likert scale data. I’ve had real trouble changing the colours. I read the plotly documentation but I keep getting errors. Could you advise how we can implement gradient colours for each diverging stack (red for negative and green for positive responses)? Thanks

left tartan Aug 22, 2024, 11:57 AM

#

sour horizon I'm not sure if this is the right channel. I've created this using plotly and I'...

Wdym by shift to middle and split in half?

sour horizon Aug 22, 2024, 11:59 AM

#

left tartan Wdym by shift to middle and split in half?

something like this if the shaded and bar blue are both neutral and its split in half by the line at x=0

left tartan Aug 22, 2024, 12:01 PM

#

I don't understand. Neutral has some value, so the neutral areas are not all the same width? Or are you saying you want neutral centered on 0?

sour horizon Aug 22, 2024, 12:01 PM

#

left tartan I don't understand. Neutral has some value, so the neutral areas are not all the...

i want neutral center on 0

left tartan Aug 22, 2024, 12:01 PM

#

sour horizon i want neutral center on 0

Ok, paste your code for this chart ppz

sour horizon Aug 22, 2024, 12:02 PM

#


import plotly.graph_objects as go
import pandas as pd

d = {'y-axis': ['TEIs and RCs S3', 'TEIs and RCs S2', 'TEIs and RCs S1', 'Socioeconomic Factors S2', 'Socioeconomic Factors S1', 'Learning Modality S2', 'Learning Modality S1'],
     'Neutral': [2, 1, 1, 4, 5, 2, 0],
     'Disagree': [0, 0, 1, 1, 1, 1, 0],
     'Strongly Disagree': [1, 3, 2, 4, 4, 1, 0],
     'Agree': [3, 3, 5, 5, 4, 5, 9],
    ' Strongly Agree': [7, 10, 8, 3, 3, 8, 8]}
df = pd.DataFrame(d)

fig = go.Figure()
for col in df.columns[1:4]:
    fig.add_trace(go.Bar(x=-df[col].values,
                         y=df['y-axis'],
                         orientation='h',
                         name=col,
                         customdata=df[col],
                         hovertemplate = "%{y}: %{customdata}"))
for col in df.columns[4:]:
    fig.add_trace(go.Bar(x=df[col],
                         y=df['y-axis'],
                         orientation='h',
                         name=col,
                         customdata=df[col], 
                            hovertemplate="%{y}: %{x}"))    

    fig.update_layout(barmode='relative', 
                  height=400, 
                  width=700, 
                  yaxis_autorange='reversed',
                  bargap=0.01,
                  legend_orientation ='h',
                  legend_x=-0.05, legend_y=1.3
                 )
fig.show()```

agile cobalt Aug 22, 2024, 12:02 PM

#

try specifying the x_range?

#

ah never mind, yeah no clue

sour horizon Aug 22, 2024, 12:04 PM

#

agile cobalt ah never mind, yeah no clue

thx though for trying

unreal condor Aug 22, 2024, 12:05 PM

#

sour horizon ```py import plotly.graph_objects as go import pandas as pd d = {'y-axis': ['T...

Try to add a loop for every data point then

sour horizon Aug 22, 2024, 12:14 PM

#

unreal condor Try to add a loop for every data point then

please clarify, I've been trying to find the arg that allows me to set it at the middle of origin

left tartan Aug 22, 2024, 12:15 PM

#

Don't use relative, just set negative values for what you want on left

#

I think

#

(On mobile so my advice may be questionable)

sour horizon Aug 22, 2024, 12:16 PM

#

left tartan Don't use relative, just set negative values for what you want on left

I've noticed that but I can't find what can make it in the middle

#

its always left or right

unreal condor Aug 22, 2024, 12:17 PM

#

sour horizon please clarify, I've been trying to find the arg that allows me to set it at the...

Your problem seems to be way too specific to have any specific function for it. Try to find the middle point for each category then manually set the values' position with a for loop then. That's my method

left tartan Aug 22, 2024, 12:19 PM

#

sour horizon I've noticed that but I can't find what can make it in the middle

Try to create the neutral bar by itself, first, with no other bar:

#

Oh, no, create two series: one above and one below x=0. So half of neutral is negative and half is positive

sour horizon Aug 22, 2024, 12:21 PM

#

fig = go.Figure()
fig.add_trace(go.Bar(x=-df["Neutral"].values/2,
                     y=df['y-axis'],
                     orientation='h',
                     name="Neutral",
                     customdata=df["Neutral"],
                     xperiodalignment="middle",
                     hovertemplate = "%{y}: %{customdata}"))
fig.add_trace(go.Bar(x=+df["Neutral"].values/2,
                     y=df['y-axis'],
                     orientation='h',
                     name="Neutral",
                     customdata=df["Neutral"],
                     xperiodalignment="middle",
                     hovertemplate = "%{y}: %{customdata}"))

#

I've added this

#

#

I just need to find a way to merge both neutrals

#

can I set it to the same category or something like that?

left tartan Aug 22, 2024, 12:22 PM

#

Yah, I like that. Set the trace color to same, perhaps

#

And drop legend from one. Let me think about merging tho

#

I'd have to play with it a little, you could use a legend group to combine them

sour horizon Aug 22, 2024, 12:26 PM

#

it technically worked

#

tysm @left tartan

#

I'll still try hunting for ways to do it more efficiently

fiery bane Aug 22, 2024, 1:10 PM

#

wooden sail i'm guessing you're aware of it by now, but making pretty plots eats a ridiculou...

my estimate is about 8 hours per plot.

fiery bane Aug 22, 2024, 1:10 PM

#

pearl parrot Its me again What libraries amd modules should I master to get started with Ten...

tensorflow

proper crag Aug 22, 2024, 2:23 PM

#

i need idea
i wan to feed my model live streaming data
at 1st, i thot i wan to make network lab on a network simulator
but im a mac os user and most network simulator that their interfaces allow Wireshark to capture the .PCAP file isnt supported on ARM architecture

#

i wan to simulate works like the ETL pipeline, DMBS deployment and API while also deploying the model

obsidian sand Aug 22, 2024, 2:38 PM

#

Hello, does anyone have any advice on performing RAG on a CSV with a high number of columns? How would I go about doing it? I tried neo4j + Mistral 7B Instruct fine tuned for cypher generation but it does not work too well as the LLM does not generate the cypher query correctly, and sometimes gets it wrong.

Any tips please?

agile cobalt Aug 22, 2024, 2:39 PM

#

obsidian sand Hello, does anyone have any advice on performing RAG on a CSV with a high number...

what does the CSV have to do with Cypher?..

#

that is the query language used for graph databases right?

obsidian sand Aug 22, 2024, 2:40 PM

#

agile cobalt what does the CSV have to do with Cypher?..

I tried representing the csv attributes as nodes and relationships then i used llm to generate cypher query

agile cobalt Aug 22, 2024, 2:40 PM

#

what does your data looks like in first place? can you show some examples

#

and how do you plan to query/use it later (as in, which kind of prompt will the end user give to the model)

obsidian sand Aug 22, 2024, 2:45 PM

#

agile cobalt and how do you plan to query/use it later (as in, which kind of prompt will the ...

My data are about products - so there are attributes like name of product, use cases, and many other general information about it (cost, rating, dates etc)

The user query will be converted to a cypher query for querying the neo4j database, then returning the rsults as context for LLM to generate a reply

agile cobalt Aug 22, 2024, 2:48 PM

#

Which kind of queries exactly? How are you planning to evaluate how well it works?

For some cases you might want to just perform full text search over the product name, but if you use a Tool to search over it you might need to specify which values it can search in first place, e.g. provide an enum for the use cases

#

honestly I would provide a few options for the model to generate a JSON representing a query, then convert that JSON to the actual cypher query

#

cypher is kinda niche, if even using a fine-tuned model you are not getting valid queries, don't have the model generate cypher directly

obsidian sand Aug 22, 2024, 2:51 PM

#

agile cobalt Which kind of queries exactly? How are you planning to evaluate how well it work...

Some user queries that I would like my RAG to do well in are generally aggregation type like e.g (which product is the least expensive for xx use case) and also general questions like (what are the available products for xx use case, what is the rating for xx product)

agile cobalt Aug 22, 2024, 2:52 PM

#

That is not the sort of question you would want to use a graph database to answer in first place

obsidian sand Aug 22, 2024, 2:52 PM

#

Yeah Im still experimenting. What would you recommend?

agile cobalt Aug 22, 2024, 2:52 PM

#

good old SQL

obsidian sand Aug 22, 2024, 2:53 PM

#

Text to sql?

agile cobalt Aug 22, 2024, 2:53 PM

#

you could try to have it generate SQL directly I guess, should work better than cypher, just make sure you're giving the model a read-only connection with properly configured permissions

obsidian sand Aug 22, 2024, 2:54 PM

#

I understand that we can feed the attributes (CSV schema) into the LLM to create a SQL statement.

What about the actual rows of the database? (My data is mostly text data + numbers mixed)

agile cobalt Aug 22, 2024, 2:56 PM

#

just run a SELECT query then feed the results directly as part of the query TPF_02_Shrug

some frameworks have fancy ways of formatting tool calls and their outputs

trail monolith Aug 22, 2024, 2:59 PM

#

Any devs/ds from india here?

#

Need advice on getting a good paygrade lol

left tartan Aug 22, 2024, 3:03 PM

#

trail monolith Need advice on getting a good paygrade lol

#career-advice

obsidian sand Aug 22, 2024, 3:05 PM

#

agile cobalt just run a SELECT query then feed the results directly as part of the query <:TP...

Hmm for e.g query (what products has the highest rating), would give the sql query would be something like: select product top 1 sort(rating)

Thing is my attributes (e.g rating) is not sortable and not structured - because it is a mix of strings and numbers. Hence, would that mean i need to pass the entire column for the LLM to generate an answer?

agile cobalt Aug 22, 2024, 3:05 PM

#

Clean it first so that your ratings are proper numbers.

obsidian sand Aug 22, 2024, 3:07 PM

#

agile cobalt Clean it first so that your ratings are proper numbers.

Any suggestions if cleaning is not feasible an option in the long run?

agile cobalt Aug 22, 2024, 3:09 PM

#

abort the entire project if cleaning the data before using it is not feasible

#

remember: if trash goes in, trash comes out

lapis sequoia Aug 22, 2024, 3:13 PM

#

checking this i wonder if its used for ml in the web https://glmatrix.net/

obsidian sand Aug 22, 2024, 3:17 PM

#

agile cobalt abort the entire project if cleaning the data before using it is not feasible

wow ok i will see what i can do thanks

lapis sequoia Aug 22, 2024, 3:17 PM

#

WebGL (Web Graphics Library) is a JavaScript API for rendering high-performance interactive 3D and 2D graphics within any compatible web browser without the use of plug-ins (...) makes it possible for the API to take advantage of hardware graphics acceleration provided by the user's device.

#

i think some of you may find this list interesting https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API#libraries

proper crag Aug 22, 2024, 3:30 PM

#

obsidian sand I understand that we can feed the attributes (CSV schema) into the LLM to create...

you need classification model right for this?

#

as i understand bcua the model has to classify each data

#

and then create column corresponding the data

fiery bane Aug 22, 2024, 3:58 PM

#

obsidian sand wow ok i will see what i can do thanks

to put it differently,
apparntly you have a data cleaning project, not a data science project.

obsidian sand Aug 22, 2024, 4:17 PM

#

fiery bane to put it differently, apparntly you have a data cleaning project, not a data sc...

hmm okok

shut shoal Aug 22, 2024, 4:43 PM

#

How could I fix this without removing the neccessary spaces? Everytime I make a function using re it seems like I remove the spaces that are needed to signify a new word.

lapis sequoia Aug 22, 2024, 4:49 PM

#

pretty neat stuff https://en.wikipedia.org/wiki/Lp_space

#

appears in many ml papers

fiery bane Aug 22, 2024, 4:59 PM

#

shut shoal How could I fix this without removing the neccessary spaces? Everytime I make a ...

sounds like a regex question, not sure if this is the channel

fiery bane Aug 22, 2024, 5:00 PM

#

lapis sequoia pretty neat stuff <https://en.wikipedia.org/wiki/Lp_space>

just go one level more abstract and read https://en.wikipedia.org/wiki/Norm_(mathematics)

Norm (mathematics)

In mathematics, a norm is a function from a real or complex vector space to the non-negative real numbers that behaves in certain ways like the distance from the origin: it commutes with scaling, obeys a form of the triangle inequality, and is zero only at the origin. In particular, the Euclidean distance in a Euclidean space is defined by a nor...

shut shoal Aug 22, 2024, 5:01 PM

#

fiery bane sounds like a regex question, not sure if this is the channel

What would be the correct channel?

fiery bane Aug 22, 2024, 5:01 PM

#

I'm not sure either, try #1035199133436354600 ?

shut shoal Aug 22, 2024, 5:01 PM

#

Awesome, thanks.

fiery bane Aug 22, 2024, 5:04 PM

#

lapis sequoia pretty neat stuff <https://en.wikipedia.org/wiki/Lp_space>

or https://en.wikipedia.org/wiki/Metric_space

Metric space

In mathematics, a metric space is a set together with a notion of distance between its elements, usually called points. The distance is measured by a function called a metric or distance function. Metric spaces are the most general setting for studying many of the concepts of mathematical analysis and geometry.
The most familiar example of a m...

unkempt apex Aug 22, 2024, 5:38 PM

#

Road extraction from satellite images!!

#

now it's accurate

serene grail Aug 22, 2024, 5:42 PM

#

Nice, how did you fix it?

lapis sequoia Aug 22, 2024, 5:50 PM

#

fiery bane or https://en.wikipedia.org/wiki/Metric_space

ill take a look just trying to know what they mean here exactly

#

learnt up to l_p spaces

fiery bane Aug 22, 2024, 5:54 PM

#

next is lebesgue measureable

wooden sail Aug 22, 2024, 5:57 PM

#

we're gonna lose octobass to real and functional analysis before ever getting to ML

#

i'd really redirect to to vector norms on finite dimensional spaces unless you really wanna play with infinite dimensional spaces. if you've never heard of either, all the more reason

#

https://en.wikipedia.org/wiki/Normed_vector_space

serene grail Aug 22, 2024, 5:59 PM

#

Is real analysis the field where you have to make proofs and things like that?

wooden sail Aug 22, 2024, 6:00 PM

#

that's all of math

#

i would put it as "real analysis is the more formal version of calculus"

#

the flavor where you do go through all the proofs instead of being handed down the recipes

serene grail Aug 22, 2024, 6:00 PM

#

Ah, thank you

wooden sail Aug 22, 2024, 6:01 PM

#

for most people not studying maths, linalg, real analysis, "intro to proofs", or "discrete mathematics" will be the first and possibly only time they ever have to fight against proofs and rigor

#

and the stuff they're discussing just above is several steps after that, which is a pretty bad idea for someone without a decent feel for maths just trying to get started with ml

unkempt apex Aug 22, 2024, 6:02 PM

#

serene grail Nice, how did you fix it?

added threshold as 0.1, because output images were on that range
also added batchnorm and dropout layers in model!

unkempt apex Aug 22, 2024, 6:03 PM

#

wooden sail and the stuff they're discussing just above is several steps after that, which i...

yeah, maths is very important for this field

serene grail Aug 22, 2024, 6:03 PM

#

What would be a good way to learn some stuff about proofs (at least as a high level overview, just for fun)? Intro to proofs sounds promising

fiery bane Aug 22, 2024, 6:04 PM

#

wooden sail we're gonna lose octobass to real and functional analysis before ever getting to...

I mean, lebesgue measurbale is literally in his screenshots

wooden sail Aug 22, 2024, 6:04 PM

#

fiery bane I mean, lebesgue measurbale is literally in his screenshots

i see it, but i would still say that given the background and the questions they've been asking before, it's best to read something that.. they're more likely to understand at all 😛

#

everyone has to start somewhere

fiery bane Aug 22, 2024, 6:05 PM

#

lapis sequoia ill take a look just trying to know what they mean here exactly

What I'm trying to say is, if see L_p spaces from the perspective of metric space, it would be easier to think about other stuff later on like KL div, or what contrastive learning is doing

wooden sail Aug 22, 2024, 6:05 PM

#

read analysis is probably not that somewhere

fiery bane Aug 22, 2024, 6:06 PM

#

wooden sail i see it, but i would still say that given the background and the questions they...

Well, everyone has their own path. I don't like math, but it seems that some people want to take the more theoretical and fundamental path 1st, and I can appreciate that.

wooden sail Aug 22, 2024, 6:07 PM

#

that's also my preferred path, but there are better reads to start from before jumping into lp spaces

#

starting from inner product spaces is more reasonable

lapis sequoia Aug 22, 2024, 6:12 PM

#

i pretty much have no direction, but also no goal

fiery bane Aug 22, 2024, 6:19 PM

#

lapis sequoia i pretty much have no direction, but also no goal

then you are going nowhere, but also, you have arrived.
lol

lapis sequoia Aug 22, 2024, 6:19 PM

#

pretty much how it feels indeed

fiery bane Aug 22, 2024, 6:20 PM

#

you have win at life, congratz

unkempt apex Aug 22, 2024, 6:26 PM

#

shit happens , don't worry!, just keep going..

lapis sequoia Aug 22, 2024, 6:31 PM

#

ty :)

verbal oar Aug 22, 2024, 6:36 PM

#

why grokking ml has ofsetted operators?
its hard to read formulas

#

sth like instead of yhat - y
yhat y -

#

do you have too this issue?

#

I see more of this type in this book

#

with text no problem just some formulas

#

#

whats wrong with it?

serene grail Aug 22, 2024, 6:39 PM

#

Looks like some sort of formatting issue, is this a pdf? Maybe try a different pdf reader or something

verbal oar Aug 22, 2024, 6:39 PM

#

maybe epub?

spare forum Aug 22, 2024, 7:36 PM

#

lapis sequoia ty `:)`

Yep tbh having to deal with Lp spaces and those are like heavy maths, depending on what you say on them it can be a master degree math topic lol

#

Like it's a Banach space with the associated norm blabla

#

Most ppl don't know shit about this and can still stay sota about ml DL, they eventually knew but only researcher mind about such things

lapis sequoia Aug 22, 2024, 7:50 PM

#

thanks for the comments @spare forum

runic parcel Aug 22, 2024, 8:22 PM

#

I have multiple tool data stored in different .txt files, which I have provided to my Langchain + OpenAI RAG model. The setup allows the user to input a prompt, and based on that prompt, the AI suggests the best tool accordingly. However, I've encountered an issue where the AI is recommending tools from inappropriate categories.
For example, if a user types 'I want to make a website,' the AI might still suggest tools related to video editing, which is incorrect. What should I do?

serene scaffold Aug 22, 2024, 8:26 PM

#

runic parcel I have multiple tool data stored in different .txt files, which I have provided ...

did you confirm that the RAG part of the pipeline picks the right txt files for the user's query?

runic parcel Aug 22, 2024, 8:26 PM

#

serene scaffold did you confirm that the RAG part of the pipeline picks the right txt files for ...

yes its even working

#

if i write tools for video editing its showing proper

#

but sometimes its giving irravelent results. its showing video editing tool in "website building tool"

spring field Aug 22, 2024, 8:28 PM

#

do you keep a record of the past conversations that are taken into account and then does this occur after talking about said irrelevant stuff or just talking with it for longer?

#

or does it happen to suggest something completely irrelevant on the first prompt as well?

#

if you don't keep a record of the past conversations, perhaps, you're fine-tuning the model instead? and over time it starts to pick up more of the more common stuff discussed
but I'm assuming here that it's only a per session thing

runic parcel Aug 22, 2024, 8:34 PM

#

spring field do you keep a record of the past conversations that are taken into account and t...

actually i dont keep the past conv

#

its like

#

when i ask it to give me tools for video editing, it gives proper tools. so for the tools with presentation. but when i asked to give the tools for website building, it gave me video editing tool. so some part it messes up

#

so what should i do to make it give proper and accurate results

lucid parrot Aug 23, 2024, 12:29 AM

#

what's the consensus on AI generated code?

serene scaffold Aug 23, 2024, 12:33 AM

#

lucid parrot what's the consensus on AI generated code?

it can be pretty effective for well-specified prompts. non-programmers often don't know what to ask for.

#

(that's my opinion, not necessarily the consensus. but I think I have more experience with generative language models than anyone else in this server.)

lusty patio Aug 23, 2024, 12:37 AM

#

I was wondering if anyone here knew of a good textbook covering building transformers or other deep ML topics

serene scaffold Aug 23, 2024, 12:38 AM

#

lusty patio I was wondering if anyone here knew of a good textbook covering building transfo...

check the pins

lusty patio Aug 23, 2024, 12:38 AM

#

serene scaffold check the pins

ooo thank you

lucid parrot Aug 23, 2024, 12:38 AM

#

serene scaffold it can be pretty effective for well-specified prompts. non-programmers often don...

yeah i'm just curious about how people feel about getting PRs that were generated using AI / how can they tell?

agile cobalt Aug 23, 2024, 12:38 AM

#

lucid parrot what's the consensus on AI generated code?

never trust anything AI generated before having a human double check it

it can seem to work great in small constrained toy examples, but often fails with real world larger & messier data - specially if the user doesn't properly understands how the system works

lucid parrot Aug 23, 2024, 12:38 AM

#

agile cobalt never trust anything AI generated before having a human double check it it can ...

totally agree

serene scaffold Aug 23, 2024, 12:40 AM

#

lucid parrot yeah i'm just curious about how people feel about getting PRs that were generate...

if you've confirmed that the code does what you intend it to do, and it upholds all the standards of that project, then it doesn't matter if you used AI to create it.

if someone just started making a bunch of PRs with untested, AI-generated code, that person is draining the energy of open-source maintainers and should fuck all the way off.

iron basalt Aug 23, 2024, 12:41 AM

#

lucid parrot what's the consensus on AI generated code?

If it's something that is commonly done it can work ok with a well crafted prompt(s).

#

It may fall under legal gray area though (i'm not a lawyer, look into it).

lucid parrot Aug 23, 2024, 12:42 AM

#

i'm just curious about how the maintainer can tell if something is AI generated (i'm not trying to fool anyone and don't use AI to blindly write code, just generally curious)

iron basalt Aug 23, 2024, 12:43 AM

#

lucid parrot i'm just curious about how the maintainer can tell if something is AI generated ...

When the code quality is average or worse, has hallucinations, or other oddities that you would notice if you read a bunch of open source projects made by humans.

lucid parrot Aug 23, 2024, 12:43 AM

#

also, i've been sensing some kind of tension between newcomers (who are more like to use AI to write code) and experienced developers

serene scaffold Aug 23, 2024, 12:44 AM

#

lucid parrot also, i've been sensing some kind of tension between newcomers (who are more lik...

newcomers should be honest about their capabilities and ask for opportunities to contribute at their skill level. a lot of repos have "good first issue" tags in their issue tracker.

iron basalt Aug 23, 2024, 12:44 AM

#

iron basalt When the code quality is average or worse, has hallucinations, or other oddities...

However, the problem with this as a general answer is that it depends on which model, what it was trained on, and what it's being used to generate.

lucid parrot Aug 23, 2024, 12:45 AM

#

have any of you experienced someone submitting AI generated garbage lol

iron basalt Aug 23, 2024, 12:45 AM

#

Yes. Including automated responses to PR comments.

lucid parrot Aug 23, 2024, 12:47 AM

#

lucid parrot also, i've been sensing some kind of tension between newcomers (who are more lik...

are any of you sensing this tension too or is it just me lol

#

i wonder if this is going to reduce trust in newcomers contributions

iron basalt Aug 23, 2024, 12:47 AM

#

iron basalt However, the problem with this as a general answer is that it depends on which m...

There are also a bunch of better models that are not released publicly, nor scaled all the way up, but may already be used somewhere without anyone knowing.

iron basalt Aug 23, 2024, 12:48 AM

#

lucid parrot i wonder if this is going to reduce trust in newcomers contributions

No, this is a general problem present already. Just talk to them and you can tell, both if they are human and if they actually have read and understood the project or are trying to.

lucid parrot Aug 23, 2024, 12:48 AM

#

ah so you don't think AI is necessarily exacerbating this issue?

#

then i'm curious if you think it's going to have an effect on OSS at all

violet gull Aug 23, 2024, 12:49 AM

#

as of now AI cant write anything complicated enough to warrant an issue

iron basalt Aug 23, 2024, 12:49 AM

#

lucid parrot ah so you don't think AI is necessarily exacerbating this issue?

Open Source has all kinds of issues, spam is annoying, but security issues are worse and ever growing.

lucid parrot Aug 23, 2024, 12:51 AM

#

security issues in general? or especially because of potential "bad" AI code

iron basalt Aug 23, 2024, 12:51 AM

#

lucid parrot security issues in general? or especially because of potential "bad" AI code

Intentional exploitation, not by AI bad code.

unreal condor Aug 23, 2024, 12:53 AM

#

Is anyone interested in taking part in shared tasks ? Specifically the upcoming SemEval 2025

lucid parrot Aug 23, 2024, 12:53 AM

#

i saw online that around 60% of github users are using github copilot and they're mostly newcomers so just curious if this is going to affect the open source community, it seems like this channel mostly thinks that it won't have an effect

violet gull Aug 23, 2024, 12:54 AM

#

lucid parrot i saw online that around 60% of github users are using github copilot and they'r...

copilot also cannot write anything complicated any any decent code review will catch it

iron basalt Aug 23, 2024, 12:54 AM

#

lucid parrot i saw online that around 60% of github users are using github copilot and they'r...

For any project I'm in charge in you can't submit copilot code. Potentional legal reasons.

lucid parrot Aug 23, 2024, 12:55 AM

#

is there some kind of check to make sure the code isn't generated by copilot? or is it based on your discretion?

unreal condor Aug 23, 2024, 12:55 AM

#

lucid parrot i saw online that around 60% of github users are using github copilot and they'r...

I don't think that should be a problem as long as you don't blindly use it and treat it as a supporting tool rather than use it to do all the work for you

violet gull Aug 23, 2024, 12:55 AM

#

lucid parrot is there some kind of check to make sure the code isn't generated by copilot? or...

code review + tests

iron basalt Aug 23, 2024, 12:55 AM

#

lucid parrot is there some kind of check to make sure the code isn't generated by copilot? or...

I can't tell always, I have to take their word at some point, but if I can tell it gets deleted and you are banned.

lucid parrot Aug 23, 2024, 12:57 AM

#

interesting. i'm curious if you've banned more people after copilot launched compared to before...

unreal condor Aug 23, 2024, 12:57 AM

#

Tbf, those who don't use AI to help them write code nowadays are probably seniors with conspiracy theory. AI-generated code is not that bad lmao, just don't misuse it

lucid parrot Aug 23, 2024, 12:58 AM

#

yeah i guess but also it does hallucinate and it's annoying to keep checking code...

iron basalt Aug 23, 2024, 12:58 AM

#

lucid parrot interesting. i'm curious if you've banned more people after copilot launched com...

No, contributions are low in every OSS project beyond small changes. Most projects are driven by one or a few people.

#

Pick any OSS project on github, and look at the contributors page, the first one will probably have 80% of the code.

lucid parrot Aug 23, 2024, 12:59 AM

#

yeah 100% agree. is there a way to tell when someone has been banned by just looking at a repo's pr?

iron basalt Aug 23, 2024, 1:00 AM

#

unreal condor Tbf, those who don't use AI to help them write code nowadays are probably senior...

They are senior engineers for a reason.

iron basalt Aug 23, 2024, 1:00 AM

#

lucid parrot yeah 100% agree. is there a way to tell when someone has been banned by just loo...

I have just have my own ban list.

lusty patio Aug 23, 2024, 1:01 AM

#

unreal condor Tbf, those who don't use AI to help them write code nowadays are probably senior...

Its great for efficency like excel auitofil for repetitive code, just not actually designing applications

lucid parrot Aug 23, 2024, 1:02 AM

#

ok so @iron basalt basically what i'm getting is that from your experience there's 0 tolerance for AI code and it isn't a huge issue because the number of people contributing isn't super high so it's not that annoying. lmk if i misunderstood anything

lucid parrot Aug 23, 2024, 1:02 AM

#

lusty patio Its great for efficency like excel auitofil for repetitive code, just not actual...

totally agree

unreal condor Aug 23, 2024, 1:02 AM

#

lusty patio Its great for efficency like excel auitofil for repetitive code, just not actual...

Yes

iron basalt Aug 23, 2024, 1:02 AM

#

lucid parrot ok so <@119925597395877889> basically what i'm getting is that from your experie...

Yes, this is not like posting rage bait on Twitter to pay rent. Very few people actually want to contribute.

lucid parrot Aug 23, 2024, 1:03 AM

#

iron basalt Yes, this is not like posting rage bait on Twitter to pay rent. Very few people ...

lol true

unreal condor Aug 23, 2024, 1:03 AM

#

iron basalt They are senior engineers for a reason.

I don't deny their ability, but denying the convenience AI brings due to your skepticalism is pretty bias, dont u think ?

iron basalt Aug 23, 2024, 1:04 AM

#

unreal condor I don't deny their ability, but denying the convenience AI brings due to your sk...

They understand from many years of experience of seeing new things that are there to provide convenience making things worse, trust their experience.

#

This is not like senior management, they are engineers.

#

If/when it's actually good enough, they will let you know.

lusty patio Aug 23, 2024, 1:05 AM

#

unreal condor I don't deny their ability, but denying the convenience AI brings due to your sk...

I agree, I feel like senior engineers let their egos get in the way and claim that AI is nowhere near as good as them

#

I mean I'll tell you one thing,

unreal condor Aug 23, 2024, 1:06 AM

#

iron basalt They understand from many years of experience of seeing new things that are ther...

That's wrong, just like how the old gen judges the new gens how easy their life is compared to them just because all the technology and stuff in real life

lusty patio Aug 23, 2024, 1:06 AM

#

its a lifesaver if you dont wanna read through documentaiton

iron basalt Aug 23, 2024, 1:07 AM

#

unreal condor That's wrong, just like how the old gen judges the new gens how easy their life ...

Again, this is not random old people yelling at clouds, they are people with decades of software development experience.

lucid parrot Aug 23, 2024, 1:08 AM

#

lusty patio I agree, I feel like senior engineers let their egos get in the way and claim th...

so do you think it's creating a tension between the seniors and juniors? have you experienced this?

iron basalt Aug 23, 2024, 1:08 AM

#

I know that in modern culture it's the norm to disregard senior's advice in the context of politics and such, but this is not that.

lusty patio Aug 23, 2024, 1:09 AM

#

lucid parrot so do you think it's creating a tension between the seniors and juniors? have yo...

I mean, Its with all levels of engineers. but I've definetly seen a correlation with ego and hate on AI

lucid parrot Aug 23, 2024, 1:09 AM

#

what do you mean by ego lol

unreal condor Aug 23, 2024, 1:10 AM

#

iron basalt Again, this is not random old people yelling at clouds, they are people with dec...

And yes, what is the good in those experience when you don't even try to adapt to new things ? I've seen seniors said that AI "help" them do their job not actually "do" the job for them. Those deny AI convenience just have big egos and way too fixted without actually trying them first

lucid parrot Aug 23, 2024, 1:10 AM

#

because ai could technically save the managers time by making their subordinates check with the ai for answers before going to the manager?

unreal condor Aug 23, 2024, 1:10 AM

#

I've seen profs in my school can't even code properly lmao

lusty patio Aug 23, 2024, 1:10 AM

#

lucid parrot what do you mean by ego lol

devs thinking they are just built diff

#

in the world of cs, you either have imposter syndrome or an ego

#

its very rare to find people without eirther

unreal condor Aug 23, 2024, 1:11 AM

#

lusty patio in the world of cs, you either have imposter syndrome or an ego

True lmao, could not agree more

iron basalt Aug 23, 2024, 1:11 AM

#

unreal condor I've seen profs in my school can't even code properly lmao

Professors are not what i'm talking about. I mean decades of working experience.

unreal condor Aug 23, 2024, 1:11 AM

#

You say profs dont have experience ?

iron basalt Aug 23, 2024, 1:12 AM

#

unreal condor You say profs dont have experience ?

Some do, many not so much.

#

Depends on if they wanted to stay in academia.

unreal condor Aug 23, 2024, 1:13 AM

#

I don't get it, why are you so skeptical about an AI could autofil a "for loop" for you. You could even check it afterward

lusty patio Aug 23, 2024, 1:14 AM

#

iron basalt Professors are not what i'm talking about. I mean decades of working experience.

even then, depends on the experince

#

I know a lot of senior engineers that work for the goverment or another "slow paced setting" and they don't know anything outside of their very specific domain

unreal condor Aug 23, 2024, 1:14 AM

#

Just don't tell that AI to build the whole system and u should be fine

lusty patio Aug 23, 2024, 1:14 AM

#

and even their domain knoladge isent often impressive,

#

just saying, work experince does not translate to wisdom

iron basalt Aug 23, 2024, 1:15 AM

#

unreal condor I don't get it, why are you so skeptical about an AI could autofil a "for loop" ...

This is fine and all, but now you are bordering on a snippet tool, which is not what many want to use it for.

lusty patio Aug 23, 2024, 1:15 AM

#

unreal condor Just don't tell that AI to build the whole system and u should be fine

Well, I disagree with that

#

with AI, comes integration hell

#

its pretty good at designing systems ngl and individually coding out components of the systems

#

its just

#

integration is the aids part, taking all the boiler plate based code and fitting it together like a puzzle

lucid parrot Aug 23, 2024, 1:16 AM

#

yeah i feel like not just integration but also with some repos having certain style guidelines - not sure how ai would match these guidlines and then maintainers can probably tell it's ai generated

lusty patio Aug 23, 2024, 1:16 AM

#

lucid parrot yeah i feel like not just integration but also with some repos having certain st...

It can if instructed

#

quite easilyt

lucid parrot Aug 23, 2024, 1:16 AM

#

for individual repos?

lusty patio Aug 23, 2024, 1:16 AM

#

its one of the things it does best. As long as you specify the style guides

lucid parrot Aug 23, 2024, 1:17 AM

#

hmm i tested it out a while ago and it generated bs

lusty patio Aug 23, 2024, 1:17 AM

#

you probably just need better prompt engineering my friend

lucid parrot Aug 23, 2024, 1:17 AM

#

lol

#

it seems like this channel has people on 2 opposite ends of the spectrum - fans and haters

#

i'm trying to be in the middle tho

unreal condor Aug 23, 2024, 1:18 AM

#

Tbh, instructing AI is prompt-learning and its performance still not comparable to other types of learning

lusty patio Aug 23, 2024, 1:18 AM

#

unreal condor Tbh, instructing AI is prompt-learning and its performance still not comparable ...

wdym

iron basalt Aug 23, 2024, 1:19 AM

#

lucid parrot it seems like this channel has people on 2 opposite ends of the spectrum - fans ...

I do work on such systems, I just also understand the senior engineer's concerns.

unreal condor Aug 23, 2024, 1:22 AM

#

lusty patio wdym

Thanks to the rise of LLM, there is a new type of learning call prompt-learning. But here is the thing, u don't train or fine-tune the LLM, u just make prompts for it and make it do specific tasks like classification. And compared to other methods, the performance is really bad. It could achieve somewhat average results if the validation data is simple enough.

unreal condor Aug 23, 2024, 1:25 AM

#

lucid parrot it seems like this channel has people on 2 opposite ends of the spectrum - fans ...

Ok i'm kinda new here, but is this channel the place where we discuss the theory behind AI and how to build it or do we just give comments and reviews about AI-based tools and the views around AI ?

left tartan Aug 23, 2024, 1:46 AM

#

unreal condor Ok i'm kinda new here, but is this channel the place where we discuss the theory...

This is generally about both the practice and theory of DS, including AI/ML. Debating AI tools like copilot is not really what's covered here, altho perhaps a well framed question might be on topic, I dunno.

faint quail Aug 23, 2024, 2:29 AM

#

why is my validation so spiky?

unreal condor Aug 23, 2024, 2:45 AM

#

faint quail why is my validation so spiky?

It seems like your model diverged at random epochs. It could have happened because of high learning rate, but it converged in the end so this shoud not be something that you worry about.

pine escarp Aug 23, 2024, 3:07 AM

#

Guys, whats the best web scraping tool?
I want to get data on NVIDIA GPUS and compare them with Intel.

violet gull Aug 23, 2024, 3:08 AM

#

pine escarp Guys, whats the best web scraping tool? I want to get data on NVIDIA GPUS and co...

lol what

faint quail Aug 23, 2024, 3:47 AM

#

unreal condor It seems like your model diverged at random epochs. It could have happened becau...

Now my validation loss is lower than my training 💀
I think my BatchNorm code may be incorrect because thats only part thats different during inference

Code: https://paste.pythondiscord.com/Y5NQ

#

Maybe its because I applied batch norm after every conv layer and deep layer

unreal condor Aug 23, 2024, 4:02 AM

#

faint quail Now my validation loss is lower than my training 💀 I think my BatchNorm code m...

Why are you dreading over low loss ? Shouldn't you be glad ?

faint quail Aug 23, 2024, 4:03 AM

#

unreal condor Why are you dreading over low loss ? Shouldn't you be glad ?

yeah but its still a little strange lol

#

ig it just has to do with the regularization

unreal condor Aug 23, 2024, 4:04 AM

#

I've never seen regularization caused such problem

#

What is the value of ur learning rate then

faint quail Aug 23, 2024, 4:04 AM

#

learning rate of 0.00001

unreal condor Aug 23, 2024, 4:04 AM

#

It could be some hardware problem too

faint quail Aug 23, 2024, 4:04 AM

#

and batch size of 32

faint quail Aug 23, 2024, 4:05 AM

#

unreal condor It could be some hardware problem too

perhaps

unreal condor Aug 23, 2024, 4:06 AM

#

Ye, then i have no ideas, i've never dwelled too deep into regularization or normalization that much. But like i said, if it converges in the end, it works just fine, don't try to fix it lol

#

I have found somewhat of a feasible answer to ur problem: https://stackoverflow.com/questions/61287322/validation-loss-sometimes-spiking

Stack Overflow

validation loss sometimes spiking

i want to detect which one is genuine image and which one is spoof image. and i have +- 8000 dataset images (combine). so i trained the model with LR = 1e-4 BS = 32 EPOCHS = 100. and this is the re...

fiery bane Aug 23, 2024, 4:37 AM

#

faint quail and batch size of 32

idk what's the size of your dataset, but you can pump up those a numbers a bit, maybe more stable
make sure you don't have a bigger batch size than the number of iter in epoch (just my own instict, no actual math here)

fiery bane Aug 23, 2024, 4:38 AM

#

faint quail yeah but its still a little strange lol

i know you are trying to do some sanity check.
make sure this is not a sign of a bigger issue.
Just redo it with 10 random seed, if they all 10 converge despite the spikes, then that's fine

faint quail Aug 23, 2024, 5:20 AM

#

fiery bane i know you are trying to do some sanity check. make sure this is not a sign of ...

alr thx for the tips

simple tapir Aug 23, 2024, 5:40 AM

#

Hey

#

I'm trying to build a real time hand gesture pipeline

#

import React from "react";
import { createRoot } from "react-dom/client";
import App from "./App";
import "./style/index.css";
import * as tf from '@tensorflow/tfjs';
import '@tensorflow/tfjs-backend-wasm'; 

const root = createRoot(document.getElementById("root"));

tf.setBackend('wasm').then(() => {
  root.render(
    <React.StrictMode>
      <App />
    </React.StrictMode>
  );
});

I have basically used this apprach to set the backend to wasm

#

But I get these errors

verbal oar Aug 23, 2024, 7:41 AM

#

you have only one error failed to load resource

#

I mean other are labeled as warnings but have error word in text 🤔

#

I think rather javascript and webassembly related

woven sundial Aug 23, 2024, 8:05 AM

#

hi im new here 🙂
i wrote an astrophoto ai denoise script in pyhton but i have small (big) problem with it, can someone help me out?

#

it denoises well, but i have really visible tile borders, where should i start to get rid of them?

#

tried doing overlap, changing stride and its still there

#

heres sourcecode: https://paste.pythondiscord.com/ATOQ

scarlet anchor Aug 23, 2024, 9:01 AM

#

Hi, I am trying to load The Llama 3 model rom hugging face on my colab

#

on colab -

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", low_cpu_mem_usage=True)

It takes forever and stops after running out of memory (Your session crashed after using all available RAM.)->

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards:  25%
 1/4 [00:24<01:12, 24.10s/it]

#

It happens even tho I am using GPU

#

On my jupuyter notebook, this command -

! huggingface-cli login

#

takes forever to run

#

Ideally it should show something like this -

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.

#

Is there any workaround to fix this?

#

I did try using this - low_cpu_mem_usage=True but still it crashes

placid horizon Aug 23, 2024, 10:10 AM

#

scarlet anchor I did try using this - ``low_cpu_mem_usage=True`` but still it crashes

What's ur ram size

#

Ideally it should be 16gb

hearty token Aug 23, 2024, 10:27 AM

#

#

Is this imbalance of 4 star and 5 star reviews over other classes bad for training?

jaunty helm Aug 23, 2024, 10:30 AM

#

hearty token Is this imbalance of 4 star and 5 star reviews over other classes bad for traini...

somethings to keep in mind:

be sure that you have a healthy amount of 1-5 classes in both training & testing sets, e.g. by setting stratify in sklearn.model_selection.train_test_split
using accuracy might not be the best idea

hearty token Aug 23, 2024, 10:31 AM

#

jaunty helm somethings to keep in mind: 1. be sure that you have a healthy amount of 1-5 cla...

Gotcha, is there something equivalent for pandas or pytorch? I split it currently like this:

train_size = 0.8
train_dataset=df.sample(frac=train_size,random_state=200)
test_dataset=df.drop(train_dataset.index).reset_index(drop=True)
train_dataset = train_dataset.reset_index(drop=True)

And what do you mean by accuracy not being the best idea?

jaunty helm Aug 23, 2024, 10:35 AM

#

hearty token Gotcha, is there something equivalent for pandas or pytorch? I split it currentl...

equivalent pytorch / pandas
off the top of my head, I don't think so... iirc last time I did something like this, I just grabbed train_test_split or StratifiedKFold from sklearn

accuracy not good
imagine for simplicity, your have 1000 data points for reviews, 900 gave a 5 and 100 gave a 1.
simply by predicting everything to be a 5, you achieve 90% accuracy

#

though now that I look at it, is that kaggle? if so, just use whatever metric they use

hearty token Aug 23, 2024, 10:38 AM

#

jaunty helm > equivalent pytorch / pandas off the top of my head, I don't think so... iirc l...

Ahhh yeah that makes total sense. That wouldn't be a problem if each of the classes exist in equal quantities in the data set would it? I am actually not planning to use the entire 20k records. I have space to make each of the classes in equal amount

hearty token Aug 23, 2024, 10:38 AM

#

jaunty helm though now that I look at it, is that kaggle? if so, just use whatever metric th...

Yes, it is from kaggle

scarlet anchor Aug 23, 2024, 11:35 AM

#

placid horizon Ideally it should be 16gb

its 20 gigs RAM

lapis sequoia Aug 23, 2024, 12:13 PM

#

pretty nice post showing logit and logistic are inverses https://math.stackexchange.com/questions/3252945/how-to-justify-the-logistic-function-is-the-inverse-of-the-natural-logit-functio

left vault Aug 23, 2024, 12:50 PM

#

halp :3 https://discord.com/channels/267624335836053506/1276523830147551263

lapis sequoia Aug 23, 2024, 1:37 PM

#

1st time reading about SVMs (and co), very neat idea, if anyone wants to discuss https://en.wikipedia.org/wiki/Kernel_method

lapis sequoia Aug 23, 2024, 2:10 PM

#

one does get into moody waters fast https://en.wikipedia.org/wiki/Inner_product_space lol

woven sundial Aug 23, 2024, 2:54 PM

#

woven sundial it denoises well, but i have really visible tile borders, where should i start t...

???

past bramble Aug 23, 2024, 3:28 PM

#

ay

#

can anyone guide me to creating LLMs?

lapis sequoia Aug 23, 2024, 3:30 PM

#

nice image-summary of svms

jaunty helm Aug 23, 2024, 3:31 PM

#

past bramble can anyone guide me to creating LLMs?

from scratch? don't even try, the amount of data required isn't really accessible to individuals
otherwise, checkout huggingface

verbal oar Aug 23, 2024, 3:34 PM

#

yeah I read about it in grokking ml
embed in 3d space and then project back

#

but before doing embedding move triangles up and squares down

pine escarp Aug 23, 2024, 3:34 PM

#

lapis sequoia nice image-summary of svms

how do you read or understand the 3d plot

verbal oar Aug 23, 2024, 3:35 PM

#

I recommend kernel method section in grokking ml

jaunty helm Aug 23, 2024, 3:36 PM

#

pine escarp how do you read or understand the 3d plot

in 2d, there's no line that'd separate the red from blue
the idea is, to use some function to transform those 2d points into higher dimensions, in this case 3d, then in that higher dimension, you might be able to find a hyperplane that can separate the data, which is what's shown in the 3d plot

verbal oar Aug 23, 2024, 3:36 PM

#

yes at start there is not linearly separable but after kernel trick its seperable

pine escarp Aug 23, 2024, 3:37 PM

#

jaunty helm in 2d, there's no line that'd separate the red from blue the idea is, to use som...

Yeahhhh, thanks i understand now

lapis sequoia Aug 23, 2024, 3:40 PM

#

pine escarp how do you read or understand the 3d plot

it's separating each class according to the fitted plane

#

the dots are transformed using the kernel trick, from 2D to 3D.

past bramble Aug 23, 2024, 3:40 PM

#

jaunty helm from scratch? don't even try, the amount of data required isn't really accessibl...

true but I really wanna learn llms from scratch. I don't know what kind of datasets are used for this but I could prolly find a small one at least online.

I have heard using vectors that map words with values and multiple other ways, no idea which dataset I would need for this

verbal oar Aug 23, 2024, 3:40 PM

#

hmm so unproject is same word for embedding?

lapis sequoia Aug 23, 2024, 3:41 PM

#

jaunty helm in 2d, there's no line that'd separate the red from blue the idea is, to use som...

nice explanation, i missed it

jaunty helm Aug 23, 2024, 3:42 PM

#

past bramble true but I really wanna learn llms from scratch. I don't know what kind of datas...

right now, most mainstream LLMs are built from a special DNN architecture called a transformer
map word -> values that's called the word embedding layer

jaunty helm Aug 23, 2024, 3:42 PM

#

lapis sequoia nice explanation, i missed it

also the kernel trick and SVMs are 2 separate things
it's just that usually you use them together

shut shoal Aug 23, 2024, 3:42 PM

#

os.environ['pipeline'] = 'code'

# Verify that the environment variable is set
print(os.environ['pipeline'])

#Create the question answer pairs using groq api
def groq_qa_pairs(text):
    #Create the client
    groq_chat = ChatGroq(
        #Keep the temperature low to maintain more precise question and answer
        temperature = 0.3,
        #Retreieve the key
        groq_api_key = os.environ['pipeline'],
        #Get the model type
        model_name= "llama3-8B"
    )
    #Give the prompt
    system_prompt = (
    "You are an expert in the Indian legal system and your job is to summarize legal documents. You will be given text from real court cases" 
    "and you will need to generate what the underlying question was of that court case and the outcome of the court case. Here is an example"
    "of the format I want you to follow: \n\n"
    "Legal text: {t}"    
    "Q:\n"
    "A: ").format(t = text)
    #Get an output
    response = groq_chat.generate(system_prompt)
    return response

groq_qa_pairs(pdf1)

TypeError: Got unknown type Y

Why is this the error for me?

#

I'm pretty new on this stuff so I don't really know what really happened.

lapis sequoia Aug 23, 2024, 3:43 PM

#

yeah, the kernel trick is just a transformation of the features.

#

right?

past bramble Aug 23, 2024, 3:43 PM

#

jaunty helm right now, most mainstream LLMs are built from a special DNN architecture called...

I have heard openai api providing embeddings, haven't looked into it. Embeddings generally in this field mean the mappings?

agile cobalt Aug 23, 2024, 3:43 PM

#

shut shoal ```py os.environ['pipeline'] = 'code' # Verify that the environment variable is...

show the full traceback

verbal oar Aug 23, 2024, 3:44 PM

#

kernel is map for svm

lapis sequoia Aug 23, 2024, 3:44 PM

#

yeah, just non linear normally, i'd expect

verbal oar Aug 23, 2024, 3:44 PM

#

svm are classically just linear

shut shoal Aug 23, 2024, 3:45 PM

#

agile cobalt show the full traceback

It's very long

agile cobalt Aug 23, 2024, 3:45 PM

#

past bramble I have heard openai api providing embeddings, haven't looked into it. Embedding...

in the general sense, embeddings are a compressed representation of data

usually they'll be vectors with a fixed size (determined by the model that produces it), and be normalized as floats in the range of -1.0 ~ 1.0

verbal oar Aug 23, 2024, 3:45 PM

#

and you have rbf, polynomial, gausian kernel

agile cobalt Aug 23, 2024, 3:45 PM

#

shut shoal It's very long

!paste

arctic wedgeBOT Aug 23, 2024, 3:45 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

lapis sequoia Aug 23, 2024, 3:45 PM

#

but requires you to know a useful fn to map them

shut shoal Aug 23, 2024, 3:45 PM

#

agile cobalt !paste

https://paste.pythondiscord.com/DP6A

lapis sequoia Aug 23, 2024, 3:45 PM

#

those are just transformation kernels for svms or wht do you mean? @verbal oar

#

cuz most are also activations in DNNs, thats a separate thing ig

verbal oar Aug 23, 2024, 3:46 PM

#

I mean when you provide as parameter kernel=
in scikit learn for example

#

'poly', 'rbf' etc.

lapis sequoia Aug 23, 2024, 3:47 PM

#

ig those are just doing rbf(x)=>x_t

#

and so on, so they are just specific $\phi$s in the wikipedia page (which is to my understanding a non-linear transformation, i.e like an activation in DNNs.)

past bramble Aug 23, 2024, 3:47 PM

#

agile cobalt in the general sense, embeddings are a compressed representation of data usuall...

thanks I'm understanding it!
i know it sounds stupid but can I go ahead to find an embeddings dataset to create an llm? I don't know how big they would be yet, I'll research on it

agile cobalt Aug 23, 2024, 3:50 PM

#

past bramble thanks I'm understanding it! i know it sounds stupid but can I go ahead to find ...

creating a LLM from scratch requires training on millions of data samples at least ; large models like Llama are trained on Trillions of tokens

take a look at https://github.com/karpathy/nanoGPT though, it is a bit more reasonable but won't be useful for much besides research/learning

GitHub

GitHub - karpathy/nanoGPT: The simplest, fastest repository for tra...

The simplest, fastest repository for training/finetuning medium-sized GPTs. - karpathy/nanoGPT

#

you can use an open source text encoder model to create your own embeddings though, look up the architecture of some popular open source models

agile cobalt Aug 23, 2024, 3:53 PM

#

past bramble thanks I'm understanding it! i know it sounds stupid but can I go ahead to find ...

embeddings are also frequently used for vector similarity search ; if two sentences carry a similar meaning, it is assumed that their embeddings will be similar. This can also be used for documents, images, videos etc. as long as you have a model that can encode that data (and there are even some multi-modal models which can encode multiple types into the same 'space')
random example of something I did for images

with text, that's commonly used as the first step in a Retrieval Augmented Generation pipeline

agile cobalt Aug 23, 2024, 3:56 PM

#

shut shoal https://paste.pythondiscord.com/DP6A

hmm idk, where did you get the code from?

shut shoal Aug 23, 2024, 3:59 PM

#

agile cobalt hmm idk, where did you get the code from?

I looked on examples on groq and based it upon there.

#

I can find the example I was basing it on.

#

https://replit.com/@GroqCloud/Groq-Quickstart-Conversational-Chatbot?v=1#main.py

replit

GroqCloud

Groq Quickstart Conversational Chatbot

A simple application that allows users to interact with a conversational chatbot powered by Groq. This application is designed to get users up and running quickly with building a chatbot.

#

This was what I based it upon

agile cobalt Aug 23, 2024, 4:00 PM

#

shut shoal I looked on examples on groq and based it upon there.

my guess is that you're using an incompatible combination of model class / api provider / model name

shut shoal Aug 23, 2024, 4:01 PM

#

agile cobalt my guess is that you're using an incompatible combination of model class / api p...

What can I do to fix this?

agile cobalt Aug 23, 2024, 4:01 PM

#

shut shoal https://replit.com/@GroqCloud/Groq-Quickstart-Conversational-Chatbot?v=1#main.py

that is entirely different from what you are doing

agile cobalt Aug 23, 2024, 4:01 PM

#

shut shoal What can I do to fix this?

copy/paste from somewhere that works

either langchain's documentation or groq's documentation

lapis sequoia Aug 23, 2024, 4:02 PM

#

it may be interesting to read the creator of SVMs

#

just realised he's in lex fridman

#

this is pretty cool summary ig:

In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, representing the data only through a set of pairwise similarity comparisons between the original data points using a kernel function, which transforms them into coordinates in the higher dimensional feature space.

shut shoal Aug 23, 2024, 4:04 PM

#

agile cobalt copy/paste from somewhere that works either langchain's documentation or groq's...

Okay awesome

past bramble Aug 23, 2024, 4:22 PM

#

agile cobalt creating a LLM from scratch requires training on millions of data samples at lea...

thanks! it's looking interesting so far as I'm reading it

past bramble Aug 23, 2024, 4:28 PM

#

agile cobalt embeddings are also frequently used for vector similarity search ; if two senten...

I'll experiment on simpler embeddings first.

Also that website, I don't think the output is supposed to look like this

agile cobalt Aug 23, 2024, 4:29 PM

#

past bramble I'll experiment on simpler embeddings first. Also that website, I don't think...

it is working as expected, it just does not works very well lol

#

oh, that is the example one

yeah it is very sensitive

past bramble Aug 23, 2024, 4:32 PM

#

Im finding everything except the exact same kanji

#

pithink

#

should be easier to find same ones than similar according to me, it's impressive btw

agile cobalt Aug 23, 2024, 4:33 PM

#

this worked for me, the angles and intersections are very important

past bramble Aug 23, 2024, 4:41 PM

#

jaunty helm from scratch? don't even try, the amount of data required isn't really accessibl...

I have heard the term huggingface a lot, what exactly is it? python module? some tool for creating text models?

agile cobalt Aug 23, 2024, 4:42 PM

#

past bramble I have heard the term huggingface a lot, what exactly is it? python module? som...

it's a website, a bit similar to GitHub but for machine learning models and datasets

it also has a bunch of python libraries that makes it easier to access models hosted in their website

edit; oh yeah, it also has the Spaces that let you deploy python applications for free like the above derp

lapis sequoia Aug 23, 2024, 4:43 PM

#

article is pretty large, but so good. the math is overall quite digestible imho https://en.wikipedia.org/wiki/Support_vector_machine

#

sharing in case anyone wants to discuss

past bramble Aug 23, 2024, 4:46 PM

#

agile cobalt it's a website, a bit similar to GitHub but for machine learning models and data...

are the models free to use? can I use it to create new text models smarter than gpt2? (I have heard its the free fine tunable gpt)

#

if I see anything needs to be paid for I'm out

#

ducky_drawing

agile cobalt Aug 23, 2024, 4:49 PM

#

past bramble are the models free to use? can I use it to create new text models smarter than ...

most models are free to download and you can run inference locally without additional costs besides your own compute/electricity, but you have to check their licenses (just like you would need to if you were downloading something from github, or installing from pypi)

some of them are free to use via their API or inside of Spaces

you cannot use Hugging Face to train models though, they focus on inference and deployment

for training you could try Google Colab or Kaggle if you want free cloud compute, and iirc that gpt repository I linked earlier is at least on the same level as GPT2 and can be trained in them

crystal talon Aug 23, 2024, 4:51 PM

#

hello! im having some problems with very long html parsing times (talking about minutes for around 30 pages), is that normal?

lapis sequoia Aug 23, 2024, 4:52 PM

#

idk but you could use multiprocessing and map those to your cpu cores in parallel, right? unless its only one html, a can't be splitted

crystal talon Aug 23, 2024, 4:53 PM

#

lapis sequoia idk but you could use multiprocessing and map those to your cpu cores in paralle...

ive got the code posted in #1035199133436354600 , wondering if u can have a look to see what i can do

lapis sequoia Aug 23, 2024, 4:54 PM

#

is your 'time' including the request or just the parsing?

crystal talon Aug 23, 2024, 4:55 PM

#

was the whole code so probably including the request

#

even so it takes a longer time compared to my other projects

lapis sequoia Aug 23, 2024, 4:55 PM

#

oh, i think your first step should be to identify the bottleneck

#

yeah but otherwise we dont know what to fix

past bramble Aug 23, 2024, 4:55 PM

#

agile cobalt most models are free to download and you can run inference locally without addit...

good to know free options exist, in the repo readme, they mentioned A100 CPU or GPU, I don't recall what exactly it was, is required to train them

lapis sequoia Aug 23, 2024, 4:55 PM

#

it may be a crappy server, who knows !

#

https://tenor.com/view/server-runing-server-up-and-running-hamster-gif-17522250

Tenor

crystal talon Aug 23, 2024, 4:56 PM

#

lapis sequoia it may be a crappy server, who knows !

ironically it's a gov website 🤷‍♂️

#

oh well - what are some ways to isolate the bottleneck?

jaunty helm Aug 23, 2024, 4:56 PM

#

past bramble good to know free options exist, in the repo readme, they mentioned A100 CPU or...

training an AI takes a lot of computations
meaning that you need a pretty high tier gpu to have decent speeds

unreal condor Aug 23, 2024, 4:57 PM

#

past bramble good to know free options exist, in the repo readme, they mentioned A100 CPU or...

You don't need gpu to train models, you just need a cpu and at least 35 years.

lapis sequoia Aug 23, 2024, 4:57 PM

#

just time each part

import time
start=time.time()
#code (reqquest) 
end = time.time()
print(end-start)
#...

jaunty helm Aug 23, 2024, 4:57 PM

#

crystal talon oh well - what are some ways to isolate the bottleneck?

use a profiler, like line_profiler

jaunty helm Aug 23, 2024, 4:58 PM

#

jaunty helm training an AI takes a lot of computations meaning that you need a pretty high t...

if you're just running, then the requirements drop by a lot
personally I can run an 8b model pretty comfortably on a 4060 (8gb vram)

past bramble Aug 23, 2024, 4:58 PM

#

ain't gpu "graphics" pu, what's it doing to train text models

jaunty helm Aug 23, 2024, 4:59 PM

#

past bramble ain't gpu "graphics" pu, what's it doing to train text models

graphics involve a lot of matrix multiplications
neural nets just so happen to do a lot of mat muls

past bramble Aug 23, 2024, 4:59 PM

#

unreal condor You don't need gpu to train models, you just need a cpu and at least 35 years.

i see that's cost effective 👍🏻

#

lemme see how much A100 costs

crystal talon Aug 23, 2024, 5:00 PM

#

lapis sequoia just time each part ```py import time start=time.time() #code (reqquest) end ...

10.18548583984375 request
0.7679240703582764 parsing

seems like its the request - is there any way i can increase the speed?

iron basalt Aug 23, 2024, 5:00 PM

#

jaunty helm training an AI takes a lot of computations meaning that you need a pretty high t...

Deep learning specifically, other options can be trained on a CPU.

past bramble Aug 23, 2024, 5:00 PM

#

I should not have checked 💀

jaunty helm Aug 23, 2024, 5:01 PM

#

past bramble lemme see how much A100 costs

you don't want to know, and I don't think you can buy them anyways cause it's targeted at large companies
best you can get are consumer grade cards, so like the RTX 40 series

unreal condor Aug 23, 2024, 5:01 PM

#

past bramble ain't gpu "graphics" pu, what's it doing to train text models

That is not its sole purpose, GPU can handle specialized computation way faster than CPU due to more specialized cores. Just like purplys said, those specialized computation happen to be mat mul

past bramble Aug 23, 2024, 5:01 PM

#

didn't know gpus can be used for computations

unreal condor Aug 23, 2024, 5:01 PM

#

past bramble lemme see how much A100 costs

Kaggle offers free A100 30 hours a week, even dual T4 for the same period

lapis sequoia Aug 23, 2024, 5:01 PM

#

if it's multiple documents,
https://www.reddit.com/r/learnpython/comments/woh54x/how_can_i_speed_up_python_requests/ ?
seems the first possibility.

this sends your requests sequentially but does not wait for replies (iirc.)

agile cobalt Aug 23, 2024, 5:01 PM

#

past bramble good to know free options exist, in the repo readme, they mentioned A100 CPU or...

oh, I misremembered and underestimated GPT-2

yeah you're not going to be able to train something GPT-2 level without a pretty large budget

#

training from scratch takes a lot of compute

past bramble Aug 23, 2024, 5:02 PM

#

agile cobalt oh, I misremembered and underestimated GPT-2 yeah you're not going to be able t...

just heard i can use 30 hours free a100 👍🏻

past bramble Aug 23, 2024, 5:03 PM

#

unreal condor Kaggle offers free A100 30 hours a week, even dual T4 for the same period

I should try it out

unreal condor Aug 23, 2024, 5:03 PM

#

Even google colab offers free T4 with limited time used

past bramble Aug 23, 2024, 5:03 PM

#

that's 60 hours a week

jaunty helm Aug 23, 2024, 5:03 PM

#

if you're willing to compromise, e.g. not train the entire thing from scratch, then the hardware reqs also drop significantly

agile cobalt Aug 23, 2024, 5:03 PM

#

unreal condor Kaggle offers free A100 30 hours a week, even dual T4 for the same period

it is P100, not A100

unreal condor Aug 23, 2024, 5:04 PM

#

agile cobalt it is P100, not A100

My bad then

jaunty helm Aug 23, 2024, 5:04 PM

#

here's an estimate from llama factory:
https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#hardware-requirement

unreal condor Aug 23, 2024, 5:04 PM

#

Wait, isnt P100 better than A100 ?

agile cobalt Aug 23, 2024, 5:04 PM

#

unreal condor Wait, isnt P100 better than A100 ?

no, not even close

past bramble Aug 23, 2024, 5:04 PM

#

maybe i should level down a bit.
either creating another type of model from scratch or start off from a checkpoint as Purplys said

unreal condor Aug 23, 2024, 5:05 PM

#

Bruh, my bad again

lapis sequoia Aug 23, 2024, 5:05 PM

#

verbal oar I mean when you provide as parameter kernel= in scikit learn for example

past bramble Aug 23, 2024, 5:07 PM

#

any suggestions for other types of models I could build from scratch? I have already made image recognition ones(on limited objects), I want to go further

jaunty helm Aug 23, 2024, 5:08 PM

#

past bramble maybe i should level down a bit. either creating another type of model from scr...

well as a reference, the only 'people' releasing models that are trained from scratch are basically all companies
example:

llama: meta
gemma: google
qwen: alibaba
nemo: mistral + nvidia

unreal condor Aug 23, 2024, 5:09 PM

#

past bramble maybe i should level down a bit. either creating another type of model from scr...

Yes, creating a LLM straight away is way too ambitous. I don't want to discourage you but even the predecessor of LLM - the PLM (pretrained langage model) - took a very long time to research and develop, not mention the amount of pretrained data (TBs of text) and computing units.

jaunty helm Aug 23, 2024, 5:09 PM

#

jaunty helm well as a reference, the only 'people' releasing models that are trained from sc...

there are way more finetunes, i.e. take an existing model and tune it on some other data

agile cobalt Aug 23, 2024, 5:10 PM

#

or just try some classical ML like Kaggle's Titanic with sklearn instead of neural networks

jaunty helm Aug 23, 2024, 5:11 PM

#

if you step outside of LLMs, most ML architectures aren't that compute expensive to get started

unreal condor Aug 23, 2024, 5:11 PM

#

Only big companies nowadays can afford to develop LLMs ngl

agile cobalt Aug 23, 2024, 5:11 PM

#

jaunty helm if you step outside of LLMs, most ML architectures aren't that compute expensive...

stares at stable diffusion yeahhh maybe "outside of generative ai" /s

past bramble Aug 23, 2024, 5:11 PM

#

jaunty helm there are way more finetunes, i.e. take an existing model and tune it on some ot...

I have used openai fine tuning, it doesn't give the feel of creating something, rather using a product

uneven jewel Aug 23, 2024, 5:12 PM

#

Guys Mtech CSE or Mtech AI and ML,which should I choose?

agile cobalt Aug 23, 2024, 5:12 PM

#

past bramble I have used openai fine tuning, it doesn't give the feel of creating something, ...

you can fine tune open source models like Gemma or Llama

past bramble Aug 23, 2024, 5:13 PM

#

agile cobalt or just try some classical ML like Kaggle's Titanic with sklearn instead of neur...

alright I'll get into kaggle and try those out

uneven jewel Aug 23, 2024, 5:13 PM

#

agile cobalt you can fine tune open source models like Gemma or Llama

what does that mean?

past bramble Aug 23, 2024, 5:14 PM

#

agile cobalt you can fine tune open source models like Gemma or Llama

is it different somehow? isn't it simply providing conversation examples to get responses as per your style? or is it free?

agile cobalt Aug 23, 2024, 5:14 PM

#

uneven jewel what does that mean?

Gemma and Llama are open source Largue Language Models (if that still makes no sense to you, think of it like free versions of ChatGPT)

fine tuning is a process through which you adapt a model to perform better on some specific tasks using your own data

past bramble Aug 23, 2024, 5:14 PM

#

oh its free

jaunty helm Aug 23, 2024, 5:14 PM

#

past bramble I have used openai fine tuning, it doesn't give the feel of creating something, ...

tbh, LLMs don't really 'feel' that magical if you just look at the code or something
most of the time's spent waiting for training
like I think I remember seeing the entire llama3 training file is just 300 lines of python

uneven jewel Aug 23, 2024, 5:15 PM

#

agile cobalt Gemma and Llama are open source Largue Language Models (if that still makes no s...

But I asked which specalization should I choose,either Mtech CSE or Mtech AIML,I'm studyin AI and DS 3rd year

unreal condor Aug 23, 2024, 5:15 PM

#

Even fine-tuning LLMs require a behemoth amount of computing power. I tried inference only with a 7B params LLM from huggingface with google colab T4 and still i couldnt do it due to limited GPU RAM

agile cobalt Aug 23, 2024, 5:15 PM

#

past bramble is it different somehow? isn't it simply providing conversation examples to get ...

I assumed that the part you were discontent with was just using an API versus actually running the training loop, never mind

agile cobalt Aug 23, 2024, 5:15 PM

#

uneven jewel But I asked which specalization should I choose,either Mtech CSE or Mtech AIML,I...

yeah I was not replying to you in that first message 🙃

uneven jewel Aug 23, 2024, 5:15 PM

#

agile cobalt yeah I was not replying to you in that first message 🙃

ahh mane

jaunty helm Aug 23, 2024, 5:16 PM

#

unreal condor Even fine-tuning LLMs require a behemoth amount of computing power. I tried **in...

you're probably doing inference on the fully weighted model, which isn't really needed imo

uneven jewel Aug 23, 2024, 5:16 PM

#

uneven jewel Guys Mtech CSE or Mtech AI and ML,which should I choose?

somebody Welp

past bramble Aug 23, 2024, 5:16 PM

#

agile cobalt I assumed that the part you were discontent with was just using an API versus ac...

ah wait we don't use api here?

unreal condor Aug 23, 2024, 5:16 PM

#

jaunty helm you're probably doing inference on the fully weighted model, which isn't really ...

Wdym

jaunty helm Aug 23, 2024, 5:16 PM

#

unreal condor Wdym

I can run a 8b model with 16k context really comfortably on a 4060

#

with quantization

agile cobalt Aug 23, 2024, 5:16 PM

#

past bramble ah wait we don't use api here?

you could use an API, but the charm of open weights models is running them yourself

*well, you don't use an API as in not a HTTP rest API, you'll still be using some library API

unreal condor Aug 23, 2024, 5:17 PM

#

jaunty helm I can run a 8b model with 16k context really comfortably on a *4060*

Dude, its a goddamn 4060

#

I used free cloud GPU lmao

jaunty helm Aug 23, 2024, 5:17 PM

#

unreal condor I used free cloud GPU lmao

you said you used a T4, which has twice the vram a 4060 has

unreal condor Aug 23, 2024, 5:18 PM

#

Around 15 gb

past bramble Aug 23, 2024, 5:18 PM

#

agile cobalt you _could_ use an API, but the charm of open weights models is running them you...

could you provide a reference on fine tuning llama with python?

jaunty helm Aug 23, 2024, 5:18 PM

#

unreal condor Around 15 gb

yeah, 4060 has 8gb

spring field Aug 23, 2024, 5:18 PM

#

there are 8GB and 16GB 4060s

agile cobalt Aug 23, 2024, 5:18 PM

#

past bramble could you provide a reference on fine tuning llama with python?

https://llama.meta.com/docs/how-to-guides/fine-tuning/

jaunty helm Aug 23, 2024, 5:19 PM

#

spring field there are 8GB and 16GB 4060s

isn't the 16gb only on 4060ti

past bramble Aug 23, 2024, 5:19 PM

#

that's a new color, we have purple names?

unreal condor Aug 23, 2024, 5:19 PM

#

jaunty helm you're probably doing inference on the fully weighted model, which isn't really ...

So what do you recommend how to do it tho

spring field Aug 23, 2024, 5:19 PM

#

we don't, no
I do kekw

past bramble Aug 23, 2024, 5:19 PM

#

agile cobalt https://llama.meta.com/docs/how-to-guides/fine-tuning/

thanks! wonder why I didn't get that in results directly

shut shoal Aug 23, 2024, 5:20 PM

#

#Create the question answer pairs using groq api
def groq_qa_pairs(text):
    #Create the client
    groq_chat = ChatGroq(
        #Keep the temperature low to maintain more precise question and answer
        temperature = 0.3,
        #Retreieve the key
        groq_api_key = os.environ['pipeline'],
        #Get the model type
        model_name= "llama-3.1-8b-instant"
    )
    #Give the prompt

    messages = [
        ("system", "You are an expert in legal analysis."),
        ("user", "As an expert in legal analysis, your task is to read the following legal text and generate a corresponding question that reflects the key legal issue, followed by a concise answer that summarizes the outcome of the case. \n\n Legal text: " + text + "\n\n Q: What was the key legal issue addressed in this case?\n A: Please provide a summary of the court's decision.")
    ]
    
    # Generate response
    response = groq_chat.generate(messages)
    return response
pdf1_res = groq_qa_pairs(pdf1)```

I keep getting "TypeError: Got unknown type system" and whatever I do to replace system it always returns some sort of an error.

past bramble Aug 23, 2024, 5:20 PM

#

spring field we don't, no I do <:kekw:788010203052376094>

how ya get it I want py_strong

shut shoal Aug 23, 2024, 5:20 PM

#

!paste

arctic wedgeBOT Aug 23, 2024, 5:20 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

jaunty helm Aug 23, 2024, 5:20 PM

#

unreal condor So what do you recommend how to do it tho

run a quantized model (tho idk how to do this specifically on colab)

past bramble Aug 23, 2024, 5:20 PM

#

stuck with damn no color for 5 years

shut shoal Aug 23, 2024, 5:20 PM

#

shut shoal ```py #Create the question answer pairs using groq api def groq_qa_pairs(text): ...

https://paste.pythondiscord.com/XLGQ

This is the full error

unreal condor Aug 23, 2024, 5:21 PM

#

What do you mean by "quantized" ?

agile cobalt Aug 23, 2024, 5:21 PM

#

past bramble how ya get it I want <:py_strong:590540023468654601>

#roles message

agile cobalt Aug 23, 2024, 5:22 PM

#

unreal condor What do you mean by "quantized" ?

using 8-bit or 4-bit floating point precision instead of 16/32/64 bits

jaunty helm Aug 23, 2024, 5:23 PM

#

unreal condor What do you mean by "quantized" ?

so usually each weight in the model is like 16 or 32 bit floats
quantization is trading precision for resources, so i.e. an algorithm might cut the weights' precision down to 8 bits, 4 bits, or even lower

unreal condor Aug 23, 2024, 5:24 PM

#

Oh, there is a way to trim the params down like that ?

past bramble Aug 23, 2024, 5:27 PM

#

so em where's the model stored and where's the fine tuning part

jaunty helm Aug 23, 2024, 5:28 PM

#

unreal condor Oh, there is a way to trim the params down like that ?

yeah, you don't just truncate the weights to fit in range though (you can, but you can do better)
rn the quant techniques give more/less bits to some weights, so you end up with something like 4.2 bits on average for example

agile cobalt Aug 23, 2024, 5:31 PM

#

past bramble so em where's the model stored and where's the fine tuning part

that image is only running inference, no fine tuning / training

it is downloaded from Hugging Face when you call from_pretrained iirc
generally you'll want to avoid using HF libraries for fine tuning, they're really focused on inference/deployment

unreal condor Aug 23, 2024, 5:31 PM

#

past bramble so em where's the model stored and where's the fine tuning part

When you call from_pretrained(model_name), huggingface will download the model for you if it has not been downloaded yet. And you fine-tune the downloaded model by training it with ur own data.

past bramble Aug 23, 2024, 5:33 PM

#

oh that clears it, thanks!

unreal condor Aug 23, 2024, 5:35 PM

#

jaunty helm yeah, you don't just truncate the weights to fit in range though (you can, but y...

Man i wouldn't have to go find a bajillion ways to optimize such thing if i were not dirt poor. Doing research in CS is really not for everyone lmao

jaunty helm Aug 23, 2024, 5:36 PM

#

it's always le money

verbal oar Aug 23, 2024, 6:04 PM

#

ah right gaussian rbf

lapis sequoia Aug 23, 2024, 6:27 PM

#

anyone wants to join at reading this https://en.wikipedia.org/wiki/Neural_network_Gaussian_process

lapis sequoia Aug 23, 2024, 8:27 PM

#

this one is quite neat as well https://en.wikipedia.org/wiki/Gaussian_process

abstract wasp Aug 23, 2024, 9:51 PM

#

Hi, by any chance can someone who has gotten a job as a data analyst/scientist look over my resume and give me feedback 🥹 ty

#career-advice message

shut shoal Aug 24, 2024, 2:15 AM

#

processed_dataset = pd.DataFrame()

def dataset_creation(qa_text):
    q = ""
    a = ""
    answerHit = False
    #Iterate through the data and make a dataset
    for word in qa_text.split():
        if answerHit:
            # If "Answer" word is detected, start appending to answer
            a += word + " "
        else:
            if word == 'Answer:':
                # Switch to answer mode when "Answer" is detected
                answerHit = True
            elif word == 'Question:':
                # Skip the "Question" word
                pass
            else:
                # Append words to question before detecting "Answer"
                q += word + " "


    processed_dataset = pd.DataFrame([{"Question": q.strip(), "Answer": a.strip()}])
    
dataset_creation(pdf1_result)
print(processed_dataset)```

Why doesn't processed_dataset create a dataset? What am I doing wrong?

serene scaffold Aug 24, 2024, 3:06 AM

#

shut shoal ```py processed_dataset = pd.DataFrame() def dataset_creation(qa_text): q =...

hello, your dataset_creation function does not return anything.

you also should pretty much never create an empty dataframe.

thorny rivet Aug 24, 2024, 5:54 AM

#

can you guys suggest me some great projects for final year

violet gull Aug 24, 2024, 5:55 AM

#

thorny rivet can you guys suggest me some great projects for final year

of HS or College?

thorny rivet Aug 24, 2024, 5:55 AM

#

violet gull of HS or College?

college

violet gull Aug 24, 2024, 5:55 AM

#

LLM without pytorch or tensorflow

thorny rivet Aug 24, 2024, 5:55 AM

#

will it be good for 200 marks

violet gull Aug 24, 2024, 5:56 AM

#

i havent seen a rubric so i have no idea

thorny rivet Aug 24, 2024, 5:56 AM

#

yhea understandable

#

i was going for predictive analytics

#

but llm from scratch is also kinda cool

#

thanks man

violet gull Aug 24, 2024, 5:58 AM

#

thorny rivet i was going for predictive analytics

3d "image"/model classification

jaunty helm Aug 24, 2024, 6:06 AM

#

thorny rivet but llm from scratch is also kinda cool

llm from scratch is basically impossible without compute & data inaccessible to individuals

serene grail Aug 24, 2024, 6:08 AM

#

Perhaps they meant a neural network from scratch which is actually possible

slate scroll Aug 24, 2024, 6:11 AM

#

A fun idea (that's a bit out of style) is retraining the last few layers of large models. That might be a neat project. The old version was something like:

For a large image understanding model (like AlexNet) retrain the last few layers to predict something like a breed of dog. While AlexNet can do this itself, it's not really very good at it and you can fine-tune it for a specific use-case.

I think this is a really interesting aspect of LLMs that could be explored. How can we start with a pre-trained LLM and retrain it for a specific task.

jaunty helm Aug 24, 2024, 6:12 AM

#

finetune for specific usecase
sounds vaguely similar to lora (the actual technique is different I'm sure)

serene grail Aug 24, 2024, 6:14 AM

#

slate scroll A fun idea (that's a bit out of style) is retraining the last few layers of larg...

That does sound fun, I need to look into this sort of thing

violet gull Aug 24, 2024, 6:17 AM

#

jaunty helm llm from scratch is basically impossible without compute & data inaccessible to ...

both are easily obtained

slate scroll Aug 24, 2024, 6:18 AM

#

jaunty helm > finetune for specific usecase sounds vaguely similar to lora (the actual techn...

A bit, LoRA is a combination of fine-tuning and optimization (often quantization, but I'm not super familiar if that's how LoRA is doing it). I think the fine-tuning is the easiest bit for something like this. It involves collecting data and training a model. Quantization or reduction then involves optimization that may be mathematically heavy for a project like this.

pine escarp Aug 24, 2024, 6:28 AM

#

thorny rivet will it be good for 200 marks

Go through kaggle, you'll get some ideas.

wooden sail Aug 24, 2024, 8:11 AM

#

violet gull both are easily obtained

no, especially the compute as you would have to pay and college students might not be able to, and for the data there aren't good public datasets. you'd have to scrape it yourself (people training LLMs this way is what has prompted platforms like reddit to lash back against scraping and require paid API usage)

unreal condor Aug 24, 2024, 8:17 AM

#

slate scroll A fun idea (that's a bit out of style) is retraining the last few layers of larg...

You can always fine-tune LLM for specific tasks. That is the point. LLM is just PLM (pre-trained language model), but large (of course), which is specifically need to be fine-tune for specific tasks. I think people keep mistaking that the only purpose for LLM is question answering.

unreal condor Aug 24, 2024, 8:28 AM

#

thorny rivet can you guys suggest me some great projects for final year

I feel like deciding your own project is a very bad idea. If you are planning to work on a graduation thesis, find a mentor (MSc or PHD graduate) then do some research with them, they will give and validate your topic to fit the academic style. You can't just decide on your own with this sort of thing and consult strangers on the internet is a big NO NO because most (if not all) Unis/Colleges won't approve random projects for a thesis. This is a tedious process and you should start ideally a year in advanced.

But if you want to do projects for elective courses then this is fine i guess.

thorny rivet Aug 24, 2024, 8:31 AM

#

unreal condor I feel like deciding your own project is a very bad idea. If you are planning to...

you are right

thorny rivet Aug 24, 2024, 8:31 AM

#

pine escarp Go through kaggle, you'll get some ideas.

thanks man

lapis sequoia Aug 24, 2024, 9:32 AM

#

interesting paper, ai in pharma (current open problems.) https://onlinelibrary.wiley.com/doi/epdf/10.1002/adhm.202401312

fiery bane Aug 24, 2024, 12:23 PM

#

thorny rivet can you guys suggest me some great projects for final year

just pick one from this list https://paperswithcode.com/sota

Papers with Code - Browse the State-of-the-Art in Machine Learning

11393 leaderboards • 5057 tasks • 10460 datasets • 138886 papers with code.

fiery bane Aug 24, 2024, 12:24 PM

#

slate scroll A bit, LoRA is a combination of fine-tuning and optimization (often quantization...

often quantization, but I'm not super familiar if that's how LoRA is doing it).
as far as i know, no

lapis sequoia Aug 24, 2024, 1:08 PM

#

u guys agreeing to this?

serene grail Aug 24, 2024, 1:10 PM

#

Well, it depends on how you define "increasing aptitude", if you look at the last few years, they have become better, sure
Is it still increasing? If yes, how much and is it enough to worry about? I honestly don't know

The first part about creativity and critical thinking I agree with

lapis sequoia Aug 24, 2024, 1:13 PM

#

how many of you use LLMs to summarise text?

#

(it's related to your 1st Q)

#

i think they are apt tool for the task, 95% of the time, for 95% of the people

serene grail Aug 24, 2024, 1:15 PM

#

I don't because

I wouldn't trust it to be good enough (I'm quite pedantic)
I like reading
(well, I also don't have a use case for it, I don't have a job or anything that forces me to read many texts I don't like)

left tartan Aug 24, 2024, 1:15 PM

#

lapis sequoia how many of you use LLMs to summarise text?

I've done it a few times when reviewing a text outside my domain, and used it to find relevant search topics. I was recently looking at some particular financial math paper and was unfamiliar with the algorithms involved, and used GPT to summarize and find relevant search keywords (it came down to finding the right starting point for the topic)

#

I try to force myself to not rely on GPT to explain something, but it is useful as part of a search strategy

serene grail Aug 24, 2024, 1:18 PM

#

Using it for search can be useful yeah, I sometimes try googling something and don't find what I'm looking for (or anything close) within a few minutes because I don't know the topic/field at all
Asking an LLM gives you the keywords you can then google

lapis sequoia Aug 24, 2024, 1:18 PM

#

interesting, i try to read as many things as i can fit in my head, and chatgpt is pretty much a coworker

#

think of it like paralellising reading XD

lapis sequoia Aug 24, 2024, 1:20 PM

#

left tartan I've done it a few times when reviewing a text outside my domain, and used it to...

yeah outside of domain is neat, cause it presents a relevant 'glueing' of information

fiery bane Aug 24, 2024, 1:21 PM

#

lapis sequoia u guys agreeing to this?

I disagree. It has never been an asset for any scientist, regardless of time.

lapis sequoia Aug 24, 2024, 1:21 PM

#

fiery bane I disagree. It has never been an asset for any scientist, regardless of time.

could u expand?

left tartan Aug 24, 2024, 1:22 PM

#

lapis sequoia think of it like paralellising reading XD

The problem I have with it is the incompleteness: I need to see the forest, not just the trees. A summary of a passage by itself doesn't help me understand the big picture. (Even if it were accurate)

fiery bane Aug 24, 2024, 1:22 PM

#

lapis sequoia how many of you use LLMs to summarise text?

very rarely

#

I mostly use it to write

fiery bane Aug 24, 2024, 1:24 PM

#

lapis sequoia could u expand?

I cannot find a time/scenario where a scientist, young or old, have to memorize stuff. Except maybe, in cases where their labs burned down / got bombed etc2

lapis sequoia Aug 24, 2024, 1:26 PM

#

i mean those scenarios are pretty much everywhere

#

how do you think a lab scientist conducts itself in a laboratory? is 90% memory

left tartan Aug 24, 2024, 1:27 PM

#

fiery bane I disagree. It has never been an asset for any scientist, regardless of time.

Ah, I didn't get your point on first read. You disagree with the premise of the passage.

fiery bane Aug 24, 2024, 1:28 PM

#

lapis sequoia how do you think a lab scientist conducts itself in a laboratory? is 90% memory

IDK lol, I'm a computer scientist, I just SSH to my labs.
I imagine they write notes???

lapis sequoia Aug 24, 2024, 1:28 PM

#

i dont see an llm as different from automation a computer does, but certainly ppl are segregated in the opinon space

fiery bane Aug 24, 2024, 1:28 PM

#

And they can consult notes too?

fiery bane Aug 24, 2024, 1:29 PM

#

left tartan Ah, I didn't get your point on first read. You disagree with the premise of the ...

yea haahah. There are things i like and don't like about llm, but replacing scientist memory just sounds weird.

lapis sequoia Aug 24, 2024, 1:30 PM

#

you have to memorise a lot of stuff, especially during your training, that may be unnecesary, i think that's the point

#

like asking to know what the -p flag is for accesing through ssh

fiery bane Aug 24, 2024, 1:30 PM

#

lapis sequoia you have to memorise a lot of stuff, especially during your training, that may b...

by training, do you mean like, highschool and bachelor and course work?

lapis sequoia Aug 24, 2024, 1:30 PM

#

yes, anything that requires memorising should be reduced in place of tools

#

that's how i interpret it

#

would you be ok mastering a task that a robot can do?

fiery bane Aug 24, 2024, 1:31 PM

#

lapis sequoia like asking to know what the `-p` flag is for accesing through ssh

like, the docs are always there if you forgot, you can always do man in the shell.
and if you use it often enough, you will memorize it

lapis sequoia Aug 24, 2024, 1:31 PM

#

(better and faster than yourself.)

unreal condor Aug 24, 2024, 1:31 PM

#

lapis sequoia i dont see an llm as different from automation a computer does, but certainly pp...

Yes, LLMs are getting more and more effective, but certainly cannot replace real engineers at the moment, those who tend to hate LLMs are probably senior devs who are too skeptical to try new things, and they r probably too paranoid that they think AI will take their jobs one day

lapis sequoia Aug 24, 2024, 1:32 PM

#

i agree with that ^

fiery bane Aug 24, 2024, 1:33 PM

#

unreal condor Yes, LLMs are getting more and more effective, but certainly cannot replace real...

orrr the seniors devs who have to fix things up when they broken because the junior devs are misusing and abusing llm?

serene grail Aug 24, 2024, 1:34 PM

#

lapis sequoia i dont see an llm as different from automation a computer does, but certainly pp...

I think LLMs just haven't been around long enough to be used in the same way as some other automation tools IMO, with time people will find (and invent) more sophisticated ways to use them despite their limitations and tradeoffs vs other tools
There's some infrastructure missing around LLMs is what I think. RAG is a good example of an enhancement that could make LLMs much more useful for a specific purpose, and as people come up with more enhancements like that (and continuously improve them), things will get better

left tartan Aug 24, 2024, 1:34 PM

#

unreal condor Yes, LLMs are getting more and more effective, but certainly cannot replace real...

I believe LLMs encourage lazy learning, doing the minimum / seeing the tree rather than understanding / seeing the forest

lapis sequoia Aug 24, 2024, 1:34 PM

#

how many senior devs would you need knowing kernels in the future i.e this ones: https://en.wikipedia.org/wiki/Kernel_(image_processing)
sp how to hand-craft them.

Kernel (image processing)

In image processing, a kernel, convolution matrix, or mask is a small matrix used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between the kernel and an image. Or more simply, when each pixel in the output image is a function of the nearby pixels (including itself) in the input image,...

#

not just memorising, but only a few minutes should be spent in the concept / idea, then move on

lapis sequoia Aug 24, 2024, 1:35 PM

#

serene grail I think LLMs just haven't been around long enough to be used in the same way as ...

yeah that's great criticism imho

unreal condor Aug 24, 2024, 1:35 PM

#

fiery bane orrr the seniors devs who have to fix things up when they broken because the jun...

That is my point, i said they are not good enough to replace real engineers at the moment. Even SOTA AIs can only execute like 17% of the engineer's tasks. A pathetic number so don't misuse it

unreal condor Aug 24, 2024, 1:37 PM

#

left tartan I believe LLMs encourage lazy learning, doing the minimum / seeing the tree rath...

That has always been the purpose of new techonologies. They make your life easier or encourage "laziness" as someone says.

left tartan Aug 24, 2024, 1:38 PM

#

unreal condor That has always been the purpose of new techonologies. They make your life easie...

The difference here is: the goal in education is not the task, but the learning. You're supposed to pick up the bigger picture along the way. At best, an LLM might help you with your project, but at the cost of the journey.

unreal condor Aug 24, 2024, 1:40 PM

#

left tartan The difference here is: the goal in education is not the task, but the learning....

I see, I agree with that

#

So what you were mentioning is the process, not the goal

lapis sequoia Aug 24, 2024, 1:42 PM

#

left tartan The difference here is: the goal in education is not the task, but the learning....

this seems to agree with the paragraph though, as long as one accepts that education is mostly about 'regurgitating information'

left tartan Aug 24, 2024, 1:43 PM

#

lapis sequoia this seems to agree with the paragraph though, as long as one accepts that educa...

Huh? I and the passage are saying the opposite

lapis sequoia Aug 24, 2024, 1:44 PM

#

yeah but you are not the education system

left tartan Aug 24, 2024, 1:44 PM

#

Learning is not about memorization, but understanding

serene scaffold Aug 24, 2024, 1:45 PM

#

left tartan The difference here is: the goal in education is not the task, but the learning....

"An LLM can help you get there, but cost you the journey" I'm gonna put that on a plaque above my oven in your honor.

serene grail Aug 24, 2024, 1:46 PM

#

To remind you not to use an LLM while cooking?
Why above oven?

lapis sequoia Aug 24, 2024, 1:47 PM

#

or to not buy meals but preparing them

left tartan Aug 24, 2024, 1:47 PM

#

lapis sequoia yeah but you are not the education system

I agree this is a problem: incentives. If your incentive is to pass a course with a good grade, an LLM may help. But you'll be a weaker candidate than those who learned.

lapis sequoia Aug 24, 2024, 1:47 PM

#

however, i do use a chopper, haven't regretted it.

#

(and an oven, as opposed to making a fire outside.)

#

imho it's very important to look at what aspects have already been replaced by technology

left tartan Aug 24, 2024, 1:49 PM

#

Hey, you're welcome to aim for mediocrity. More jobs for the rest of us.

serene scaffold Aug 24, 2024, 1:49 PM

#

serene grail To remind you not to use an LLM while cooking? Why above oven?

BillyBobby gives me cooking advice

spring field Aug 24, 2024, 1:50 PM

#

BillyBobby is an LLM confirmed

left tartan Aug 24, 2024, 1:50 PM

#

That was perhaps a lot of snark. My point is: you're not going to land a DS or Ml job without understanding the low levels

#

lol, now where's the #cooking

lapis sequoia Aug 24, 2024, 1:50 PM

#

yeah, i mean, i didn't clarify cause it was fun

#

but we are talking about the opposite aspects of the education system

#

im saying that one should remove the aspects that are repetitive, you are saying one should not remove the learning

#

those not always overlap

spring field Aug 24, 2024, 1:51 PM

#

left tartan Learning is not about memorization, but understanding

nah, understanding is still low-level, one must apply, evaluate, analyse, create

#

(a la Bloom's revised taxonomy)

left tartan Aug 24, 2024, 1:53 PM

#

I'm actually curious how LLMs and education will converge; how we'll see LLMs incorporate into learning systems.

fickle shale Aug 24, 2024, 1:53 PM

#

Any beginner book for stats?

lapis sequoia Aug 24, 2024, 1:54 PM

#

possibly relevant

left tartan Aug 24, 2024, 1:54 PM

#

fickle shale Any beginner book for stats?

complete beginner? No stats course ever?

fickle shale Aug 24, 2024, 1:54 PM

#

left tartan complete beginner? No stats course ever?

kinda

left tartan Aug 24, 2024, 1:54 PM

#

Does OpenStax have one?

#

(Their calc books are good)

fickle shale Aug 24, 2024, 1:54 PM

#

left tartan Does OpenStax have one?

let me check

main fox Aug 24, 2024, 1:55 PM

#

fickle shale Any beginner book for stats?

This can depend on your goals and background. Statistics by David Freedman is a decent book it you're at highschool level math, for example.

fickle shale Aug 24, 2024, 1:56 PM

#

main fox This can depend on your goals and background. Statistics by David Freedman is a ...

well i need for postgraduate i just forget all stats like probablity distribution and statsitical inference

left tartan Aug 24, 2024, 1:56 PM

#

fickle shale well i need for postgraduate i just forget all stats like probablity distributio...

Oh, so you want a calc based stats college level course?

#

Maybe find a recent syllabus from your Uni and grab that text?

fickle shale Aug 24, 2024, 1:57 PM

#

left tartan Oh, so you want a calc based stats college level course?

well i have stats and queueing theory as a subject

main fox Aug 24, 2024, 1:58 PM

#

Try Statistical Inference by Casella and Berger

quaint rivet Aug 24, 2024, 2:38 PM

#

is there any good resource where i can learn more about attentions with it's math?

#

i want to implement it on unet

fiery bane Aug 24, 2024, 2:49 PM

#

quaint rivet is there any good resource where i can learn more about attentions with it's mat...

the orginal paper?

unreal condor Aug 24, 2024, 2:51 PM

#

quaint rivet is there any good resource where i can learn more about attentions with it's mat...

Attetion mechansm has long been a concept in NLP, dating back to RNN. I assume you want to learn about the most recent type of attention, Multi-head attention, which is the most prominent factor contributing to the success of Transformer. Then try to google transformer concept in general, you will learn about multi-head attention eventually. I also believe StatQuest made an excellent video explaining the Multi-head attention concept

serene scaffold Aug 24, 2024, 2:56 PM

#

unreal condor Attetion mechansm has long been a concept in NLP, dating back to RNN. I assume y...

is this comment AI-generated?

unreal condor Aug 24, 2024, 2:56 PM

#

serene scaffold is this comment AI-generated?

Bruh, are you that skeptical

quaint rivet Aug 24, 2024, 2:57 PM

#

fiery bane the orginal paper?

videos would be good😄

#

actually i want to learn mathematics behind attention

pine escarp Aug 24, 2024, 2:58 PM

#

serene scaffold is this comment AI-generated?

Hello, I'm ChatGPT.

#

jk

#

dont ban me

unreal condor Aug 24, 2024, 2:58 PM

#

quaint rivet actually i want to learn mathematics behind attention

Tbh, i was suprised that the math behind Multi-head attention is just a bunch of matrix multiplications

quaint rivet Aug 24, 2024, 2:59 PM

#

unreal condor Attetion mechansm has long been a concept in NLP, dating back to RNN. I assume y...

ohk

pine escarp Aug 24, 2024, 2:59 PM

#

unreal condor Tbh, i was suprised that the math behind Multi-head attention is just a bunch of...

what is attention

quaint rivet Aug 24, 2024, 2:59 PM

#

unreal condor Tbh, i was suprised that the math behind Multi-head attention is just a bunch of...

yeah i too was surprised

unreal condor Aug 24, 2024, 3:00 PM

#

pine escarp what is attention

Imagine this ok. You want to translate a piece of text. What are you gonna do. Read each sentence then translate them 1 by 1 or translate the whole text all at once.

quaint rivet Aug 24, 2024, 3:01 PM

#

i found that it's easier application on unet

pine escarp Aug 24, 2024, 3:02 PM

#

So its related to NLP?

#

i should do a project on NLP ngl.

unreal condor Aug 24, 2024, 3:03 PM

#

unreal condor Imagine this ok. You want to translate a piece of text. What are you gonna do. R...

It's usually the first scenario so you pay attention to each sentence then translate them rather than the whole text. It's a rough explanation but in NLP the math behind it works somewhat like that. Each token will be affected more by adjacent tokens, and less by far away tokens

unreal condor Aug 24, 2024, 3:03 PM

#

pine escarp So its related to NLP?

Yes, it originated from NLP

pine escarp Aug 24, 2024, 3:04 PM

#

unreal condor It's usually the first scenario so you pay **attention** to each sentence then t...

ohh,i get itt

unreal condor Aug 24, 2024, 3:04 PM

#

Ngl lads, Dr Andrew NG gives a better explanation than me, and it has been too long since i last checked it

quaint rivet Aug 24, 2024, 3:07 PM

#

unreal condor Tbh, i was suprised that the math behind Multi-head attention is just a bunch of...

actually my core concern is on attention gate. I'm trying write attention gate which will use remote sensing techniques on input image. Attention unet has simpler architecture.

#

That's why i'm trying to learn it for scratch especially mathematics part

unreal condor Aug 24, 2024, 3:09 PM

#

quaint rivet actually my core concern is on attention gate. I'm trying write attention gate w...

What you are saying seem to be related to Computer Vision, which is apparently not the field of my expertise, I don't know about attention mechanism in CV then, sorry.

quaint rivet Aug 24, 2024, 3:09 PM

#

unreal condor What you are saying seem to be related to **Computer Vision**, which is apparent...

Ok np. Actually it's part of deep learning.

#

just looking for resource of attention

pine escarp Aug 24, 2024, 3:10 PM

#

unreal condor What you are saying seem to be related to **Computer Vision**, which is apparent...

Have you done any projects using CV?

unreal condor Aug 24, 2024, 3:11 PM

#

pine escarp Have you done any projects using CV?

Yes, but just basic ones. I delve deeper into NLP

pine escarp Aug 24, 2024, 3:11 PM

#

unreal condor Yes, but just basic ones. I delve deeper into NLP

so you expert in nlp

#

wxs_emu_woah

unreal condor Aug 24, 2024, 3:12 PM

#

quaint rivet Ok np. Actually it's part of deep learning.

Every common ML models are DL nowadays

unreal condor Aug 24, 2024, 3:12 PM

#

pine escarp so you expert in nlp

I try to

quaint rivet Aug 24, 2024, 3:12 PM

#

unreal condor Every common ML models are DL nowadays

ok

fiery bane Aug 24, 2024, 4:26 PM

#

quaint rivet actually i want to learn mathematics behind attention

I mean, I find if you want to learn the math, then go to the paper is the best way lol.
If you wnat video, maybe this? https://www.youtube.com/watch?v=-QH8fRhqFHM

YouTube

Jay Alammar

The Narrated Transformer Language Model

AI/ML has been witnessing a rapid acceleration in model improvement in the last few years. The majority of the state-of-the-art models in the field are based on the Transformer architecture. Examples include models like BERT (which when applied to Google Search, resulted in what Google calls "one of the biggest leaps forward in the history of Se...

▶ Play video

pine escarp Aug 24, 2024, 4:58 PM

#

Guys.

#

I use the classic jupyter notebook for coding

#

I recently learnt about poetry package

#

how to install it so that i can use in jupyter notebook?

#

do you guys use package/envs managers?

unreal condor Aug 24, 2024, 5:09 PM

#

pine escarp do you guys use package/envs managers?

When you install python, u have pip as a default package manager. But install anaconda for virtual envs

serene scaffold Aug 24, 2024, 5:09 PM

#

(I strongly recommend not using anaconda)

pine escarp Aug 24, 2024, 5:10 PM

#

serene scaffold (I strongly recommend not using anaconda)

what do i use then

unreal condor Aug 24, 2024, 5:10 PM

#

serene scaffold (I strongly recommend not using anaconda)

Is there better alternative ?

serene scaffold Aug 24, 2024, 5:10 PM

#

pine escarp what do i use then

You can usually just use regular virtual environments. Poetry is intended to make environments easier to reproduce.

serene scaffold Aug 24, 2024, 5:10 PM

#

unreal condor Is there better alternative ?

Just regular virtual environments.

#

Anaconda was created when native virtual environments were less mature. The use case for anaconda is pretty much deprecated, but a lot of people just haven't moved on.

unreal condor Aug 24, 2024, 5:11 PM

#

serene scaffold Just regular virtual environments.

Wut ? How do you create virtual env without conda ? Am i missing something ?

pine escarp Aug 24, 2024, 5:12 PM

#

serene scaffold You can usually just use regular virtual environments. Poetry is intended to mak...

I dont get it

serene scaffold Aug 24, 2024, 5:12 PM

#

unreal condor Wut ? How do you create virtual env without conda ? Am i missing something ?

Python comes with the ability to make virtual environments with venv.

pine escarp Aug 24, 2024, 5:12 PM

#

serene scaffold Python comes with the ability to make virtual environments with venv.

Oh

#

so you wnat us to use venv instead of conda

serene scaffold Aug 24, 2024, 5:12 PM

#

Yes

pine escarp Aug 24, 2024, 5:13 PM

#

serene scaffold Yes

if we use venv, can we use poetry?

serene scaffold Aug 24, 2024, 5:13 PM

#

I work for an AI company, and anaconda is banned at my company. And we all get along just fine.

serene scaffold Aug 24, 2024, 5:13 PM

#

pine escarp if we use venv, can we use poetry?

Why do you think you need to use poetry

unreal condor Aug 24, 2024, 5:14 PM

#

I see

pine escarp Aug 24, 2024, 5:14 PM

#

serene scaffold Why do you think you need to use poetry

people told me that its a good package manager

#

i liked it when they told me about

serene scaffold Aug 24, 2024, 5:14 PM

#

Native virtual environments are the default assumption. Just use those unless there's a specific limitation of them that you need to overcome.

pine escarp Aug 24, 2024, 5:15 PM

#

serene scaffold Why do you think you need to use poetry

https://discord.com/channels/267624335836053506/1276933702928302090

unreal condor Aug 24, 2024, 5:16 PM

#

Tbh, just use whatever you are comfortable with. It's just package managers, not maleware

serene scaffold Aug 24, 2024, 5:16 PM

#

(I treat anaconda as malware)

pine escarp Aug 24, 2024, 5:18 PM

#

serene scaffold (I treat anaconda as malware)

jupyter notebook comes with anaconda

unreal condor Aug 24, 2024, 5:18 PM

#

serene scaffold (I treat anaconda as malware)

Eh, someone on stack overflow treat pip as malware too so my stand is always neutral until I've tried all the new tech out

pine escarp Aug 24, 2024, 5:18 PM

#

is jupyter notebook malvare too

serene scaffold Aug 24, 2024, 5:19 PM

#

pine escarp jupyter notebook comes with anaconda

You have it backwards

pine escarp Aug 24, 2024, 5:19 PM

#

unreal condor Eh, someone on stack overflow treat pip as malware too so my stand is always neu...

crazy

pine escarp Aug 24, 2024, 5:19 PM

#

serene scaffold You have it backwards

LMAO MY BAD

serene scaffold Aug 24, 2024, 5:19 PM

#

Notebooks are fine as long as you understand how they manage state

pine escarp Aug 24, 2024, 5:20 PM

#

serene scaffold Notebooks are fine as long as you understand how they manage state

what is a state

#

also coming back to my doubt

#

lets say i use venv to manage my envs

#

but i still use anaconda to launch my notebook

#

is it fine or not?

unreal condor Aug 24, 2024, 5:22 PM

#

pine escarp is it fine or not?

Yes, i have used it for so long without any issue.

pine escarp Aug 24, 2024, 5:22 PM

#

unreal condor Yes, i have used it for so long without any issue.

i personally like classic jupyter notebook too

#

i dont prefer vs code

#

even though it has some good features

unreal condor Aug 24, 2024, 5:23 PM

#

Umm, u can install jupyter notebook in VS code

pine escarp Aug 24, 2024, 5:23 PM

#

unreal condor Umm, u can install jupyter notebook in VS code

like the classic jupyter notebook?

unreal condor Aug 24, 2024, 5:24 PM

#

pine escarp like the classic jupyter notebook?

Yes, with all the VS code benefit

#

Download the jupyter notebook extension then u can use it

pine escarp Aug 24, 2024, 5:24 PM

#

unreal condor Yes, with all the VS code benefit

will it look like this

#

@serene scaffold what notebook do you use

#

like vscode?

#

does pycharm support notebooks

#

also you use vscode notebooks?

wet canyon Aug 24, 2024, 5:28 PM

#

is this the right channel for a doubt i have regarding saving the final image after running K means clustering on it?

pine escarp Aug 24, 2024, 5:29 PM

#

wet canyon is this the right channel for a doubt i have regarding saving the final image af...

yes it is

unreal condor Aug 24, 2024, 5:30 PM

#

pine escarp also you use vscode notebooks?

Yes, save me the hassle with manually turning on the notebook with conda prompt

pine escarp Aug 24, 2024, 5:30 PM

#

but you can also ask in #1035199133436354600

pine escarp Aug 24, 2024, 5:30 PM

#

unreal condor Yes, save me the hassle with manually turning on the notebook with conda prompt

fair enough

wet canyon Aug 24, 2024, 5:30 PM

#

alright. sorry for the spam coming up then 😅

pine escarp Aug 24, 2024, 5:30 PM

#

unreal condor Yes, save me the hassle with manually turning on the notebook with conda prompt

how do you manage envs or install packages in vscode?

#

i have never used vscode

#

im familiar with pycharm though

spare forum Aug 24, 2024, 5:30 PM

#

pine escarp does pycharm support notebooks

Not community version

wet canyon Aug 24, 2024, 5:31 PM

#

I'm currently working on running K-means clustering on a thermal map image of a waterbody. While the clustering itself is working fine, I'm not able to save the clustered image correctly.

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import cv2

from sklearn.cluster import KMeans
img = mpl.image.imread("ipv2/resources/processed/thermal_image.png")

plt.imshow(img)
img.shape

X = img.reshape(-1, 3)
X.shape

kmeans = KMeans(n_clusters=1000)
kmeans.fit(X)

clustered = kmeans.cluster_centers_[kmeans.labels_]
clustered = clustered.reshape(img.shape)

plt.imshow(clustered)

clustered = np.clip(clustered / 255.0, 0, 1).astype(np.uint8)
# clustered = clustered/255
cv2.imwrite("clustered_1.png", cv2.cvtColor(clustered, cv2.COLOR_RGB2BGR))

pine escarp Aug 24, 2024, 5:31 PM

#

spare forum Not community version

so professional version supports notebooks?

spare forum Aug 24, 2024, 5:31 PM

#

Y