#data-science-and-ml
1 messages · Page 113 of 1
Okay, so I can't upload a pdf file
I will drop the link to access the paper
Please look up
Weather Forecasting using Incremental K-means Clustering - arXiv https://arxiv.org/pdf/1406.4756
I am I wrong here, low-resource language models are just glorified token compressed lookup tables with low temperature and low top-k next-word prediction models?
what language models? If you mean llms then, what lookup table? What are the keys in the table youre imagining?
like if the language model is fine tuned with a small set of Q and A and we take the prompt as the input it would be basically the key to the table
and then we write our inference code to only have a few sets of allowed keywords
then it's just a chat based knowledge vault, essentially
current draft, bit more refined, still haven't decided on some details of the notation so it might not be consistent yet, gotta lookup what people usually use and use that
not gonna be making a lot of major advances cuz im also writing my resume rn
When dealing with the gini index for the purposes of deciding a split in a decision tree, you compute the gini index of samples on either side of the split, then take a weighted average of the indexes. However, a gini index is supposed to be the probability of a sample being misclassified - that is, the probability of random.choice(samples).class_ != random.choice(samples).class_ - the correct way to compute this for the two splits would be a different formula entirely. Why is the weighted average used?
@serene scaffold i think i did something cool on my own (for once)
i parsed this excel file into a pandas dataframe. i hate parsing stuff in excel to pandas dfs.
df2 = pd.read_excel("/Users/rahuldas/Desktop/Tortilla Dataset/statistic_id1345446_corn-tortillas-consumer-price-index-change-in-mexico-2021-2023.xlsx", sheet_name="Data", skiprows=[0,1,2,3,4], names= ["Months", "Percentage Amount"], usecols=[0,1])
print(df2.head())
how would get the following to work in python? This is some mincer thing.
Can anyone say how should i start Tensorflow?
Because you want to add the proportions to it
You wouldn't want the tree to make splits that split off 1 instance each time into a leaf. You'd much prefer splits that can split off a large amount of instances
the correct way to compute this for the two splits would be a different formula entirely
turns out I am just straight up wrong, this is in fact the correct formula.
I noticed I have an off by one error (the area between blue and red has a disjointed connection where it meets the south pole area) 😦 😦 😦
Thats... going to be annoying to debug
I suspect it's related to pyproj
or bigger
I think the south pole is actually correct, and it's the prime meridian which has the off-by-1 (as well as the north pole). Since the black line at the north poll I think should be 1 pixel to the left which looks more symmetric with the south pole
Though... wait. if black is 0, and red is 1...
Doesn't this mean my longitude is increasing clockwise around the globe? Doesn't it go the other way?
😦 Oh no, there are more bugs than I thought
Made this AI that runs a LLM locally through Python, gave it some speech recognition for commands, still very work in progress
context, its replying with "xdd" and "short and bad" answers as I have for the sake of Debugging and testing made its behaviour like that (Im running the LLM off a Bad CPU Locally so Wanted to keep the Response time low ish) and I know there is some bugs with the Text Settings still (fixing that rn)
Overall, I'm very happy with it so far, In the future would probs upgrade back to Microsoft Azure Speech Recognition and Speech Synthesisation plus probably buy a GeForce Graphics card with CUDA for faster Responses (and using bigger models)
cool
I am trying to create an AI to forecast household electricity consumption appliances wise for a month. I have asked all the chatbots to write a code to create a model using suitable algorithm but still I’ve been facing problems as I don’t have strong knowledge on this. Is there any resources available for free to learn particularly for my project or any existing Research paper to learn from ?
You can try and look on the website Kaggle, it hosts competitions and projects, but often it also includes guides with provided code. High chance there will be a similar project on Kaggle already meaning you can look at other people's code or even look up some youtube video
Quick question here, im doing a project and im currently in the pre-processing stage of my data. After assesing correlation i notice that there are most likely non-linear relationships between features. Anyone know some techniques to uncover these non-linear relationships such that i can perform feature selection.
I’ve tried all those but still hard to find the dataset and the right algorithm for this specific project goal.
I later created a dataset using chatgpt and applied all the suitable algorithms but the accuracy is low. Every attempt I’ve ever made was through chatgpt provided code.
there are metrics other than (pearson) correlation you can try, like kendall's, spearman's, mutual information, etc
in fact if you look at the docs of pd.corr, you can see that you can specify a method=... to use the aforementioned kendall/spearman
Sure, ill give it a try. Didnt notice the different methods. The thing is also that its time-series data so that might also play a role
I checked it, there is not much difference between corr measures, all still approximately the same
I've also seen people recommend not doing manual feature selection and leaving it to regularization
I see, but isnt that highly dependent on what models you choose
I guess you can still check mutual information
there's 2 versions, this one's for regression and the one above classification
Examples using sklearn.feature_selection.mutual_info_regression: Comparison of F-test and mutual information
Examples using sklearn.feature_selection.mutual_info_classif: Selecting dimensionality reduction with Pipeline and GridSearchCV
Ill try the classif, since my target variable is either 1 or -1. See if it holds any new info.
there's also using metrics/models to select for you instead of doing it manually
e.g. SelectKBest
Examples using sklearn.feature_selection.SelectKBest: Release Highlights for scikit-learn 1.1 Pipeline ANOVA SVM Univariate Feature Selection Concatenating multiple feature extraction methods Selec...
some guide on the sklearn site
The mutual info, classif, yields more features that have some importance, although the importance values are all below 0.02. Note: i have about 140 variables, so i figured the importance would be spread out but i hoped for a few very important features. On to the selectKBest!
Hey, I want to start learning about AI development. I have ~3 years experience in programming, and have learned it by myself. How should I get started with Data science and AI? Can somebody guide me to the recommended resource for absolute beginners in this field?
!resources data science
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
Hopefully I'll be able to do summarization with decoder ony blocks, not sure if I'll have time to code cross attention
Try to setup openAI client on your computer and then go through some of their tutorials
yo, I have some regression model, that outputs 36 output dimensions, each of them being a human body length
I want to compare how well the model did on each body length, so I just calculated the MAE of it and the truth, but the longer body lengths tend to have higher errors (which isn't surprising)
So I wanted to ask: How should I normalize it? Should I use RAE or just divide by the truth mean?
Any AI/Ds dev here?
Need some carrer advice
How can i become good ai dev and how can i start
I want to strong my fundamental firstly so how can i start ai
and work on funadamental
Thank You Ur advice is appericated!
Hii anyone has worked with flowise ai?
does anybody has worked with pytrends library?
@brave cobalt @deep bough just ask your questions, dont ask to ask
I am facing issue with connecting custom tools with webhooks
depends on how deep you wanna go, if you wanna do research you'll need the math, if you wanna do high level gluing of AI components you can do it with good software knowledge and surface level understanding of how AI works
I personally feel like knowing at least a bit of multivariate calculus is necessary to understand concepts like gradient descent, or, just what are gradients. but multivariate calculus is not much harder than normal calculus especially if you dont get into the advanced stuff
Here is my flowise ai work flow
->
And here is my custom tool query
interesting, which part is not working ? do you have an error somewhere ?
aah javascript, what does the error printout ?
I both miss it and hate it, how is that possible
custom tool is not activating during the chat
right but what does console.error print out ?
Im trying to make a appointment chat bot and while chatting it should ask user its name and that name will me pulled with help of custom tool(given name property and js query (which is right)) so it should pull the name and post it to webhooks
🙂🙂
no error
while chatting I am giving name but its not activating the custom tool
maybe z-score normalization ? like, look for the tabulated average and standard deviation of all humans (they'd be estimates ofc)
and not even connecting to webhooks
ah okay so it's chat gpt that is not picking up on your prompt to create a request in the first place ?
I guess yes
ask if he knows about it
if he doesn't then maybe it's not made available to it in the first place
but ig this is the challange of using and working with LLMs, they're very unpredictable
maybe reduce the temperature to 0 or wtv parameter controls the output sampling
its having a normal conversating as it should so it means OpenAi tool is working
yeah but don't mean it knows about the endpoints right
Awesome
I've rewritten the gradients, now I just have to code them into the cuda code
the first one is looking kinda suss tho
cuz i did a whole thing just to get to this to avoid computing extra stuff
and one does not look like the derivative of the other
I think I gotta lower the indices on the deltas of the first equation first line, because there's an implied sum with Mkk'
yeah that was the case, dont know why im operating on the original expression anyway
im keeping it like this, but again only way im gonna know this is right is with a unit test on a fully coded layer
Hello i am creating 1 layer neural network using numpy that is trying to learn AND gate and something is wrong i am doing forward and adjusting weights but outputs are wrong
I am doing it for my school project and if someone can help me with it i would be really gratefull
your sigmoid derivative is wrong
it should be sigmoid(z) * (1 - sigmoid(z))
I don't think so; they pass y to it after all.
(more like, the argument of sigmoid_der should be called y and not z)
def forward returning a sigmoid already so i just pass it into sigmoid derivative so i dont calculate sigmoid again in it
oh, I didn't see that
ok i missed adding a biases in forward
i think your backprop is wrong - dW should involve np.dot(X, error), not np.dot(self.weights, error). (unless I can't take derivatives this early in the morning.)
shouldn't the derivative of input.dot(weights) be transposed
it started to work after i add biases and change learning rate
Your error is a vector, shouldn't it be a scalar ?
the variable name is slightly misleading but that part of the formula is correct I think
ok ur right i think i check my school materials that i get from teacher and changed it and its working properly
Ok I think I'm actually more confused by the fact that idk the dimensions of anything, but I think I see what's going on, the np.dot will do a matrix mul
I am just beginning AI and this scares me
im tryna scare someone into hiring me
its on my resume
but like, you shouldnt worry this is super specialized stuff
Is this what I have to go through in my college if I am pursuing AI/ML? 😭
i dont think tensor calculus is part of it
maybe not a lot of tensor calculus but I'd be surprised to not see linear algebra and a bit of multivariate calculus in an ML track
I wish ML papers used it tho, a lot of stuff goes under-specified with normal matrix notation
calculus
linear algebra
vectors
Matrices
Statistics
hi, im kind of new to machine learning, im not sure if this is the right chat to ask such questions either. i cant seem to get results that look right to me. i am trying to predict peat collection quantity based on weather statistics daily. ive tried multiple scikit-learn models, parameter tuning, using a standardscaler, but nothing really works out the way i think it should. can you recommend any models or just give any tips for this situation?
heres an example of what the full dataset looks like. "total_qty" is the peat collected that day
Using matplotlib how can I plot such figure where I can fit two plots in same axes frame i.e one above and other one below
I want to learn to get a job as a fresher and want strong fundamental
Read matplotlib or seaborn documentation I forget how we can create
are you looking for twinx/twinxy? https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.twinx.html
Kind of but the y axis reference (0) is shifted for both plots that they shouldn't coincide
I will look into it more, thank you
configure the ylims and ticks i guess? unless other matplotlib gurus have other ideas that is.
Hii how can I integrate Flowise ai chatbot with whatsapp
- where I want to store user input in excel too
ah ok, in this case this would be mathematically the same as dividing by standard deviation (when calculating MAE loss), thanks!
depends the kind of AI job you want
you can work with AI without need to use all math tools, i mean, develop a new deep learning architeture layout to solve a problem.
but still always highly recommended have a good statistics skill
but if you want research and create new types of algorithms like, optmization, implements libraries from scratch, or something related you will need a good math background
Doing it right on the input is better tho, it reduces the risk of overflow during inference and all other sorts of floating point related trouble
which one
both
what part of the first one confuses you
what is j,n,p
p is the percentile, j is the index of the datapoint and n is the total number of datapoints
the author is arguing that the definition is ambiguous because there are many such P's
you gotta give a starting point by trying to understand, throw an hypothesis, draw something, see what's the earliest thing you understand on the text and go from there
how do i understand this formula..
i mean i know the basic stats..like Mean deviation, standard deviation, median absolute deviation etc.
i get the interquartile range but don't know why we measure percentile
and can't understand the formula...
just like mean, std and etc, its just another way to characterize the data without having to look at the entire thing
it's usually best if you do that yourself, otherwise you'll always be dependant on someone else to learn a new bit of information
I can help you unlock yourself if you get stuck in a specific place, but otherwise you should be able to study
ok fine
In statistics, a k-th percentile, also known as percentile score or centile, is a score below which a given percentage k of scores in its frequency distribution falls ("exclusive" definition) or a score at or below which a given percentage falls ("inclusive" definition).
Percentiles are expressed in the same unit of measurement as the input sco...
this project has a nice mlops pipeline from dev to prod, it has data pipelines using prefect, deployments, uses py, rust, cpp and CUDA and has fancy math
so for my next trick, im gonna try to get it published somewhere
this is too much detail
try searchng about it on google
but also, there's this
khan academy is really good for getting your basics
Hi everybody, Do you think it is worth passing the tensorflow certification now, that they are gonna end it ? and in more general, do you think it's a nice certification to get, or is there a better one ?
U just need linear algebra and statistical learning theory which is still just linear algebra. Tensor calculus is not relevant. Tensor, as you will learn in linear, are just billinear mappings and are used because in topics like RL ur mapping rank matrices of different rank to try to populate Q matrices and such. Either way an undergrad degree will only teach u how to use it, not to research it. Topics like that are learned latr
what plotting libraries do people use for jupyter?
you actually also need calculus
not tensor tho
There's also Plotly and Shiny for dashboarding.
Gradients have strict linear algebra relationships and a derivative is nothing more than a Picard iteration not a derivative in computer terms so you don't. It's implied you learn it prior to linear but it's not necessary to utilize a packages which is what most "a.i" people do anyways
uhm
you gotta know what a derivate is
As a matter of fact, SVMs in c++ don't use gradient descent at all bc it's a slow shitty method that is weak
Ur computer doesn't take a derivative
it also doesn't see geometry
When u do gradient descent, it's performing a Picard iteration
yet, it renders it
Ahhh it can take in geometrical images if u do an emplacement into R2
it actually understands a very small set of instructions all things consdering
Which is also a linear algebra relationship
it understands +, -, if
Because there is a topological mapping between any hashmap and R2
I.e. dict{key, value} -> R^2
you need to know what a gradient is in order to understand the concept of gradient descent, or am I missing something
how do you understand back propagation without knowing about the chain rule
There are many algorithms that don't use gradient descent that are much faster.
I found a maths series on youtube by free code camp. It teaches maths along with how to do that with Python, using libraries like sympy etc.
Should I learn from it or should I just learn normal maths and later learn how to implement it in Python?
You should buy the book or go online and pirate linear algebra done right
I don't think you understand your claim when you say calculus is not needed
And treat it like ur bible
U don't need calculus to do machine learning. You can use it to do some - but no u don't need it
No need to learn "Python bindings" for it rn??
but you need linear algebra ?
they go hand in hand imo, you need both
i tried training my model using conjugate gradient from scipy and turns out it tweaks every single parameter one by one and sets it back and tweaks the next one
what else is there
Any book you'd recommend?
Within gradient descent there are a shitton
I don't agree with your assessment
Conjugate gradient descent is just one. Stochastic gradient descent, vanilla gradient descent, primal dual conjugate gradient descent dual gradient descent
I think calculus is a fundamental subject to study
Oh don't get me wrong it is, but it's too low level too understand what's going on
I understand that if you just wanna do high level gluing of AI components you dont need to understand the fundamentals
Exactly my point
but that's why I always preface that it depends on what you wanna do, how deep you wanna go
Undergrads glue
but stochastic and vanilla descents are the normal differential based gradient descent
Yes u didn't c the end of what I said I said for undergrads (my assumption was he was an undergrad)
They are different there is only one vanilla there are subtleties in each u cannot ignore
Do I need to understand curves too?
aleoght
So that's my issue then, how can you say that linear algebra is needed but calculus is not
They're both first year subjects that always happen in the same semester
Cuz they're both fundamental
Because not all machine learning methods need calculus but all machine learning needs linear but u misinterpreted what I meant when I said what is needed for high lvl work in the space
Calculus is very fundamental and even helps with linear algebra
But there are numerous methods in machine learning that do not use calculus and are actually faster as a result
Show me an opt Algo that don't use calculus so I see what you mean
Any SVM that uses kernalization and abuses linear separability
And then u can utilize the same problem using nonlinear seperability
And it can be implemented in c++ as well
So it's way faster than Python which is a snail language
I also do cutting edge research tho so applying these techniques is much more difficult than just abusing a package - and there is something to be said about the time it takes to create and build something not really making up for speed inprovements
Recent algorithms for finding the SVM classifier include sub-gradient descent and coordinate descent. Both techniques have proven to offer significant advantages over the traditional approach when dealing with large, sparse datasets—sub-gradient methods are especially efficient when there are many training examples, and coordinate descent when the dimension of the feature space is high.
But I can't and don't use packages really anyway so I'm just used to doing things by hand the right wah
Ahh yes the curse of dimensionality
Sounds like you'd be robbing yourself of a lot of tooling by not knowing calculus
Well we are talking Abt sometjing very different niw
I'm just saying the only math an undergrad must know to know how to use the packages and write real machine learning software they just need a strong base in coding basic and intermediate stuff discrete structures the like and linear algebra to understand feature spaces
Now if they wanted to mow the underpinnings of anything else then ofc calculus is mandatory
Know*
The reason why I disagree is that I believe both subjects are important to have a high level understanding of a lot of the core concepts in ML
High lvl and that's where I would totally agree but we aren't talking high lvl
And calculus is not just derivative
If we're not talking high level, then I'd argue you'd need more math
There is a lot more meat and potatoes there as well and besides on a daily basis most people would never even touch partial derivatives anyway.
Just call a package and be on their way
You're touching one everytime you train a model
I argue, it's best to know you're doing so, that's all
Ofc it's always good to know what ur doing but as I said linear algebra done right is self contained so it will teach him the building blocks anyway
Ah that idk I don't know that book. Can't really judge
It's the canonical text I've taught from it many times and I stand by it always
It's always my rec because it's extensive, advanced, and self contained as well perfecr explanations from a real mathematician not some of these apezoid linear books that are a joke
I don't recall which book I used for linear algebra, but I learned my calculus from spivak
Long time ago, still remember it lol
Yeah Stewart has a good one too but I learned calculus in high school so I don't remember the specifics the book I used
But early transcendentals by Stewart or whatever it's called is good
Mostly bc of the good parts included in it for multi which if u want to understand gradient descent at that basic lvl is crucial
Then obviously pffafenburger or rudin to understand what ur even doing in calculus but that's just the mathematician in me
Are you also a PhD ?
Understandable I c why people don't like it
I'm a PhD in applied math at UIUC which may seem weird that I'd say don't understand all the math but I think undergrads mostly just want a job
And don't need to actually know anything
Just let people like me write packages for them
Yeah ig it can be easy to forget that these are hard subjects if you're young and learning stuff for the first time. To me they're like stuff I've learned on my first semester, and not even close to the level of mental punishment I had to go through afterwards
Exactly u also seem like u liked it a lot and were willing to pursue it at a high lvl
Most people seriously don't care so when I offer guidance -> I do that first and if I get pushback then I'm like ok u wanna really learn? Then do this this and this
Yeah you gotta like it
Like id recommend to most people if they really wanna do machine learning? Gotta learn dynamical systems, graph theory, algorithms and recursive structures, etc
Lots of probability theory too bc most research is now in MDPs and the likes
Whelp, I didn't do half of those, I did physics
Tho ig you can say I picked up algos during my thesis
Yeh I hate physics but I'm starting to release my beef against jt
I also had a ton of stats and probs cuz I did a minor in maths, I literally prefer that someone punch me before having to hear another intro to the subject >.>
Mathematicians get triggered by the handwaviness in physics, and it's kinda funny
Yes I do
It's very frustrating
But I try to not remember that they are different fields and in reality physics is crucial
Physicists say the exact same thing about maths
I know the perfect video for this but I lost it
Yeh u know mathematicians could learn something from other fields about not wasting our time so much on things that don't matter but in the same vein everyone could learn from maths that everything matters
That's why I do applied and research RL and computational graph theory both topics actually have serious real world applications so it's both for the love of math and actually trying to further civilization
Instead of just picking my butt trying to prove crazy analysis or algebraic stuff that's meaningless
I meant math not maths I hate it when people call it maths
But I'm American so xd
Honestly at some point physics becomes super similar to math, especially at the forefront of research. They're all just doing math that 90% of the time doesn't really relate to their day to day experience so it ends up being similar to just inventing more math
They say that's how the field became stale with string theory, a lot of math, no experiments. That side of things will come back once they have better hardware ig, but til then I'm not sure if they can do anything.
Super interesting stuff, did you do math first then went into ML research ?
Well ironically despite what I said earlier which was guiding for the other person, math is ML and is critical to understanding it (assuming u want to at a high lvl)
So yes I did do math first I did my undergrad at NYU in mathematics
Ah when I said high level I meant it more in the computer science sense, in the "abstracting away the details" way
But the ML research I do is itself a math problem so
That's funny cuz I see it as a complete analogue to physics where you get your hypothesis and model and test it against experiment
Well it depends I guess on if ur just doing one stage of it or the whole thing or whatever
At least in deep learning it seems to be very experimental. You don't know nor can't prove anything 90% of the time, you just kinda gotta test it out
Well I wouldn't go that far my research 100% I prove first then I implement then I go back to the drawing board try something new
Also speaks to how cutting edge the work that is being done. Mine is all in deep reinforcement learning so it's very cutting edge
But I had to do the math first and then I'll have to do more math later down the road when I'm done working with the toy problem I've been playing witj
The papers I read are usually about transformers or semantic segmentation stuff
None of them are particularly mathy
Ahh if u want some super good readings do
Rcnn fast rcnn faster rcnn mask rcnn (detectron)
That's 4 super interesting paper right there I'm about to give a presentation on them along with my implementations
I'm doing this stuff rn
I feel like it's 100x more mathy than what you'd find on the literature it will refer to
Which includes the attention all you need, which was a super impactful study
But idk, I'm just following my interests
I'll look into it, I feel like I've read about it b4
What these papers choose to include is also field dependent
Sometimes it's published also from the math perspective but u might just be looking at the comp sci perspective or something
Idk the specific problem tho can't know every problem
I think in that sphere at least the computer science perspective is crucial cuz it directly relates to the $ you need to train at large scales
True detectron2 I couldn't get to work cuz my computer only has 2 gpus
And u need 8 to optimally train it
I could've written better code to make it use both my cpus so 2 cpus and 2gpus but I didn't have time
So I half assed it
To replicate the attention all you need one you also need a lot of GPU and like a week or something non stop
A good direction for research is to try to find ways to democratize all this stuff. Industry is already moving in that direction
Yeh I wouldn't bet on that
Too much money to invest no one is going to want to fork it over for free
Electricity ain't free 🤷♂️
it's cheaper than renting a gpu on amazon
there are ways to do it
Wouldn't matter to me anyway I have access to Delta at UIUC so I have a massive supercomputer XD
I keep hearing that kind of stuff xddd
And my advisor headed the project so I get to work on it a lot which is lit
Yeh it's huge
And u can multi GPU and multi CPU train
I suppose that could explain the disconnect between you guys and industry needs
They built it for both
Yeh I mean I also have access to a supercomputer in industry bc I work in quant finance
And they have way more money than even my school which has billions xd
by industry I would mean like, the startup scene which is arguably a crucial backbone for innovation
would be nice that mistral ai didnt have to sellout to msft to keep doing what they do
or that anthropic didnt need to so much investment from the big dogs
True I don't know uch Abt the startup scene tho
I'm a big dog unfortunately
Or fortunately depending who u ask xd
Research is great and all but research don't pay bills
i dont care either way, they both have their place imo
True the little guys in this space have actually a lot of impact
Cus they have to force innovation with minimal resources
Vs just abusing supercomputers
But then ig when quantum computing comes out all this is gonna be irrelevant anyway
it's also harder to mobilize a company like google, or an old timey institution like a uni
Yeh well universities have like tons of people doing different problems so they don't really give a shit
in this space they seem to be behind tho
most impactful stuff has come from companies and not unis
Well big companies use our stuff
And at places like Nvidia those PhDs also teach
So it's like a combined effort if anything
With every1 doing a specific part of the pipeline
uhm, I suppose so, wouldn't make sense any other way anyway
but still, the major innovations I can recall seem to come from the free market, tho ofc the knowledge has to come from somewhere and it comes from the faculties
@flat token I found the video - https://www.youtube.com/watch?v=NzK11DrRdks - not suitable for all audiences, math majors be advised
This video is inspired by an example from "Professor Stewart's Casebook of Mathematical Mysteries".
"as someone with a physics background, I can't resist expand anything that is expandable" 🤣
https://en.wikipedia.org/wiki/Neumann_series just for a little completeness, cuz that video was painful lmao
A Neumann series is a mathematical series of the form
∑k=0∞Tk{\displaystyle \sum _{k=0}^{\infty }T^{k}}where T{\displaystyle T} is an operator and Tk:=Tk−1∘T{\displaystyle T^{k}:={}T^{k-1}\circ {T}} its k{\displaystyle k} times repeated application.
This generalizes the geometric series.
The series is named after the mathematician Carl Neumann...
please help:
i am getting error:
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same
i have tried following:
def _convert_weights_to_fp16(l):
if isinstance(l, (nn.Conv1d, nn.Conv2d, nn.Linear)):
l.weight.data = l.weight.data.half()
if l.bias is not None:
l.bias.data = l.bias.data.half()
if isinstance(l, nn.MultiheadAttention):
for attr in [*[f"{s}_proj_weight" for s in ["in", "q", "k", "v"]], "in_proj_bias", "bias_k", "bias_v"]:
tensor = getattr(l, attr)
if tensor is not None:
tensor.data = tensor.data.half()
for name in ["text_projection", "proj"]:
if hasattr(l, name):
attr = getattr(l, name)
if attr is not None:
attr.data = attr.data.half()
self.enc.apply(_convert_weights_to_fp16)
def forward(self, x, window_of_500_patch, visualize, epoch_num, batch_idx, idx):
with torch.no_grad():
with torch.cuda.amp.autocast():
# print(x.dtype)
# x_f = x.float()
#! (N, 197, 768) => pick [CLS] => (N, 768)
out = self.enc.forward(torch.rand(2, 3, 224, 224).cuda(), output_hidden_states=True)```
Honestly it can be really awful if you're studying advanced physics from books cuz they'll do these mental acrobatics and not tell you about them so you kinda have to go back and forth endlessly til you figure it out
what would be a good way to visualise a graph network? and should i do the visualisation pre or post clustering?
hey, so this is a super simple problem, but im trying to import a csv to remove the first few rows. My csv is in the same folder and the first few rows contain text and other things. I've been using this code and keep getting the error that they're no columns to parse
do you know why this isn't working
further details supporting my initial hypothesis
i need to find more datasets though
I am very confused on how the deep Q learning algorithm can acutely work. Specifically I am confused on how the loss function will mean anything. How can you you be sure that reward + gamma*targetModel's Predction will guide you in the right direction if the target Model's prediction can be completely off. Thank you!
can sselenium access this inspect tab, im planning to get the xhr that have "multi" in the name of xhr
https://networkx.org/documentation/stable/reference/drawing.html
It internally uses other plotting packages.
Has anyone worked with Time series
Deep q learning is verification of a bellman equation
It's possible to teater out from not having enough iterations but that problem is fixed in a lot of different ways
For one the scheduler modifies the learning rate as the explore phase and exploit phase progress to continuously attempt decrease loss (which is ultimately the point)
Through this you build up your Q table and that completes the problem formulation
There are a lot more nitty gritty details but I don't know your background. This is quite advanced and not something cursory to just look at learn and know which is why you may be having some difficulty understanding the underpinngs
alright, today I'm chunking away some time to code these
..
why in nn.Module.register_full_backward_hook, thats what the hook is:
hook(module, grad_input, grad_output) -> tuple(Tensor) or None
why is grad_input and grad_ouput a tuple with one element?\
Hey, so when we train an AI, it needs to store the data. So how do AIs which play any game work? Is the data or something distributed along with the AI?
information is encoded in the weights
in deeplearning models are super complicated functions
as an example you can look at a super simple function
f(x) = mx+b
in this case, m and b are the weights, which you can change to make the function be the shape that you want
in deeplearning models, exact same concept, but the function is super complicated
looks complicated
potentially just for compatibility with older parts of the program, only way to know for sure is see it stated in the docs or ask one of the maintainers I think
you can also think of it as a big machine, with millions to billions of knobs, which you can alter so that the behaviour of the machine changes, you alter these knobs until the machine does what you want, and thus, the information gets encoded in them
By the 1950s, science fiction was beginning to become reality: machines didn’t just calculate; they began to learn. Machine calculating was out. Machine learning was in. But we had to start small.
Donald Michie’s “Machine Educable Noughts And Crosses Engine” -- MENACE -- was composed of 304 separate matchboxes that each depicted a possible stat...
this is the first one
i feel like there's one too many if statements, but im leaving it cuz im never not lost in the middle of all this stuff
im almost done with this stuff i think, taking a while ngl
@final kiln hii you have any idea how I can connect flowise chatbot with whatsapp
or say can I integrate it with whatsapp
Where I still want to extract user name and email from chat and its worrking with custom tool
you probably can't, last I recall whatsapp doesnt allow it
similar to botpress
you can probably do it o telegram
even with that, idk, I wanted to do a similar thing so I could have chat gpt on my whatsapp, but wasnt able to
interesting
so is there any way I can send this user and system message in it
dont they expose an api ?
yaa they require whatsapp business token
then you can just use that api to fetch and send data
should be a pretty small pythons script
but How I can fetch this user and system messages
that will depend on the api of the other thing you're using
means?
it's probably described in the documentation
like you gotta see how API A works, how API B works, and then you glue them with a script
API B means whatever you're using to interact with the language m odel, be it open ai api directly, or whatever this is
you gotta find one
😭 yess
likely in the docs
client requirement
it can easly done using direct python script
like have to just make a chatbot using llm
maybe you can make it as thin as possible and do the rest in py
okk thanks for helping Ill try
I came across reddit post saying matplotlib sucks and so many people agreed
but what else do I use???
I tried a whole bunch of libs they keep deleting my plots after i save my notebook and close and reopen
i do agree matplotlib is annoying last time all i needed was major and minor ticks for different y limits and it was so painful
also I am looking for some plot that I can update in place in jupyter notebook like fastprogress plot is there anything like that
haven't found a good direct alternative tbh, but yeah the API for matplotlib could be better for sure
there's plotly
I think this is one of those tasks where you just ask gpt for some boilerplate code and modify it to your needs
it's hard--and frankly terrifying--to imagine how it could be worse.
I have like, two or three functions I use from it, plot, scatter, figure and subplot
never dared to go farther
oh and hist
I've made a class that I can say class.plot(x) or class.imshow(x) any amount of times and then when I say class.show() it creates a figure with all the things I added to it and I dont have to deal with figures and axes
but still giant drawback is that its not interactive, I need to plot 20 lines and I have no idea which is which from the legendf
I liked bokeh (from little usage) but it deletes after reloading notbook
Why use conda over pip
its the only python environment manager that I was able to install and figure out
also when u install pytorch it seems to be better at installing CUDA
I was watching AI and data science course and they recommended to install conda
I used to use it a lot before having my life dominated by docker
conda is considered to be "for datascience" but idk why
anaconda comes with all the scientific packaging
yeah but they could make pip come with all of it
yea
Not really, python is a general purpose language, AI is one of a hundred uses
they could make anapip for AI users
the conda part I don't think is specifically for AI
I don't wanna argue against my own self interest here, but idk if we'll ever get special treatment like that
just a nice environment manager and can istall some packages without breaking dependencies unlike pip
Yeah I see what you mean. Pip does have something like that with venv
But I've never used it a lot cuz I used conda or docker
There's also this one https://python-poetry.org/
Python dependency management and packaging made easy
Which is the yarn/npm analogue of py
Arguably better than conda, but depends on your tastes
I want to clarify a doubt regarding linear regression
I came across 2 ways to solve the problem
- gradient descent
- a mathematical equation = (XXT)^(-1) (XY)
is this correct and are both used in real life?
is the second method called OLS ?
Hey there, not sure if this is the right place to ask, but I'm facing an issue and wondering if anyone has stumbled across this before
I'm managing an air gapped environment where part of my job is making jupyter notebooks accessible to data scientists. We mostly do this through JupyterLab images on K8S, but we also provide the ability to work with VS code
We configured an image that has vscode-server, that way they can SSH into the remote container and leverage robust hardware, while conveniently working from vs code. But we only considered people working with regular python files in vscode
Some clients requested the ability to work with Jupyter notebooks in vscode from the remote ssh, We figured it'd be a simple case of installing the microsoft jupyter extesion for vscode, installing IPython, ipykernel and jupyter on our images and installing the python extension.
However, for some reason, the Jupyter extesion doesn't detect any Jupyter kernels. It doesn't even detect that python is installed, which is the weirdest part because it clearly is, I can run python code with the python extension,
Does anyone have an idea as to what the problem is? I am using VScode 1.82.2
yes and yes
I want to semantically group independent document information in the same context. For example, if there are 50 hedge fund reports, the ideal output is "two advisers predicting that stock X will increase while one predicts a decrease", etc...
I am pretty new to this, so I tried embeding and cluster, provided me somewhat bad results but set me to the path of exploring more in that direction.
Recently, I've found out about BERTopic and Topic modelling. I think this is huge and that I am closer than ever to solving this. My BERTopic stack looks like:
embedding_model: all-MiniLM-L6-v2
representation_model: [KeyBERTInspired(top_n_words=25), MaximalMarginalRelevance(diversity=0.4)]
vectorizer_model: CountVectorizer(stop_words="english")
I want to either:
A) For every document, I am parsing and looking for "similarities" to run fit and transform so I have a list of every document's topics. Then, for connected topic-based docs to use, like ChatGPT, to try to find similarities for specific "overlapping."
B) Run all the documents together as a knowledge base to see mutual topics and, based on the output, search for relevant parts in the documents.
Bonus questions:
- Should I split documents into semantically grouped parts, or should I have one element/document?
Thanks
Thanks. !
Yaay, I won the hackathon I spoke about recently
Choose the libraries that work for you, whatever they say on Reddit. Matplotlib is behind a lot of other stuff (pandas plot, seaborn etc.), it's worth learning a bit about it even if you don't use it directly
anyone got any news on distributed inference?
give me ML ideas
R is the "data science" language. Most other languages can do many things
in 2024, you can uninstall conda and forget you ever had it.
I've been doing ML since 2018 or so, and I've never had conda installed.
Does anyone here have experience designing machine learning pipelines using model-based parallelism so that you can effectively have bigger-than-one-gpu models
I am wondering if there's any resources someone recommends on this topic
try plotly, it's a good visualization library
So conda is just a Cpython distribution that contains pre installed packages and tools for data science?
hello, im not sure where to post this, so please let me know if there is some other place I should post this but I needed some help understanding this. I get the idea but I don't understand how to do it. How do I for example use the bigram model in this instance?
i still use conda, it's an easy way of managing python in environments without any permissions. mamba is pretty good
hello guys
so I wrote my BSc final exams few days ago and my Project defence would be coming up in a month time. I just want to say I am officially unemployed😂 .
I am a Data Scientist and Machine Learning Engineer, been programming since 2021. I am currenly exploring NLP and i am working on a TextSentimentAnalysis project, hoping to eventually build a Customer Review Analysis software.
I am readily available to take on any role in the Data field. So guys please hit me
sure 👊 😏 👊
hey i need some hep with data preprocessing
anyone free to lend a hand and teach me
hello! I'm writing a data processing code that heavily use pandas library and it seems kinda slow. I have no idea how I can optimize it but maybe someone here can help. Can I post a my code here ?
sure, send it
Sorry, we don't do look-for-hires but if you want general tips you can check out #career-advice
Hope you guys can find something to optimize. 🙂
Here is the main loop of my program:
import pandas as pd
from strategies.Strategy import Strategy
def strategyLoop(df: pd.DataFrame, strategy: Strategy, longTermMAPeriod:int=200, pipValue:float=50.0, capital:float=1000) -> pd.DataFrame:
CAPITAL = capital #$
inPosition = False
entryPrice, sl, tp = 0, 0, 0
slInPips, tpInPips = 0, 0
pipValue = pipValue
lot_size = 0.01
entryDate = df["datetime"].iloc[0]
tradesData = []
for i in df.index[longTermMAPeriod+strategy.N:]:
currentPrice = df["close"].iloc[i]
if not inPosition:
inPosition, slInPips, tpInPips, entryPrice, entryDate = strategy.checkIfCanEnterPosition(df, i, CAPITAL)
else:
newSlInPips = strategy.updateSl(currentPrice, entryPrice, tpInPips)
if newSlInPips != 0: slInPips = newSlInPips
sl, tp = entryPrice+slInPips, entryPrice+tpInPips
lose = currentPrice <= sl
win = tp <= currentPrice
if lose or win:
profit = tpInPips*pipValue*lot_size if win else slInPips*pipValue*lot_size
#print(f"profit {profit}, tpInPips: {tpInPips}, slInPips: {slInPips}")
CAPITAL += profit
tradesData.append({
"entry_date":entryDate,
"exit_date":df["datetime"].iloc[i],
"entry_price":entryPrice,
"stop_loss":sl,
"take_profit":tp,
"profit":profit,
"capital_after_trade":CAPITAL
})
inPosition = False
return pd.DataFrame(tradesData)
and here is a function used in the previous code:
def checkIfCanEnterPosition(self, df: pd.DataFrame, i: int, capital: float) -> tuple[bool, float, float, float, str]:
inPosition, slInPips, tpInPips, entryPrice, entryDate = False, 0, 0, 0, ""
allowedToTrade = True
if self.uselongTermMA:
allowedToTrade = True if df["longTermMA"].iloc[i] < df["HA open"].iloc[i] else False
if allowedToTrade:
shortTermMAZoneMin = df["shortTermMA"].iloc[i]-(df["close"].iloc[i]/100)*self.percentZoneFromMA # => MA - 3% du prix
shortTermMAZoneMax = df["shortTermMA"].iloc[i]+(df["close"].iloc[i]/100)*self.percentZoneFromMA # => MA + 3% du prix
isLastNCandlesInshortTermMAZone = False
for j in range(i-self.N, i):
if utility.between(df["HA close"].iloc[j], shortTermMAZoneMin, shortTermMAZoneMax):
isLastNCandlesInshortTermMAZone = True
break
if df["shortTermMA"].iloc[i] < df["HA open"].iloc[i] and df["HA color"].iloc[i] == "green" and isLastNCandlesInshortTermMAZone:
entryDate = df["datetime"].iloc[i]
entryPrice = df["close"].iloc[i]
if self.useSR:
isBelowMiddleSR, slInPips, tpInPips = self.determineSlAndTp(capital, entryPrice, self.keyLevels)
inPosition = isBelowMiddleSR
else:
slInPips = -utility.getSlInPipsForTrade(
invested = capital*self.maxRisk,
pipValue = 50, # valeur du pip pour le SP500 pour un lot standard = 50
lotSize = 0.01 # micro lot
)
inPosition = True
tpInPips = -slInPips
return inPosition, slInPips, tpInPips, entryPrice, entryDate
Yup, that's what I suspected.
You're typically not supposed to loop over data frames
Yeah but I don't know how to do other way :/
How well do you know Pandas yet?
There's no wrong answers, I just want to see how best I can help you 😄
I know you can do queries like in a database and you can use apply function that could maybe improve the speed
I used pandas because dataframes are very "readable" datastructure and easy to use with a lot of functions but maybe I should use another datastructure to store the datas ?
So, I 'd advise you to learn what functions Pandas has to offer because you can replace your "imperative" code with Pandas' methods that do the same but in 1 line of code and without having to write loops because looping over dataframes is what makes it slower
Ok I'll try to learn more about pandas functions
anyone done some distributed inference yet? am messing about with PiPPy atm
what is pytorch doing with my threads gahhh
so much abstraction i can't see anything
It's more than that. But in 99.9% of circumstances, you don't need it. You can just use python normally and pip install everything you need.
I do all my development in a cloud computer running a docker container with all the dependencies.
....and sometimes the cloud computer is my own laptop - it makes all lot more sense than what it sounds like once you do use it
Like, I can onboard a dev in about 20-30min or less. No one else has to install any packages or worry about env
it's a convenient way to spawn new python environments too
You can do that with normal python
not if you need to switch python versions
which will occasionally be the case with ML and numerical libraries and finding what works
also I understand in the latest versions of conda they are switching to the mamba solver if I'm not mistaken
so the main reason it was kind of unenjoyable before should be going away
Can't you do it with pyenv
idk what that is and why people keep pushing it
conda has its own repos too
which is good for business usage
I'm not pushing it, I'm saying that I think you can manage py versions with pyenv
A lot of people recommend it but I've never heard of it anywhere but this server
I favour docker over all of this person ally
I think poetry uses it, might've been how I found out about it
Not too sure tho
in my professional life however, I've encountered conda independently several times
does that really mean anything? idk maybe
Same, but I've encountered all of them I think
Tho docker is everywhere I haven't been in any project where docker isn't used
In some way shape or form, there's always been docker
I use docker but just when I'm getting ready to make my stuff portable I think running everything out of docker from the beginning of development sounds like a pain in the ass
Sir, you'd be 100% correct
But - after spending the time and getting it right for the first time, it has been a breeze. Never going back.
py -m venv .venv, py -3.11 -m venv .venv? assuming you have the version you want installed
Like, there's a way to do it in which it is not troublesome. It just took me a long time to fine tune it.
isn't py some weird program that only exists on windows
I've never used that either I don't know what it is and only heard of it here
you can do the same thing with python3 in linux as far as I know
yeah but you need to then manually install python versions
On that note, why do we have so many names
which is the entire point of what I was getting at
don't manage python version installations yourself
just treat python like a package
with a tool that can pull it
you still need to keep track of which versions you are using, trying to pretend that it works like magic is just throwing problems under the carpet
No, I think he means like, you can have several 3.11 installations done via conda, like it provides the API so you don't have to do it
that much I guess that I can understand, but still don't consider enough of a upside to use conda
also, technically you can install python via the command line? definitely overly complicated territory though
I understand that there is actually a way to have models bigger than a single GPU's vram if you pipeline the model into different pieces where each piece fits into a single GPU
I want to learn how to do this
does anyone have any resources on how this sort of thing can be done
Interesting problem, haven't thought about it myself.
The only technique I know that is in that vein is gradient accumulation, which is pretty simple to implement and quite effective
Ig the first question I'd ask is why are they there, do they mean anything, is it missing data, if so can I get away with removing data points that contain NaN, etc
there is in principle a different approach for each dataset and each model
interpretation of missing data can be different depending on the context
in some cases you might want to backfill it using something like KNN
in some cases you know what it means and should encode a dummy variable
from dataset with [15716 rows x 16 columns] , Number of rows without NaN in any column: 2955
Thats <20%
Considerable amount of NaNs
Can you get away with removing the column that contains NaNs ?
Missing value NaNs
I bet it's a subset of columns that are most often missing
I mean how much info can it hold of it's mostly nans rite
if it's just random missing data I think doing interpolation using KNN is actually a good idea
but if it's a few columns then you might just want to drop them
Distribution of missing values is like this:
Date - 0 NaNs
Issuer - 0 NaNs
Symbol - 0 NaNs
Exchange - 0 NaNs
Amount - 248 NaNs
Security - 4 NaNs
Performance_1Qtr_After_Deal - 1087 NaNs
Performance_1Yr_After_Deal - 938 NaNs
Performance_to_Current - 0 NaNs
Market Cap - 652 NaNs
Forward P/E - 4841 NaNs
PEG Ratio - 11353 NaNs
Price/Sales - 4184 NaNs
ROE - 3414 NaNs
Debt-to-Equity Ratio - 5152 NaNs
Net Income - 3376 NaNs
what do you mean maru? Are you suggesting using clustering to predict value range which I can plug in into missing data?
for each NaN column?
there are things built into sk learn to do this
Idk actually, seems evenly distributed almost
i forgot what it's called exactly
but with those y ou probably can't do it
because just from my personal knowledge
context is important
e.g. if I have a row whose basementArea is nan, but the value in hasBasement is False, then the nan is probably because there's no basement, and I'd fill that with a 0
all those missing ratios are probably cases where earnings are negative or zero
or sales are negative or zero
so they just make the ratio a nan
Agree, thats why i cant just simply drop NaN dense columns or like use mean value
you should one hot encode the missing variables
like
if you need to make them some value for your model do that
but then also have a column that says "this row had this column replaced because it was nan"
as an indicator variable
It is little more complex than that, and it isnt the point of my question, i just want to ask you for suggestion how to structurely think about filling in missing values or like what can i do with it
I mean you can if they just signal that the data point is incomplete, but if it means something otherwise, you can just encode it somehow
you should not fill in those missing values
they have a semantic meaning in finance
they are probably nan for a reason
you can't use zero for negative earnings
Maybe try to dig in the dataset docs, if there's one
is this a specific use-case where i shouldnt try to fill in, or that is general goto?
it's because we know that the nans probably exist for a reason and aren't just missing data
the reason is that when companies have no sales or no earnings they are undefined
and filling them with any number would be inappropriate in a sense
that's why the indicator variable is important
The answer then seems to be that you gotta acquire domain specific knowledge about your problem and try to make a decision that makes sense
Yeah that makes sense
thanks @agile owl @final kiln
also, what do you mean by encoding missing values?
I would hot encode enums
Cause I know that 0-item1 1-item2 etc...
Depends on the model, for example if I was using a language model I'd just use a special token that represents NaN
Perhaps a numerical value that I know for sure won't appear anywhere else, maybe a -1 if all other numbers are positive, idk actually
You can also do one hot encoding
And you can actually also do the same thing as with language models
Which is to use an embeddings table
would it make sense for example to normalize it [0,1] and then to use -1 as encoder
Maybe, I can't say for sure. Normalizing tends to be a pretty good idea tho
For sure, but also, to spare you some suffering, try to search through papers with code to see if anyone has solved your problem or a similar problem
You'd be surprised at how much time you can save
It shows you the state of the art in the various areas of deep learning
And you might even find your dataset there
neat! thats what i need
how you encode missing values can also depend on the model yo uare using
some models just natively handle nan
which is the best imo
what framework are you guys using to run a pytorch model as a server/daemon?
If it is true, can someone explain me why this is the case?
Also I think this is extremely nice formulated:
You can create a machine learning model without using the column and use it's performance as a baseline, and carry out a performance(accuracy) benchmarking for all the steps compared to the baseline.
Idk, I'm very wary of anything that means generating new data to fill in the gaps
My instinct tells me to just drop columns and data points than to add synthetic data like that. Ideally the model would have some way of encoding "missing data", cuz that in on itself could be a bit of information right, "when these values are missing, the output tends to be a certain value"
What that person says at the start is very true in my experience. Often the data quality is much more important than the model. Ig you can totally choose the wrong model, but if you have bad data not even the best SOTA models will help you out.
If I were to directly modify data points from my dataset, I would need a pristine justification to myself
guys
this is not quite right either. you can think of it as being two sliders, one for the model and another for the data. the worse one is, the better the other has to be. the issue is that most people do black box ML, i.e. the model is completely made up, and so data is everything
im doing a project in data analysis
i need a lil bit of help
i wish i could talk with someone and share my screen and stuff
What I'm saying is that you can have a fancy model and it won't work cuz the data is bad, and it's waaay easier to have a fancy model than to have good data.
but i accidently left this server now i have to wait for 3 days to get voice verification
Not only that, at least the deeplearning models I've been using are very resilient, I can butcher them and still get good results if my data is of high quality >.>
what are you guys using for serving?
sure, this would be the extreme case where the model slider is set close to 0 but the data is good
you can use bad data with a good model and well-motivated regularization to account for data errors and that works too
why no one is replying to me
because there is nothing we can do about this
if you have questions, by all means go ahead and ask
I've spent countless hours around bad data to get nothing, then I got better data and it got solved in like 30min
It happens a lot to me
for complicated phenomena, it's very difficult to make a good model
I suppose the data slider is more important, that's how I feel
would you help me?
in those cases you're kinda screwed without good data
But ig it can depend on the problem
i can't know if you don't ask your question
Maybe I've only worked with problems where data is more important
@wooden sail why is it?
why is what
i have a dataset and they have given me some task and i've done most of the work except two tasks and nothing is helpful not even chatGPT
you're still not asking a question
imagine that you are the person trying to help you. what would that person need to know to say something helpful
how can i show you my dataset?
this is one of those english, m*, do you speak it moments
what is the structure of it? text files? images?
excel
no need to be rude
is it xlsx or csv?
its normal column and row data
the file itself: what is the extension?
okay. show a screenshot that shows the names of each column and the first few rows (and nothing else--don't include a bunch of other stuff on your screen)
okay wait
Please stop trying to upload documents. Please post a screenshot.
oh no i cant send the xlsx file here
can i send u personally?
No
If you're willing to upload the whole xlsx file here, I'm not sure why a screenshot would be an issue
the thing is idk how to take a screenshot
I'd need an example of this tbh, cuz like, I don't think it's advisable to just invent new data out of nowhere, especially if it's purely motivated by stats and not from some understanding of the underlying phenomena
so... what do you guys serve models with
what operating system are you on
well yeah, just making data up is always a bad idea 😛
intel
Not necessarily, I can think of situations where it is okay
i mean randomly making it up with no motivation behind how you made it up
64 bit
are you on windows, mac, or linux
windows
things like missing data are ok as long as there is some notion of "structure" or "low dimensionality" underlying the data
okay, open the "snipping tool"
label images according to whether the 192th pixel's color hex is prime
Ok wait, I think I didn't understand what you said. You said regularization to account for errors, what do you mean by that ?
omg yeah got it
great. Remember to only use screenshots to share information that you cannot share as text. Text is always preferable to screenshots.
In here
depends on the type of error, but some concrete examples include the data being "bad" due to noise. if you know the noise statistics, then you can do something about it
if the data has no noise but parts are missing, and you know it follows a "simple"/predictable structure, that's also fine
a combination of the two is also ok
this would mean you know the model is "simple" and you also have a statistical model
Okay yeah that actually makes total sense and I've done that many times especially with data coming out of instruments, you know the noise cuz you can literally sample it.
if you also know your model is wrong because it's a little too simple, but it usually performs well, there is a way to measure "mismatch". you can try to make simplified models with fewer parameters that are "usually" "almost correct"
these tend to be robust to data errors, but in exchange the maximum "resolution" is poor
never too wrong, but also never quite right
Would you say it makes sense to try to identify noise in the data even tho you can't actually sample it ? Maybe using stats you see that there's like a random component to it and you remove it
My instinct is to not remove it, cuz I'm ignorant about what it is and where it came from
okay. now explain what the task is. be specific, so that we don't have to interview you.
yeah you can't even remove it if you don't know what it is. there are some decompositions that, although blackboxy in that they use deep learning, are based on the idea of decomposing the data into a "simple"/"structured" component and another "random" one with lots of detail that cannot be trusted
Task: Geographic Analysis
Plot the locations of restaurants on a map using longitude and latitude coordinates
what tool are they asking you to use?
python ofc
they didnt mention that
i just have to do the work with python thats all
I gotta do some upskilling on non-deeplearning ML
I hate stats tho >.>
okay, try using this: https://geopandas.org/en/latest/gallery/create_geopandas_from_pandas.html
ok
but non-DL ML is just stats 😮
(and DL is just doing calculus on the stats)
also you're so kind @serene scaffold
you have so much patients for someone who is dumb like me lol
@orchid forge no problem. do you know how to install stuff (like numpy, pandas, etc)?
you'll need pandas and geopandas. and probably openpyxl, to open the excel file as a dataframe
Ig I'm being unfair to the subject. I think that stats alone is extremely dry and unappealing, but when it's coupled with a subject it becomes something really good. Some of the most profound ideas that I've had the pleasure to put in my head are statistical in nature
im using google colab i hope i could import those libraries
yeah, you just need to do !pip install pandas geopandas openpyxl as a cell.
k
you're an admins woowww
its nice that you help people
I only do it to offset what a horrible person I am in every other aspect of my life
no you're not, if you're helping someone like me i think you're the coolest yet very humble person
anyway, this isn't about me. what progress have you made towards making the "maps"?
i just imported the libraries
and now writing the further code
god i can't write a single code i wanna cry
start with just loading the excel data as a dataframe, and then converting that dataframe to a geodataframe
okay, what does gdf look like when it prints?
looks good to me. keep following the example from the geopandas website
it looks like you have cities from all over the world, so you can skip the part where they restrict the map to just south america.
what does that mean?
ok
I think you can just remove the .clip([ ]) part
ok
it's an absolute frigging pain to download these huge models
is there no better alternative than git clone? it keeps messing up
forg it, i'll download the parts file by file
huggingface?
yeah but it uses git clone
loookkkkkkkkkk
interesting. git is historically really bad at very large binary files.
i did it
it is absolute scheisse
and my instances keep dying because they run out of memory
from a DOWNLOAD
it's just not a good fit for the data model. git is meant to work well for lots of small and medium size text files containing mostly text.
any alternatives? otherwise i'm just going to make a shell script that downloads a list of URLs
YAY
omgggggg
Might as well just download it directly from the hugging face web UI
huggingface makes you install an extra git module to handle large binaries
git-lfs? or something else?
Oh I didn't know that
yes
I've only just started using HF (via sagemaker)
I think it's a memory leak?
@desert oar you're a genius tho
Disk memory or ram memory?
Interesting
I'm just a chunk of salt but thanks
Try increasing your swap file
total download is 65 GB, my instance had 75 GB + 16 GB RAM
But no ideas beyond that
will just get a bigger instance if it comes to this lol
oops not u
cloud makes swap obsolete heheh
i mean @serene scaffold
You do pay more for more ram
0.01 USD more for the hour lol
But in any case, 16gb should be more than enough
whenever I see salt rock lamps for sale at stores, I feel bad for your trapped brethren
Woah, I don't believe that
i know right? something is wrong
yeah check out google cloud spot instances
let me tell you exactly
Spot is an auction market so that does happen
i wanna be a coder like you @serene scaffold
I've been using 32gb of ram cuz its cheaper than the 16gb
0.09 extra USD per hour for 16 to 32
half that for spot
why do people even buy computers anymore lol
seems like you're on your way
Never saw a just 10cent increase like that tbh
But yeah in that case might as well rite
gcp
oh half price for quad core non-spot
My computer has been turned into a glorified terminal for cloud machines
nice
lol mine too
I could legit just code from my cellphone browser
I'd click some buttons on my GitHub actions workflows and it gives me web link that opens a vscode in the browser
i'm going to start fully transitioning to cloud lol
It's worth it for sure, and you don't even need to be confined there cuz your computer can also be part of the list of machines right
So you get a perfect reproduceble env across any machine if you do it right
But mostly cloud tho
i think i could learn a lot from you
this
but also it's made me feel that turning off my computer at night is a waste lol
i still literally have a hard time believing this cloud stuff is available
and it is so FRIGGIN CHEAP
you can probably learn from any of the people who frequent this channel
hmm
before the end of the day I'm going to try running distributed inference with starcoder2 on 80 E2 1GB instances, wish me luck
also it's going to be free because I haven't used my compute this month lololo
I feel that the way I got this stuff setup is so good that I could just turn it into a product and sell it. Something cloud agnostic and not dependent on GitHub actions. You'd just need to install an agent on your machine and you'd get the whole thing. Cloud agnostic env, from dev to prod, spot instance pricing only cuz fault tolerance is pretty easy to account for
yeah i mean the time investment to get this stuff working is non negligible
i've been at it nonstop for like the last month and it's still not good enough for prod
Would 100% blow up the current competition that doesn't even have GPU, let alone give you the option to not use the cloud if you don't want
About 70-80% of the pricing too
dude way less
i have another task ..... Analyze the ratings and popularity of different restaurant chains
for the same dataset
No.
I'd need my cut hueheuehs s
Alas, my brain is not smart and prefers to do ML research for free
Hey I was trying to use the BERT model for one of my applications but it seems I'm not able to install tensorflow-text library, currently using Python3.12 any suggestions?
you should change all the Yes/No data to True and False. and you can probably ignore "Rating color" and "Rating text", since those are just non-numeric versions of "Aggregate rating".
I have to head out for a bit
what "analysis" do they want you to do, anyway? just looking for patterns?
ya i guess
i think they just want me to do it all by myself they are not specific with things
i think im free to analyze it the way i want to
k thanks for helpng, you rock!!!
You know what, screw it, I'm submitting the idea to ycombinator. I gotta be trying everything, and I suppose this kinda counts as a job application
Which library should we use to create data insights for a python project
please ping me if someone answers
What is a "data insight"? Pandas is the most popular library for reading tabular data (like excel or CSV). And matplotlib is (unfortunately) the main one for creating data visualizations.
Hi. I have a containerised pytorch model which does a prediction when you give an input but the problem is that the majority of the time spent by the container image when it is loaded (on a serverless GPU platform), is spent on importing packages and dependencies like importing torch, other dependencies from other places in the directory.
Is it possible to reduce this time? Also is it possible to reduce the load time of a pytorch model as well?
Like my inference is done in less than 5 seconds but these package imports shoot up the inference time by 5x because the packages are not imported yet. How do I solve this?
Any help is appreciated!
every time you send a request to that container, is it re-loading pytorch and the model in a fresh python process, every time?
Yes. As the server spins down if it is not in use. Or isn't warm.
the model needs to always be fully loaded and ready to go, or there's nothing you can do to make it fast.
Really? 😭
Isn't there a way to deconstruct a pytorch model such that we can save it as a file and load it as is? Kinda like a cache but saved on disk?
Perhaps
I don't think it will ultimately be as satisfying as keeping the model warm.
I came across this article:
https://medium.com/ibm-data-ai/how-to-load-pytorch-models-340-times-faster-with-ray-8be751a6944c
But this is no longer supported, so I thought of another possible way to desconstruct the pytorch model, first remove the weights as numpy arrays and copy the model structure as it says in the article. I was trying to serialize the weights in a json file which is doable, but the problem is with how to save the model structure with the weights removed as a json since it is not serializable! If this can be done i.e. the model scaffolding can be saved somehow it can be instantly loaded in few hundred ms instead of 8-10 seconds that it currently takes.
Ray’s Plasma object store can reduce the cost of loading deep learning models for inference almost to zero.
Random forest is primarily a bagging algorithm since it averages the trees right? On the same note, it also can be boosting (i.e. gradient boosted RF).
What could a graph like this signify? My validation loss is massive, but training loss is only around 1.
I'm doing multi-class classification with FastAI with 9 classes.
My model is using DenseNet121 and FastAI's vision_learner function.
My loss is Weighted Cross Entropy Loss.
I think it might be due to the wrong activation function, but I'm not entirely sure (I'm using ReLU).
I'm really new to ML and would be grateful for any help.
im trying to run, i have p40 (24gb) cuda version 12.2, 535.161.07 nvidia drivers and pytorch 2.2.1:
`model = VisionEncoderDecoderModel.from_pretrained("nlpconnect/vit-gpt2-image-captioning")
feature_extractor = ViTImageProcessor.from_pretrained("nlpconnect/vit-gpt2-image-captioning")
tokenizer = AutoTokenizer.from_pretrained("nlpconnect/vit-gpt2-image-captioning")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)`
but im getting this error:
RuntimeError: CUDA error: CUDA-capable device(s) is/are busy or unavailable CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSAto enable device-side assertions.
just wanted to share this gif I made: an equirectangular projection of a cubed sphere with gausians applied at each step and moving particles
Using: python, numpy, pyproj, and scipy (oh and cv2)
nice
heyy guys, how to enable autocompletion in jupyter notebooks? ( I mean auto closing parenthesis)
I mean auto closing parenthesis, if it makes sense
It can be found in settings > auto close brackets
Thanks 🙂
But why unfortunately

hey
wow
@final kiln did you find any discord server where people talk about cloud dev? not finding much
the AWS discord server is good tho but specific to AWS
Haven't searched for it
I know it is kind of a useless question, but is ReLU actually better then sigmoid in NN forward_prop?
it's difficult to say what makes one activation function "better" than another. it depends on the situation. and even then, it might not be straightforward to explain why one activation function seems to work better.
the important thing is that it's a non-linear function
Also beware of the random initialization of the weights, it has been the deciding factor for me many times
Really, how could you counter randomness?
you don't.
Soo, your model could work, but randomness can actually f it up?
Wdym ?
Hi guys
Wanted to know what it was like being too stupid to understand what a pointer is
you start with randomly initialized weights, and make slight adjustments to them during each training iteration. it doesn't mean that your model might "randomly suck"
Can you tell me
Not everyone here only programs in python bru
pointers are part of programming languages. not AI. see #❓|how-to-get-help
No shit
I code in C
The only channel that's actually active
So yeah i know what a pointer is
that doesn't mean you can ask an off-topic question.
!rule 7
7. Keep discussions relevant to the channel topic. Each channel's description tells you the topic.
What I mean is that there are various forms of random init, and choosing one over the other has sometimes been the final step in making my model train correctly
It shouldn't really matter right because the model will do back_prop and change the weights accordingly right?
I don't know all the reasons it might matter, but the one I can think of is if you do a random init where your values would make the model susceptible to overflow when computing any of the steps
Ahh yeah alright
But if you look in the torch documentation you might see the various inits and links to the corresponding research paper
I can see, but for random init between -0.5 and 0.5 it shouldn't be much of a problem i guess
I mean it shouldn't
I hear a "but..." in that message 😅
But experience has thaught me otherwise
Like I'm just saying, if you're stuck, this may be one of the knobs you gotta look at
Well yeah i have encounterd overflow already but that was because my sensor data wasn't normalised xD
So it was working with sens data of a max of 255
Good advice
The explanation I gave was just the first plausible thing I could think of, idk if it's the actual reason, there's papers on this stuff
Alright well good to know
Is there btw any tutorial that actually explains back prop any good? It is still kind of magic in my eyes
3blue1brown has a good playlist on it
When going through any explanation, just remember that you're just doing the chain rule
alr. thanks
nothing else?!
titanic = sns.load_dataset('titanic')
nothing else what?
Hi, i'm trying to fine tune mistral using a custom dataset, but it is not working... im getting this error validation is not defined, but i defined it . https://github.com/huggingface/transformers/issues/29966
Does anyone know how to fix this?
Hello, please always give text as actual text. not screenshots.
if you get an import error of the form "cannot import name x from y", it means that you do have y installed, but y doesn't have x
langchain is actively developed, so this might be a version mismatch.
Is there any way to fix it?
are you following a tutorial, or what?
yeah
please link the tutorial
https://www.youtube.com/watch?v=Iyh6ftlZ2Q0 im using this tutorial and just googling a bit to try and give my llm a search tool to use
🚀 In this tutorial video, we present a very simple and quick tutorial on how to build custom LangChain Agents and Tools. We do this through a very simple Python code!
🔖LangChain is an open source framework that allows AI developers to combine LLMs like GPT-4 with external sources of computation and data. Specifically, LangChain is a framework d...
I don't see that import statement in the tutorial. what did you read that gave you the expectation that it would work?
wait i meant this tutorial https://www.youtube.com/watch?v=QI3HrPz7ZlI
Learn to build anything possible with AI in my course - schedule a call with me to learn more - https://calendly.com/vukrosic/20min Learn everything about AI and its business application in my course + community - https://www.skool.com/ai-entrepreneur-8527
📚 Explore our video courses covering a wide range of AI topics.
💬 Engage with the communi...
that video is six months old. so you'll have to figure out what the newest version of langchain was when that video came out
looks like it was this one https://github.com/langchain-ai/langchain/tree/v0.0.283
so you'd need to do pip install git+https://github.com/langchain-ai/langchain.git@v0.0.283
if that doesn't work, copy and paste the entire error message as text.
@fathom tide
@serene scaffold thanks but im looking for a different way with the newest version
okay
@final kiln what database tech have you been using?
need to start thinking about organizing my logs, metrics and data lol
In which context
For logging training metrics and such I've been using MLFlow, which I connected to a managed postgresql db in aws
For vector db, I've so far used open search and qdrant - and I recommend qdrant for sure, hands down
Tho there's potential benefit with using postgresql for vector db, because you can get a managed solution in AWS
Postgres has a vector thing, but I haven't used it
Qdrant also has managed
But it's a smaller and more recent company so there's more risk
And for normal stuff I've used MySQL
hm i see. you heard of Redis? unified model apparently for both vector + sql
thanks for comments btw
Yeah there's redis too
I use redis a lot and it's really good, so far it may actually be the best thing I've used
Like it has never given me trouble
I just set it up and I forget it exists
You don't get that a lot, open search for example is a pain in the butt, I had so much trouble with it
MySQL is very good too ofc, but I think it's needlessly complicated to setup replication and other advanced stuff
Tho you gonna wanna go for managed at that point
hm but then why don't you switch to redis-only?
Redis is in memory db, it wasn't designed to persist data, it also is noSQL
Tho I think you can set it up to persist data
It's also very useful to use for locking your processes cuz it's single threaded
it has an sql module
Didn't know that
yeah it also has a couple of persistence options
I mean if it does SQL I might try to use it
am wondering whether I should go full redis or learn the other techs individually
I built this huge multi container application
I use:
- Postgres
- Optuna
- MinioDB
- MLflow
- Tensorboard
hm no overlap between MLflow and Tensorboard?
different use cases
never heard of MinioDB, will look it up
It's just on-premise AWS S3
good to know