#data-science-and-ml
1 messages ยท Page 122 of 1
Yeah, the MSE and the variance are the same under the condition they estimator is unbiased
specifically, the variance of the error and the MSE of the error
Idk if you'll ever do reinforcement learning @spring field but there understanding and working with the bias of estimators has practical tradeoffs
Whereas in stats you can argue it's a bit fuzzy, unless you're making new methods and you want them to have certain sane properties
i'd also suggest to take some time to digest the idea. i've found a lot of people struggle with it
Honestly depends on what you want to do though
It's something I had to learn in statistics and it goes into the bucket of things I never use
I will, in fact, I have already, lol
i would say you do use it. this is one of the nails in the coffin of trying to say whether your model is good by looking at the values of the MSE ๐ a mistake many people make
Maybe having passive knowledge of it helped me? We can never answer that question
There are different ways to arrive at this conclusion
certainly
we did recently submit a paper exactly on this matter btw
since it turns out a lot of results are misinterpreted in papers
the MSE do be tricky
MSE of the variance? like how far off the predicted variance is from the true one? isn't that just square of their difference? not like you can have multiple variances, can you?
In your niche it's very true / important but I'm curious how many PhDs can answer this coherently. That doesn't make it right though
But you get my point
you started off right ๐ since your population sample is chosen "at random", that makes your estimated variance also a random variable with its own mean and variance
yes, that's understandable, I just noticed the formulas are similar enough ๐
but I'm glad I started this whole educational thing rn :p
you can also compute its MSE
If there's one thing I've learnt is that if you stare at any concept long enough you start to see it's more nuanced than you initially thought
that's probably the case tbh, i need to talk to real people at some point
1
variance of the error means using the mean of the error as well, right?
will do
well, not necessarily
you could directly compute the variance
you could compute it through the MSE by doing a bias-covariance decomposition though
Notions of what an unbiased estimator is, is kind of important ye
mmm, I have realized that the loss doesn't have any particular meaning aside from smaller is better
"small number good"
Here at least comp sci got less meaningful statistics than us
Econometrics was the best course I took in terms of statistical modelling
not yet, I kinda dropped out of the ME course I was doing, cuz... reasons
anyway, I'm planning on going to a different uni now for sth in cs, ds, stats, ml/ai, applied math, not entirely sure quite yet, so I'll probably have stats, yes
You're just self learning all this? You're not enrolled right now? If that's the case you'll be perfectly fine ๐
The economics side of it is very boring but econometrics teaches you all the pitfalls of modelling in practice
Many of my comp sci peers didn't take it and were a lot worse at modelling
No ML course teaches you about heteroscedasticity, predicted vs actual plots, analysing residuals, omitted variable bias, multicollinearity, ...
It's just about ML models and nothing more
Which is a mistake imho
not self learning exactly, lol, it's a... special course on ML, unrelated to uni, but the stats and some other stuff I'm covering mostly myself ig, yeah
Maybe multicollinearity is one that is touched on
To motivate why you need regularisation
these should kinda be prereqs even, not even part of ML courses
I agree-ish
you activated my trap card
I'm honestly bad at giving precise definitions of things related to math
I can give you a hand wavy one
The only context we spoke about it in the context of ML was with ill-posed problems
most people would mention the explicit ones like multiplying or adding a term that makes the problem easier in some sense
Like a matrix where one row is a linear combination of another
fishing out a particular solution, changing the type of convexity, etc
Something something about the rank
roughly anything that makes a problem easier by restricting the solution space
By adding an additional regularisation term you can solve it uniquely
I feel like "what is regularisation" is a question you can answer in 5 different ways
Especially if you're asking for practical applications, how to interpret it and so on
i also didn't like it right now, i was gonna add "or changing its geometry"
I could talk about L2 regularisation being a Gaussian prior in the Bayes world and so on
now define "capacity"
You can add noise to all your input, that's regularisation
And it solves this problem
that would be a restriction of the parameter space
making it low dimensional
(it's not the only kind, but that works)
tikhonov walks into the room
Ah yes hearing "Tikhonov" sends me back to the university auditorium
L2 and L1 don't actually do that though, those aren't restrictions. they change the geometry
that was my beef with it
if anything L2 reg increases the dimension of the parameter space to make it smoother
as zestar says, increasing the rank of the matrix
more degrees of freedom
yes but that's not a restriction ๐ as you pointed out before
you can explicitly do this though, e.g. with how you complain about needing to use rectangular matrices instead of low rank square ones
THAT is a restriction
an additive term changes geometry and promotes a behavior, but does not restrict it
that's exactly it. and the intution is indeed correct
but a restriction eliminates the possibility of something happening
i bring it up because a restriction is something properly defined
i would be surprised if you could in the ML context
Why is early stopping considered regularisation
cuz we don't have a whole lot of guarantees for ML
Is there any basis for it?
i recommend looking at fisher information (this is what i study), which measures the information models carry about their parameter spaces. it's related to entropy, but they're not the same
i guess in the sense they prevent overfitting
Then the definition is reduced to "thing that prevents over fitting"
i would interpret it as wanting a low dimensional solution
overfitting means your parameter space has a structure that contains even the noise, which is higher dimensional than the original parameter space (remember the previous noise discussion)
early stopping would keep your parameter space lean before you go on to fit the noise, so something like using a subspace of the full parameter space of your model
(not necessarily a vector space)
sure
Okay this is a nice way to look at it
the whole discussion on "dimension" is pretty powerful
linalg too stronk
yeah smth like that
dimension accounts for it automatically ๐
It's 2024 and I still haven't learnt how to use stan
treating it like a manifold with charts of dimension <= d might work
there's a reason topological methods are hot stuff
stop when d starts to increase
i think incorporating time is a problem tbh
referring to deciding after how many iterations you have reached a solution, not to time
you can make the analogy, but you may inadvertently introduce misconceptions on how the optimization process works
if you compose a function with itself 10 times, where is the time axis?
gradient-based methods are like fixed-point iterations. you compose a function with itself N times and hope you ended up at a good solution
choosing to attach the number N to time is an arbitrary choice
which is an arbitrary choice ๐
Is there an equivalence for gradient descent with early stopping when you have a closed form solution?
time need not be how you parametrize the curve the fixed point iteration traced
i don't think so. you'd need an explicit regularizer if it's done in one step
like L2 or L1
in that case you can formulate regularization as a bias
lead me astray daddy, but do it gently
idk why i said that but i think it captures the sentiment
It's kind of nasty though, not like L2/L1
this is fine, but one can make the argument that we use math to model things outside our standard intuition too
I use it because it's simpler to implement and reason about
that motivates the abstract, axiomatic approach of maths from the last 100 years
it's more powerful
(and less intuitive)
idk about clearly, that very much seems to be your preferred flavor only because you studied something related to physics
there's a distinction between what is actually happening and how you choose to interpret it. the way you interpret it may bring limitations and introduce misconceptions
certainly
a lot of momentum techniques, as the name implies in the first place, are studied exactly as you say
i think some of the popular interpretations of nesterov acceleration have something to do with damping of a spring or something like that
just as there are other studies that use none of that
yeah needs some statistical flavor on top
that's essentially it, yeah? you have a network that spits something out and then you evaluate it in a loss function that maps to R
so the composition of network and loss function is a functional acting on the inputs
but then you have a chicken and egg problem since calc of vars exists separately from physics ๐
it's also there if you include the data, since now you have to include expectation operations
every calculus student: wdym integrals are easier than sums
hilbert spaces are already directly connected to ML
R^n and C^n are hilbert spaces
the complex number part is usually kinda trivial, since you don't need the structure of C^n. cost functions f:C^n -> R are anyway not complex analytic
so you study them by isomorphism to R^2N (wirtinger calculus or splitting real and imag)
not really "larger" anyway though
since complex floats are internally represented as 2 float 64s
so internally what you have in the computer is also R^2N with special multiplication and addition structure
I have a question and for which I will have to post a picture is that okay
sure
Hi, is this any good? Feel free to recommend, thanks. I just recently finished a basic python course and wanna jump into these stuffs. Im a complete beginner btw
might wanna practice Python for a bit first
Alright thanks
Read the ISLR textbook and just look up stuff you want to learn. Anytime there is some sort of crash course, it is usually not good. Data School is good for beginners.
ngl, but features and channels mean roughly the same thing to me, if you could provide some additional context, like that last task mentioned or a sample of the dataset or sth
If you have a multivariate time series you have n channels over t time points. Each channel has a measurement for each t โน T. If you "flatten" it and give it to say a traditional ML model you have T * n features
so it's 6 params each embedded in 38 dimensions? over time
sth like
(batch, sequence, features, channels)

why is scipy.optimize never used? Are there just better methods? I know hyperparmeter tuning is better, but, yeah.
Used for what exactly?
I use it infrequently
optimization. I remember a couple of years ago, I used it to optimize some residual for lasso regression. I do not know, ML is optimization heavy.
For that you can just use Lasso or ElasticNet directly in sklearn
I had to use it years ago in grad school
he wanted us to do it that way for some reason
scipy optimize will either use heuristics for the gradients and hessians, or require you to provide them explicitly
Nothing wrong with that. It just requires a bit more steps than just using sklearn. I use scipy directly when I have to
you use them when your optimization problem is nice and easy to formulate explicitly
Hessians man.
with deep learning this is not the case, so you use something different
That is when there is more than one variable to optimize, right? I do not remember
no
what is it
hessians? or which part?
How do they fit lasso again? Coordinate descent?
yeah, second order partials: [f11,f12] [f21,f22]
that's an option. usually some form of iterative shrinkage/thresholding with (possibly block coordinate) descent
youngs theorem
you can do it for the case with a single variable as well
How?
you differentiate twice ๐
The last time I used scipy was when I handrolled some time series methods because I don't like most implementations ๐ฆ
that is just a second derivative of one variable
well, if there is only one variable, that's all you have
the hessian is the jacobian of the gradient. in the univariate case, that's just the second derivative of the one variable
in more dimensions you get also the cross derivatives
can somebody help me make an object detection ai
ok but i want to create my own
yeah u can do it
ok i tried to download yolo but i dont know what file to choose on github
u gonna use use yolo to train ur own ai model
finally having actual success ๐
u can search on kaggale
instead on github
what do i do to install it
check my github the latest repo car detection one
u will understand
what is your github
its in my profile
you need to know the signs of so example: f12 = f21; right? This was a while ago.
i don't understand your question
the cross partials, like, f11 *f22 -( f12)**2 < 0; if f11 < 0; then it is at a maximum, if f22 < 0; maximum as well, in order for the cross-partials to hold, the signs need to be known. And that is only know by knowing if f12 = f21 or something
for functions with continuous second order partial derivatives, the order of differentiation does not matter and the hessian is immediately symmetric
what one looks for is whether all of the eigenvalues of the hessian are positive or negative
show me
f12 does not always equal f21
no, not always, but many functions you deal with in optimization do work this way
particularly under the condition i mentioned above
Gotcha. Sorry, it has been a while and I forgot. The jacobian determinate is just to make sure the IFT holds or something and it is a matrix of first order derivatives, right? and it cannot = 0
My model is giving me a recall score if 1
Yet other methods (accuracy, precision, f1 score) are showing accuracy of around 95
That's very possible
Recall means that you classified all true positives as being of the positive class
Is that not over fitting of data?
But you can have false positives, which is impacts your precision, accuracy and f1 score
Not necessarily
Imagine your data is 50/50 distributed between + and -
In your case you have 51/49 for example
It is 45/46
All those that are + were classified as + but you misclassified a few extra as + that should've been -
then this shouldn't be too hard to figure out, look at the data ๐
datacamp it worth for learning data science and practising projects
Are you asking a question?
is it bad that the average minimum fitness of each generation is staying relatively constant?
Topics in ML that are the most advanced, what would you guys/girls, say?
What are these generations? This generation or whatever would dominate all previous generations in terms of fitness by miles
Wdym?
Thx. After each training generation I just wrote the max mean and min fitness of that generation to a SQLite db and the used matplolib to plot it and scipy to make the smoothed curve
lol
Honestly I have never used dataframes before.
Any of you into Game Theory? I heard it was used in RL. Kind of obsessed with Game Theory and IO. Like, I have heard RL borrows from Game Theory, how?
ANy ideas how can i optimize onnx model for arm64 architecture?
I want to webscrape IMDB top 1000 movie subtitles to classify them using NLP. Any suggestions as to how to proceed. There are permissions issues I am running into. I'd also be ok with using a dump of these files. Thank you.
I need to speed up my model object detection for raspberry pi devices
Hi guys does the word2vec need sentences or is it fine even if we feed only the words?
I did a bit of game theory in my bachelors. I think they're quite different because the classical RL setting just has a single agent whereas game theory is more of a multi-agent thing. When you have multiple agents there are game theoretic insights.
Another way you can look at it is from this perspective: in RL you have to learn the optimal policy and game theory actually gives you an answer to that question in the form of a nash-equilibrium (wherever present)
yeah
What's the issue? Is this a scraping question or a processing question?
Theres no details here. Whats slow? Why is it slow? What is your code? Etc.
So basicly im runing object detection script on machine with out GPU so it runs on cpu. What more details do you need?
That's like zero details. So I guess explain more?
It is possible to use lightweight charts and yfinance together ? ๐ค
oh god, i just downloaded a dataset from kaggle, to make a project out of it. im not able to import a simple excel file
May you share the code?
sure hold on
Is the file name correct? (Try removing the whitespace of your filename)
/ is the file in the current working directory
On my end works, considering the correct file path
it looks like this
Okay, is this file in the same directory as your notebook
i have to pip it oh oh ok
don't forget the !pip
so everything is good with my file name?
And if you have a virtual environment, use python -m if you are inside your shell
there's no white space?
what does that mean
File name should be fine, just wasn't sure how pandas reacts to whitespaces xD
oh
Okay my sr. data scientist told me that recently. If you do a normal pip install everything will land in your base python
This is bad, since some libraries might crash with different versions
so we use venvs (virtual environments).
Those are isolated environments where all libraries for your project are separated from the base python and other project environments
(At least this is how I understood it)
oh
idk if that is the real technical answer but it has worked on my end since then ๐
does it work now?
im getting the same error
Hm ok.. wait let me check something really quick
k
You are on windows, right?
FileNotFoundError: [Errno 2] No such file or directory: 'salaries (2).csv.csv'
ya
oh k lemme check
Game Theory has sequential move games(like chess) when you have to wait for the player 1 to move first and the second player acts based on the moves that the first player makes. Simultaneous move games are when both players act without knowing what you other player is doing. These games have a game table(most common game). Pure strategy Nash-Equillibrium is when both players cannot have a better pay off based on what the other player(s) has done. There does not have to be one Nash-equilibrium, there can be no Nash equilibrium and more there can be more than one pure-strategy Nash equilibrium. This is just simple game theory groundings. Game Theory becomes nonsense hard when it comes to Auctions. Auctions are extremely hard. I regress. Nash-Equillibrium in sequential move games is 'Sub-Game Perfect equillibrium'. There is first and second mover advantage. And, Sequential\simultaneous move games can be combined. There are also mixing games, mixed-strategy Nash equilibrium, Bayesian Nash Equilibrium, signaling games(hard), Infinitely Repeated prisoners dilemma(my favorite), Bertrand pricing games, Cournot games, Voting Games, Incentive Design(principal-agent), Collective Action Games. 'Agents' are players. I do not know. My friend is in RL and would ask me for Game Theory advice. Game Theory is incredibly underated and it is not a joke.
do i have to add the file name right after i have pasted the whole path to that file?
So you mean something like this C:\path\to\ file (2).csv ?
ya
There is no space after the \ afaik
nope
im so sorry, i feel so embarassed
nah dw
We all have been there
- I am neither an expert as well. Literally in my 2nd semester xD
omg i did it
how??
i got it done
Awesome
Nono you are all good ๐
I initially learned Java before my uni and now I am in my 2nd semester of python so I am new as well
What was the fix btw?
i made a stupid problem, i didnt add 'r'
Happens xD
I literally got roasted by Java sometimes for not adding a ; in line 42
see it positively: you will never forget the r ๐
how do i proceed with ML if i've learnt numpy, pandas, matplotlib and scikit learn?
What helped me was doing a data visualisation project
I analysed the top 1000 streamers world wide and ran some statistics on a public dataset.
So my go-to answer would be:
1. get some dataset
2. Try to visualise the data
3. Upload to git
btw i like matplotlib as well, however plotly express is perfect for interactive elements
thanks!
Could you give like exact info what detials do you need? Code? Model?
All of those, yes.
https://universe.roboflow.com/roboflow-universe-projects/license-plate-recognition-rxg4e i use this model converted to onnx using onnx cli
!pastebin
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
https://paste.pythondiscord.com/LD3A thats code which i use for object detrection
if not frame_queue.empty():
frame = frame_queue.get()
# cv2.imshow("Frame", frame)
height, width = frame.shape[:2]
processed_image = prepare_image_input(frame)
plate = detect_closest_license_plate(
session, processed_image, height, width, app_logger
)``` Here is how i trigger that function for camera stream
How can I send you my model in onnx format to not get message from @arctic wedge ?
Some model info from https://netron.app/
Visualizer for neural network, deep learning and machine learning models.
@serene scaffold what else do you need?
does anyone have a really good guide to NLPs? I have been working on some, but I just want a good guide
one that is not outdated
Nothing is "an nlp". No one talks about "NLPs" to refer to programs.
What specific thing do you want to make?
ok, so like, countvectorize,tfidvectorize, when one starts the train/test split, I do not know I get this one error all of the time. I do not know if it is because I used tfid instead of countvectorize or vice versa
There isn't an overarching guide to all of nlp because it's a broad category of AI. Whatever you're working on is concerned with a small subset of what is considered to be nlp.
If you need help in relation to an error message, be sure to always always show the whole error message
How broad? and this:
UnimplementedError: Graph execution error:
#skiping error
Node: 'binary_crossentropy/Cast'
Cast string to float is not supported
[[{{node binary_crossentropy/Cast}}]] [Op:__inference_train_function_21541]
I'm not sure how to quantity how broad it is.
It looks like you have some data structure that's supposed to contain only numbers, but it contains strings.
I will figure it out. It always heppens during one step. I do not know it is confusing me
I'm on mobile, so I can't dig in more at the moment. You might make a reproducible example of the problem you're having
alright, will not be hard to do, thank you.
it is this. Everytime, Sorry, not trying to bother you, this is the error,and this is the snippet of code I am talking about: raise ValueError(Errors.E1041.format(type=type(doc_like)))
ValueError: [E1041] Expected a string, Doc, or bytes as input, but got: <class 'list'> def lemmatize_reviews(df):
df['Quote'] = df['Quote'].apply(lemmatize_text)
return df
from nltk.tokenize import word_tokenize
def lemmatize_text(text):
doc = nlp(text)
lemmatized_text = ' '.join([token.lemma_ for token in doc])
return lemmatized_text
def do_tokenization(text):
token_words = word_tokenize(text)
return token_words
df['Quote'] = df['Quote'].apply(do_tokenization)
from sklearn.feature_extraction.text import TfidfVectorizer,CountVectorizer
tfid = TfidfVectorizer(preprocessor=do_tokenization)
df['Quote'] = df['Quote'].apply(lemmatize_text)
df['Anime'] = df['Anime'].apply(lemmatize_text)
cv = CountVectorizer()
!code could you please format that
print("Code")
Hii, can i ask for simple help on a Collaborative filtering question?
Yes. Be sure to always ask a complete question that someone can start answering.
" implemented sklearn library for cosine" ???? It sounds like you implemented that class in sklearn
As in the source code
yes sorry
Hi no worries sorted my issue my question was just dumbly asked
PER in RL sort of seems similar to DML and triplet loss where you try to get the furthest same class and closest other class to your current thing you're checking to sort of push away the other class and bring the same class closer
similarly PER should enable both retrieving low score actions in states thus making it less likely to make such actions in the future (pushing it away) and bring actions that had greater rewards closer (if making an analogy for triplet loss)
Hey everyone! ๐
I'm diving back into programming after a bit of a break and planning to create a stock market prediction ML model. I've got some basic ML knowledge from Kaggle, but I'm looking for some buddies to learn and build this together. Programming alone can get pretty dull, and having friends makes it so much more fun! If you're interested, hit me up! Let's make this project awesome together!
what concepts do i need to learn in calculus for machine learning? Already learnt integration and differentiation, what else?
do i need to learn each and every topic for ml?
not backend engineering, ig i would need to learn different different languages for that.. currently im planning for research
research of what kind
the applied approach requires less maths, but stuff like designing new architectures and studying their properties requires a fair amaount of math
all of those are made up terms trying to keep up with the tasks people end up doing at work, idk how meaningful it is to distinguish them strictly when companies don't
i don't think the experience will be much different in most places
do you have phd? @final kiln
Have you looked what europe-based ml programs for msc/phd can offer?
actually, im still in high school ๐ currently want to learn ml as a hobby so in future i can become data scientist. and im even preparing for one of the toughest exams in the world (jee; you might know about it), i believe it consists of some of the concepts of calculus, it'll help building my foundation and then I'll start with advanced topics
Does it makes sense to do phd in ml if u are not aiming for r&d roles?
I will be soon getting into it and I'll share what I hear/find w u
The term data scientist is so dead
ML engineer will perish soon as well
Nobody, it's going to be a game of revolving chairs
Data scientist nowadays means "does anything with data"
ML engineer will soon mean "does anything with ML", including calling openai's APIs
Just wait, I saw the same thing gradually happen to data science
Well, nowadays it just means "makes dashboards in powerbi" anyway ๐ญ
Well we have a dedicated powerbi team separated from the AI team and Data Science team. Unfortunately, some believe that AI = LLMs and LLMs = M365 Co Pilot
Yeah, I'm working exclusively on deep learning right now but I'm not sure if there will be any other project like mine after it's done
hey uh im new to ai i tried to make a classifier ai im not sure why its saying ball is a verb though any idea? i have a feeling its lack of data because im using not a lot of data but ??
https://paste.pythondiscord.com/YGAQ
JEEEEEEZ try a bow approach
I mean I am in my 2nd semester. I honestly haven't worked so much with that but what I would do is take random text
build bigrams
THis is way easier to do with a library (let me search a good one)
https://maartengr.github.io/KeyBERT/guides/countvectorizer.html#basic-usage
Leveraging BERT to extract important keywords
alright let me try tysm
pop!
Guys i got a question about some confussion matrix why my image doesnt show the number? while i print it have the number shown
like this
hard to tell without the code
yeah, I was thinking about its applications in RL as well
can you explain what exactly a data-visualisation project is? I dont get how its related to ML sorry
thx for replying the problem was solved, the seaborn lib must be 0.13.0
hola
im making a project on this type of data
can some one help me with what type of questions i should come up with to make a project about it?
you can do time series anaysis with that data, maybe predict the salary with experience level
ok
is this a personal project or related to school?
personal project
okay that's awesome, keep working on it
thanks man. on to it haha
So making an AI that does one thing, e.g, guessing a country based on its shape, is relatively simple, but how much harder is making an AI that is able to do 2 things, e.g, can recognise countries based on shape, but can also recognise countries based on their flags?
hi everyone! i was looking for an all in one ml ecosystem, as lightning studio AI, does someone use it? How about speed? do you have any better recommendation for realtime inference?
its not that hard you can use the same model you did for predicting using shape u can make it predict using flag same model differnet dataset
No, I mean being able to do both at the same time
How much harder would it be to create a model that can do both?
what model are you serving?
context: im building a multi agent with langgraph.
But this multiagent contains multiples models. One of that, is a Vision model.
The other ones are simples llm inference endpoints.
The vision model, currently is an NVIDIA endpoint, but it takes too much time like 2-3 seconds. Im looking for deploy a self-hosted vision model to try to reduce the inference time.
i would recommend tensorrt + triton inference server
prod ready, fast, maintained well
it seems very cool, thanks for the suggestion! i'll take a look. It contains a grpc server, great!
yes we use grpc and have had good experience
hello everyone, i wanted to learn reinforcement learning so came here to ask suggestions for good resources available online
i have completed my college math and have a decent understanding of statistics, lin alg and calculus.
i have worked with simple ANNs like making it from scratch (shoutout to andrej karpathy for it). unfortunately i couldnt find any videos from him which teach RL so wanted your opinions on where else/who else can i watch for learning how RL's work from scratch. im fine with papers as well (as long as they dont include any complicated math notations/vocabulary ๐
as i still consider myself to be a beginner)
you can start here I guess https://medium.com/free-code-camp/an-introduction-to-reinforcement-learning-4339519de419
I did it by reading this book http://incompleteideas.net/book/the-book-2nd.html and implementing every algorithm they show pseudocode for
ahhh will check them out, thank you
The devil is in the details and just reading the text was not enough for me to understand the nuance
Especially when you start comparing on-policy vs off-policy methods etc
thank you so much
It was pretty fun. After writing this stuff from scratch (just numpy + matplotlib) everything became clear ๐
I'm doing policy stuff now... it's a bit of a confusing topic to say the least and I don't understand why it's not converging
I mean it is, so maybe the issue is more with policy gradient not being efficient enough on its own 
it's worth to start from the beginning of the book imho
maybe i will understand the magical words used once i read the books ๐
there's a very logical trajectory to all of this
im assuming i start with full pdf link?
beginning with policy gradient methods would've confused the heck out of me
Also, many of the algorithms are proven to converge under specific conditions
I mean, I didn't begin with policy gradients, I'm there now
like, on-policy + tabular => converges
off-policy + function approximation => can possibly not converge etc
nvm, I'm gonna just go and read that book, lmao
policy gradient should be on-policy though iirc
I'm just currently going over all those methods really, I think aim is to get familiar with a bunch of network types and then specialize later? the course I mentioned a couple days ago
How do you handle imbalance data?
tbf, from a learning perspective, making multiple passes over learning material, going deeper and deeper on each iteration is a better method than just going one by one and trying to have a deep understanding immediately
by using weights, but I think it depends on what exactly you're doing, cuz apparently for say next token prediction you don't want to do that
doing nothing
if that's your distribution, that's your distribution
hmmm
Lets say my dataset has 20% of no cases and 80% of yes cases
9 times out of 10 I'd agree but the book is so well written I think it's the kind of text you can go through and just get it in a single pass
Should I take 20% yes cases only?
It has a logical structure that makes RL seem coherent, if you go method by method (as is actually not a bad idea for traditional ML) you're kind of losing out on that
I was more talking about the whole course that I'm doing where we don't really go thaaaat much in depth of all the models we cover
but if I'm gonna be doing some more RL afterwards, I'll definitely check out that book, bookmarked it
If your course touched on RL it's probably a really good one
Maybe we should pin it?
As a nice "overview" type of course
it's, uhh, a private one so to speak, but it is good, yes ๐
fair enough! ๐
Is levenshtein good option for spell checking if i have sequences of words in database for search recommendations?
I'll link this again as what you should do with unbalanced data
https://scikit-learn.org/stable/auto_examples/model_selection/plot_cost_sensitive_learning.html
this is true but typically not the case, at least in applications where people talk about unbalanced data
alright, suppose it's a classification task, you'd want to handle class unbalance there for sure, right?
There's also the misconception that the method will just ignore the minority class in favour of the majority
That's not necessarily true
no
This example is classification
I can understand that specific example
yeah I agree on it too but the thing is
I wouldn't want to mention it because I don't like talking about downsampling/upsampling because that's what people like doing which is just the wrong thing to do in many cases
For instance, I did a course on biometrics (making models for fingerprint, retina, face, ... detection) and we didn't downsample or anything funky a single time
You can imagine how unbalanced these datasets are.
We just used common sense, less sexy ways to deal with imbalance
mmm, doesn't that use DML or sth?
DML?
Looking back at it, we did eigenfaces, fisher faces, local binary patterns and deep metric learning
I also did deep metric learning for retina detection
well, classification/retrieval
damn, sift features are literally stone age tech now
this ugly thing is a retina, but placed horizontally instead of in a circle
It's your eye
they're unique identifiers, even between twins
ah shit
true
like this stuff
was a cool course in hindsight, kind of like applied computer vision
the one I sent?
it's legit ye but these were the originals, before preprocessing
how do you get this from that?
honest answer? A bunch of preprocessing I didn't write myself
oh, I just don't see the correlation, lol
hmm?
is it like, unrolled?
yes
idk, looks a bit more than just split in half
the code I copy pasted is nasty
but it works ๐คท
Whenever people at work talk about "neural nets are too heavyweight" I show them stuff like this to show how finnicky pre NN comp vision was
Way more effort for worse results ๐ ๐
mmm, cuz I'm thinking this
Ah that's fair actually
idk, the pupil is weird anyway
fsr I didn't ask as many questions about this as you guys are, but that's a good thing on your part
I just accepted the preprocessing had processed
I actually haven't used vision transformers
CNNs only ๐ด
I did not need to know that...
I saw some performance graphs and that did seem to be the case
though seeing where they pay their attention is quite fun in ViTs
and I assume ViViT probably does beat a CNN, cuz a CNN can't really do frame processing, I mean, ig you could try do an RNN + CNN 
This was a lot of fun. I made a novel method that can add image specific noise (output of a NN) to turn any image into an image neural net 2 to trick it into believing it's an airplane.
The noise is very visible but that's because I didn't tune my loss function, I used a basic heuristic
ah, the likely future of AI cybersecurity 
I wonder if something similar can be done to open source transformers
You need whitebox access to the gradients, which you absolutely do for llama etc.
Gradients, huh?
I should do more ML projects. I never do (aside my actual job)
same here (except I don't have a job yet...)
Yeah, I think Matiss is talking about video
RNN+CNN is truly a thing for those applications yeah
@final kiln https://boards.eu.greenhouse.io/otainsightltd/jobs/4334136101 thoughts? Sounds suspect to me.
Type of role where I suspect you'll be doing dashboards 24/7
I applied, nothing to lose
Want to know the secret of the high callback ratio? (I suspect I'll get a callback for this one very soon)
Here most ML people can't really really code well to the extent they can put things into prod
yes
If you can, and have demonstrable experience doing so you're ahead
Notice how using version control is a nice to have ๐ญ
I see it with some of my colleagues. One does all his "development" inside a browser IDE in our SaaS. Doesn't version control anything "because it is saved in the browser"
Brilliant guy, but that's reality
I detest browser IDEs
Made me really dislike databricks
I need my vs code with my very specific colour template, keybindings, ...
I sync it between work and private
I just want them to give me a way to SSH into their environment
and code with vscode
ah, then it's np
Also went for this because I'm curious how much they pay https://www.solita.fi/positions/ai-specialist-with-a-focus-on-genai-5878783003/
I wouldn't do Gen AI focused roles ngl
The risk is losing touch with actually training models and the difficulties with deploying them
If you're just doing it with existing APIs you're effectively a backend SWE
Just my 2 cents
yeah, this stuff is actually engaging
because people (incl. myself) don't really understand it
It was the same issue with my current job. All our projects were too abstract for regular people with no tangible use case
"Spinal deformity detection using wavelet features", "glucose prediction models for people with type 1 diabetes", ...
The only thing we made that people could understand was some IoT computer vision thing
It had a UI, clear output etc
That's a factor as well
I'm gonna let the NLP train pass me I think
Haven't done enough in it and for some reason I can't be bothered to either
Just did the info retrieval thing in uni (which is the foundation for RAGs etc) that didn't do the actual NLP course
The cost of training, deploying, rlhf,, ...
@left tartan is data mentioned here enough?
Here is my model converted to onnx format
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
for who is that message?
I dont think that pasting model to paste bin makes sens
Also im really glad for any ideas how can i speed up my model
oh that's actually really funny mb
I did that for myself
I was pasting code in the rust discord server but it was too big
I needed the link for pastebin but I didn't know it off the top my head so I went to a random channel and did ! paste
sorry for the confusion
I can't look today, but perhaps someone else can.
ViViT uses transformers, lol
completely agree.
I was just wondering if CNN + RNN could do sth with videos
Ok I have a long and convoluted question for someone with a bigger brain than me:
So I know there are tons of GPT/LM services online that are free where I can send a prompt and the thing returns a response. And I know the way this works is it tokenizes my input, vectorizes those tokens, sends data into a neural network, and it outputs all the possible tokens' confidence values, or how sure it is that every token proceeds the last. My question is, I'm wondering if there's an API or a cloud-based thing where I can send a string as input, and the output is the confidence values for the next token in the string. For example, if the input is "The sun is", the output could be [["hot", 98], ["big", 95], ...]
TLDR: Is there an API where I can get the list of the raw output nodes of a GPT?
you may as well just run Llama or Phi locally instead
much of the time the tokens are not entire words, but rather fragments that may make no sense on their own like "un"
you can specify logprobs for the OpenAI GPT API though, https://platform.openai.com/docs/api-reference/chat/create#chat-create-logprobs
openai completions api exposes logprobs but at most top 5
wonder if this is related to message size? or to make teacher-student distillation more difficult?
quick question, can a TokenLearner be applied to GPTs as well, similar to how it works with ViT? so that it can learn how to split words itself or sth?
alright, ig a TokenLearner specifically probably cannot, but the general idea at least
network learns how to split words/sentences itself
TokenLearner itself appears to be specifically made for working with multi dimension data, so yeah probably not a good idea to use it directly on text
tokenizers are typically trained separately (most commonly by byte-pair encoding or unigram model https://github.com/google/sentencepiece)
there is some work like this https://arxiv.org/abs/2106.12672 on tokenizers learned e2e with pretraining but it isn't used in sota LLMs
State-of-the-art models in natural language processing rely on separate rigid subword tokenization algorithms, which limit their generalization ability and adaptation to new settings. In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. To this end, we introduce a soft gradient-...
the overall idea... iirc the token mappings for text transformers are already generated automatically from the text distribution in the training data, and you probably wouldn't get as much benefit from it in text as you may get for images
yeah, it's used with vision transformers, it's pretty cool, I implemented one some days ago #data-science-and-ml message
mmm, I see, how does this "automatic generation" happen though 
(I could also just google it...)
I don't know in enough detail to explain it
fair enough
I'd say the best thing you can do is participate in tabular playground competitions in Kaggle (look this up). Do your own EDA and look at other solutions afterwards.
Hi I am working on a project using sklearn and i am facing an error in my final project.
If anybody can help me solve my problem , that would be greatfull. Just ping me in the dm.
Thanks
You'll get help much faster if you state what exactly you're having problem with here.
Is it on kaggle
Really?
yes, make an account and check it out
This is my favorite resource for EDA topics: https://www.itl.nist.gov/div898/handbook/.
# Screen time analysis is the task of analyzing and creating a report on which applications and websites are used by the user for how much time. Apple devices hve one of the best ways of creating a screen time report.
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
data = pd.read_csv("Screentime_App_Details.csv")
# print(data.head())
# print(data.isnull().sum())
# print(data.describe())
figure = px.bar(data_frame=data, x="Date", y="Usage", color="App", title="Usage Graph")
figure.show()
PLease someone help me in fixing this error
i have tried deactivating my anti-virus but still the issue persists
you need to host the server first to run it idk
how to do that?
thankyou
idk what are you trying to do
Where are you running this? From a notebook?
from vs code
In a .py or .ipynb?
.py
i wanna get into this sector, ik basic python , can anyone tell like what exatly it is and the roadmap? thank you
i m looking for this output
then use simplehttpserver or smt idk
how to do it
https://plotly.com/python/getting-started/ (write html, I meant)
or just use live server extension
thanks billy and hoang
The explore chapter, if that's what you mean, yes
Its a very thorough document
Yup, exactly. And practitioners get frustrated with the recent trend to throw ML at data and forego EDA
Too many people never look at the data
When I say look I just really mean, sample some of it images, records, documents, whatever it may be and read it
no graphs just looking
From the questions that come here that's often implicit (not looking)
As for EDA
I've said it many times, I think you should avoid doing ML and your first line of defence is a good EDA such that you could find heuristics that solve the problem sufficiently
This is usually a thing with tabular data only imho
Agreed but when it's about EDA I guess the assumption is that it's tabular data
Or at most time series
There's degrees to it though. One of my colleagues is a bio-engineer more analytics focused (+ has more domain knowledge in our project)
The EDA's he does are things I'd never dream of
They have tons of depth
Sometimes they're complex in a bad way but still, loads of depth
Guys , please if someone can help me plot my data just like in the graoh below
I am making a stock market price predictions how to give the time stamp feature to the model
i have given there not sure if it is of any use
#add day of features
df['day_week']=df['Datetime'].dt.dayofweek
df['day_month']=df['Datetime'].dt.day
df['month']=df['Datetime'].dt.month
df['time']=df['Datetime'].map(lambda a:(pd.Timedelta(a-pd.to_datetime('2023-1-14 9:15:00+05:30'))/pd.Timedelta('5m'))%(24*12))
!code
I don't think Lisan here is gonna like my suggestion (
), but... RNNs are usually what you'd use for time series data such as this and you'd simply provide the data in a sequence
probably could use transformers for it as well since they have positional encoding, but idk
other than RNNs, there are a bunch of other time series thingies in ML like ARMA, ARIMA, SARIMA, SARIMAX and whatnot
Is it better to make my own ai from scratch? or use pre built libraries already im still new to ai, and im not sure whether i must go down to learn, how LLM, RNN.. etc work, and then program my own from scratch or should i use a library like, tensorflow , torch or Hugging face???
Everything is still a bit confusing for me, i do understand python so i guess that would be my starting point i do suck at maths though.
Im kinda looking for a starting point, and where to go from their.?
if you are starting I would strongly suggest you to use a pre-built model to learn
Once you get a hang of it you might try coding it on your own but not sure if it will be ever needed in a practical scenario I might be wrong that what I know
but of course, it's wishful thinking at best, but if you believe there are patterns to find, might as well I guess, besides it's not a terrible learning experience anyway
I would not suggest that, I'd suggest doing it all from the beginning, implementing a simple Feed-Forward network, by hand, using say only numpy and matplotlib for dataviz, then ramp up the complexity, then move onto pre-trained models
reminds me of that time monkeys throwing darts beat the market... (I just found an article on Forbes and it's actually a bit more than that, the explanations and stuff, it's quite interesting)
one can hope 
anyone familiar with assigning weights to datapoinst in a dataset and then training these weight in parallel to the original dataset, in order to determine which datapoints that are more reliable when sampling together a prediction?
Anyone has some idea how can i speed up my model?
Or recognition process with out Using GPU
any generative AI videos u guys recommend to add custom knowledge to models/ create agents get used to basics thanks
This is the right answer
There's been a lot of research on this
Hey guys I have just finished these concepts in NLP:- Text preprocessing , One hot Encoding , Bag Of Words, N -grams , TF-IDF , Word2Vec and POS tagging . What else should learn to improve in NLP?
Any ideas what additional img transformations should i make to make this img easier to read by OCR?
Thats img before any preprocessing operations
Im using easyocr
For OCR actions
But im wondering if i should my imgs better quality or are there any additional preprocess techniques which i could use
How to do that?
Thx, let me try it
can irun mistral on 1660ti?
I think thats not that easy task, im making OCR for licesne plates so i would need a lot of time and resources for creating good model
have you tried creating this?
it maybe if you can make it see where liq is
and price is fractal
it might work to some extent
using smart money concepts
I will try it, im not really good in training data sets most of the time i used already created by someone
cool can u suggest any good read on 'fractal time series'
But for sure i will try it right after connectedCompontents function
i never said u should use numbers for this
price is fractal
by this i mean price action repeats on different time frames
so if u can make a model that can learn different patterns on different time frames and can alsoo see where most of the liq is
it might be possible to predict direction to some extent from higher to lower time frames
ngl i have no idea what that means
but as a trader when we say price is fractal it means
it behaves same way on different time frames
liq runs same happens on all time frames
tag buy side and then to sell side imediatly
buy side liquidity == people who's buy orders got filled from being short(liquidated)
what do u mean by "it is not present in the numbers "
bro u don't have to beleive me
but u should look up 'smart money concepts'
that's how people understand what market is about to do
and i been trading for 3 years now
haha people do say just by looking at raw price action you can predict if something at scale is gonna happen
going by that defination you can't really predict the market ever no matter what you do
but u can always use probability and risk management to make educated guess
nope u can't
the only way to 100% accurately predict is by knowing what everyone involved is gonna do with certiniity
but try delta neutral stratgies you are good with math u can come up with somethiing like that i am sure
there is theoratical part and there is what actually happens
news has little effect on markets
people who are sources of the news usually have already taken their sides of the bet so news is almost always priced in
well let me know what you some up with
try paper trading
i have built alot of trading bots most of them only make money under some specific conditions or just loose
thanks for the resources hopefully i'll learn something new
Okay i think trying to get that high resoults with OCR doesnt make sens, @final kiln could you share some more details about training UNET model?
What dataset did you use
Unfortunetly i dont have "my data" i mean i found some dataset with licesne plates imgs here https://universe.roboflow.com/roboflow-universe-projects/license-plate-recognition-rxg4e
There is no such repo
Label you mean the location of license plate or actual text?
Nope
What do you mean by that?
Okay i could try to generate such data
Okay assuming i have a set which includes every letter in 40 random license plates from the set above how should i train that data? As i told you im really bad in traning stuff
Oh okay
Okay im glad for your advices for sure i will try to train such model
https://keras.io/api/data_loading/ so i should just use this code with my prepared data from license plate data set?
Okay thxx
Im currently doing a guided project (stock price predictor). I'm continually commenting throughout my code to use as a guide when I go onto make a solo ML project. Is what I'm doing useful trying to improve my ML programming skills? What can I do to improve them further?
okay tysm
Im wondering about one think, because imgs which are provided in the data set are "to good" what i mean by that: I mean that my camera doesnt record in that high quality the imgs in the data set
Also i know that my camera setup could be better so im thinking also about first trying with different angle
Hi guys is advanced NLP known as Gen AI?
Okay
No idea I just finished ext preprocessing , One hot Encoding , Bag Of Words, N -grams , TF-IDF , Word2Vec and POS tagging. So i was wondering what to do next
I am sorry but I dont have any idea about what should I learn next
No. There's no widely recognized gradations of NLP. But even if there were, none of them would be "generative AI", since you can have generative AIs that have nothing to do with language.
Technologies like ChatGPT are interactive, generative language models. And interactive generative language models are in vogue at the moment. But they aren't the only application of language models. Or the only AI technologies that are both interactive and generative.
Okay I got that point
I am sorryyy๐ฅฒ
What should I focus on next?
Try making a model that classifies documents, which are emails, as SPAM or NOT SPAM
I did that
how did you do it?
I have used naive bayes classifier
this isn't a serious suggestion.
how did you represent each email?
In terms of vectors? I used the word2vec to give me the vectors of each word in the email and then as whole for the email
what was the dimensionality of the vectors?
I have kept the dimensionality of those vectors as 100
okay, try making a feed-forward neural network that takes those same vectors and outputs a true/false prediction
You mean instead using an ML algorithm , I should make a DL model with outputs as true or false?
deep learning is a subset of machine learning (and don't listen to anyone who tells you otherwise)
Sure bud , you are my coach ! Okay I shall make that now and let you know
how does a local running ollama model hold soo much info?
it is only like 4gb
by having info i mean
it is answering all the questions?
my understanding is LLMs just copy paste knowledge that they have seen before (they aren't able to think)
so how can a 4gb model answer all those questions
this isn't quite right
it's better to think of it as building a function that assigns tokens a probability of occuring together
no copy pasting is going on
mb didn't mean literally copypasting
it's more like building a function that you feed some text into, and it tells you what it thinks comes next by assigning a probability to all the tokens it knows
and as for how much predictive power, let's do a quick calculation
wait just like any other machine learning algo
where the fit some function and do gradient decent for cost function reduction
yep
with the distinction that the prediction is not deterministic
ohh ok ok now i see how hullicination might occur
not all models have a random sampling effect
this is magic ๐
do they actually do a random selection with the predicted weights or do they pick the max? or is max usually used for classification?
at any rate: 4GB. these models often use 32 bit floats. we can fit 4*10^9/32 floats in that much memory
it does use a random selection among the top scores
.wa s 4*10^9/32
125000000
that's the number of parameters in the model
mmm, ig that makes sense, otherwise it would indeed be rather boring if it were deterministic
right, it makes it sound less natural
asking exactly the same question on different sessions should net you slightly different responses
it kinda makes sense
i have only done very basic neural networks and the statistics did work
but to think that i can work for something as natural language to "guess" what comes next and still able to make a sentence which makes sense is unbelievable
whether it learns how to assign probabilities correctly depends entirely on the training data though
there are lots of cool experiments tricking language models into spitting out tokens in ways that make no sense
you can get stuff that doesn't make up words at all
is there a basic grammar defined like it has to map token with real words
so how does iit know what "Apoptosis" is
it doesn't
but in the training data, the tokens making up that word only appear in special combinations, so seeing that in a sentence in the correct context (with other tokens in the correct order) tells it to assign the upcoming tokens a high probability
is there like a very basic LLM you can train yourself
to better understand the process
there should be, try scrolling up. i recall someone training a model of their own in the past few days
maybe matiiss, actually
i also remember that person getting random tokens sometimes
I mean, it makes sense
like for ml i did an example for
[1,0,0,1] so had to fit the line by first 3 elements and train to predict the 4th
that wasn't me then ๐
thanks i will
that was an RNN
ah ic, i wasn't paying attention
I mean, I did attempt next token with GPTs as well, I just forgot to actually do the rollouts ๐
and also I implemented it like a translator...
that does cut the parameters from N^2 to 2N - 1
N if you use a symmetric kernel
we used a similar trick in a recent paper cuz a model required a few petabytes of memory otherwise
you give something up though. CNNs enforce shift invariance, which is something not all sentences have
could be interesting. there will definitely be a tradeoff between memory gains and accuracy as you play with the window size
@final kiln hey sory for ping, but this question is siting here for some time now and i see you've got big knowladge about AI im wondering how could i speed up my obejct detection model? ( using CPU cant use GPU )
in the message pined there are some info about model
Also send model it self
Im using onnxruntime
They told me here that using onnxruntime is really good solution for deploying object recognition in dockers
i've had a problem with puting my yolo object detection to docker
and here told me to use onnxruntime for it
Oh i remeber that was to problem
The size
@final kiln
I need to deploy it to field devices, so it cant be that big
Thats the size of my docker now
And i know i can cut like 100MB more from it's compressed size
Nope, wait i will send you my docker
There is a bit mess here becasuse i was trying some different stuff with mutli platform imgs, but here you go:
FROM --platform=$TARGETPLATFORM python:3.11.9-slim-bullseye
ARG TARGETPLATFORM
RUN apt-get update && \
apt-get install -y --no-install-recommends \
libgl1 \
libglib2.0-0 \
gcc \
libhdf5-dev \
cmake \
g++ \
libffi-dev \
pkg-config && \
if [ "$TARGETPLATFORM" = "linux/arm/v7" ]; then \
apt-get install -y --no-install-recommends ninja-build; \
fi
RUN apt-get install -y --no-install-recommends libssl-dev
WORKDIR /app
RUN pip install --upgrade pip setuptools wheel && \
pip install --no-binary=h5py h5py --no-cache-dir
COPY requirements.txt .
# COPY /wheels .
# RUN if [ "$TARGETPLATFORM" = "linux/arm/v7" ]; then \
# pip install onnxruntime-1.16.0-cp311-cp311-linux_armv7l.whl; \
# apt-get install python3-scipy -y --no-install-recommends; \
# fi
RUN pip install --no-cache-dir -r requirements.txt --find-links https://download.pytorch.org/whl/cpu
COPY . .
RUN echo "Building for $TARGETPLATFORM"
CMD ["python", "main.py"]```
What do you mean "no you did"?
You can cut a bit by using mutli stage builds
But it wont be much
No i followed the instaltion for CPU only
Oh you mean in old img
I thought your talking about that one
Ye so i installed it with CUDA there
But even with out CUDA i had like 5GB or like 4.7GB of img
I heard about some model quantization but never investigated in subject
I tried bunch of ways but non of them gave me any improvements
Okay, so i should grab some controllers with GPU or start being AWS slave XD
Ye i know, im just joking
I will read that options really carefully, im really glad for all you support
Ye i know that
I swear it always feels like im going back to square one with my ML journey. i see you all with somewhat clear aims to be in ML/AI/DS judging by the level of knowledge you all have, whereas I'm just a uni dropout doing this shit because it seems interesting, not knowing how much fucking work it requires with no aim in what to use it in
I dropped out of 1st year. Got sick of the studying. Applying to apprenticeships (engineering and it). Idk wtf i want to do
Still waiting for confirmation if im successful or not. It just feels shit. When i have no aim or guide, i cant focus in doing something. Like with the ML/AI. I cant put the hours needed to be good at it if i dont know what im going to do in the future with it
Well its what i plan to do. Do an apprenticeship and then use that to get into a degree apprenticeship scheme. I just want to do something to pass the time and have it be beneficial for me in the future
I swear bro its like in constantly going back square one every few days
It used to be my dream but when you spend so many years studying and then for it to not go anywhere and seeing your dreams collapse in front of you with barely anyone helping you out, it destroys and changes how you think about things
Are apprenticeships a common thing where you live? I've never heard of anyone in the US getting into AI from an apprenticeship.
Im from the UK. Ton of apprenticeship schemes over here
I applied to an IT apprenticeship with BT and an engineering apprenticeship with another company. If i do well, ill do a degree apprenticeship where i can get a degree through my apprenticeship (no need to pay for anything)
is there any good interactive free data science tutorial/course? Most yt's are boring. A full of non-stop talking
oh nvm I mean just python, not the whole data science stuffs.
sry about that
Ill see how it goes. Hopefully i get something back from them
@final kiln
how was the job interview?
on paper im millionaire
on paper valuation , my share price
you'll get the job
nice.
do u have LangChain knowledge tho?
as someone who hires,
I would say it's not about just interview
Some ppl interview to hire "actually" and some to reject, it's just matter of requirement and urgency, scarcity
for backup
they have options, but to be on safer side they i'll get back to u
after even sometimes 6 months there another hire when fucks up they'll get to u
that's sad.bad but reality.
oh, what was different?
Maybe i will try to make our company like this
what was so different in this Interview Vs Others @final kiln
can u give me a bit detailed insight, maybe i could apply it too
notice everything and please do lemme know with insights
so she took personalized interest in ur work and gave u more ideas and stuff
very very good, i do this but i'll take this more into consideration
crazy
where are u from ?
which country
oh
why?
why didnt u like the HR
(im trying to scrape everything that a developer like u thinks of a interviewer/a company, so i can maybe identify something wrong or apply something seeming to be useful)
ok, what kind of personality ppl
"developers" like you wanna work with
hi! sorry for interrupting, quick question. I was wondering how I could figure out how much memory capacity/processing power would be ideal for my AI model? i'm using a python wrapper called faster whisper to live transcribe voice into text and running it on my normal CPU isn't cutting it at all
okay, i get it.
okay
i'm eventually planning to move it to a raspberry pi
how would that work?
oh hm
me and my friends were planning to make a robotic dog with speech recognition and a camera
is a raspberry pi not ideal for that?
so essentially use an online/cloud API
gotcha ty
Let's say I have a dataframe that consists of 1000 rows. 700 of those rows belong to class A and the rest 300 belong to class B. Is there a function in Pandas that takes in an imbalanced dataframe and randomly removes rows from the class with excess rows such that the final dataframe contains equal number of each class?
there is no function to specifically do that
but you can identify the major class, get the number of samples to be dropped from it, sample that amount of bad indexes, and drop them from the frame
a way
built-in, no. if you can't be bothered, imblearn has samplers. if you just want to use pandas, do what Nahita suggested
like .groupby the class, .apply a .sample(n) with n equal to the number of elements in the minority class
Alrighty, I'll do it. Not that I'm too lazy to do it, I just thought it'd be great if there was an in-built way of doing it. However, it's a trivial task.
Oh, alright. ๐๐ป
or actually, if you don't need any randomness you can just .groupby('class').head(100) to get the first 100 of each class (and .tail for the last 100)
what is the absolute fastest embedding model with acceptable performance (must be open source) right now?
the "acceptable performance" will vary greatly based on which task you want to use it for, and the model performance may vary a lot depending on what your data is like
traditional search engine? RAG? Which type of data you're working with?
(QA, text or pdf documents, how structured is it, which language(s))
I'm guessing only text, but the way you framed your question it's also ambiguous whenever you want a model that only accept text inputs, be only images, multi-modal
see the above questions
but seems like https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 might be a good default choice
how does it compare to something more primative such as TfidfVectorizer?
it's a rag with a mix of structured and and unstructured textual data
you have to fit TfidfVectorizer on your data, and it lacks much of the semantic meaning which embedding models try to preseve
I'm also not sure about what its output shape would end up as for large dataset?
most embedding models create a vector with few thundred to a few thousand of vectors
i am running mistral on localhost and it is taking a few seconds before responding
unless you're running it on a GPU, that is to be expected
also ollama services is running on my system as soon as my system starts
whenever i call the mistral endPoint does it start mistral each time? bcz i don't see it using too much resources
i don't see ollama/mistral using my gpu
unless thoose --variation seed versions are ollama??
it should take way more memory than that
if your GPU does not supports it you could try a smaller model like phi3 but I wouldn't expect amazing performance nor quality from another you can run without a GPU
mistral 7B is running on my CPU i5 5-600k? shouldn't it also be able to run og 1660Ti with 6gb vram i think
do you ever use training data on a grid search?
bit pytorch specific but what's the difference (if any) between
model.forward(...)
model.forward(...)
...
optimizer.step()
and
with torch.no_grad():
model.forward(...)
model.forward(...)
...
optimizer.step()
that's what I thought, I just don't understand it ๐
like, aren't gradients calculated and added during backprop?
oh the gradients of only that layer?
for context I was doing A2C RL and the former worked while the latter did whatever it did, lol
also speaking of RL it seemed to me that using too big of a hidden size overfit the score function without actually increasing the total reward
I recommend going into a debugger and looking at the gradients of your weights in #1 and #2 to compare
I went into the debugger a couple times while trying to figure out why it was doing whatever it did and that never crossed my mind ๐ will do 
anyone knows how to make jupyter vs code extension store outputs and variables when i close vs code? kinda annoying having to retrain the model every time i wanna do smt with it after closing vs code
i could save and load the model into a file but jupyter saving outputs wouldve been more convenient
Hi, does anyone know how could i make treshhold for that img which would give the letters really high contrast?'
I tryed bunch of techniques but even with this img im still having trouble with that OCR
maybe thats because im using easyocr for that task
Idk
def preprocess_license_plate_image(img):
img = cv2.resize(img, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.GaussianBlur(img, (5, 5), 0)
kernel_sharpening = np.array([[0, -1, 0],
[-1, 5, -1],
[0, -1, 0]])
img = cv2.filter2D(img, -1, kernel_sharpening)
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 11, 2)
return img
``` Code used for preporcessing
I even achived that preprocessing img
But still having a trouble with correct reading it
Why dont you fine-tune some neural net?
Also try tesseract-ocr
I had very pleasent exp w it
Ye i know that, but im still conecered about your method becasue Eeasyocr is already using AI model for that
does anyone know how yo create a chatbot?
btw, I can't tell if it's an S or a 5
really looking forward to that 4th point 
In good enviorment yes
But in field not really
i mean i still have some options for different camera setups
anyway, currently I'm doing GAN stuff
Thats actually how the licesne plate letters are printed
Thats S
Thats like base img which i preprocess
mmm, had my suspicions on that, but like... a model understanding that? meh, maybe
What do you mean by that?
well, a bit in reverse, but yeah
yes but also i know that i didnt setup the camera in best possible way
this is gonna be a bit sidestepped, but have you considered using a better quality camera?
Currently not XD
I mean i still have a few not tested configurations from which i will definitely start
hi i have this pictrue and this is my code:
from PIL import Image
from pytesseract import pytesseract
import enum
import pyautogui
import os
class OS(enum.Enum):
Windows = 1
class Language(enum.Enum):
ENG = 'eng'
ARB = 'arb'
class ImageReader:
def __init__(self, os: OS):
if os == OS.Windows:
windows_path = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
pytesseract.tesseract_cmd = windows_path
#print("Running on: Windows\n")
def extract_text(self, image_path: str, lang: Language) -> str:
img = Image.open(image_path)
extracted_text = pytesseract.image_to_string(img, lang=lang.value).strip() # Strip the text to remove newlines and spaces
return extracted_text
if name == 'main':
save_path = r'C:\Users\falcon\Desktop\my_lexis_plus_bot\images1'
file_name = "big_question.png"
save_path1 = os.path.join(save_path,file_name)
left = 1250
top = 400
width = 600
height =42
screenshot = pyautogui.screenshot(region=(left,top,width,height))
screenshot.save(save_path1)
screenshot.show()
ir = ImageReader(OS.Windows)
bigq = ir.extract_text('images1/big_question.png', lang=Language.ENG)
print(bigq)
is there anyway to make my code detect the underscores and write them with the output?
beacuse the output only the sentens without the underscores like this : Nadia is much more than her sister.
use an img reader that supports _
how come LLMs are not deterministic if they have predefined weights and they predict what word should come next shouldn't they be deterministic?
they might be if you turn the temperature to 0
there's this: https://152334h.github.io/blog/non-determinism-in-gpt-4/
Itโs well-known at this point that GPT-4/GPT-3.5-turbo is non-deterministic, even at temperature=0.0. This is an odd behavior if youโre used to dense decoder-only models, where temp=0 should imply greedy sampling which should imply full determinism, because the logits for the next token should be a pure function of the input sequence & the model...
I am building a predictive model, consider it to be based on stocks etc. There are columns for search trends which show trends rated from 1 to 100. Could be a very useful variable for the model but most of the data for those search trend columns is missing. Any idea on how I could leverage whatever is given?
Is anyone of you experienced in opencv in python
Please just ask the question directly. (Don't ask to ask: https://nohello.net/en/)
I need some software developers (good in opencv) to work in my startup
Hi guys, I am attempting to generate vector embeddings for a very structured document. It is divided at four levels, the book, the division, the title and then the content. I have been thinking about how to prepare the material for chunking, especially in a way that maximizes on the rigid structure of the documents. My plan is to divide them by the smallest block, the content, and put a block of text above the content, containing the book, division and title. Every chunk will have this header block and the content. How does this idea sound? One concern I have is the repetition of the header block for divisions that has a lot of titles and content, so it would be the same header block over and over, and I wonder if that will be counterproductive to search.
Anyone knows here how could i quntize my object detection model?
you mean quantize?
From my understanding of your description, this is not what we'd consider a "very structured document". Regardless, having the "header block" repeated shouldn't be an issue as long as you tokenize it as one unit.
You might also exclude the header block from each chunk, if it's only there to tell you where chunk boundaries are.
My intention is to supply some metadata that is not contained within the chunk content, rather than to use as a delimiter for the chunk boundaries. It is good to know that the repetition won't cause any issues since i feel that it is a key component for each chunk to make sense. Thanks for your comment.
Easiest way IMO is convert the model to ONNX, and then use onnxruntime to quantize the model
i already have my model in onnx
But im still having trouble in doing that
What have you tried so far? are you following https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#quantizing-an-onnx-model ?
tried bunch of methods and they didnt speed up My model
they are typically not magic
the speedup will depend on hardware
typically they only reliably reduce memory overhead, not speed.
https://www.easypaste.org/file/xqYJMfw0/license.plate.detector.onnx?lang=en thats my onnx model file
Im using arm64 architecture on production
and what type are you trying to quantize to?
From float32 to uint8
https://github.com/microsoft/onnxruntime-inference-examples/tree/main/quantization/image_classification/cpu followed the repo mantioned in docs
And it didnt work well
I'm not sure if you're going to gain much via float32 -> int8
their intermediary results are likely going back to int32 operations
I can't see anything wrong with the model itself
Looking at the cpu provider code, most of the operations do not work via int8 they get cast internally
Well in theory if the data is int8, you can at minimum do 16 operations per AVX2 register on the CPU compared to 8 fp32 operations.
But in arm this is a bit different since NEON & SVE behaves a bit differently
But normally smaller int operations = more throughput loosely
Okay, so your telling me that even if my quantization process went well i still wouldnt gain that much speed
GPUs normally have a lot of specialization around the quantization, so they more commonly see a significant speedup with the quantization
But on the CPU you are more at the mercy of the execution provider's willingness to specialize to datatypes
It depends, but I can't see a huge amount of specialization for int8 on CPU providers with onnxruntime.
If you're on ARM, you can try use the ARMNN provider that might have more specializations
Okay, whats ARMNN provider ?
There is also https://onnxruntime.ai/docs/execution-providers/community-maintained/ACL-ExecutionProvider.html
Not sure what Arm chip it will be running on so not sure if you can run any of these
Thx, maybe this will give me better resoults
Hi there,
I developed a webapp with Flask recently. I'd like to add a scraping task in the background (something I've already dev'd) but things get complicated after :
I have lots of videos, from which I'd like to extract some texts but not necessarily all. I've tested a bit with Google's Vision AI, which works very well, but I end up with a lot of extraneous text. So my question is, what do you think is the best way to clean up the data with OCR tech? I have several solutions in mind, for example training a model to detect only the texts I want (but I don't know where to start with this method), I've also thought about using classic regexes but this solution seems extremely limited and not suitable for me. Can you think of any other viable solutions for me? Any frameworks? Ways to train AI models? etc.
Thanks for taking the time to read already and have a nice day!
hi, can someone clarify what exactly should i be doing for this question ( i don't particularly understand cuz i'm new to pytorch)
Oh yeah the quoted "question 1" is just this:
Anyway this is my code for question 1, but idk what to do for q3), could someone help?
If you have a model, you should make a batch size of 1, and give this batch of 1 image (shape would probably something like (1, nr_channels, height, width)), and then pass the result to the loss function with the desired output (i.e. the input image).
This should then give you a loss for a single image. Do this multiple times, and and plot the resulting losses. @buoyant shoal
Bit unsure about the terminology your teacher uses. The grammar is also a bit poor. I assume with "total loss" (Q1) they mean the "MSE" over 600 images. But that would give the same answer as question 3, so they probably mean "sum of squared error"?
Does that make sense @buoyant shoal?
Could also be the sum of MSE of the 100 epochs for question 1. That would make more sense I guess. Question 3 would then be the MSE of epoch 0 (untrained). But they also want a plot of the loss of each separate image I would think.
hi, so wait this is wrong?
# Create a data loader for training data with a batch size of 600
train_dl = torch.utils.data.DataLoader(train_dataset, batch_size=600)```
Do you have the entire assignment description?
I think this is it basically
The code was like already there for the most part and was just filling in trivial details
but iirc they already had batch_size = 100 on the code
i just changed it to 600
This is the original code
oh
okay yes it's more or less the same
# Import torch library and the other usual libraries
# Based on code by Naveen on nomidl.com
import torch
import torchvision
import torch.nn as nn
import matplotlib.pyplot as plt
import numpy as np
from torchvision import transforms```
# Define a data transformation to convert digit images to tensors
transform = transforms.ToTensor()
# Load the MNIST datasets for training and validation
# images are 28x28 pixel images of handwritten digits in a greyscale
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
valid_dataset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
# Create a data loader for training data with a batch size of 100
train_dl = torch.utils.data.DataLoader(train_dataset, batch_size=100)```
yeah the rest is all the same actually
i moved all the stuffs from the jupyter notebook
Hmm alright. There is just some contradicting stuff, like they initally use a batch size of 100, which means they give 100 images, calculate MSE, update model, do this until all images are used, then the epoch is done.
But they want to plot the MSE of a batch of 600 for every epoch (for question 1), which is a bit random, Why pick 600 random images and save the MSE of that? why not just use the MSE of all images?
wait i think maybe it has to do with the train_dl line?
train_dl = torch.utils.data.DataLoader(train_dataset, batch_size=100)
this thing like doesn't even run entirely
What do you mean?