#data-science-and-ml
1 messages ยท Page 394 of 1
Pasting large amounts of code
If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
whole code. https://paste.pythondiscord.com/viducocuxu. i will also try to find what you are asking for
2secs
ys, xs, confs = heatmap_to_label(y_pred=y_pred[0, ...],
keypoint_names=keypoint_names,
label=label)
i think thats first time code says xs
i see. did you write this function, heatmap_to_label?
from ScantensusPT.utils.heatmap_to_label import heatmap_to_label
it looks like it's from here
its my supervisors code. yeah he must have done it for some reason
but my graph plot is last chunk of code i
well it looks like xs is a list and not an array. what happens if you do np.asarray(xs).shape?
ill check now
when i plot xs and ys it works. but when i do xs and file.flow_true it doesnt so yeah must be some different kinda of format or something. illl check now
yes, matplotlib accepts both lists and arrays
sorry 2secs laptop just running a bit slow just waiting for code to run
okay done
np.asarray(xs).shape
Out[6]: (913,)
only one number displaye
ys is also same shape
np.asarray(file.flow_true).shape
Out[8]: (473612,)
but file.flow.true is different
@desert oar
yeah, that will be a problem
they need to be the same size
how can i make it so that it only plots same size. so keep xs same as it is lower number and i want to make file.flow_true plot same number as xs. i tried :
ax.plot(xs,file.flow_true[0:913],"r-", linewidth=0.5)
still got error. x and y must have same first dimension, but have shapes (1,) and (913,)
ye been trying for many days lol
hi, so i have been practicing to create virtual environments in pycharm using conda
at first i created Project Folder in D drive but by default the envs were in C drive in shown location
then i tried to remove all envs
now while trying to create again, the envs location seems to be different.
which one is the preferred/default location?
FYI: i was creating the virtual environments from within pycharm
like this
I think there's no preferred. As long as you know where they are it's fine.
ok. thanks miwojo
i am surprised to see it says 1 and not 913
what happens if you do ax.plot(np.asarray(xs), ...)? and what is type(xs)?
okay will try now
hmm same error
x and y must have same first dimension, but have shapes (913,) and (473612,)
in the ... i put file.flow.true not sure if thats what you wanted me to do
and type(xs) is a list
ye weird
Would you guys recommend basic course for machine learning except for python basics?
anyone know answer to my question
if you know python already this one is great course: https://course.fast.ai/
can i also learn that course if i know very little about coding
yes, you will have a bit more to learn though. learning python and ML stuff at the same time. but it's possible
what's your question?
what's the syntax for pointing my sparksession to an existing application? any time i run the following it starts a new application with the same name, instead of attaching
spark = SparkSession \
.builder \
.appName('SparkSQL::IP') \
.enableHiveSupport() \
.config('spark.master', 'spark://IP:7077') \
.getOrCreate()
What are yโallโs thoughts in the advancement of AR, do you think there are any practical uses of AR. And is computer vision related to AR
Sorry lots of questions
Definitely think computer vision is related to AR
not sure how relevant it will be, rn it seems more like a nerds dream than something everyone will use in like 5 years
But I guess that is what people said about computers and stuff in the past so ๐

Is this a beginner territory?
You can ask beginner questions about DS/AI sure
I am wondering if itโs recommended to start with machine learning being a beginner in Python
machine learning is not tied to knowing python
But if you want to do it in python it's good to at least know the very basics yeah
and know how to use pandas, and some commonly used libraries like sklearn, tensorflow, pytorch etc.
And to understand machine learning you also want to learn a bit about linear algebra and statistics
Downloading Anacondaโฆ looks like it comes with those libs
Yes its recommended for ML
VS is my favorite otherwise..
I personally use pycharm
does someone know how do I get individual colors for each bar in matplotlib
it gives this messy thing rn
what you're doing should work, but it seems to be mungling the bar values together for some reason. Assuming https://stackoverflow.com/a/18973430 doesn't work, I would suspect matplotlib doesn't like the dataframe.
Side note: the parts of the petrol column that are green is quite a bit. Do the heights of the bar graphs fit the data (i.e. expecting ~600k petrol, 100k diesel, ~10k automatic), or is some of the data being moved to other columns (so the large green section under petrol is stealing ~350k from automatic)?
It didn't work
I looked up this link before
Oh. You are suggesting that the first bar is not actually plotting the petrol column?
the mixed colors suggest to me that the first and second bars contain - incorrectly - combination of all data values instead of just their own categories. Whether this is true is dependent on whether the bar heights match their expected-by-you values
I think you might be right
Right
thus is how the dataset looks.
I just tried to plot the 2 columns together. And thought it worked.
Male petrol users are a lot higher than 60k and so are diesel users than the given value
I've created a neural net in python structured like this with x0 being bias and x1,x2,x3 being inputs. But I'm confused whether or not I need to use an activation function on a7 and a8? I've used sigmoid on a4,a5,a6.
try this
df2 = df.groupby(['Fuel'], sort=False).sum()
plt.bar(df2.index.array, df2['Male'], color=['r', 'g', 'b'])
the sort=False keeps the same order as you had before (first appearance in column) and speeds up the grouping process
Works alright. But the values are too high too give a good description. I made it fancy with using percentages by normalisation.
your df2 generator is more clean. Thanks
for data sanity checking, can i ask what the automatic fuel type is? I know of auto transmission, but not fuel. Is that code for electric or something else?
yes I think so.
May i know. Why you converted index into an array
It's working fine with df2.index as well
Btw. Is there a cleaner way to normalise an array than mine?
I didn't try it with just df2.index as stackoverflow posts suggest df2.index.values, and https://pandas.pydata.org/docs/reference/api/pandas.Index.values.html suggests .values is semi-deprecated in favor of either .array or .to_numpy() depending on use case
tl;dr didn't know it'd work
Piechart does it automatically. I was checking if bargraphs had such a feature
numpy, being a array-based math library, has a function to normalize, but you have the non-third-party solution. Should mention, however, that instead of turning the numbers to 0-100 percentages in the norm, python's print has a format specifier for that purpose:
>>> x = 1/4
>>> print(f'{x:.1%}')
25.0%
>>> print('{:.1%}'.format(x))
25.0%
I recommend this book for someone new to DS/ML and new to python https://allendowney.github.io/ElementsOfDataScience/README.html
Anaconda is a set of DS tools with option for GUI launcher: jupyter, Spyder, Rstudio, orange(it was some time ago when I had it installed, so it may changed) conda for virtual env manager and lots of packages preinstalled like pandas bumpy, like 100+ of them. It's like 500MB+ download though.
the thing is
who really uses all those tools tho

theres a lot more tools now too
Couldn't find the current list of tools on their website. Do you know what is preinstalled currently? I'm not going to download and install it just to find out that ๐
I used Enthought
before I went Conda
hallo
what is Enthought?
cannot stop thinking about your mice setup Chris. share some pics pls ๐
no
I had a project on sci compute then just grabbed that lmao
yeah reading it. not sure what is it. it's like anaconda?
It is
Conda better
Well it shows Python has ecosystem for sci compute and there are many who try to make it like Matlab
what's the difference between Machine Learning Engineer and Machine Learning Operations Engineer (MLOps)?
Ah one is dev other is dev ops
So the MLOps has moar on his plate ....ML Eng builds data pipelines...ML Ops builds deploys and monitors etc I guess but its kinda fuzzy
agree, very fuzzy
to a point that i don't see a difference
Yep
Recently downloaded one on my 2nd pc. I'll check for the list when I'm up in the morning ๐
on top of that you have data scientist, which confuses the picture even more. bcs if MLE builds models, then what DS do?
Maybe the ML Ops takes moar ops than ML Eng
Prototype?
maybe
Maybe make the math
maybe
Honestly, I'd say MLOps is just the combination of machine learning and DevOps to automate, track, pipeline, monitor, and package machine learning models.
It began as a set of best practices but has now slowly morphed into an independent ML lifecycle management approach.
Just like how a Backend engineer is different from a DevOps engineer, such can be likened to Machine Learning Engineer vs MLOps Engineer. ๐ค
What MLE would do in such a setup?
MLOps is a special type of ModelOps, according to Gartner. However, MLOps is concerned with operationalizing machine learning models, whereas ModelOps focuses on all sorts of AI models.
- Deployment & Monitoring โ This is the final step, which is mostly about MLOps which includes things like packaging your final model, creating a docker image, writing the scoring script, and then making it all work together.
A Machine Learning Engineer focuses on all the ML pipeline and MLOps Engineer only focus on the Deployment & Monitoring part I guess ๐
But really, has anyone seen someone with MLOps Engineer as his/her job title? I haven't seen though.
Nice lmao
Not me
Nvidia recruiter reached to me with job advert for MLOps... Triggered my questions
Take it if you want lmao at least hardware will not be an issue
Yeah
Tried ML in desktop can be slow ...nao you hab all the toys ...and free or discounted GPU I hope...say its for testing lmao
That would be nice. It's just recruiter screening or something phase but I'm considering playing this game with them ๐
Go lmao and negotiate a good deal and gud luck
Thanks!
Now that's interesting... Wishing you success!
Found this. How To Choose Right Data Visualization Charts For Your Data?
https://medium.com/coders-mojo/how-to-choose-right-data-visualization-charts-for-your-data-f4dd49061aea?sk=7015ece56ed3f68f9b857d535e6b8c16
Thanks! I replied that I'm interested to the recruiter. Will see...
If anyone has knowledge on why this behavior may be happening, please let me know:
Your model is pretty small and it's not outweighing the overhead of transferring data to the GPU and back. 1 ms per step is pretty high for such a powerful CPU too though, likely due to TF having really high overhead.
Or something else is wrong with it.
tried reading online. havent found anything useful yet
Maybe DS use libraries where ml devs make from scratch?
what is DS?
Data scientist
oh no. this is a simple OOPs tutorial im watching
im learning python, no prior coding info
wish there was a IDE/editor channel
yes 2 hrs
Iโve not yet needed to use classes in my day to day
oh there. is. ill go ask there
@iron basalt Data isn't transferred forth and back. It's loaded into GPU memory and then processed there until the very end.
Show what the overview looks like in TensorBoard's profiler.
i think it would be hard to create from scratch libraries like numpy, pandas, pytorch, etc...
who here can help me with web scraping using bs4
im having trouble with finding the things i want since the html is bigg
just ping or pm me if you can
Why not just ask te question here?
Intro This is a part of the series of blog posts related to Artificial Intelligence Implementation. If you are interested in the background of the story or how it goes: Previous blog post links How to scrape Google Local Results with Artificial Intelligence Real World Example of Machine Learning on
Hello people
I have a important question do you have a any suggestions where I can read stuff about creating an ai for my self/ self learning system
Or could you teach me
Because I would love to Programm a home ai
But I made more like a series of if commands that I have to write
Home AI like google home and alexa are really complex
If you don't have any experience with AI and machine learning you are going to have a bad time
@lapis sequoia consumer products like Alexa and google assistant aren't really AIs in and of themselves, but they have lots of individual components that are the actual AIs. Each one probably leverages pretty advanced techniques.
I have a list of inputs and a function. I want to find the Jacobian for each individual input. Rather than looping, could I speed this up somehow?
I'm using autograd.
import autograd.numpy as np
from autograd import grad, jacobian
x = np.array([5,3], dtype=float)
def cost(x):
return x[0] ** 2 / x[1] - np.log(x[1])
jacobian_cost = jacobian(cost)
did you try something like that?
I think using Jacobian returns shape (# of outputs, size of output, # number of inputs, size of input)
I am looking for shape (# of inputs, size of output, size of input)
each output row corresponds to one input row
actually never mind, I don't think this can be done more efficiently than looping
because to get to that format the Jacobian function probably loops itself
What does a low Silhouette Coefficient mean? I know 1 means the clusters are well defined and far from one another and 0 means they're closer. But does a lower score invalidate the results?
Yeah guess so
Hello,
What is Best Data Visualization Tool For Time Series Forecasting in Python?
are you trying to make a line plot?
I need more advanced plot with tools like zoom and move
matplotlib should be enough
plotly, if you want it interactive
%matplotlib widget in lab
%matplotlib notebook in notebook
Enjoy your interactive plots :)
so im trying to implement KNN from scratch, and using np.random.normal to generate my training and test data. but i dont really have an idea on how to proceed from there. can anybody suggest a pseudo-code/ algorithm
and im not using sklearn
you would need to assign labels to all of them (which can be arbitrary, I guess), and then put all the points in a kd tree. do you know what a kd tree is?
no, i dont
the "k" in kd tree is not the same k as the one in kNN
i am not that familiar with trees
i know how to implement in C, but that's about it
I have to head out for a bit. but you can use kd trees to keep track of which points are closest to each other
ah okay, i'll check it out
hey guys,
i want to do a kmeans label based on sorted attribute value,
let's say n_cluster=3,
then:
the smallest value of the attributes will labeled as 0
and the mid will labeled as 1
and the bigger values will labeled as 2
any idea how to set kmeans, to do that, or at least how to change the label after the process?
s
sorry forgot to mention that the data im working on is 1D array where i reshape it for (-1,1) and then put it on the model
So ML engineer can just use sklearn? ๐๐
am i reading this correctly? 8 nvidia GPUs + 2 amd CPUs (x12) ?

96 GPUs...is that enough GPUs for you @serene scaffold

The weights alone take up around 40GB in GPU memory and, due to the tensor parallelism scheme as well as the high memory usage, you will need at minimum 2 GPUs with a total of ~45GB of GPU VRAM to run inference, and significantly more for training.
oof
just for inference

why
are solutions that require this much computation power practical for anything?
imo this is still "basic research"
would you say the same about particle accelerators, space telescopes, deep-sea submarines, or nuclear fusion reactors that are still deeply energy-negative?
this is as much supercomputing research as it is ml/ai research
"how big can it go" is a valid research question imo
yeah where is the line where we say
this isnt cost effective
or this is providing diminishing returns, etc.

also honestly "we did the biggest model" is good press
especially if your gpu brand is attached to it (nvidia)
can someone help me with this
from keras.models import load_model
model = load_model('best_model.hdf5')
it says OSError: SavedModel file does not exist at: best_model.hdf5{saved_model.pbtxt|saved_model.pb}
yeah how else are you supposed to get more funding ig

best_model.hdf5 is not a file in your current working directory, apparently.
if you don't know your current working directory, one solution is to provide the entire file path, from the root of the file system. that way the CWD doesn't matter.
What not?
and now it says raise IOError(f'No file or directory found at {filepath_str}')
OSError: No file or directory found at speech_commands_v0.01/best_model.hdf5
I've seen analogy somewhere that these huge models are like F1 cars to 'normal' cars that get ppl from A to B :)
right, and as wasteful and goofy as f1 seems, the tech does actually filter down to real cars (at least once in a while)
(f1 is also fun to watch)
There are lots of GPU cloud services. Does anyone have experience with any of these services? Whatโs the best budget option? I need to get some Python code running on GPU via Jax and Numba but I donโt have an Nvidia computer.
i think the go-to "freebie" recommendation is google colab. but i think maybe you have to put your data on google drive and use the stinky stinky colab notebook interface
unless there's a way to upload a notebook or .py file
I mean, colab isn't that bad tbh...
for a free service
I think it might be a problem tho if you want to host it 24/7 for free tho, because colab will ask you to verify you're still there after a while I think, you can't just afk
the interface is pretty horrible, but free is free
yea true
the colab notebook shuts down after a few hours and you might get scaled up or down
so yeah it's not useful if you need always-on compute
Google Colab is not an option. I'm developing a Python package so I need to edit Python files, run tests, etc.
yea, but that makes sense tho, I mean it's a free service, they also still need to make profit for all that computing
hmm..
I have heard that Paperspace CORE is pretty good, never tried it tho
oh apparently kaggle has gpu compute hosting
i think it's like colab though, notebook-based
idk i saw it on r/machinelearning
im seeing recommendations for paperspace, vast.ai, and lambda labs in some threads
thought they only had like shared models/spaces and stuff
idk! never used it
A lot of the services are just "user-friendly" wrappers around Google cloud, Amazon aws/ec2 or whatever they call it now, and Microsoft Azure or whatever they call it now. If you go straight to the source, it will be cheaper and you have any features you want.
hmm, could this be it?: https://www.kaggle.com/code/dansbecker/running-kaggle-kernels-with-a-gpu/notebook
this is at least a guide ig
idk, I don't rlly use kaggle that much tho, I mostly use it to get datasets from ig
You will need to know how to ssh into a server and setup whatever it is you are doing yourself (not too hard).
Inconclusive findings L. Did vader sentiment analysis of musk tweets with doge and doge prices
Beginner, so choice of tools may have been the issue or small sample size
JarvisLabs gets good recs in fastai community for price and performance. I didn't use it myself though. All the free options are just for toy projects i think. Colab can block you from using GPU if used too much, sagemaker studio lab and paperspace have limited free GPU resources, meaning there are times you find get free GPU. Kaggler has limit of ~40hrs per week.
If you need some actual speed (not tiny GPUs), and you are just testing things for now. I would buy your own GPU since you will be using it a lot, and the cloud services will end up costing the same or more for that high speed over that long usage time.
Cloud services are when you don't want to buy like 16 GPUs.
(Worth it if you can afford it though)
or 96 
sorry i couldnt help it
ignore me
Also if you are using your own custom kernels and not something like tensorflow, you can run them on AMD GPUs. Pytorch has ROCm now, but it's only kind of there.
Numba is an option, but there is also other options like pyopencl.
You need to think power consumption and place to keep it and heat it generates and how fast it's obsolete ...
Something like an Nvidia Titan is still not obsolete and won't be for a long time (and it's old). And if it is, it's fast enough for DL stuff.
For power, it's one GPU, not like 16, so standard gaming PC power consumption.
If you use it a lot it uses a lot of power. Now power is cost. Worth including in your calculations.
You need at least one GPU for DL, but not more unless you to make really big models.
Titan is like 3, soon 4 generations being?
I do, and it's still less than the cloud options (depending on where you live).
That's because the cloud is charging you extra for the whole UI and setup and such, and scalability on demand (to more GPUs).
But if you are just testing for now, one GPU is enough.
Oh and the noise. Need to be happy to live with constant noise from the thing
Yea, get some cheap sound pads that you can stick on the walls.
I prefer pictures on the wall than sound pads ๐
Unless you have some spare room for the rig, you are going to feel, hear it.
Yeah, either you are ok with your room looking like a video game streamer's room (with the loud GPU and sound pads) or you will have to pay extra for cloud.
Specially if you use it a lit, which i assume you do
Personally I have a server room with a bunch of small computers and larger ones running things. But not everybody is a computer nerd with a computer room.
(And old computers because I like messing around with them for fun)
That make sense
looks conclusive to me: sentiment is very weakly and noisily related to price. i would be interested to see price and sentiment time series
theres certainly other correlations to consider
e.g. lead/lag cross-correlation and autocorrelation
Thank you! I will look into those approaches.
I think I'll get an Nvidia Jetson. Seems to be a good compromise on cost and power requirements.
can't you build a decent workstation for not much more than that?
depends on budget of course
Yeah, the Jetson series is for when you want something smaller, especially to run on a robot.
And/or lower power consumption.
Most of them are still bought out though I think. It's hard to get anything from Nvidia right now.
certainly for $1300 you can buy a lot of cloud compute hours
Perhaps try looking at other things, such as the (log) change in price or volatility
Maybe you'll find something interesting
Yeah for something not too crazy in terms of speed, like 43 days (running 24/7). Assuming like $1.25/hr.
A 3080 TI is going for about the same amount as that Jetson AGX Xavier.
And that is a pretty big difference.
3080, not TI, is much less and still faster.
Since you only really care about the GPU, you could buy a real cheap CPU or if you already have one, reuse it.
Guys, I needed some help understanding the theory behind feature selection. How is it determined that the current score with the newer feature added would be higher than the one without it or not?
I am not asking how to implement it. That i understand.
the problem is that there basically is no theory
Oo
How is it determined that the current score with the newer feature added would be higher than the one without it or not?
either you actually fit 2 different models, or you use heuristics
I used heuristics thing. That hill climbing technique
e.g. mutual information is a principled heuristic based in information theory, but it's still just a heuristic
Though I don't understand how does that hill climb graph come up.
what do you mean? "heuristic" is fancy jargon for "rule of thumb" or "educated guess"
it'd be useful to see what you mean by "that hill climbing technique"
Oh. My English is weak.
it does look like you can apply hill climbing to feature selection, but i haven't personally used it
What i was taught is. We added one feature to the data randomly. And then checked if model accuracy is higher or lower than without it. And based on that we keep it or discard it.
Sometimes features by themselves don't say much, but in combination might say a lot
say for predicting how a recipe will be rated, if it has caramel it might not be rated well, if it has apple it might not be rated well, but caramel and appel would be rated well
So some features are only useful in combination
In which case adding 1 feature and checking the performance wouldn't work too well
Now, there was this dataset with 13 features which gave 73% accuracy without Feature selection. Then with FS i got 86% accuracy by selecting 12 features.
So i thought maybe it was just 1 "bad feature" and rest all good features. But then later on it worked even better with some other 9 features. Now I don't understand if there were 3 more "bad features" why weren't they dropped when I ran my FS first time.
That kinda depends on a lot of stuff
the features themselves, the model, if that is training or testing accuracy etc.
Most deep learning models are very capable of figuring out what features are useful
Ah. Got it. So let's say i was doing that FS and added just caramel first. It might not improve my accuracy and my algorithm might end up discarding it.
But if in my random order, the apple comes first before caramel, then caramel would turn into a "good feature", right?
Yeah, if you add them separately they might be worthless
and you discard them
but if you add them add the same time it would make it better
We were asked to use that randomness for a couple of times until you stumble accross the best state.
i like shapley values for visualizing feature importance, not sure how those relate to deep learning, they provide some good insight on supervised learning though. .. unless there's heavy multicollinearity, in which case the related features are competing for importance
I think L1 regularization also just makes less important features converge to zero
Which would be some way to determine which features might be important
@tacit basin what tf does a ML engineer do that DS canโt
Maybe we could have ran all combinations of say 5 features. That would be 6^5 i think.
it would be 13*12*11*10*9
I took example of a case where there are 5 features only
But something that is also important with this is to start in a few different initial states
And hopefully one will lead to a high local maxima
Yes that's what we were told. Though I don't really understand how that graph came
You don't understand the graph?
ia m not sure myself, the whole DS/MLS/MLE/MLOps is confusing to me
but i don't believe majority of MLEs create libs from scratch
The graph is pretty abstract
Or maybe I don't understand the graph
also not sure why the y-axis is objective function
you would think it would be performance
Ah. So it just tries to denote that if you caramel came before apple. Then there's only a local maxima you can reach, right?
Objective function is performance only I think.
data scientists, who curate datasets and build AI models that analyze them. It also includes ML engineers, who run those datasets through the models in disciplined, automated ways. (source: Nvidia)
It just shows that the method you are using where you keep adjusting your model slightly only when it increases the performance
And if you do that, you will reach a local optima
It's just an abstraction of what we observed through that way of FS
as you can only go up
Hey, whats the best method to recognize the mood of a piece of text?
try huggingface
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")
[{'label': 'POSITIVE', 'score': 0.9598047137260437}]
i read through some documentation regarding sentiment analysis and it seems pretty daunting to use in a beginner project, what do you think?
Yeah if you need to understand how it works and you are a beginner in ml and python, definitely not in reach rn
You can just use it, but you wouldn't understand how it gets the results
If that's not a problem then no biggy
im just looking for projects to work on rn,i completed my discord bot project for my server and the only other project idea revolved around this topic
It wouldn't be much of a project, just downloading a model someone else made that is really complex
There are simpler methods, but they wouldn't work as well
bro where is @serene scaffold
guess what
my boss wants to try use that model now and create some sort of internal API
even tho the paper recommended at a minimum 2 GPUs

i know negative DevOps knowledge but he wants to try to stand up something

this is karma
for me poking fun at GPU architectures

deploy models into production
ok you created the model
now get it out of your notebook and integrate it into a codebase and production environment

unless you want the DS to create their own APIs and containers and do cloud configuration

imho i think that would be a valuable skill though but thats me
also
this is really cool if you havent seen it
join waiting list :/
lmao, you have to fill in all your social media, so only influencers get access probably
Does look really cool
I'm an MLE. Most of my time is spent doing pipelining and helping models get from notebook-to-production. We also help DS with internal tooling. I'm on a smaller team that also functions as Data Engineers, so some time is split there. EDIT: This was to respond to an above question and the reply didn't work on my phone. I'm not just randomly trying to flex about MLE stuff.
Did you get access then? @misty flint
So why do they get paid more, whatโs hard about taking DS code and running the real data into it
for DALL-E? no
ah question. if you had to stand up something for this model for like an internal API on aws, how would you do it? https://github.com/EleutherAI/gpt-neox
they say you need 2 GPUs at a minimum
for model inference
stand up something?
otherwise i heard you can just use lambdas/serverless stuff
uhhhh idk its just what my boss said ๐
i think it means make available for API calls

ah yes CLIP is for sure cool
That they use pca to compress the semantic information
and it actually gives like more raw images of the important parts
I'm honestly not sure what this is --- lemme take a look for a sec.
yeah i wonder why that is
I have no clue how most of this works lol
Ah, okay, well, it can be containerized. So, that's a good start. So, there'd be two parts to this:
- Development
- Deployment
For the former, is it intensive to work locally [jupyter, etc.]? For the latter, I think I'd prob either spin up a docker container (if not very intense) or have an EC2 which I can pop up and run the stuff on-demand.
But the more I see this kinda stuff, I really get into computer vision and image generation
If the former is too work intensive, there's some AWS solutions, but a lot of them are $$$$.
For it to be a training platform (it sort'a looks like it might kind of go along with some platform tools) I might spin up a K8s cluster with TensorBoard + Jupyter + some training images or something. I'm not 100% sure since I don't know the service well.
Yeah. If you know you're going to be running a lot, or you don't particularly care when it's run, you might be able to save money with spot instances / reserving.
But tl;dr, the easiest solution to spin one'a these things up is to try to Dockerize it locally. Once you get that, you will prob know the gist of how you want it to look / work in general and can translate to on-demand/whatever EC2 stuff.
do it. im doing a project on GANs just to learn more about it. i dont really like CV tho 
i see, i see

wait
Yeah, I think cv might be a good basis though rn
wont we run into the problem of not having enough GPUs though?
Actually, doing Docker with GPU is a little weird. Hm. I don't think I have done that before.
It's not all relevant, but still some useful stuff in there
I think this [https://docs.docker.com/compose/gpu-support/] might tell you a bit more --- it seems like you may be able to specify it in docker, but the last time I tried this was a few years ago, so I'm not sure.
Usually for our GPU stuff we're not testing new things, it's kind'a old stuff, so we already know the way to configure the EC2s et al.
oh thanks buddy. really appreciate it.

i will let you know what i find out
I wish I could be more helpful, but this is out of my paygrade! Haha, if you find cool stuff out, lemme know.

Oops wrong channel my bad but thanks for answering
I just haven't done much with NLP atm
i think its bc its easier for me to see product opportunities
So can't say that much about it
I like CV more Iโm just a visual person
with those models
RNNs in general confuse me a little bit atm
But I guess you'd have to know a lot about NLP too if you wanted to create stuff like Dall-e 2
Yeah not sure either
but transformers are super important in understanding modern NLP yeah
do most transformer models not use RNN anymore?
Or is transformer just implicitly not recurrent?
the latter
they are not recurrent and they dont use convolutions
thats what allows them their speed and parallelizability
dall-e 2 definitely uses transformers it seems
The paper actually doesn't look too too bad, might read it sometime
only computer vision to go
good luck
tyty, you too
transformers use attention, attention is not recurrent. it's just pairwise relevance scores for every pair of items in the sequence
bag of words wins again
Yeah, I just thought transformers could be used with recurrent connections
But it was attention that was used with RNNs too
But like I said, not really comfortable with transformers or attention haha
yeah i'm not actually sure if you can mix recurrent and transformer
maybe there's a way
if anything, transformers are an alternative to graph neural networks (but as far as i know they have some special behavior that makes them not a strict special case thereof)
its a dual architecture, CLIP is used for rankings while a diffusion model (GLIDE) is used for sampling
transformers model recurrence implicitly - autoregressive models show that more than anything.
it's just pairwise relevance scores for every pair of items in the sequence
and that interepretation breaks down very quickly. specifically, its meta-learning (learning to learn) and as schmidhuber likes to claim, a special case of his fast weights paper. in essence, attention is hard to understand because we can't really interpret what its doing - but the standard guess is that it also learns which weights to "attend" to (weigh more) as well as paying attention to tokens as well
its pretty hairy, but suffice to say attention does a lot of things at once - and it being Turing complete allows it to learn things we won't expect ๐
for instance, it can also learn the positional encodings when the sequence is not provided with them, with only a little drop in performance. amazing stuf
what's ig?
what're the biggest differences between autoregressive models vs. bidirectional ones (besides architecture). you find any good resources on that? 
what's a transformer ?
and whats a bidirectional model, RNN, attention, autoregressive model, and diffusion moel?
instagram data sets?
oh ok
not sure what ig stands for lol
bidirectional are usually encoder-only, the attention of each token is calculated w.r.t to every other token
doubt it's instagram..
is it true bidirectional are better for classification tasks
autoregressive masks future tokens while training, because that's whats its supposed to predict
yep, mostly. though it doesn't really matter because an autoregressive model can be treated as a classification model too
I've no idea what that means
i see
lol
๐
Is there any recommendations for getting more into deep learning, like on transformers and stuff?
prolly yannic Kilcher's videos
Kind went through most of the stuff but kinda abstractly with a course deep learning
what's ur bg?
I'm an AI master
huh. I suppose then you'd have enough knowledge to read the paper directly?
i found this really helpful for DL https://d2l.ai/
no idea what masters in AI means on a technical level
@mild dirge helps give some context
Yeah i 've read the paper, understood the just of it
and it has code too
ah that's cool thx
It's broad broad tbh
We learn a bit about machine learning, but also just coding in general and cognitive modeling etc.
interesting.
But I kinda like going into the machine learning part more than the others
in my program, you specialize. i chose machine learning as my concentration
for a deep dive into NLP, I supose you'd have to strike out on your own
but there's a ton of information online.
I kinda started liking books a bit more lately
well, most SOTA applications boil down to transformers
Reading blogs just feels like a very abstract overview
there are some really good ones out there
and most online material is then just videos or blogs
I watch his videos, and pretty much all the popular ai channels, but I tihnk they either assume prior knowledge or are surface level
where can i finnd a good onnline resource to learn stuff
its more like a primer b4 reading the paper
tons
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
so many
how do you find them?
oh ty
I figured, but idk where to look lol
I had a talk like few days ago with my prof, and I mostly adressed that the course was quite broad, and maybe would have liked some more in-depth stuff
I don't pretend I understand all the deep mathematical details in multiple papers anyways; but its a good idea to see his review, then read the paper, then go over the maths
well, what do you want to focus on lol
But he kinda argued that deep learning is just broad, and you should do other courses for in-depth
it really is too broad
is it?
for now, justa basic understanding of everything, before deciding what to focus on
just seems like a few architectures domineering
its not like they're forcing you to learn NeRF ๐คทโโ๏ธ
maybe you should take an online course then
that can help you decide
what kind of course

@coarse burrow I would recommend brushing up on linear algebra and other topics
there's plenty resources in pinned comments
Is this actually free?
yeah
wow, cool
completely
ook
the code too lol
i would probs just choose the chapters youre interested in and skip around as needed
at least thats how i did it
wow, I'll check it out
ik
might be overwhelming but maybe it can give you an idea of whats out there
Lmao thats how I do it too...imma drinking coffee too rn as usual
I like it so far
some of it is familiar from yannic's vids
bro me too. sweet coffee tho since its friday night

and im doing homework last minute
yet again

see if you can get your boss to buy one more GPU than you actually need to make it happen, and then take it home.
i wish. we will probably go cloud native with this project

also the only reason why he can get funding for this project is bc it would be tied to our AWS account
which we can just throw money at
bc...
cloud

that was literally his reason
๐ธ

doesn't it literally work by computing pairwise relevance scores? whether or not you get interesting emergent behavior out of it is kind of unrelated
dont think MLEs are paid more than DS. source?
or possibly MLE is the new DS, new buzzword?
it's a bit different skill set (according to some theory, in practice i think it's a big overlap). i wouldn't say one is easier than the other.
One profession is paid more than the other not because it's harder, it's because someone is willing to pay more for it due to some business reason or hype i think.
Do data scientists even code that much
Usually people just slap whatever name they deem fit onto the job listing but there's not much of a difference between them
I think so too.
I would think that a machine learning engineer is someone doing research and/or making what they found more readily available for use (e.g. making ML based libraries). Someone in DS still needs to have a decent idea of how the ML stuff works, because the details can't be perfectly encapsulated, but may not be spending nearly as much time on it. They have other things to do. Someone in DS may not even make use of ML, just plain old statistics (not a statistician though, that is similar to a machine learning engineer in that they focus more heavily on it (and depending on what you are doing, you may need a statistician)). A data scientist uses whatever they can to analyze data and may actually be a data engineer if they also work on more than just the analysis of the data (not uncommon). But, in the end, it's just a job title. I'm ok with it not being well defined, it's like when "scientist" used to be an actual job title, because you can do more than one thing. The need to categorize it properly to limit job scope is also something though, so having a vague title can have upsides and downsides.
any idea how I can make the scale of the graph bigger so everything fits on my screen size
Sorry if this is the wrong channelโI have a question on plotting probability distributions here, and I'm still not sure about it: #help-dumpling message
ig = I guess
but it technically could also mean instagram lol
Yeah I read it as if ig was the one you got the datasets from instead of kaggle
but that makes sense haha
not really ๐ which is what I said, attention to tokens is a byproduct. Attention originally was a meta-learning architecture that attends to matrix weights - a derivative similar to FWP. Its a pretty big mystery, but at any given time an attention head could be doing anything - one can't claim any particular head to be doing x unless they can prove otherwise
Attention maps by their very definition aren't interepretable. on papers, they constrain to create pretty maps which go well with their theories, but in reality its a pretty mess - as it should be
howdy folks, struggling with a pandas question over at the coconut channel if anyone is available ๐
#help-coconut message
Thanks @serene scaffold for helping me solve this issue ๐
hello all. Pls anyone experience in web-scraping meteorological data from ERA5 website?
plt.figure(figsize=(12,6), dpi=100)

i converted my h5 model to tflite using this
tflite_model = tf.keras.models.load_model('my_model.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(tflite_model)
tflite_save = converter.convert()
open("my_model.tflite", "wb").write(tflite_save)
but when i try to classify images it just predict the same class always its like broken model
while the h5 model is working fine i just converted it to tflite and its like broken
i used the resnet50 model from keras and trained with preprocess_input for resnet50 with
from keras.applications.resnet import ResNet50, preprocess_input
do i need to apply this preprocess on inputs for my tflite model?
reading this https://www.ecmwf.int/en/computing/software/ecmwf-web-api says they have an API you can use, so you don't need to scrape?
Great, thanks @cursive wing
Any idea on the best way for finding boundaries of a basketball court? I have tried several solutions, color space segmenation, hough line transform to find the boundary lines, was planning on trying GMN but i don't have a datasets availiable. Any suggestions?
Boundaries or the Court its self as a mask, either works
Theyโre paid vastly more and the position requires a phd
You mean MLEs earn more and require PhD? Do you have any data for that. Would be interested to read it.
I didn't notice this vast pay gap. But i didn't do any proper study on it. From my experience PhD is not a requirement for MLE jobs. I would say PhD would be needed for ML Researchers i think
For organisations that intend to write papers and submit to NeurIPS etc. Bcs that's what PhDs are trained for.
For say applied ML PhD is not required i my opinion
Or if you are startup and need PhD in a team to convince investors. That also I've seen.
Just looked up indeed for US. Average yearly DS: 120k, MLE: 130k. That's indeed around 10% more on average.
Glassdoor median : DS 122k, MLE 123k. That's less than 1% more.
Hello, i have been trying to implement grad cam and i have a question, can i take the outputs after training my model and pass it into grad cam?
Or does grad cam only work with one image?
Come to london
U will see
No chance to win these jobs
guess I need some real help here. So I used a tool to annotate my image
however I already got annotations in the format of images, but I am not sure how to get it into JSON/COO format for my Mask RCNN
I would rather not spend ages on each single one using the annotation tool if the work has already been done
I see. Where to look for London jobs?
Good question
North end
i need help with installing some packages
i'm trying to install nltk
but every time i install it
there is something missing in the resources
How do you install it? Os?
with pycharm just click install
You could try install instructions from nltk website and see if that works
Trying to generate a bunch of (0, 1)'s in jax
Like this:
[[0, 1]
[0, 1]
...
]
Don't know how to do it since jax arrays are immutable
hi guys
can someone pls help me figure out a solution toa. problem im facing
basically i have this review file
or another
want to implement something like this
on every date taking avg of the numbers from all previous periods
but i dont want to assign same weight to everything
i want the more recent ones (top X percentile most recent ones) toget more weight
can we do this
@lapis sequoia there won't be an idiomatic way to do "the average of all previous values", but you can still do it iteratively.
can you be specific about how you want the weights to be calculated?
Sometimes I wish I had the time to learn actual coding
Canโt wait to graduate and focus on that
what is your degree in?
thanks bro
itโs a DS degree
does anyone know how to do Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample. in python jupyter notebook
I mean more like data structured and algorithms
but im looking for a way to give it aw eight
eg
it's just like on 1/1/2021, let's say we assign a value to every review that comes before that date
eg if it's 1 period ago, it's 1
if it's 2 periods ago, 2
etc
and then , let's say we get N
we find closest whole number to 0.2N
so let's say N is 60, so 0.2N (top 20%) is 12
so the first 12 reviews, on that date, we multiply all the values by W , eg 3 - and then we calculate the mean as we did before
and then on the next date we repeat the same thing
etc
u know?
the answer will be the same for a given version of numpy whether you're using a jupyter notebook or any other Python environment.
what do you mean by "single sample"? if there's only one element in the array, reshaping to (-1, 1) or (1, -1) will both be (1, 1)
I'm not mentally invigorated enough to follow this at the moment, unfortunately
you can wait and see if someone else answers, or you can ask again another time. but I can't commit to looking at it again at a specific time.
ok thanks ๐
Hey @livid lance!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
because https://pastebin.com/NYs44rfb
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
thats the error i got
thank you for showing the error message. is [8.21304911e-01 1.44999358e-37 8.76388929e-01] a "sample" in your case? is it always three elements?
yes
the total number of elements is the size of the array. if the size of the array is 3, then it is one sample. so you can use that to make an if statement that does the right reshaping.
ok
from sklearn.model_selection import train_test_split
Xtrain, Xtest, ytrain, ytest = train_test_split(pca.explained_variance_, pca.explained_variance_ratio_, test_size=0.35,
random_state=42)
SXtrain, Xvalid, Sytrain, vtest = train_test_split(Xtrain, ytrain, test_size=0.3,
random_state=42)
this is the code for the array
@livid lance this doesn't really tell me anything unless I know what every variable is. but scaler = preprocessing.StandardScaler().fit(SXtrain) is where your code starts to go wrong, per the error message
as print(pca.explained_variance_)
print(pca.explained_variance_ratio_)
used that
@livid lance you probably need to reshape SXtrain before passing it to StandardScaler().fit
Ok
How does this model look?
Should I add a pooling layer between those Conv2d layers that follow each other, to reduce trainable params?
Is this tf?
Mhmm Keras
does keras not show activation in this table?
model = keras.Sequential([
keras.layers.Conv2D(input_shape=(400,400,3), filters=64, kernel_size=9, strides=2, activation='relu'),
keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2),
keras.layers.Conv2D(filters=128, kernel_size=7, strides=2, activation='relu'),
keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2),
keras.layers.Conv2D(filters=256, kernel_size=5, strides=2, activation='relu'),
keras.layers.Conv2D(filters=512, kernel_size=3, strides=2, activation='relu'),
keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2),
keras.layers.Flatten(),
keras.layers.Dropout(0.3),
keras.layers.Dense(5, activation = 'softmax')
])```
Doesn't look like it, but I used relu for my Conv2d layers and then softmax for my dense layer
Seems like quite a lot of channels for the last few layers, not sure if you need that many with images of size 400x400*
But then again, I haven't made that many CNNs, so just give it a go ig
I tried training it a couple of days ago and got 0.8 accuracy and the matrix looked fine, but i wanted to try improve it so I done some more data augmentation and just trying it now again
There's so many parameters you can change
But what sparked the question, I read on a paper they were trying to lower the number of trainable parameters, so was just wondering if I should do that too
It's good to just try a few configurations out if it doesn't take too long to train and test
Guys I am trying to classify my text, I have multiclass classification, are these layers good? I am trying to improve accuracy.
Takes about 15 min per epoch >.>
I'm not, I've been trying to get it working but it's not been working tbh
gpu is definitely much quicker than cpu
I've deleted my tensorflow and downloaded tensorflow-gpu
but I can't even get past the imports
@mild dirge could you take a look please? seems u got an understanding of it.
absolutely not haha, I have just used CNNs a few times and know how they generally work, not an expert on what shapes and layer combinations are optimal
You can't even know without knowing the problem and the setting
):
There's some other people here that know more about nlp, not for me to answer srr
yeah im not sure why it doesn't see it
this weird there is no dropout
I have a dropout layer at the end, should i be adding more?
can someone help me make loop
500 ?
I have an array of size (10000, 784) (as you guessed, mnist), how do I reshape it to (10000, 28, 28)?
When you know tell me
ูู ุนุฑุจ ููุง ุ
ุดุจุงุจ ูู ุนุฑุจ ููุง
ุงุญุชุงุฌ ุดููุฉ ู ุณุงุนุฏุฉ
please use English to the best of your ability
the problem is i'm bad in engilsh but ok i will try
is there arabs here guys i need some help
So FastText,
Is it only used to classify text into categories and word embedding?
it's for learning word embeddings. whether you then use those embeddings for a classification task is up to you.
Is 400x400 pixels too large for training, if so what should I try set it as?
400x400 is 160,000 pixelx, whereas something like 244x244 is 59,536 pixels
So a HUGE decrease in pixel numbers
anyone know how to delete all image above red line or make image transparent above red line
Why I get an error like this? anyone can help me?
i ahve coordinates of line if easier
hello everyone
Howdy
Your input is not matching what it's expecting, what is being run? Where is the input?
this is my input
You may read the error. It is written in plain form.
Your model expects 6 features. I assume one this is your Y. So make another df which has just 6 features.
i am a high school student having crazy love for coding and want to establish my career in electrical engineering
If you mean econometrics as in maths for economics, I know an economics Discord server I can refer you to, in case no one here can help you ๐
Just tag me, when you feel like you'e waited long enough.
how does one learn machine learning ?
I'm currently reading https://mml-book.github.io/book/mml-book.pdf rn
Didn't read it. But seems great to learn mathematics for machine learning. Happy learning
Learn the models
I've studied how machine learning works in my chemistry degree as a undergrad
but now that i've gotten deep into software development, I want to have machine learning under the belt
is there any machine learning projects I can start off with?
ngl, I'm struggling a bit reading it
Try making a MLP from scratch?
Learning by doing works best for me. I started by finding a dataset on Kaggle that interested me. I wanted to implement it in my own environment (not a notebook). I looked at PyTorch docs and tried to copy logic from different notebook examples and try to get it running. Then find a way to write it in my own way.
I've probably done a hundred mistakes so far and will continue to do them, it's a part of the process. But I learned a ton more the week it took me to implement it than the books and courses I've taken previously.
yeah book are annoying to read @astral storm
best way is to just do it
just want to understand the maths behind it before i do it
@astral storm how did you get to know the maths ?
by reading a book haha
It's hard to learn the maths by just doing it
The best would be maybe looking at the series by 3 blue 1 brown, and then trying to implement it yourself
What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks
Additional funding for this project provided by Amplify Partners
Typo correction: At 14 minutes 45 seconds, th...
I'm more of a top down kind of guy when it comes to programming. I dont fully grasp the underlying maths of ML and tbh you don't need to to be able to build ML applications. Don't get me wrong, the underlying maths is still important but if I was to start with learning that I would've lost my interest in ML a lont time ago.
It's like you want to learn how a radio works, first I want to know what happens when I press Play, Stop, Paus etc. and then pick it a part; not the other way around.
But everyone is different and learns in different ways. There is no right or wrong.
Cheers for this
got it
and don't just use videos and blogs, a lot of that material just simply contains mistakes or explain the topic too abstractly
Books really are good for getting more in-depth knowledge, even if you don't like reading that much ๐
any recommedation for books
the book i'm currently reading (https://mml-book.github.io/book/mml-book.pdf) is a bit annoying to read, wish it was simpler but I can just about manage it
I liked (Deep Learning with PyTorch by Eli Stevens and Luca Antiga) for how to do an entire deep learning project
But for the more basic subjects I haven't really found a good book tbh
okok thanks!
In this book they do explain most of the basics, but some of it is kinda assumed to be known
So they won't explain it too in-depth
got it
Hi, got a Question, is the idea of gradients the only thing needed for ML from multi variable calculus? I am going through it using Khan academy and its a lot of videos
"only" is a strong statement, but calculating the gradient is the first thing that comes to mind, yes. did you complete the part about partial derivatives?
yeah, I reached till multi variable chain rule, its only like the 30th video out of 175 but I was thinking of moving on to other topics
you should at least take away some intuition for multi variable integrals, if you plan to do any probability (which you should)
change of variables can be an important technique for probability
but otherwise honestly yeah... for a hobbyist/self-taught person, gradients are good enough to start with
you will want to learn about hessians too (2nd derivative) at some point, to build intuition about numerical optimization techniques that are in common use
the linear algebra is probably more important, and it's really useful to learn how to do calculus on matrices and vectors
I see, I see, came here to ask cause i spent some weeks learning game theory and the most I got out of it was the discount factor used for reinforcement
yeah, I tried compleleting the khan series on that but opted for 3blue1brown for now
yeah game theory is useful for certain applications but not for "garden variety" data science
you will want to actually take a real linear algebra course at some point. MIT 18.06 is very good (and free)
same with probability and stats. 30 hours of snippets from khan and similar are not equivalent to 30 hours of structured course material
XD I just saw what math topics was needed for ML and just went balls to the walls on topics. mulitivaraible was the last one and I wanted to skip it.
thanks, i think I should give MIT free lectures a try
yeah you need it, but not in the way eg. a mechanical engineer might need it
the linear algebra professor in particular is famous for his 2005 course which was put online, his name is Gil Strang
you will have to spend time actively working through problems. do not assume you can just absorb knowledge by listening to a video
this was the mistake i made for my first ~2 years in university
it also seems to be a common fallacy among young people today
I do try to go through the questions but stopped caring half way. I will try to work through the problems for the MIT videos
you won't regret it!
so far I have tried to relearn linear algebra, probability, calculus, mulitvariable cal, game theory and discrete math
edit: statistics too, forgot about that (from watching statquest)
are there any subjects/topics you recommend?
that I might be missing out on
Found this interesting. Dimensionality Reduction using an Autoencoder in Python
https://medium.datadriveninvestor.com/dimensionality-reduction-using-an-autoencoder-in-python-bf540bb3f085?sk=70e0c203d872195d6b61b460d08a724b
Someone sent an article about Dal-e 2 the other day, which was pretty cool
Pretty impressive
hi need some quick help. found this code online for r. not sure what part of the code i replace with my code. like do i write your_df or write my own variable. im new
couldnt be me
how would one deploy a mobile app that involves machine learning algorithms originally written in python?

tensorflow lite
Hi does anyone have some materials on credit assignment problem and wants to share them?
Hi everyone, i have a deadline for school tomorrow i have to pass my JSON variable to my python and execute it. I would appreciate it if someone could help me thanks1
@raven linden looks like this is a #web-development question
Anyone know this error?
@modern cypress if that's all the information you can provide, your best bet is to Google the salient part of the error message. But if you want help here, it has to be text. No one wants to manually retype your error message
Sorrry
---------------------------------------------------------------------------
InternalError Traceback (most recent call last)
Input In [16], in <module>
4 labels = []
6 for features, label in all_data:
----> 7 img = tf.convert_to_tensor(features, dtype=tf.float32)
8 x.append(img)
9 y.append(label)
File ~\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\util\traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
File ~\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\constant_op.py:102, in convert_to_eager_tensor(value, ctx, dtype)
100 dtype = dtypes.as_dtype(dtype).as_datatype_enum
101 ctx.ensure_initialized()
--> 102 return ops.EagerTensor(value, ctx.device_name, dtype)
InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.```
I've tried to uninstall and reinstall tensorflow
I've tried removing tensorflow-gpu
so as to just run on the cpu
but none of that has worked so far
If anyone has an idea, I'd really appreciate it ^^
hey guys what is the difference between tangent and differentitation?
whats is their use?
cite some real life examples
a tangent is a line that touches a different line/curve at exactly one point. let me make you an example
@frozen marten the dark blue curve is the curve of interest. the light blue line is tangent to it at the bottom point. the yellow line is tangent to it at one of the points on the side. see?
but whats the use of tangent? @serene scaffold
I'll get to that. do you understand so far?
keep in mind that mathematical principles are abstract, and they might not have obvious real-world applications. and the applications that they do have might not be related to each other.
but the point of differentiating a function is to figure out its rate of change
a real-world example is that speed is the rate-of-change of location, and acceleration is the rate-of-change of speed.
so every curve has a tangent and are called differentiable?
do you know the notations f(x) and f'(x)?
yep
f'(x) is the slope of the tangent to f(x)
yep! in my diagram, the light blue tangent line has a slope of zero. and the slope of the yellow line would be 3, or something like that.
depends on the function ๐
what does it mean when a slope value varies?
you know how f(x) = 4x is just a straight, diagonal line? anywhere you go on that line, the slope is going to be 4.
yes.. but not true for the parabolic one u drew
you are right 
the slope gets lower and lower towards the "bottom", and closer and closer to infinity (or negative infinity) towards the tips
so that we can estimate
you don't need to estimate. the derivative tells you.
right i understood, the point where we want to can be replaced to obtain it
why infinity at tips?
because as the parabola continues forever and ever, the rate of change keeps getting higher
(or lower, on the left side)
I always thought these two were the same
But then I figured I couldn't pass the second one into torch.mm
Any fix to this?
so accelaratio has varying slope but velocity has const slope?
depends. if someone is going at the same speed constantly (they never speed up or slow down), what would their acceleration be?
0
if no change in speed
if they inc speed periodicaly in same direction and magnitude then there is possibility of const slope ig?
not periodically--they would have to be increasing their speed continually
at that timw velocity willhave non const slope...
hmm whyyy
i raise 4kmph 1st and 4kmph afteran hour..
because you can only take the derivative where something is continuous
so my case shud form a step like figure
That requires you to have a radio that someone already made. But in the case of learning ML, unless you can find some library for the specific thing you want to learn about (unlikely, many things in ML are not public libraries sitting around, either because they are too simple, too complex, or not as effective as what is commonly used), you won't have that. You have to have the ability to create a radio from basic ideas / principles (e.g. learn the math). In addition, even if you find that "radio" for ML, it's software which is much harder to play around with than something physical (because it's abstract / not physical / can't perfectly encapsulate all its details / hide them from you via a nice user interface).
cos there ll be 0 in between
like a 4 and a horizontal line and then 8 and a horizontal line...
will that not be continuous
I guess so. the derivative function would be flat in between each timespan that your speed is increasing
yea.. so continous rihgt..?
yes. though if you want a derivative that's a smooth curve, you'd need to always be increasing the speed slightly
it is
so the derivative will be changing right..?
so can we say derivative helps us understand the continuity ofa function?
and does every continuous function have a deriateive?
so this represents the speed, right? the parts where the speed is a horizontal line are impossible, because you can't change speeds instantly
but if we assume that they're diagonal segments that are almost horizontal, that is fine
accelaration
anyways speed can be changed by applying brakes
I think we've strayed from the original question, but it seems that you actually understand all of this quite well
Hmm, how is the validation accuracy exactly the same? I'm confused
also i have another question... in a linear regression why is it preferred to have a least square as a measure raher than absoute difference (i understand that taking non absolute values can be misleading when summed) @serene scaffold
I'm not sure
mmm well alright
+1, and the reason why is that concepts taught are abstract and humans understand abstract things in terms of some prototypes / concrete examples (some mix of them). So even if you think you understand something, you don't really unless you have either played around with some concrete examples (context), or you already know a lot of stuff and can sort of do a knowledge web trick in which you can relate it to enough other things (which may be tied to concrete examples) (yes, this is similar to an embedding / generalization).
*If you learn something as general purpose as linear algebra (with concrete examples / applications), that knowledge web explodes in scale and there many things you can understand without even messing around with them (because you can reduce the problem to a linear algebra problem (problem reduction like in computer science)).
*So at the start, you just want to play around with as many examples as possible, and it will be a slow annoying process at first (not really any way around that).
(it's fun if you want it to be though, if you are a mathematician you probably love that kind of puzzle grind and discovery)
yeah sometimes you find that you dont actually understand something until you have to implement it


How should you discuss your model choices for the layers in a report?
I sort of just looked around and tried some different layers with some parameter changes, so I'm not really sure what I should be saying
model = keras.Sequential([
keras.layers.Conv2D(input_shape=(400,400,3), filters=64, kernel_size=9, strides=2, activation='relu'),
keras.layers.Conv2D(filters=128, kernel_size=7, strides=2, activation='relu'),
keras.layers.Conv2D(filters=256, kernel_size=5, strides=2, activation='relu'),
keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2),
keras.layers.Dropout(0.3),
keras.layers.Conv2D(filters=256, kernel_size=4, strides=2, activation='relu'),
keras.layers.MaxPooling2D(pool_size=(1, 1), strides=2),
keras.layers.Flatten(),
keras.layers.Dropout(0.3),
keras.layers.Dense(500, activation = 'relu'),
keras.layers.Dropout(0.3),
keras.layers.Dense(100, activation = 'relu'),
keras.layers.Dropout(0.3),
keras.layers.Dense(5, activation = 'softmax')
])``` The model I ended up with if anyone has any improvements (multi class image classification)
Can anyone give me examples where python stands out from other data science tools?
all of the major Deep Learning frameworks (pytorch, tensorflow and high level wrappers around them) are based in Python
Thank you! That was what I thought too
just wanted to say , there is a fair amount of resources on learning the maths behind machine learning
i found this one
hot take: people who preach only "top down" approaches are either lazy or just dumb
that's a stereotype I've picked up from quite a few ML servers
Its mostly the intersection b/w top-down and bottom-up where things get interesting, and you learn something new
I have a burning question that I need answering real bad. Are weights in a neural network updated per input or per set of inputs?
they are updated every "batch". with standard SGD, the batch size is 1, so they are updated every input.
you might say that they're all based in C++, but python is the "first class" interface for them
so with this being the set of inputs, the weights would update just after 0.5, 1, 0.75?
what kind of report? you can say "this architecture gave decent results, in the future we can spend more time looking for more-optimal architectures"
oh, i see what you mean by "per input". no, all the weights are updated at once, for all the features.
if you did one feature at a time (or subsets of features), that would be a different algorithm called "coordinate descent", which isn't in general use for neural networks
alright thank you so much ๐
recall that the gradient of the loss function is the vector of all the partial derivatives of the loss function, i.e. the vector of derivatives of the loss function with respect to each individual weight
ok so since I only have 1 set of weights, but multiple outputs? Is that when I would get the MSE to then apply that to the weights?
i don't follow your question
gradient descent minimizes the loss function: loss(w, x, y) = mean((prediction(w, x) - y)**2)
or whatever your loss function is
where the w is the weights
the mse gets "applied" to the predictions from the model
you choose an initial value for w, compute the gradient, and then update the weights. repeat until convergence or until you give up
i've done some research but i can't seem to find if vs (not vsc) 2022 is compatible with cuda 11.2.2
so yeah, is it?
and i'm just tryna get tensorflow set up, and from what i've seen the most recent version
of tensorflow only supports 11.2
trying to run an RNN for text classification on colaboratory, but when I try to fit the model, I don't even get through the first epoch (>10 minutes).
I don't know what to do at this point, my model is as simple as can be but it's just taking eons to run
this is what it is currently
any advice would be appreciated
@safe viper did you make sure that it's using a GPU?
Yeah, I got Colab Pro and it's running on GPU with High-ram
hello, can anyone recommend resources to help me learn how to calculate process runtimes in pandas?
65 mil params is quite a lot no?
It took me about 1-2 mins on rtx 2080 with 5 mil parameters
or should i even spend time studying that? novice data analyst speaking
I ran a CNN with 65 mil as well and each epoch only took 2 min even though I had like 10 layers
also depends on how much data
If you're giving it the entirety of wikipedia then that might explain something
how did you do your word embeddings before feeding it into the RNN?
usually people reduce down the dimensions
before doing so
for example if i had some corpus with 20,000 unique tokens, if you did one-hot encoding for each token, your dimension size would be 20,000
There are 100 dimensions... is that too much? My embedding was taken directly from an example my professor provided
but in reality, practitioners reduce dimensionality down to 300-500 for an example like this (or smaller)
Yeah my embedding matrix only has 100, so that should be fine right?
hmm maybe it has to do with how you built your model then
