#data-science-and-ml
1 messages ยท Page 66 of 1
I could use them for my work (medical stuff, modelling something as a function of vital signs + behaviour)
It's not just the coding part, but moreso the questions that arise during the process that are more valuable imo
But I just don't get why people are using them for time series. You don't want permutation invariance there
I almost certainly go on deeper math rabbit holes than coding when implementing something from scratch
So they have temporal fusion transformers that take away the permutation invariance of transformers, why not use a regular RNN at that point etc...
I find that a lot of details I might've glossed over when just reviewing the theory become difficult to ignore once you're implementing it
We account for that using positional embeddings
The other benefits of Transformers are exceptional for time series. And positional embeddings are powerful enough to make it work
Spare me the work for my undergraduation and make a Transformer to predict probabilities of disease diagnoses according to the symptoms a patient has related 
Transformers are not restricted to processing the input sequence one-at-a-time
This leads to arguably some of the biggest benefits of Transformers, modelling very long sequence dependencies (theoretically infinite) and single shot computation in parallel
gtg now ;-;
I'd say that there's a trend on using Transformers for anything... 
There's the Transformer for the Stable Diffusion conditioning(text), there's Transformers for AlphaStar, the DeepMind's AI that achieved GrandMaster in StarCraft 2, there's Transformers for image classification, for video classification, face recognition, text classification...
Maybe there's also for working with audio. I just didn't find it yet 
https://ai.googleblog.com/2022/10/audiolm-language-modeling-approach-to.html
AudioLM is a pure audio model that is trained without any text or symbolic representation of music. AudioLM models an audio sequence hierarchically, from semantic tokens up to fine acoustic tokens, by chaining several Transformer models, one for each stage. Each stage is trained for the next token prediction based on past tokens, as one would train a text language model. The first stage performs this task on semantic tokens to model the high-level structure of the audio sequence.
I won't deny it. In fact I was going through this paper sometime ago (not sequence processing) and it frustrated me; looked like they'd just thrown a transformer each at 7 parts of the problem and hoped it'd work.
But I can't complain. It works. That paper reports SOTA performance and defined a completely new task and training procedure
And tbh often there's a lot of thinking and mathematical justification that goes into where to throw a transformer and what kind of transformer
Yeah this is fun lol
Yeah, I just don't have enough finesse with transformers
I'm quite prone to think that, in most applications, Transformers are thrown into a task and everything else is adjusted in order to make it work.
I'll try them out for my work if I have spare time
At the very least it'll be a learning experience lol
How would you guys suggest I learn pytorch? I want to train a simple model over the next few days, taking in ~140 inputs, im not sure how many center layers, but only 7 outputs. I have learned a bit about ai and how it works, the math behind it and all that, but I just never learned pytorch yet. Most tutorials on pytorch seam kinda confusing and the main docs are verry in depth for a starting guide.
data for input and output is repeated data of
- type
2-4. positon
5-7. rotation
ABOVE IS COPY FROM #python-discussion to continue a conversation.
if anyone wants to help, I dont know how to do this well, I can generate the data in this form and would prefer to use this rather than a waveform colapse to generate a map for a game.
@nova pollen hola
what would the training data be?
the more context I get the less I think deep learning is suitable ๐
fine, here is the long version
I want to take a racing game that has track prices for a lot of things, use the previous track peices to generate the next one and so on. The training data will come from maps that have no boosters or any sort of speed increse and be sorted into the ai model using the distance from the start block. The reason I have to sort it is because all maps have blocks unsorted, and without any maps with speed boosters, I can assume everything goes in a down direction or at least away from the spawn point allowing me to get the next peice and gett a good training set.
using that training set I will try to generate the next block that was placed in the map assinging weights to how similar blocks are to eachother allowing the ai to still get points for being wrong
I will also when chosign points, if the selected peice is less than 20 away from the start just include 0's as the types and position
*and rotation, allowing me to slowly geneate a map using this ai after it has trained on enought data on the maps and for long enought
This aprroch is better over waveform colapse, becausei want to learn ai and I dont want to haveto find all the parts my self, allong with the fact that that would illeminate any jumps or obsticals if I should use waveform colapse
I may also add one more data point giving how many track peices exist in total, allowing the ai to end the track at a reasonable point.
I know this is possiable, its just how hard it is
mm apart from the sequence modelling point i mentioned earlier
this is a generative problem
since there isn't really a "ground truth" next object
There is for the training data, I can use the one that is placed afterwards as that object
I would train it on a lot of comunity made maps
*reciently there was a no power compotion where people could tag their maps as not using any powered objects like boosters
right, but if I gave you the sentence
"I am currently eating a BLANK"
and asked you to complete it, there would be many possible valid continuations
yes, thats why I hoped to try to solve it using weights for how much each peice was similer, but I dont know if that would help at all
this makes training it as if there were only one correct answer difficult
yeah...
I just wanted to do this for the reasons specified before
mm
as generaly wavefrom colase, only looks at its direct neibors, but I could have it extend its size ig, or make premade assets for each road section
but eithro way it would not be as good as training it directly...
I have a arm small computer that I already have on 24/7 running nothing rn, so I could just give it the task on the cpu for a few days.
doesnt hurt to try i suppose
yeah.... If we have 1 correct also, it would train, just not the best
it would have conflicting information
but it would work
The tutorials in the torch docs should be a good option
JavaLim in ai channel ๐
Yeah. They're very powerful and very pervasive in many sub-fields of ML
Maybe, maybe not. Still seems to give better performance than any other option
do you think this is a good way of trying to solve this, or do you suggest I do somthing diffrent
u wanna replicate how the game generate track pieces?
The first rule of machine learning is: don't use machine learning
i.e. try to find a simpler solution, mathematical or algorithmic
I read your long version but tbh it wasn't clear to me what exactly you're trying to do
Like wdym by "sorted into the ai model"
not replecate, I want to generate a track
assuming the game ure using doesnt just randomly generate track pieces and stitch them together, ure just replicating that algorithm by training a model on that data no?
Sorry to interrupt the lively discussion here: I have a problem with the training time of my convolutional neural network. The inputs are 512x512 (grayscale) images and i want to perform image segmentation. For this i am choosing the U-Net architecture. Now even for a reduced training set of only ~100 samples, a single epoch takes ~1h to finish. My total amount of training data is ~1600 samples (not even including additional data augmentation). What would be smarter to do, in order to cut down on training time while keeping some of the performance: (i) "Reduce" the images to 256x256 or even 128x128 by some kind of "blurring" , (ii) reducing the networks architecture by removing a few layers or (iii) something else.
ure using gpu i hope?
the track peices are compleatly out of order in the data, but i have their positions, then using the start block, I can try to find what next track peice there is based off sorting them by distance using only tracks that have no speed increse;
track peices are out of order, I am sorting by their distance from the start blocks, using only tracks that have no external power for your vheical, allowing me to know what order they are in, and use that as input data, as without sorting it it would be hard
I am trying to train on comunity made maps, there is nothing in the game for this
why not just write an algorithm (non ml) to generate tracks?
I am not using wafeformcollapse because its its not fun, I want to learn ai, and I cannot allow for thins souch as jumps with that.
oh my god i just checked. I am beyond stupid lmao. Thanks for the quick reply
tensorflow on windows?
yes
Pain
the only reasonable one would be wafefrom colapse, or to make all jumps and transitions my self then have it basicly do waveform colapse on that data to get the output I wnat
sounds like seq to seq
Why can't you do jumps with waveform collapse. What are jumps here
not sure of the details if its non fixed length, etc tho
Makes sense
and how u would restrict certain combination of track pieces. ig it will be learnt based on community made maps
check the map, and there was a chalange reciently for best maps without any external power, so I can use those as a training set
disguartding any seneary
Are the "track pieces" to be selected from a finite, discrete, pre-determined set?
The thing with time series at least is that very very simple models (t = t-1) type things or exponential smoothing can outperform complex models
Yes
I think transformers will matter in my case is when I start doing long horizon with a large conditioning window
Because that's the space where basic models fall flat
Its trained on comunity maps and as I said later, it might be good for me to include one more value on there about the amount of track peices placed, so it can determin how far away the end should be
if all tracks are fixed length, could just do fixed length seq to seq
Or when there is complex structure inherent in your sequence. Time series forecasting tasks work well with simpler methods, like extrapolating stock prices maybe.
But you'd be hard pressed to compete with transformers on say speech or language tasks
^
If i may ask: How does this come into play? I.e. why does it matter whether i use tensorflow and whether i'm on windows for training time?
tensorflow doesnt support gpu on windows anymore. would need wsl
hey, not really a coding question but does anyone know where i can download the "tesseract executable"
Ok, well i'm working on Kaggle notebook, so i should be good, right?
What are these track pieces? Are they limited choices like {A, B, C} are they infinite choices like [1, INF) or are they a continuous variable...? Or smtg else?
yep, itll be linux
That's an OCR module ig check their project page?
there is about lets just say 10 peices of road, a start, a checkpoint, and a end peice. 13 possiable peices, then all rotations of those
Relent downloading executables from unauthorised kr untrusted sources
ur right, i did a scan of their github and of google's tesseract-ocr github but haven't found anything yet
Okay, so the track length will always be 13? Or it'll sometimes stop at 6?
i will go through again thank you very much
Or 7 or 3
Is it not pip installable? That should give you the wheels
the length is generates is determined by when it places the end block, it is given the track length at the current point in time, and as part of the data its given to train on, it is given the amont of peices placed for it to determin when the end is
it only geneates one peice at at time so that should be fine
Yes but it isn't necessary for
each track generated to be 13 pieces long is it?
yeah i installed it but it doesn't come with the executable and has to be seperately installed
kinda weird but i think i did find it, had to do some digging in the original tesseract-ocr engine page
no, that is the possiable peices to select from, it can generate a track of any length given those peices
but based on the trained data, I want it to generate the end peice
This seems like you can just sample from two distributions, one containing you set of tracks and one to regulate when it ends. You can add a bias to tune it.
I don't think it requires ML but sure you can use it if you want
one sec let me rewrite this as a long thing
Look into sequence modelling, RNNs, etc. There should be tutorials on pytorch docs.
And remember, one of the most important parts of an ML problem is formulating the data and model inputs in the right manner. You could be stuck in a simple problem for ages if you don't do this right.
Don't rush to the modelling part, give data all the time it demands and you should be better off for it
huh that's weird.
You could build from source either way ig
If you want it to be like the tracks other players have generated, you can add those to a population and sample from that instead of sampling arbitrarily
I want to generate a track in a game consisting of only track peices, starts, ends, and checkpoints, lets just say this is then a array of those real in game object ids, mapped to 0-13, 0 being nothing and only occoring before the start block.
the tracks to train on will first have to start with being cleared or selected with only 0 boosters / external power to limit the direction downwards and away form the start block. This will allow us the then sort the track peices that are currently randomly placed in the file, into a neat set from start to end in a continual pattern.
This data then we use to train a model by taking a random peice from a random track of data, selecting that peice as the one to be generated, this can be anything but a start block (Start blocks will only ever exist once and will never be placed by the ai.) the ai then takes the flowing data about blocks:
1x -
current track length
20x -
type (mapped betwine 0-13)
position (x,y,z) (clamped to a 1/4 th grid tile)
rotation (x,y,z) (clamped to 45* increments)
this data is given to the model, who then has to guess the track peice that was selected. This will repeat over and over attempting to generate blocks in the positions that tracks have most relivent online.
this will not be verry accurate, but with enough training, it should be close enough.
Waveform colapse is not a good option as for things like jumps or the end it needs more information that is easier to provide to a ai model.
generated format:
type (mapped again)
position (clamped again)
rotation (clamped again)
I think I wrote that better
stargazer?
busy with some work
What are jumps
ok, sorry just didnt know
No it's alright dw about it. I just check this when I can
its a car racing game thing, so you can jump spaces with enough speed
for wafeform collapse, it normaly only checks the spaces sorrounding, and because jumps have multiple blank air spaces, we cant use that by its self
so you eithro make bigger setcions for waveform collapse
contaning multiple peices
or has to expand on waveform collase
But your task is only to create the track right? Why consider jumps
thats part of the track
So there are jumps between certain pairs of track elements (say 2-7) and not between others?
Or jump is one of the 13 elements?
the model, will not prodict the peice in a certain place, but rather predict a peice and a positoon
this makes jumps easily possiable
the model doesnt necessarily need to predict piece and position
yes, but I would like it to
just piece should be fine if u set it in such a way that the outputs are already in its designated position
yes, but that makes it so jumps cant be done
I feel there's a bunch of information here that isn't apparent to us as it is to you since we don't know the game you're working with
I believe you can try to go for sequence modelling using ML. If nothing else, the process of preparing and structuring the data for the model should help you gain a lot of clarity about the problem
here is the exact game i wanted to try to do it on
https://store.steampowered.com/app/1440670/Zeepkist/
Zeepkist is a racing game for 1-4 players, or up to 64 online, in which players race down extreme downhill soapbox courses to set the best times possible!If you like weird physics, soapbox racing, and/or creating your own crazy tracks, then this is the game for you!๐ธ Race against time itself in Adventure mode!๐ธ Crash into your friends in 4-playe...
$11.99
931
Just remove all non track blocks
I do think this can be solved using some probability and statistics, without ML. But you can try it out and see
and use maps with no boostars
I dont think so, its a more difficult question
I guess I could use some sort of modified waveform collapse
but it would be diffuclt
I'm thinking more like how a sentence is generated, previous words matter to the next word being generated
That's possible. I'm not familiar with the game (and so the problem) as you are
From my brief reading about wave function collapse, I don't see why u don't wanna use it
certain things like jumps would require structures made of multiple blocks, doing this would also mean checking multiple blocks and I just think that that would be harder
allong with the fact I want to learn about using pytorch
Yeah like if a jump block has been selected for piece 5, piece 6 cannot be another jump right?
I think if ure learning something for the first time (in this case pytorch), best to start with something simple too that is well documented on how to approach
Whether or not it can be solved without ML, it does look like something that can be usefully solved with ML. So if you want to use it as a project to dive into learning ML, go for it
It depends on how you generate this, and its hard to explain right now in short sentances but I belive ai is what I want for this ranther than wavefuntction clapse
thats part of it, and I think the results will be cooler / better with ai than with wavefunction and me making it basicly all by hand
Look into sequence modelling
RNNs, GRUs, LSTMs and the like
Transformers might be overkill
Also look into some of the simpler mathematical sequence modelling functions before that. You can derive inspiration from them if nothing else
Attention is all you need
yeah ok, I have to learn the diffrence betwine all them, and how I should do this in pytorch, but I think my explination earlier was pertty good about inputs and outputs
what?
It refers to Transformers
Haha nothing it's a title of a paper on transformers
Or rather, the paper
A machine learning method for sequence modelling
Mhm. The tutorials are pretty good imo
The paper xd
previous input and the length of the input is not fixed
I planed for them to be fixed, should I just ignore that part
only more recient track peices effect the outcome
is it just me or does it make more sense to use permutation_importance instead of fearure_importance_ or coef_ for the importance_getter of SelectFromModel? (sklearn)
Yeah maybe start with simpler pieces to get a better idea
I was just gonna fix the length of the thing and only supply 20 last blocks
Also look into autoregression
would be soo helpful if u provide the yt link for yolo + ocr.
I-
Nvm there you go:
https://youtu.be/FKGtdSJu3X4
Your exact project, have fun lol
๐ฅณ Sign up now for free: https://theos.ai
๐๐ป Join our discord server: https://discord.gg/CKYYExqMuP
โ Join our WhatsApp group: https://chat.whatsapp.com/CzlqpwU9rID3rCg0kWq9Gu
๐ License Plate Detection Tutorial Video: https://www.youtube.com/watch?v=GVLUVxTpqG0
โ Google Colab Notebook: https://colab.research.google.com/drive/1LbbTUXzgYT7dn3lQ...
this uses some theos api
What's the issue with that
Oh last thing, is 141 inputs a good amount, is it large or small, also how many hidden layers / nodes should I have? Rember only 7 outputs.
but that would simply not be training with yolo right..? thanks but
Fair enough
Look, your whole solution is neatly divided into 2 models
YOLO to detect and extract the license plate
And OCR to convert that to digital text
Just look up a yolo tutorial even without OCR and you should be fine
There are tons of yolo training tutorials. I'm a little busy rn so I can't search but it should be easy enough to find
@potent sky what do you think?
It really depends on the problem. To get clarity about things like this is partly why I suggested you go for it.
Think about what information the input carries, what output you want, how much information is relevant and necessary, how much feature extraction you need etc
Have a meeting now gtg
ttyl
In a convolutional layer with 3x3 filter, why should the number of channels increase to 64? I understand that due to the filter being 3x3 a 572x572 image is mapped to a 570x570 image, but how come we now get 64 channels instead of just 1? (This is a snapshot from the U-Net architecture)
Because we don't have one 3x3 kernel, but we have 64 independent 3x3 kernels
Each generating a new image that is 570x570
That get stacked together
@rose dagger
And only in the first to second layer is the kernel actually 3x3(x1) because the input image has 1 channel
In the second one the kernel is actually 3x3x64
Oh i see. Thank you. Then in the remaining encoding block (left side), do we then have a 3x3x2 kernel in the second part (since we go from 64 to 128) or a 3x3x128 kernel?
No, each kernel shifts over the entire input image from left to rigth, and top to bottom, because it's a 2d convolution
So when you go from 64 depth to 128, you have 128 kernels that each are 3x3x64
As each kernel will generate a single image
Ok, now i understand what you mean. Thank you, that makes more sense!
This is my first time building my own neural network from scratch I just wrote the training part if anybody sees anything wrong with it feel free to let me know. It is a 3 layer 3 neuron in each layer neural network.
trying to use Voice_Cloning package, this error comes back:
Traceback (most recent call last):
File "c:\Users\Code\Documents\GitHub\Test\ref.py", line 12, in <module>
from voice_cloning.generation import *
File "C:\Users\Code\AppData\Local\Programs\Python\Python310\lib\site-packages\voice_cloning\generation.py", line 27, in <module>
from encoder import inference as encoder
ModuleNotFoundError: No module named 'encoder'
Looking at Voice_Cloning, inference.py is a script within the encoder folder, which is on the same directory level as generation.py
Is there a way I can just modify this import statement so that it imports the file correctly?
this may not be the right chat for this so if someone could direct me to the right chat that would be helpful as well
How often should you retrain your model? Generally, lets say you trian and test on time serries data 70/30 split in days. After you deploy you would forward test for 30 days then retrain?
Everyone has suggested fancy generative models. But let me suggest a simple one: A Markov model. In the simplest Markov model, you track the last block that was placed. For each of these, you use your training data to find the probability distribution of next blocks. To generate a new track, you pick blocks one at a time: The initial state is the start block; you randomly pick a next block from the distribution of blocks that follow the start block; then you randomly pick a next block, and so on. One of your blocks should be an "end of track" block (maybe this is an actual block, or maybe you stick it onto the end of each track in your data); when you generate the end of track block, your track is over.
not exactly what I had in mind but, I wanted it to get the position ect too
so kinda diffrent
You can add extra information to the state space.
There's a trade-off between how detailed your state space is and how much training data you have.
Sometimes it helps to reparametrize (e.g., maybe there's a way to use relative positions?).
You can also create a hierarchical model. The traditional example of this is a hidden Markov model. In these, your states don't correspond to blocks. Your states are something abstract with no well-defined meaning. However, your states also have an "output distribution," which is a probability distribution over blocks. At each step, you pick a new state; using the output distribution you pick a block. Then you pick a new state (which depends on the current state but not on the block you just placed), and so on.
Another option is to use a higher-order Markov model, where the next block depends not just on the current block but on the current and previous blocks.
Markov models are not as strong as fancier and trendier models. Their advantages are that they require less data, are faster, are easier to implement, and their training has fewer gotchas.
hi guys, I have a question
I want to create a Python Tkinter application for plotting crypto charts. Do you have any idea what would be the best library for this?
no matter how i hard i try i cant impliment my code so it runs on the gpu
do you gus know any good wrappers or libraries to run on gpu
numba dosent work becuase it dosent support a lot of things i use
like child inheartence and such
Tensorflow and Pytorch
will it work in a project where I use differnt classes and such
all classes i made using no external libriaires
@hasty mountain
Pytorch is a framework that loves classes
k
In fact, I had to learn how they work so I could use Pytorch
i see
in my case i have a project where im making a 3d render and would like it to run on the gpu instead of the cpu
*raytracing
only thing im worried about is that a lot of these programs are ml based
guys im a student and wanna do a good ai course , not a beginner
but all the courses in coursera and udacity with certificates are expensive asf, and i already have two courses from udemy so do u guys know any places ican get a cheap course?
Does machine learning or deep learning come first, when it's willed to go through this field and learner is beginner?
deep learning is an area of machine learning that uses neural networks
no one's going to care about AI/ML certificates from those websites anyway, but there's a plethora of free content on youtube.
you're a student. at university? can you take an AI course?
You can think of it as "deep learning is part of machine learning, and machine learning is part of AI"
I see
I've a very basic knowledge of machine learning and I think I could learn some deep learning without any issue
you will have issues.
oh dang
to learn is to suffer.
but in all seriousness, machine learning and deep learning take a long time to understand. that's why you can make a lot of money once you do.
I'm in my first year at university and studying computer science and engineering. Next year, i'll take artifical intelligence lecture but I'm willing to go through this field on my own aswell to improve myself. Would it be waste of time to take some machine learning classes online?
what courses are you taking right now? and what math courses will you have taken by th etime you start the AI course(s)?
(when I say "course", that might be what you call a "module")
I've already taken Pytorch for deep learning and machine learning and got no problem at all. But it wasn't that theoric
your university teaches a course that's specifically about pytorch?
nope, I took it online
not from my uni
In the first semester, we took calculus 1 and this semester we have calculus 2 classes
will you be taking linalg?
yes
what was the loss for the first epoch?
how many epochs did you do?
hundreds, I see. what does this model do?
hmm, okay
anyway, it's hard to say if a given loss is "normal" or not
what you really care about is how it changes between epochs.
i am trying to optimize a plan which reflects the contemporary skills needed...
let me know what you guys think
hi guys, want to ask about naive bayes method processing, I have pre process every data and drop unused column but when it comes to detecting outliers with Z Score or IQR my result is empty or rather NaN, do you guys have idea why the result like that?
any reputable guides on ML to train an AI that can be used within a python script?
It's impossible to answer unless you specify what kind of ai. What do you want the AI to do?
right, my bad
detect car plates (then, OCR)
and
see if a plant is a "bad" or "good" plant
like growing well or not, prolly needs some supervised training im guessing
You'd need a dataset of healthy and unhealthy plant images, yes
Though I think that would be difficult for a model to learn
Unless there's some visual property shared by all unhealthy plants
probably is i believe
like if they're straight or not
Guess I'm an unhealthy plant
lmao
Anyway, I wouldn't follow any tutorials on towards data science. Those tend to be trash tier.
what would you recommend then?
i need some material to start lawl
machinelearningmastery is a good website
imo towardsdatascience has some quality write-ups.
But as a beginner if you don't know your way around it can be easy to get into the bad articles on there (and there are many of them) and consequently adopt wrong understanding, bad ways of approaching a problem etc. which can be difficult to unlearn.
So I agree with Stel here
Do you not think it's a useful resource?
It takes some filtering but I find quality write-ups on there sometimes
what you both say is my general experience with it. you can certainly find very good content there sporadically, but there is poor quality control at best
i don't think there's any quality control. Someone I know writes for TWDS and honestly she started writing there when she was learning about data science
So her intentions were good but the things were just not correct as you would expect from someone beginning to learn anything
I'd say to prefer to search for tutorials in the docs of the frameworks you're using. Tensorflow/Keras and Pytorch got some interesting tutorials.
You can use Towards Data Science articles, but...eh...be careful. Usually the folks that write there also has a small bio. If you see someone that at least seems to understand ML, that could be a good start
The best tutorial I found about Variational AutoEncoders was in Towards Data Science, and it was written by an AI Engineer from Meta
hloww 
ValueError Traceback (most recent call last)
<ipython-input-41-8236c67b5777> in <cell line: 15>()
13 metrics = ["accuracy"])
14
---> 15 history = model4.fit(tf.expand_dims(x,axis = -1),y,epochs = 100,verbose = 0)
1 frames
/usr/local/lib/python3.10/dist-packages/keras/engine/training.py in tf__train_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1284, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1268, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1249, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1051, in train_step
loss = self.compute_loss(x, y, y_pred, sample_weight)
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1109, in compute_loss
return self.compiled_loss(
File "/usr/local/lib/python3.10/dist-packages/keras/engine/compile_utils.py", line 265, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "/usr/local/lib/python3.10/dist-packages/keras/losses.py", line 142, in __call__
losses = call_fn(y_true, y_pred)
File "/usr/local/lib/python3.10/dist-packages/keras/losses.py", line 268, in call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.10/dist-packages/keras/losses.py", line 2156, in binary_crossentropy
backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
File "/usr/local/lib/python3.10/dist-packages/keras/backend.py", line 5707, in binary_crossentropy
return tf.nn.sigmoid_cross_entropy_with_logits(
ValueError: `logits` and `labels` must have the same shape, received ((None, 2, 1) vs (None,)).
Yep but docs tutorials are fully code oriented. You preferably need math too. That's where twds comes in sometimes. Machinelearningmastery otherwise, pretty reliable
^^
I am trying to understand why some people square the cost function of a neural network and some people dont square it. It seems to me that if you square the error when you are traning the network it will overcorrect because the error will be bigger that what it actually is
You mean the loss function L = (f(x) - y) ^ 2? @crimson summit
That is because you want to minimize the function, so the minimum would be if f(x) and y are the same. And if there is a difference between the two (positive or negative) then it should be larger than 0. That way minimzing this function gives the best results.
And also remember that we have a learning rate that we use for correcting the weights, which should be set low enough to not overcorrect.
There's a bunch of reasons and the ones listed above are definitely part of them
Sometimes you also just don't want large errors so squaring it makes total sense. There's other loss functions that don't do this.
wouldnt you be making the error bigger if you square it not minimizing it ?
It would mean that larger errors are more heavily penalized than smaller errors yes
You can also have absolute difference as loss pretty sure
Or smooth l1 loss is another one
Yup you can
oh oh makes sense
You can also just predict the log of Y, that's a common trick
Here's l1 f(x) - y, l2 (f(x) - y)^2 and smooth l1 (which is a bit more complicated)
As long as it's differentiable and continuous it can be used pretty much
I'm not sure but I think MSE is just a tradition that is carried over from statistics
what does that do diffrently than just squaring
In statistics minimizing the sum of squared errors is equivalent to maximizing the likelihood, which has certain good properties.
This ^^
It's almsot like a mix of l1 and l2, when close to 0 it behaves like l2, and further from 0 it's basically linear
As to not penalize very large errors too much
But penalizing large errors can be really bad
This*
Look at: huber loss for example
Selecting loss functions and models is something you can / need to do based on your "knowledge" of the problem. If you're worried about large errors ruining you, you should be looking at techniques from robust regression
so its just kind of like a standard practice that works on a wide range of situations
would you square the cost of the hidden layer aswell or only the final layer in a 3 layer neural network ?
wdym with this?
Oh, so this explains the MSE for Variational AutoEncoders... 
Though I admit I'm really enjoying the Gaussian Likelihood because it appears to me more accurate...and more interesting...all that thing of the Decoder having to predict the most likely value between an infinite range of possibilities...
Where can i start learning AI with python?
from the plot and comments above, seems like some discussion on L2 ignoring small errors unlike L1, and L1 not being differentiable at 0. i would mention that it's subdifferentiable though, and most autodiff libs use a subderivative of 0 or 1 at 0
It looked like they were equating f(x) - y to L1
ah that's what you mean
Oh I forgot the abs there yeah
smooth L1 is new to me though. Initially I thought it was just ML people renaming elasticnet but it's something else
it's something else indeed
you see it in many places though. gradient-based methods are nice because for well-behaved functions, you can find local minima
You can find the formula here, saw it used for a reinforcement learning project
whenever you have good reason to use a non-differentiable cost but also want to use gradient methods, smooth approximations are interesting
stuff like softmax falls here when used as a smooth argmax
Also quite similar to Huber loss I see.
ah, that does appear to be the case
@potent sky another question for you, so I have my streamlit app up and running and I'm having an issue w/ the st cache data ttl. It's set to 1 hour but it doesn't actually clear the cache after an hour. It's still loading the same df from last night but when I edit ttl to a few seconds and test this change locally, it clears just fine. Is the ttl only valid while the app is actually in use? I was assuming if I close it and reopen the next day that it would be cleared on rerun but maybe I misunderstand how that works
Wdym by close it and reopen? Are you shutting down the program? Streamlit cache is persisted on disk too iirc so it could repopulate if you're shutting the program and restarting it later, but this will reset the timer
Phew... Finally managed to make a functional VAE...
now...onward to creating abominations have some fun with the architecture 
I want to make an experiment with GANs using latent vectors
The idea is to try using a GAN to create latent vectors rather than creating an entire image.
An idea that came to me after seeing the latent diffusion idea, which applies diffusion into a latent vector to make an image
Oh wait... LDM = Latent Diffusion Model, right?
So...almost for that 
Ooh we actually do this in RL-GAN-NET iirc
Very interesting paper, look it up if you want
Yep, a class of models
Aw... Then they did it before me 
The idea was exactly train a GAN on latent vector and then try to make a GAN-RL
2019 ICLR I think
Lmao this happens a lot, I relate with you. Feels like every good idea under the sun that strikes you has been done before
I was just beginning work on LDMs for music/audio when they published AudioLDM this year Feb I think
I think it's still worth trying tho, you might get a different idea to solving the problems you encounter
Yes. I'll take a look.
Maybe I could at least make something more simpler/cheaper and get an average performance, since those papers usually go for absurd things...
Hm... They didn't use PPO for it 
Thanks for the recommendation!
I was planning to-
Go for it, then.
My university vacation will end soon, so I may take a while to work on it 
Maybe you'll give me some inspiration
Follow up on my RNN from scratch
I have a question
I have some pervious programming knowledge like I know the basics of python so I was wondering if I should get this course first: https://www.udemy.com/course/100-days-of-code/ or just find a course that is specific to machine learning and get into it right away
I think if i dive into a course that's specific to machine learning it would be way harder to get finish/get into
That's a good course to get a baseline understanding of Python, which will definitely help if you go towards ML later on
meant when a user just closes the web page for the app. That explains it though. Thought that this is what was occurring
I finally got through the ChatGPT noise and found a book that goes beyond Prompt Engineering and talks about OpenAI API integration
a book for that sounds like a waste to me? specially at this point in time in which things are still moving ultra fast, to the point that something from 6 months ago may already be outdated
Agreed, it would be more beneficial to learn how transformers work, rather than how a specific transformer reacts
Hey everyone, I'm looking for some help to connect different dataframes using pandas for a uni project I am woring on. If anyone has experience here and can help please reach out, thanks in advance ๐
If the question doesn't require hours of guidance, it's probably best to just directly ask it here so people can inmediatly answer. People generally don't dm to find out what the question even is ๐
Yeah that's on me hahah, a bit desperate to find a solution so forgot to provide details ๐
So have 3 differnet dataframes that contain 4 simmilar varibles which are a yearly time series data for companies (multiple comanies can have multiple scores). What I've been trying to do here is make a function that assigns a label (high,low,mid) every year for each company depending if its value is below or above a certain quantile and store it in a seperate column. Don't have a lot of experience with python and couldn't really find a simmilar issue on stackoverflow
Would this work even if I have to iteratively (every year) write all the labels to a single column in the dataframe? Additionally, I'm using a dataframe as an input where as I've only encountered applymap being used with dictionaries or lists as input
I think this work
I don't think that there's a need for apply/applymap at all?
are the variables similar or exact the same for each dataframe though? (same metrics / column names for different values, or actually different columns in each df)
metrics are different, the lists before the function lists the relevant columns and because some of the values are inverted (higer values are worse instead of other way around) I know I'll have to take that in consideration when making the labels
pandas.qcut(x, q, labels=None, retbins=False, precision=3, duplicates='raise')```
Quantile-based discretization function.
Discretize variable into equal-sized buckets based on rank or based on sample quantiles. For example 1000 values for 10 quantiles would produce a Categorical object indicating quantile membership for each data point.
I'll look into this, thanks!
without it you could do some tricks to get which quantile each record fits into, but that function seems to just do it for you with a much simpler api than check which bucket each record fits yourself
Thanks a lot for the suggestion, looks like this should solve the issue! Just a quick follow up, I noticed that since I create a temp dataframe that gets the yearly data and use it to assign values to the original I get nothing. Do I need to use a sepperate function like df[].apply to do this?
share code as text. not screenshots
!code
# Generate yearly rankings labels for each provider based on ESG score
top_quantile_30 = 0.3
bottom_qunatile_30 = 0.7
top_quantile_10 = 0.1
bottom_qunatile_10 = 0.9
years = [2013, 2014, 2015, 2016, 2017, 2018, 2019]
reprisk_scores = ['peak_yearly_RRI', 'yearly_environmental_score', 'yearly_social_score', 'yearly_governance_score']
sustainalytics_scores = ['total_esg_score', 'environment_score', 'social_score', 'governance_score']
capitaliq_scores = ['ESG_score', 'Environmental_score', 'Social_score', 'Governance_score']
labels_30th_p = ['LScores', 'LScores', 'LScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'HScores', 'HScores', 'HScores']
labels_30th_p = ['LScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'HScores']
def get_score_rankings(df, score_type):
#Match relevant provider with correct score type!
#Rankings for top/bottom 30%
df['ESG_measure_sorts_30'] = ''
df['env_measure_sorts_30'] = ''
df['gov_measure_sorts_30'] = ''
df['soc_measure_sorts_30'] = ''
#Rankings for top/bottom 10%
df['ESG_measure_sorts_10'] = ''
df['env_measure_sorts_10'] = ''
df['gov_measure_sorts_10'] = ''
df['soc_measure_sorts_10'] = ''
for year in years:
yearly_df = df.loc[df['year'] == year, ['isin'] + score_type]
for score in score_type:
if score == 0:
df['ESG_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)
if score == 1:
df['env_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)
if score == 2:
df['gov_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)
if score == 3:
df['soc_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)
get_score_rankings(sustainalytics, sustainalytics_scores)
when you do df[...] = ... or series[...] = ... in pandas, it tries to align the index of the objects for you
idk how you are creating each dataframe so I have no idea what their indexes look like, but that could be an issue
oh wait 
why the ['isin']?
Right after the for loop you can see where I define a temp dataframe which just get the data for each year and the relevant columns I need quantiles for. The dataframe I want to assign them to has the data for the entire range of the years
hmm ok it sounds like you are iterating over a list of strings and checking if the value is equal to a string number?
This is because I knew indexing would be an issue and is an identifier for a company
# ['total_esg_score', 'environment_score', 'social_score', 'governance_score']
for score in score_type:
if score == 0:
just what?.....
that is not going to work how you want
Yeah this might be very unnecessary since I could have just renamed the columns to have the same name in each dataframe but since I also need to set a few conditions that are specifc for each data frame I kept it as is
Its not in use so dw about it
I recommend either converting everything to one standard format, or creating one separate script for each different input data you want to transform
after you decide on that, start (from scratch, not copy/pasting what you have right now) prototyping in something interactive like a Jupyter Notebook or an IPython terminal
only after you get the operations right try to organize it into a function
Is this what causing the issue when trying to send the labels back to the reference df?
what do you think that the score variable contains when you are doing score == 0 / score == 2 etc?
position of item in list no?
what do you think that would happen if you did yearly_df[0]?
you have to organize your process in your head first, and only after that start coding - and even then, doing it in small steps, testing each part.
I'm still fairly new to python so might be messing up basic stuff
!e ```py
strings = ['a', 'b', 'c']
for string in strings:
print(string)
for i in range(len(strings)):
print(i)
for i, string in enumerate(strings):
print(i, string)
@agile cobalt :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | a
002 | b
003 | c
004 | 0
005 | 1
006 | 2
007 | 0 a
008 | 1 b
009 | 2 c
for x in thing: iterates over each value in the thing, not over each position
Yeah I understand but the score == 0, 1 etc.. is just to match the column names of the yearly_df and the original dataframe column. Like I said before, I know this might not even be needed if the column names were the same for each dataframe I'm applying the function for
Nvm I'm stupid ๐
Got what you mean
Has anyone used the mask rcnn model - im trying to set it up on windows machine (https://github.com/matterport/Mask_RCNN/blob/master/samples/demo.ipynb) please dm to screenshare so that i can get it up and running.
Any suggestion on graphs that I should pick when it comes to having a lot of paramters? I have tried to make double y axis but it seems like it still looks like a mess.
If you were sure you know how to program you wouldn't need it.
If you are not sure you know how to program you better learn programming first. Not specifically by this course but it looks nice. However, you'll need only "beginner" and "intermediate" lessons, which is roughly 1/3
Lol 6 months is a long timeline considering how changes are fast paced now. I think it does warrant content that can skill the reader up and eventually help our bridge gaps.
Any data engineers here?
What's your question?
How is exactly SQL used by data engineers? Like mainly for what purpose?
To transform data and do ad-hoc analysis.
There's also more and more tools that let you do dataviz with SQL
Can someone look at my codes and tell me how to remove this graph 6? There is no data there.
i have these projects on my resume(applying for ML/DS role FULL Time), how do they look? in terms of difficulty, required time, how impressive are they?
and should i add more? or 3 are enough?
plt.delaxes(axes[2,1])
Would you describe what I should do with this? just add it into my code?
just add this to your code
Hey, what are some great ways to run concurrent request in python? I'm working on https://github.com/apolloapi/apolloapi and want to structure concurrent request for our request wrappers. We'll probably implement some call_api method but I noticed python has an async keyword but I've heard python isn't the most friendly language for concurrency.
Apollo is a model management tool for training AI models, automating tasks and catching regressions. I'm currently working on adding a new provider this week that allows for LLM based grading against LLM generated output to produce a grading system for the regression testing feature of the project.
While Python doesn't support concurrent programs as well as, say, go, async is how you do it in python. there isn't some other way in python that's going to be better.
Hello, does anyone have any resources on Multi Task Learning in general to recommend or on Multi-Head architectures more specifically?
you might implement the request handlers with FastAPI
Hi everyone
I wanted to ask that what things do I have to learn in AI field in python programming language, I am little confused as this field is very vast and beyond my knowledge. Hope you guide me
most of what you need to learn has nothing to do with python. I would start with a book like "data science from scratch" to start wrapping your head around what "data" is in the context of AI.
Ok thank you I will check out that book
But Can you just tell me what things I have to learn in programming, like I know basics of tensorflow, keras, numpy,
No, because that ultimately isn't the point. If you just "learn tensorflow", you will have accomplished nothing in terms of understanding neural networks.
Then waht more should I learn
I don't want to sound gatekeepey but as for data science, getting a solid background in traditional statistics will help a lot
the book I recommended would take a few weeks to work though
Most of ML is turbocharged statistics. Knowing basic regression well helps you understand neural nets better later on
it's not gatekeepy to make a universally accepted statement about how it be.
Yes, I think learning statistics first make sense
is it possible to update an NLP model like Stanza to fix certain incorrect dependency parsing values?
Calc and stats is what machine learning is built off of. Discrete mathematics is also heavily used in the algorithm side of AI (graph theory, etc)
This is kinda the under hood of a simple NN and why calc is important
.92*1=0.92 heavy math indeed ๐
i was in your place at some point, so i decided to document everything i learned from when i started till now in thie git repo: https://github.com/ahmedbelgacem/awesome-datascience i hope it helps you. It isn't a list of technologies and frameworks its a list of topics with the articles, books and courses i used to learn that topic. It isn't exhaustive as this is what i have learned up till now. I just recently landed a job as a Deep Learning engineer focusing on vision problems so much of this is on computer vision but you can find enough to learn. There's some french courses since i understand french and you may not. Hope this helps
i also agree with this
should i be a software engineer or AI scientist ?
do what you enjoy
you seem to have strong opinions about which career tracks are more future proof that I don't think anyone can change, so I don't think we can entertain this question.
and that's the end of that.
will data scientists role replaced by gpt-4 ?
if you want to talk about that, go to an off-topic channel, or another server entirely.
this will be your only warning.
depends on what you like most and what you're good at. I initially started as a software engineer and i studied software for 5 years. Then i added a masters degree in AI engineering. I thought that i'm good at computer science and found that easy enough. I also liked maths but it was more challenging for me and felt like doing maths was making me think and try hard while i wasn't trying hard in computer science alone. That's why i switched. Today I really like what i do (deep learning engineering) . I find that most of my work on a day to day base is pure software and coding but everything needs intuition, mathematical background and critical thinking. I find hard aspects on a day to day basis and i like the challenge. And no, it won't be replaced by gpt-4.
Hi everyone, I am new here, and this is my first message on this Discord server community.
I'm a software engineering student considering taking a neural networks course next semester. I've seen a few presentations on the course presentation, and it seemed a bit heavy on the theoretical side. I'm trying to see its applicability to real-life situations, but I think I fail to. I had a similar experience with discrete math that I took last semester; I thought it is beneficial for AI/ML, but I didn't find it particularly useful since we did only the pure theoretical part of it.
I'm curious if studying neural networks is a prerequisite for diving into other areas of AI? And how strongly correlated are the concepts covered in this course to the wider field of AI? I consciously used the term 'AI' in the messages above, as I don't want to decide what part I want to delve into before I inspect each aspect and possibility. I hope that makes sense and give you some overview of my question
Thank you ๐
as long as the hallucination part is not fixed in the LLMs, they should not replace anything
Discrete math can be relevant for AI/ML but it's definitely more abstract than taking a neural network course
I'd say a NN course is definitely a good idea for most CS majors even if you don't want to go into AI propper. A lot of chance you'll be working on/with a service that uses AI in the future.
Apart from what others have said and a little off the point, but I think in time you'll find discrete math to be useful for software engineering and other types of problem solving in general
Neural networks are overwhelmingly the concept on which most of modern deep learning is based (note: not all)
Deep Learning is a subset of Machine Learning. It has seen great visibility recently in powering technologies like voice assistants, recommendation systems (think "The Algorithm"), better camera quality, and a load of other things.
Machine Learning is one of the ways of manifesting AI and currently the most popular and successful one by far.
Hope this series of associations gives you some clarity!
Yes, i think that neural networks are prerequisite for modern ai but not sufficient. So if you're planning to study more AI courses in the future neural networks are a must but if you're going to study only that it can be beneficial for your culture but nothing more in my opinion
i also agree with this
I really appreciate your replies guys. I think of myself that I am a hard-working guy willing to put in the work, and I'm not demoralized by the course, even if it is tough or abstract, as long as it'll benefit me in the long run. Given that, do you think it would be a good idea for me to start independently studying the neural networks course material over the summer before the formal semester begins? Because I think it might make the learning experience smoother when the actual lectures start, as I won't be encountering the topics for the first time if that makes sense
(Here I go again) Start with statistics if you want to learn something independently
And then connect the ideas you see in your neural nets course to the ideas you saw in stats. It'll make your knowledge a lot stronger in the long run
Agree, I am taking Probability and statistics course as we speak. I mean, I will finish it in couple of weeks.
Then a second prereq before going into neural nets is imo traditional machine learning methods
It's a hot take but I'd say all of ML is statistics but it tends to be called ML if it's done by someone from a comp sci/engineering background. Traditional stats, "traditional" ML and NN's are imo all part of a big toolbox you can use to solve many problems. Different problems will need different techniques so knowing a bit of everything helps. ๐ Reason being that if you "skip" regular ML then you might overengineer things (especially on tabular datasets).
if you really have time, i would suggest you refresh/study linear algebra its a must for neural networks
But tbh, I think there's a lot of people now that are working exclusively on speech, text, images, video, ... and I think these profiles can get away with not having a super in-depth knowledge of the traditional stuff. It's more specialised and nearly exclusively deep learning now.
I still think it's useful to have a good understanding of traditional ML, even though you might not use the exact techniques
LinAlg, stats, probability and information theory (maybe vector spaces too if you're interested)
I think you can get away with vector spaces unless you're going for the theoretical route
hey guys..
i am very much interested in ml/dl
but idk where to learn or how to learn๐ฅฒ
i am good with math like linalg, prob and stats..
can anyone please help me
i have done some random courses.. but idk how much i have learnt and stuff.. i didnt do any projects and stuff too.. guide me pls๐ฅฒ
yea bro. i did some courses in kaggle.com
learn ones
but the thing is i dont get a pathway kinda.. like how to develop
Assuming you already have the prerequisite linalg, prob, stats then you should to "easy" Kaggle competitions (tabular playground series)
have seen many utube tutorials. have all fundamentals but cant map them and learn ๐ฅฒ
Solve the case yourself, submit your predictions and then look at other people's notebooks
oh.. i never heard. lemme check
Beware that Kaggle only trains a subset of the skills you need to work in data though
ok this way i can do for practice.. what about learning? like how to learn new things? like for free.. i can't afford for courses so yea..
didnt get u... can u come again?
there's much more to data science than training models
Books ๐
implementation?๐ฅฒ
These are all free: https://mml-book.github.io/ https://www.statlearning.com/ and http://www.mmds.org/
For theory https://www.deeplearningbook.org is also free
yea i read this. (some parts)
thanks bro!
actually, last week i started aeroplane object detection using RCNN. like ik what is cnn, how cnn works, architecture of cnn but idk how to code for it
how to build model for it.. so how to learn all these? this is what i wanted to know actually
hello i have a question about a deep learning model can anyone help me with that ?
More books: https://d2l.ai/
This one in particular covers the theory and implementation of most, if not all, common architectures
oh cool
After that (reading these 4 I sent will take a very very long time if you do it properly) then what's left is the cutting-edge in papers + actually using what you've learnt in those to do projects
The docs of Tensorflow / Pytorch / MXnet have examples that are typically well explained indeed
actually im now in 3rd year of graduation, just 2 years left. i see in linkedin all my friends are doing lots and lots of things.. idk why am unable to ๐ฅบ its a depressing btw
in this model 2 lstm layers are added in sequence with using return state=True so does it make it a stacked lstm network or not ?
But fundamentally - you need to decide if you want to be designing novel architectures or if your interest is in applying say the cutting-edge on specific problems
Imo these are wildly different skillsets
yea true
LinkedIn is 80 % inflating the truth 10 % straight up lies and 10 % factual
I remember I did a workshop track on computer vision in the cloud with a fancy consulting when I was a student. We got a (useless) certificate on the end.
Afterwards I was browsing LinkedIn and I saw someone post about a super cool thing they did. Turns out I was in exactly the same track as them but they inflated it so much I had no idea I even attended the same thing as them.
lol
I don't know if it's a good idea to essentially dox yourself haha
nah just asking ur opinion
I'm a books person so my suggestion is to take https://www.statlearning.com/ and read it diagonally and experiment with the techniques you're learning there in Kaggle competitions.
btw for this
i have used selective search for plane detection
and this is what i get
so many borders.. idk how to correct them..
ok.. thanks..
If you're good with the math I would then recommend to study python in depth. After that start with the bases of machine learning (Andrew Ng haw a really good free course on coursera with Stanford University if you want to start). Then i would recommend some deep learning, neural networks etc after that it would be nice to try different things and choose what you like the most and play around different project you find, for example try computer vision thing, study it specifically then try to build a classifier for something you like. Then study for example nlp and try to build something with it etc
cool
i can redirect you to my previous answer. I can send you some specific links for something in particular you want to start learning. For python you can start with some easy book called Automate the boring stuff with python then go to more advanced things. If you like exercices along the way datacamp is really cool.
sure
please do send
and yea i did datacamp for somedays. statistical thinking with python thing ig..
ss.setBaseImage(imtest)
ss.switchToSelectiveSearchFast()
ssresults = ss.process()
imout = imtest.copy()
for e,result in enumerate(ssresults):
if e < 2000:
x,y,w,h = result
timage = imout[y:y+h,x:x+w]
resized = cv2.resize(timage, (224,224), interpolation = cv2.INTER_AREA)
img = np.expand_dims(resized, axis=0)
out = model_final.predict(img)
if out[0][0] > 0.97:
cv2.rectangle(imout, (x, y), (x+w, y+h), (0, 255, 0), 1, cv2.LINE_AA)
plt.figure()
plt.imshow(imout)```
This is the code for detection part btw..
andrew ng ml this one?
wait i will send you the link
ok
they changed the name
ok.. and u said some other links.. what r those?
i said if you have something in particular you are looking for, say you tell me i want to learn reinforcement learning, i would send you a link for that particular subject
Are you familiar with Non Maximal Suppression
what kinds of things can i do to improve image classification tasks?
Currently, im just trying out various models and fine tuning them on my dataset. Not sure what else I can explore to improve performance
recommend any starter courses for ds ml? the one i picked out on udemy is pretty dated
Scroll up, we had this discussion just now haha ๐
How are your train and val curves looking like?
Augmentation and/or other regularization strategies might be a good idea
If you have the time for it you can also just hyperparam tune
dont have a val set which may be a mistake now
hi, I'm having problems with my project and I would appreciate if anyone could get on a call with me and help me maybe?
rightt
i have auto transforms that i get from the pre trained model itself
ImageClassification(
crop_size=[288]
resize_size=[288]
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]
interpolation=InterpolationMode.BICUBIC
)```
I think at some point you are beginning to overfit so you can play around with adding dropout inyour FC layers, augmentation, ...
there is one dropout layer already but i can increase the proba
Yeah if you have the compute for it, I'd do it with some sort of hyper parameter tuner
hmm, would u do it across diff models too? like effnet b0 to b4 and with various hyperparameter values
what language is that
output of ```py
Get the transforms used to create our pretrained weights
auto_transforms = weights.transforms()
auto_transforms
ah alright
I used KerasTuner a bunch in the past and it allows you to have a "context" for hyperparameters so you can search in a better way
So across models and also "remembering" that hyperparam1_1 is related to model1 and hyperparam1_2 is related to model2 etc
Maybe Optuna has this too - my issue with KerasTuner is that it depends on Tensorflow and installing TF just to get this is crazy ๐
ah hmmm
im not even sure if putting this much effort on just a project to showcase i know how to work with image classification stuff is worth it
ah.. ok.. can u share me related to neural networks? ann, cnn, rnn, etc etc
no..
Look it up. Should help with this problem
But isn't the analysis part for the data analysts?
People wear multiple hats. It's common to be a data engineer that also does analysis / data science
Okay, so it will dependent on the company I work for?
But yeah, even if you don't do it yourself the analyst that is working downstream relative to yourself might do their analysis with SQL
yes
Also, which is heavily used for the ETL? Python or sql?
Probably SQL?
Why though? Because what can be done in sql can also be done in python.
Many data engineers don't know Python
Ohh. So, SQL and database knowledge is top priority if I want to become a data engineer?
imo yes
You work as a data engineer yourself?
I'm an applied AI engineer. I'm the one that does all the data engineering on the team though
In the past I did internships in data engineering specifically
Nice. Can you provide your opinion on the roadmap I am following for data engineering?
Probably better placed people to do that than me :/ maybe @boreal gale
If not, try Reddit
Do you guys think itโs worth it to get a teacher for learning python and machine learning?
I am currently on udemy course for data warehousing
The roadmap is fine so long as you do enough projects
I wouldn't spend time on Inmon, Data vault, data mesh or what have you. Just good ol' star schema's are fine for entry level
Star schema and snowflake schema, right?
Yeah just star schema's are fine to focus on in the beginning
Maybe people that do data engineering full time might disagree so I'd go on r/dataengineering and ask their opinion
Got it. I appreciate all the suggestions you gave.
galaxy schema xD
How are you. ...... I'm create a shopping app using python kivy, if I send information from user interface to SQLite its going but not updating on my app at the real time. for example, on marketplace page if I add item to my cart its saying its added but going to my cart its not appearing but if rebuild the app it will be showing, so how can I make things update at the real time
err.. replying because i got pinged heh.
- is solid, it's where i would start if i were to start over
- is okay, i mean it's nice to know the concepts and all, but imo the value is limited unless you put it into practice
- spark is good to know, but imo is optional, people abuse spark way too often (when you have a hammer, everything looks like nail type of thing), i would just ignore hadoop hive pig, only research them if the job you are applying requires it/you have an unnatural interest in them
- can always help you job hunt, it's a plus but not essential
5):
- airflow is not a must, but sure you need to come to grips with some orchestration tooling, prefect and dagster are viable contenders (heck even luigi depending on your usecases)
- compute: no comment really, but if you know spark then this is probably not a big step up, again not essential imo
- cicd: only CI is relevant to your core duties, knowing how to test your code is a big plus
- docker: hell yes. you can't escape them containers these days.

6): 10000% yes, put it all into practice, do something original, it's the best way to drill some core concept into your brain and it serves well inside a portfolio
but i must say, imo data engineer is not a job you can easily land without some experience in other dev related role, companies that hire junior DE is few and far between.
also this is quoted often in the DE discord https://github.com/datastacktv/data-engineer-roadmap
and DDIA is almost a religous text in DE https://dataintensive.net/
good luck!
Hi guy, I have a piece of coding instructions and I am using anaconda3, should I type these into the anaconda prompt of into my VScode application? ``` start Anaconda3
type:
cd E:\Xfer\NC\MCT2000_LOG_FILE
Press Enter
Is there some way to reduce/manage the needed memory for a neurel net in tensorflow? I'm building a network with roughly 30 million parameters and i'm using the GPU provided on Kaggle, which roughly has 16 GB of GPU memory. When initializing the model it immediately runs out of memory. Any tips?
(I know one obvious option would be to reduce the complexity of the neural net, i.e. remove a few layers / connections, but say i want to improve the memory usage for a given fixed neural network)
@rose dagger I don't have an actual answer for you, but i know there are several memory optimization things especially around Stable Diffusion (popular/open source) that you MIGHT be able to apply in some way? I'm guessing you are already familiar with some, but there are things like xformers, cunumeric, and several other things. Have you looked into any of those?
I'm actually trying to look into if/how I could potentially convert the ZoeDepth models to use TensorRT for performance boost...lol but so far I've only been trying to "use" AI stuff, not even sure where to start yet.
I have not heard of those yet, but will look into them. Thanks!
Ah ok, then there is hope for you yet lol...good news is this is a common problem, bad news is that it is really hard to find quality information.
That will probably be your single biggest gain. I tried to get it working early on and failed many times. Finally got a better understanding of python environments and such, but it is a near drop in improvement. It DOES have a potential downside, certain things (not sure exactly what all) are not deterministic.
but also check out cunumeric, drop in replacement for some core python stuff that I've read can give performance/memory improvements
Interesting. It potentially not being deterministic probably won't bother me too much. I'll try it, but it'll probably take quite some time to implement
Any idea how far out you are on memory?
like do you need to shave a bit or cut it in half?
I just finished coding my first Neural Network. It is a simple 3 layer neural network. For some reason it is not working. I double checked the math part and everything seems right. If anybody sees any glaring errors please let me know.
here is my code ^
MNIST is a great dataset in awful packaging. Here's a CSV instead of that crazy format they are normally available in. Enjoy!
this is the data that I am working with ^
I am supposed to get something similar to this as my answer for the #7 which is the first number in the test data set
instead I am just getting this
Oh well i cut my number of parameters in half and was still out of memory lol. I'll explore it some more tomorrow and try to get a more precise estimate.
there is a good and extensive write up about optimization on HuggingFace
i was calculating it to show the steps
could that mess up the network if it is not being used ?
No, just a waste of processing time but it won't affect anything
Appareantly all outputs are very high, which means the weights might be very high
You could check if that is the case
Might not even be that the network is broken, but f.e. too high learning rate (0.3 is quite high for general models)
i messed with the learning rate but that didnt do anything
i have not messed with the weights yet though
Did you try values like 0.001 ?
Try something like 0.001 see if that makes any difference at all
Checking the manual gradients calculations would take quite a while for me as well, so if it's anything else that would be nice ;P
Btw, why do you have this inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
Are you scared of zeros or something?
its supposed to scale and shift the inputs between 0.1 and 1
Normally you'd normalize to values between 0 and 1
Did the book suggest this (may be because you don't have a bias in your NN)
sorry between 0.01 and 1
yea i am just following along in the book
but the guy in the book did some super wierd math that is inorrect so I trained my neural network diffrently
i am not to surprised that the results are different just trying to figure out what I need to adjust
if you look in the train section he calculates the cost of the hidden layer output by just multiplying the weights times the output cost or "error" how he calls it
I also just tried making the weights way smaller but that did not do anything
Do you have the csv, can you send in dm?
yea sure
sent you a friend request
Alright let me just check some stuff out then
So the values grow big after the hidden layer, so from hidden to output they get to like 14 on average
When pulling those values through sigmoid they will basically all be close to 1
Not sure why those weights are so high yet though
are the weight values that i have between -0.5 and 0.5 super big ?
Nah shouldn't be
lol
found it
[2.31742179e-03 4.87518635e-06 6.80298229e-04 7.63022959e-05
1.15368135e-07 2.46824449e-05 4.67039119e-08 9.99906227e-01
2.01392314e-07 2.47477905e-05]
Getting this output now, with 0.99999 at index 7
The way I found it was by printing out the output_errors_deriv, and found that almost all derivative where positive
Which means that the model would try to correct the weigths to increase those values, but it wanted to increase all values but the one that was the correct target
You swapped targets and final_outputs in your error derivative
output_errors_deriv = 2 * (targets - final_outputs)
And not
output_errors_deriv = 2 * (final_outputs - targets)
@crimson summit
Or actually...
That was correct
doing final_outputs- targets helps me cancel out the -1 i think
But swapping those also fixed it, you should actually change
self.who += self.lr * numpy.dot((output_errors_deriv * final_outputs_deriv), final_inputs_deriv2)
to
self.who -= self.lr * numpy.dot((output_errors_deriv * final_outputs_deriv), final_inputs_deriv2)
instead of swapping targets and final outputs
Because atm you are doing gradient ascent instead of gradient descent
should I swap the sign to negative on the other weight calculating formula aswell
Yeah I'm just checking that
I am now getting the correct largest value for the number 7 so it is working fine now
I made them both negative btw
I just need to make the numbers decimals
I think
Hmm, still something wrong even after swapping, getting 1k of 10k correct (basically random guessing)
yea never mind when I try the second number in the data set its incorrect
Yeah I'm not sure atm, it takes me too long to find too. I'd probably have to write it from scratch myself to see how I would do it and then compare it with your solution, but that takes a bit too long right now.
I don't think I can really help much further :/
I'm doing a deep learning project, and my partner tried out all kinds of hyper params, these were the learning rates he tried out for the grid search ... :/
gotta stay within the same order of magnitude, or the computer will explode /s
Also set learning rate decay to 0.97, with training taking about 10k update steps (learning rate of 10^-14 after 1000 steps or so)
At least itโs not the other way of hyper 5e4 6e4 7e4 ๐
I actually forgot the minus at some point in this project, caused a big head ache haha
Lmao, oh I could only imagine
How would I make a list that follows a distribution that looks something like this, for a given minimum, maximum, and number of items?
most scientific computing libraries with random modules allow for you to specify which distribution to use, for example in numpy's case it would be using one of these methods https://numpy.org/doc/stable/reference/random/generator.html#distributions
(as for which one exactly fits your particular use case, I have no clue though)
Thanks
I' am learning data science and learning statistics. Can anyone shed some light on 2 histograms I have along with how I determine the bins and tell me if it is normal distribution? It's confusing when its not perfect and never seems to be lol
nm, I figured out how to plot against QQ plot
In search algorithms in game, how do we know we have taken good actions?
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) is this normalization a thing?
like is someone ever use that or just it is useable?
i'm not from coding background . can i learn data science get a job pls give suggestion ?
when you frame the problem, you generally have some kind of reward. Can you elaborate your question so i can give you a better answer?
take a look at the slides part here: https://www.lamsade.dauphine.fr/~cazenave/MonteCarloSearch.html
this is the course i followed at my Uni
whats your question?
in discord.py
You know it?
please write a full sentence framing your problem
General question. I'm starting to learn ML and i'm wondering if training a ML model to determine even and odd numbers is a smart beginning. Is this a hard goal? is this a simple process?
Essentially:
Feed the model 100'000 numbers between 0 and 60'000,
Train it for idk, 14 epochs,
Save the model and test it with 10'000 numbers between 70'000 and 120'000.
Would that be a doable beginner project?
Hi peeps, wondering if anyone can help me with something. I have a pandas dataframe and I'm running a function through it, but it's getting tripped up by null values. The problem is I can't remove the null values, I just want to skip those rows, I can't find a way to do that, there just seems to be dropna() or fillna() but those null values are supposed to be there, I'm just not working on those bits, is there an ignore null and move on method in pandas?
Why do I feel like someone has already asked this before here
What specifically r u trynna do? Maybe show some code examples
!code
Sure one second
I belive it's my first time writing in this channel.
I'm a beginner at ML and i wonder if that is a smart beginner project.
U can give it a try ig
U want to use a NN?
I was thinking about it yes. I feel like a decision tree would do fine, but i'd like to try a NN, yes
What features will u pass in that makes u think a decision tree model will work?
I belive with enough trial and error it might figure out to follow the simple rules of "if odd: else:" which would result in a 1.0 accuracy.
The training data would consit of random numbers like i explained before and the correct answer for each current number it's training on
I mean, think about it tho
There needs to be some sense in how the model will work right
Yes.
So if ure a decision tree, how would u 'split' the data?
In decision trees, numerical features are treated as 'Is X > 5?' for e.g.
Will any form of >, >=, <= or < work?
@cold osprey I have a largish dataset, 130,000 odd rows. there are two columns I am working with, one has an array which I have exploded they they are now single strings on separate rows, the other column is a key value pair, looks like JSON though to be fair it's in single quotes, but I can deal with that bit. So once stripping off any excess white space and they applying json.dumps and json.loads, I am now trying to apply the following line:
df[["workflow", "cost_centre"]] = df[["workflow", "cost_centre"]].applymap(ast.literal_eval)
after narrowing all this down, it works as expected untill it gets to a row where both of these columns are null values. I need to just skip them not remove or alter them if at all possible
No
well, i already pieced together a simple feedforward MLP, just to see what happens, but since i have no clue what i'm doing it has an incredible accuracy of 0.5.
I can show you if you'd like.
Accuracy of 0.5 is no better than randomly guessing
AI is so cool!
I'm aware.
Which is what I would expect
What activation functions r u using?
I think u would need some non linear stuff to get it to work, not sure
Okay more general question. What model would be suited for such a task. I'm currently playing around with something like this:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
# GENERATE
training_numbers = np.random.randint(0, 60001, size=100000)
# LABEL
training_labels = np.where(training_numbers % 2 == 0, 1, 0)
# DEFINE
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(16, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# TRAIN
model.fit(training_numbers, training_labels, epochs=14, batch_size=10)
# GEN-TEST
test_numbers = np.random.randint(0, 60001, size=10000)
# LABEL-TEST
test_labels = np.where(test_numbers % 2 == 0, 1, 0)
# TEST
_, accuracy = model.evaluate(test_numbers, test_labels)
print('Average Accuracy:', accuracy)
# ANALYZE
#removed for discord
model.save("model.h5")
But like i said, i'm just experimenting around, not really knowing what i'm doing
Problems like odd even where there is defined way to calculate it isn't usually solved by ML
Yes, but it seemed like an easy "enviroment" with simple rules and it's easy to test.
you're talking about % 2 == 0, 1, 0 i assume?
Do you have one in mind that offers a beginner goal?
Iris, or mnist, or fashion mnist
These are like the typical first project datasets before moving onto something that interests u more and u have some domain knowledge over to apply
Fashion mnist was my intro to CNNs
somethink like this?
Can probably also use a regular MLP for (fashion) mnist because the images are so small
Ye tbh just pick any that interests u
Braining this rn, maybe u could show some example rows before the applymap step?
Otw home rn, will look in more detail in a bit
Yeah u can use a notebook
what is the random_state variable in X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)
okay
okay i solved it with a decision tree since it's just numbers.
I wonder if
hm
Haha wdym solve
try much lesser parameters
sure, here is a sample:
workflow cost_centre
220 56860820 "ott" {"ott": "2000920243", "txt": " "}
221 56860822 "txt" {"txt": " "}
222 56860823 "txt" {"ott": "2000920243", "txt": " "}
223 56860823 "ott" {"ott": "2000920243", "txt": " "}
224 56860824 "txt" {"txt": " "}
225 56860825 "txt" {"txt": " "}
226 56860827 "txt" {"txt": " "}
227 15694706 "txt" {"txt": " "}
228 9877816 "txt" {"txt": " "}
229 56860828 nan {"processing": "DE"}
230 56860828 nan {"processing": "DE"}
231 56860830 nan {"processing": "DE"}
232 56860831 "txt" {"txt": " "}
for the record, the result should come out as the following and does untill it sees "nan", which I had no idea was there untill I drilled into the data:
workflow cost_centre
220 56860820 "ott" {"ott": "2000920243"}
221 56860822 "txt" {"txt": " "}
222 56860823 "txt" {"ott": "2000920243"}
223 56860823 "ott" {"ott": "2000920243"}
224 56860824 "txt" {"txt": " "}
225 56860825 "txt" {"txt": " "}
226 56860827 "txt" {"txt": " "}
227 15694706 "txt" {"txt": " "}
228 9877816 "txt" {"txt": " "}
229 56860828 nan {"processing": "DE"}
230 56860828 nan {"processing": "DE"}
231 56860830 nan {"processing": "DE"}
232 56860831 "txt" {"txt": " "}
For more context I've labelled the columns though this was taken from the start of the fail row 229. So it is using the value in workflow to match the key in the cost_centre column, which can have up to 5 key value pairs in. They do match up, as the workflow has been exploded so that there is a workflow per row, this is just the last piece of the puzzle so that the correct cost centre is also showing on that row.
"the challange"
maybe solve wasn't the right word
this seems about right
Okay, i now watched a bunch of libraries create a tree which works .
A bit of an odd error: I trained a neural net with the following architecture (see image) and called model.predict(x) on one of the training data points and got the following error. The training worked without any errors, so what's the issue here?
The data point x is a 512x512 array
hmm, need to see what literal_eval does under the hood
The error says the input must be 4-dimensional. Usually that's datapoint_index, height, width, channel, so I'd guess you want to reshape it to (1,512,512,1), assuming you only have one sample and the images are one-channel.
https://stackoverflow.com/questions/52232742/how-to-use-ast-literal-eval-in-a-pandas-dataframe-and-handle-exceptions seems related
Thanks, i'll try that. Why exactly is a datapoint_index necessary? What if i want to predict new data points, for which i may not have an index?
u can pass 2 images at once (dependant on model) so thats another dimension
thats how i see it
Oh i see.
meaning if i wanted to predict 3 data points x1,x2,x3 simultaneously i'd have the input shape as (3,512,512,num_channels), where the first dimension is merely indexing the data points i give as an input
time to learn what that is
not 100% sure, i was thinking more of like having a model that takes 2 images as inputs
Yup, it's just that the input has to be 4d even if you only have one sample (in which case the shape along axis 0 should be 1).
how difficult would it be to train a model with pytorch that detects bugs and inserts a print statement after that line
exceptionally
that's like god tier
Is there a go-to or default method for model explainability?
I've standarized the data and also trained model with that data:
data_mean = data.mean()
data_std = data.std()
norm_data = (data - data_mean) / data_std```
**Now I want to inverse the predicted values **
```inverse_data = (predicted_arr * data_) + data_mean```
**But it gives me the below error how can i handle this?**
```ValueError: Length of values (14604) does not match length of index (2)```
what does your dataset_std and dataset_mean look like?
An AI capable of debugging itself?
Is this the next step towards Skynet? 
Hey @potent sky, since you're into trying some things on latent generative models, maybe this may be useful to you:
https://arxiv.org/pdf/2006.10273.pdf
It's a tutorial on Variational AutoEncoders, where it's explained more about the theory and mathematics around VAEs. It also talk about the confusion around the Decoder Loss (MSE or Likelihood).
My professor sent me this yesterday. Seems interesting.
I just don't really get one thing, though: if the ELBo Loss is more accurately applied when using a Likelihood metric(like Gaussian Likelihood)...why does it works with MSELoss in Diffusion Models?
I mean...I remember the sampling function for diffusion probabilistic models is based on ELBo... 
Hey thanks! I'll check it out
Yeah I read into it long ago I don't remember all the details. I do remember being very satisfied tho the math was goood xd
https://arxiv.org/abs/2107.00630 this might help a bit I think
Diffusion-based generative models have demonstrated a capacity for
perceptually impressive synthesis, but can they also be great likelihood-based
models? We answer this in the affirmative, and introduce a family of
diffusion-based generative models that obtain state-of-the-art likelihoods on
standard image density estimation benchmarks. Unlike o...
Thanks!
sorry I've edit the question, it's the same 'data_mean' and 'data_std' and their values are mention below
column1 22.346957
column2 21.629736
dtype: float64
data_std value:
column1 6.098700
column2 4.249352
dtype: float64```
@fleet heath here is the complete question kindly look at it
https://stackoverflow.com/questions/76416813/inverse-standardization-of-predicted-values
What if it will only detect Exceptions? It'll only be for Java code
Hello all, qq. is it fine to run on old pandas version forever, as new pandas versions throwing merge error
this merge error was just a warning in older versions
ideally you should adjust your code that it does not gives you neither warnings nor errors
if it works on an old version, technically you can just never update anything and keep using it exactly as is, but if you ever need to add new features to it, or if security is a concern (e.g. web servers), you may want to update things
How do you guys decide where you're going to publish especially if you're doing more applied stuff (like in my case personal health)?
Like what helps you decide if you're going for an AI journal or a health (or any other) journal?
its in my Local, im never gonna move it to prod.
What's the difference between Tensorflow's .numpy() and Numpy's np.array()? How does functionality change if I choose one over the other?
I suppose Tensorflow will simply call np.array() while manipulating the data so the operation can be as efficient as possible
i'd expect no difference at all
maybe .numpy can avoid copying the data.
anyone hear about 'openchatkit' from redpajama? https://twitter.com/togethercompute/status/1666067674382888961
released this morning, is apparently one of the best open source chat models to-date. if you had to say, what do you believe is the best open source LLM
I coded my first neural network and I finally got it to work lets goooo
97% accuracy
What was the mistake in the end? @crimson summit
Has anyone watched this ? https://www.youtube.com/watch?v=pdJQ8iVTwj8&list=PL4_UwQwZnULUCwyjPOczIE3wE5FAH1Tfl&index=4&ab_channel=LexFridman
Chris Lattner is a legendary software and hardware engineer, leading projects at Apple, Tesla, Google, SiFive, and Modular AI, including the development of Swift, LLVM, Clang, MLIR, CIRCT, TPUs, and Mojo. Please support this podcast by checking out our sponsors:
- iHerb: https://lexfridman.com/iherb and use code LEX to get 22% off your order
- N...
I had to multiply the cost with respect to a2 by -2 instead of 2 and I had to cut my learning rate in half
I can't stand lex's voice, he sounds high as a kite and his questions are so weird(?) sometimes
I saw some clips. Is mojo going to be the new lang ๐ ?
Lex also has some hot takes but who am i to judge on that front
Hi, I have a question regarding how to assign values to series in dataframes. I have a (main) dataframe that is divided into multiple years which also contains various kinds of scores. Each year has 1 unique score attached to an identifier. I would like to calculate deciles for every year (shown in code below) and do this using the yearly dataframe. Unfortunately, I have issues assigning this back to the original dataframe. Additionally, although the code below is only for 1 year, I would like to make a for loop function that does each year in the original dataframe. Any help is really appreciated ๐
'''py
sustainalytics_scores = ['total_esg_score', 'environment_score', 'social_score', 'governance_score']
sustainalytics_c = sustainalytics.copy()
labels_30th_p = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
yearly = sustainalytics.loc[sustainalytics['year'] == 2014, ['isin'] + sustainalytics_scores]
sustainalytics_c.loc[sustainalytics_c['year'] == 2014, ['ESG_measure_sorts_30']] = pd.qcut(yearly['total_esg_score'], q=10, labels=labels_30th_p)
'''
Congrats!
how did you do it? trying to learn myself
i was watching andrej karpathy's let's build gpt' from january, but i feel that models have progressed since then
Go forth. The models might have evolved, but if you know the fundamentals, you may be able to adapt pretty fine.
im so bored, im desperate for a project if anyone is working on something
where might a look for some open source projects i might be able to contribute to while i'm learning ds?
you can play around with open datasets on Kaggle and try participating in their Competitions
is an idea, i don't really thrive in competitive settings
feel like id be more motivated if it was something i could invest myself in
that's a useful lead though, seeing a lot of libraries in use that i'm currently studying
Kaggle competitions are generally months long so ig you could invest yourself
Hi I hope you had found a solution. If not, could you clarify what your data frame looks like? So you have a main data frame, with columns [โyearโ , โtotal_scoreโ, โenv_scoreโ, โsoc_scoreโ, โgov_scoreโ], so each year is one row? Or you have an additional column like โcityโ, so each year is N rows, where N is the number of cities?
guys why does precision have two values when produced using a classification report in scikit learn
1 and 0
I thought precision = TP/(TP + FP) where TP = True positive, FP = false positive
how are there seperate values for 1 and 0
is it that the classification_report function is not assuming that 1 means positive and 0 means negative and is thus calculating the precision twice
once for 1 as positive and then 0 for positive
it is made to support multiclass classification, not just binary classification
take a look at the documentation https://scikit-learn.org/stable/modules/model_evaluation.html#classification-report
There are 3 different APIs for evaluating the quality of a modelโs predictions: Estimator score method: Estimators have a score method providing a default evaluation criterion for the problem they ...
guys...
a doubt regarding how to use precision and recall
actually i am building a cnn model for plane detection
i got this precision and recall values but it didnt recognise the third aeroplane only. so is this right or wrong? or .. any comments
please suggest something..
first list is predicted boxes
second list is ground truth boxes
The metrics is shown per-class basis. This is because we might want to know how the model performed per class in the response variable (Y); since it's possible for one to be more interested in really seeing the model's performance on either the positive / negative class (for a binary classification problem) separately.
For example (Assume class label 1 is the positive class here and this is a titanic dataset), this affords you the liberty to infer that:
Precision: Out of all the passengers the model predicted would survive, only 84% actually survived.
Recall: Out of all the passengers that actually did survive, the model only predicted this outcome correctly for 89% of those passengers.
(Now you can also make such inference for the negative class with ease by focusing on the label 0)
Finally, you can as well get a general overview of each metric performance (not per-class level this time) by looking their respective average score.
You can find the complete documentation for the classification_report function here https://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html
Examples using sklearn.metrics.classification_report: Recognizing hand-written digits Recognizing hand-written digits Faces recognition example using eigenfaces and SVMs Faces recognition example u...
I have a data that 50m rows in Postgres. I could easily manage 1m part of this data even with Pandas.
Now I want to write this data to parquet using Pyspark. But it gives memory error (java heap). I even partitioned the data by 100. Why Spark can't handle it and do it in small batches?
I'm working on an image segmentation task where i'm currently trying out the U-Net architecture. When making predictions with the trained model, i am getting images of the following form (see attached image). The boundaries seem to be causing some issues here. My guess is that the cause is a combination of (a) the convolution blocks "downsizing" the images together with (b) the decoding part of the U-Net then "upsizing" the images again.
What are some steps to take to remedy this problem? Note that the inputs are WxH sized and the outputs are WxH sized as well, i.e. the same size as the input images. One idea i had was to slightly "crop" the output/train images so that they are of size (W-n)x(H-n), so that i have removed the boundary. It seems that in this Kaggle competition (https://www.kaggle.com/competitions/hubmap-kidney-segmentation/discussion/238198) the winning solution did exactly that. Any thoughts?
Hi there, wonder if anyone can help, I've been trying to drop na values from a dataframe and it just will not go. I've tried the following:
df = df.dropna(subset=['workflow'])
I've tried :
df.dropna(subset=['workflow'], inplace=True)
I've tried:
test_df = df[['workflow']]
test_df = test_df.dropna()
and I've tried:
test_df = df[['workflow']]
test_df.dropna(inplace=True)
Bonus round, I've tried
df = df[df['workflow'].notna()]
In fact, the nan values in the dataframe do not even show up as True if isna() is applied. What else can I do to rid my data frame of this plague please?
show us your dataframe, that should have worked
Hi, here:
224 "soip"
225 "soip"
226 nan
227 "soip"
228 "soip"
This is now just the single column and it still won't go, I'm actually just trying to check for na to see if this is what is messing up my function on the larger dataframe, but I just don't understand why I can never get this to work without a fight
Hello everyone, is anyone interested in Kaggle competitions?
np.nan()
colum dtype?
prob a dtype thingy
224 "soip"
225 "soip"
226 nan
227 "soip"
228 "soip"
Name: Workflow, dtype: object
replace with this to None
capital W
u did workflow
do Workflow
good eye, if that doesn't work then see what type(col.loc[227]) gives you
yes sorry, i've just changed the name of the column as it's work data and I don't want to get into trouble, that was just a typo
ah then nvm
type(new_df.loc[227])
returns str
but if there r multiple data types in a column it will give that particular dtype na?
yea thats what
well ig its nan not np.nan
so replace nan with np.nan
and then do dropna
ah, ok, how can I fix that the original dataframe is 139,000 rows lol
i see, your data is really weird..
you have value of "soip" which is a string of literally "soip" including the "
and you also have value of nan which is also a string of literally nan
the above suggestions should work, but i would look into why your data is like that in the first place
yea true..
which dataset are you working on btw? @lost pier
Anyone here a PyTorch whisperer?
I've attempted to build a SqueezeNet, and it blows.
@dusk bear I have a large data set that with two columns I am working with, one is and array which has been exploded the other is a json object that I am trying to map with the result of the exploded column, but I hadn't seen the nan values till yesterday, so now I am trying to find a way to skip over the nan values as this is just a pipeline transformation for financial data, so nothing can be dropped
Wondering if I could talk through my hyperparameters with someone, along with a sanity check.
Yes this data is very nasty it seems, I have put it through JSON.dumps and JSON.loads in an attempt to clean it but that might be what is causing the problem now I look at it
ahh.. ok..
I am working with, one is and array which has been exploded the other is a json object that I am trying to map with the result of the exploded column
could you share some redacted examples? maybe there is a better way than json.dumps/loads?
The Json loads and dumps was an attempt yesterday, i've removed that now, I'll show you the code that works up to the nan values, one second
oh my apologies, i somehow took it as the json.dumps/loads caused this weirdness.
but yes, showing what you have got would be useful
file = glob(f"{file_path}*.csv")[0]
df = pd.read_csv(file, encoding='utf-8')
df = df.replace({'\'': '"'}, regex=True)
df["Workflow"] = df["Workflow"].str.strip("[]").astype(str)
df["Workflow"] = df["Workflow"].str.split(",")
df = df.assign(Item_Cost_This_Month=df["Cost This Month"] / df["Workflow"].str.len())
df = df.assign(Item_Cost_Next_Month=df["Cost Next Month"] / df["Workflow"].str.len())
df = df.explode("Workflow").reset_index(drop=True)
df[["Workflow", "Cost_Centre"]] = df[["Workflow", "Cost_Centre"]].applymap(ast.literal_eval)
So the above code, works really well so long as there are no null values, here is a sample row of the whole data:
04/30/2023 23:24:26 1242360.0 LongForm 04/30/2023 23:24:26 05/30/2023 00:00:00 True 0 1 29 0.0 0.12 3.34 ['soip', 'ott'] uk {'ott': '1234567890', 'soip': ' '} abc xyz prd NaN NaN NaN
workflow is the array, and cost code it the key value pair
I've never posted on ai stackexchange, so could someone who is more active on that site tell me whether a question like this would be appropriate to ask there? Or is the site not meant for such questions?
The above only trips up when it gets to a row where the value in the array field is nan, as this value is used to map the key value pair, I just didn't expect skipping over it would be such a battle
๐ this is a very good start to nailing what issue is plaguing you, now could you post some problematic (and some normal) rows? redact info if necessary
df["Workflow"] = df["Workflow"].str.strip("[]").astype(str)
this is mildly concerning to me; I suspect it might be what's stringifying everything
yes, so here is a row after the explode, but before the ast line:
True 0 1 29 0.0 0.12 3.34 'ott' uk {'ott': '2000920243', 'soip': ' '}
and here is a line that is causing an issue:
False 0 0 0 0.0 0.0 0.0 nan de {'content_processing': 'XYZ'}
the above line is the first one that fails, and after looking at the csv and manually copying it out, I saw the issue and then after more digging, found that this was what was stopping it. Today I thought if I dropped all the nan's I could validate that theory lol
Ah right let me explain that one for you, the data looks like an array, but is not, so I convert to a string to remove the square brackets so I can then split it back into an array, but yes I see what you are referring to
That's the column that's causing the problem too
Possibly you want to do something like a json.loads followed by pandas.json_normalize
can we have the header as well so we are on the same page? or just highlight which column is which (the ones you have used anyway)
Yes it's the columns that have nan and the key value pair, these ones:
workflow cost_centre
'ott' {'ott': '2000920243', 'soip': ' '}
nan {'content_processing': 'XYZ'}
I have to split this up for these two columns and produce another csv that can then carry on down the pipeline into google big query I think it goes
i understand now.
is workflow really 'ott'?
or is it "ott"?
only the later is valid JSON
It actually comes in from the csv as ['ott'], but I dont' know if that's python doing that
same with the key value pair, it looks like json with single quotes, which I thought was not valid json,
I did put this in there:
df = df.replace({'\'': '"'}, regex=True)
not sure if it was in the above code, I've tried all sorts of things to clean this up, I'm getting a little lost
So after that I tried the json.dumps and json.loads, and that did get the cost centre column into a valid json format, howerver literal_eval was working on the single quote dictionary version to be fair.
hopefully this gives you some inspiration.
!e
import pandas as pd
import ast
df = pd.DataFrame({"itemgetters": ["['a', 'b']", "['a']"], "lookup": ["{'b': 'quack', 'a': 'meow'}", "{'b': 'quack', 'a': 'meow2'}"]})
df_parsed = df.applymap(ast.literal_eval).explode('itemgetters').reset_index()
lookup_values = pd.concat(
[
df_group['lookup'].str[key]
for key, df_group in df_parsed.groupby('itemgetters')
]
)
res = pd.concat([
lookup_values,
df_parsed,
],axis=1)
print(df)
print(df_parsed)
print(lookup_values)
print(res)
@boreal gale :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | itemgetters lookup
002 | 0 ['a', 'b'] {'b': 'quack', 'a': 'meow'}
003 | 1 ['a'] {'b': 'quack', 'a': 'meow2'}
004 | index itemgetters lookup
005 | 0 0 a {'b': 'quack', 'a': 'meow'}
006 | 1 0 b {'b': 'quack', 'a': 'meow'}
007 | 2 1 a {'b': 'quack', 'a': 'meow2'}
008 | 0 meow
009 | 2 meow2
010 | 1 quack
011 | Name: lookup, dtype: object
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/xadogowike.txt?noredirect
i gotta bail now, good luck!
@boreal gale thank you for your help sir, it has been very inspiring for sure
@tidal bough thanks very much for your help also, you have indeed correctly identified what was causing that problem. I am able to dropna() straight after ingest
Ok, i have posted my question on StackExchange: https://ai.stackexchange.com/questions/40742/convolutional-neural-network-struggling-at-the-boundary-of-images
I hope some of you might be able to help, but even if not, i'd appreciate an upvote on the question, if you think it is well-posed and interesting, in order to increase its visibility.
can someone help me with object detection? how do I convert xml to csv for tf records?
Hey, I'm currently researching about digital twins and simulation and I wanted to ask whether someone here has some knowledge and could answer me some questions and give me a small overview on the topic (pm if its ok)
people aren't likely to want to DM you to find out of the questions are ones that they know the answer to, so you should ask your questions here.
import nltk
# nltk.download()
from nltk.tokenize import word_tokenize
from spellchecker import SpellChecker
from gingerit.gingerit import GingerIt
from transformers import AutoTokenizer, T5ForConditionalGeneration
# Step 1: Tokenization
def tokenize_text(text):
return word_tokenize(text)
# Step 2: Spell Checking
def correct_spelling(tokens):
spell = SpellChecker()
corrected_tokens = [spell.correction(token) for token in tokens]
return corrected_tokens
# # Step 3: Grammar Correction
def correct_grammar(text):
parser = GingerIt()
result = parser.parse(text)
corrected_text = result['result']
return corrected_text
# # Step 4: Missing or Extra Words
def correct_missing_or_extra_words(text):
tokenizer = AutoTokenizer.from_pretrained("grammarly/coedit-large")
model = T5ForConditionalGeneration.from_pretrained("grammarly/coedit-large")
input_text = text
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=256)
edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
return edited_text
# Example usage
input_text = "Thiiss is aa testt sentnce with spelng mistakas."
tokens = tokenize_text(input_text)
corrected_tokens = correct_spelling(tokens)
corrected_text = ' '.join(corrected_tokens)
corrected_text = correct_grammar(corrected_text)
corrected_text = correct_missing_or_extra_words(corrected_text)
print(corrected_text)
# for i in range (0, len(corrected_tokens)):
# print(corrected_tokens[i])
this code is extremely slow bcz of the 4th function
also the output was expected:
This is a test sentence with spelling mistakes
but what I got:
This is a test sentence to see if I can spot mistakes.
which is more likely to cause overfitting in random forests. high number of estimators or low.
import math
import time
from pynput import keyboard, mouse
is_active = False
last_toggle_time = 0
def on_press(key):
global is_active, last_toggle_time
try:
if key.char.lower() == 'c':
current_time = time.time()
if current_time - last_toggle_time > 0.5:
is_active = not is_active
last_toggle_time = current_time
if is_active:
start_spinbot()
except AttributeError:
pass
def start_spinbot():
screenSize = mouse.Controller().position
centerX = screenSize[0] / 2
centerY = screenSize[1] / 2
radius = 200
angularSpeed = 0.1
mouseController = mouse.Controller()
angle = 0
while is_active:
x = centerX + radius * math.cos(angle)
y = centerY + radius * math.sin(angle)
mouseController.position = (x, y)
angle += angularSpeed
time.sleep(0.01)
def on_release(key):
if key == keyboard.Key.esc:
return False
def main():
print('Press "c" to activate/deactivate the spinbot. Press "Esc" to exit.')
with keyboard.Listener(on_press=on_press, on_release=on_release) as listener:
listener.join()
if __name__ == '__main__':
main()
i can't find a channel for my issue really but my code is meant to spin the cursor around a 1440p native screen but well not only does it not spin it in the middle but it also doesn't stop after repressing C
should it work on a UW screen?
trynna run it rn
if i run it it works fine but like i said just doesnt even spin in the middle of the screen and it does not stop no matter what i press or well until i alt f4 out of it
ok sec lemme try
Does anybody know whether its possible to simulate a digital twin of a CAD model in python?
and if yes does anybody have a paper or link to a readme or w.e.
did it work
sec setting up env
wanna install pynput in separate venv
hmm
mouse aint spinning
i can press esc to exit tho
wait nbvm
i didntr press c lol
im thinking coz when start_spinbot() is running, it doesnt register any keystrokes?
very possible
what do I put as a checkpoint for tensorflow?
on line 145
Why do you want checkpoints? Do you know exactly what they are?
the guide said to use a checkpoint
Can we take a step back for a second, what are you trying to do?
i am trying to train an object detector with mobilenet-ssd v2 320x320
i have these files, im not sure which one to use
https://towardsdatascience.com/custom-object-detection-using-tensorflow-from-scratch-e61da2e10087
this was the guide I am using, it doesn't go in depth though
Custom Dataset Training for Object Detection using TensorFlow | Dog Detection in Real time Videos | Perfect Guide for Object Detection
Is the object you're trying to detect not part of the coco classes?
Okay great, sorry for asking. Just wanted to be sure ๐
In all honesty I don't know either by I'm going to have a look as well
iirc checkpoint of the model during training?
Yeah but they seem to have multiple check point files
oh its starting form the ssd_mobilenet_v2_coco checkpoint to train
on step 8, if you download the zip file, there are 3 checkpoint files
https://towardsdatascience.com/custom-object-detection-using-tensorflow-from-scratch-e61da2e10087
Custom Dataset Training for Object Detection using TensorFlow | Dog Detection in Real time Videos | Perfect Guide for Object Detection
am I missing something?
This answers it: https://stackoverflow.com/questions/41265035/tensorflow-why-there-are-3-files-after-saving-the-model
so basically the meta file is the checkpoint file i'm looking for
@past meteor that doesn't work
If you want to use the checkpoint for training, all of them are important
The meta file describes the graph structure etc. The .data file has the actual model Weights
You can create a checkpoint object with tf.train.latest_checkpoint and then load weights in using the model.load_weights() method
This will probably be the simplest way
@potent sky can I use you as a sounding board for a second?
Or atleast used to be last I used it
TF undergoing too many changes atm ;-;
I want to make synthetic data (tabular use cases).
I was thinking of going with graphical models because I can specify how everything is related to each other first. Afterwards I sample from it and send it through a (V)AE to add a bit of unpredictable/non-boring noise.
Am I severly overengineering/overthinking this?
alright, let me try that
If all the relationships are linear and everything is independent w.r.t. each other then I'd obvs just sample my N variables and make a predetermined function that determines f(X_1, ..., X_N) but that's just too boring
Hmm I think it depends on the eventual requirements of the synthetic data, what level of information you want it to carry, what it's going to be used for no?
Your reasoning for using graphical models seems pretty sound. If I wanted data realism and had to capture relationships between different variables, this would be a good option
Be sure to check out the docs
Are you on windows? TF GPU is not supported on windows anymore I think.
Overall tf is undergoing many changes
nvm, i'm back at the same error
linux
do I just wait for this?
