#data-science-and-ml

1 messages ยท Page 66 of 1

potent sky
#

Possible. I generally try to build from scratch all the basic stuff

past meteor
#

I could use them for my work (medical stuff, modelling something as a function of vital signs + behaviour)

potent sky
#

It's not just the coding part, but moreso the questions that arise during the process that are more valuable imo

past meteor
#

But I just don't get why people are using them for time series. You don't want permutation invariance there

potent sky
#

I almost certainly go on deeper math rabbit holes than coding when implementing something from scratch

past meteor
#

So they have temporal fusion transformers that take away the permutation invariance of transformers, why not use a regular RNN at that point etc...

potent sky
#

I find that a lot of details I might've glossed over when just reviewing the theory become difficult to ignore once you're implementing it

potent sky
#

The other benefits of Transformers are exceptional for time series. And positional embeddings are powerful enough to make it work

hasty mountain
potent sky
#

gtg now ;-;

hasty mountain
#

There's the Transformer for the Stable Diffusion conditioning(text), there's Transformers for AlphaStar, the DeepMind's AI that achieved GrandMaster in StarCraft 2, there's Transformers for image classification, for video classification, face recognition, text classification...

#

Maybe there's also for working with audio. I just didn't find it yet yert

agile cobalt
#

https://ai.googleblog.com/2022/10/audiolm-language-modeling-approach-to.html

AudioLM is a pure audio model that is trained without any text or symbolic representation of music. AudioLM models an audio sequence hierarchically, from semantic tokens up to fine acoustic tokens, by chaining several Transformer models, one for each stage. Each stage is trained for the next token prediction based on past tokens, as one would train a text language model. The first stage performs this task on semantic tokens to model the high-level structure of the audio sequence.

potent sky
#

And tbh often there's a lot of thinking and mathematical justification that goes into where to throw a transformer and what kind of transformer

past meteor
hasty mountain
past meteor
#

I'll try them out for my work if I have spare time

#

At the very least it'll be a learning experience lol

merry roost
#

How would you guys suggest I learn pytorch? I want to train a simple model over the next few days, taking in ~140 inputs, im not sure how many center layers, but only 7 outputs. I have learned a bit about ai and how it works, the math behind it and all that, but I just never learned pytorch yet. Most tutorials on pytorch seam kinda confusing and the main docs are verry in depth for a starting guide.

data for input and output is repeated data of

  1. type
    2-4. positon
    5-7. rotation

ABOVE IS COPY FROM #python-discussion to continue a conversation.

if anyone wants to help, I dont know how to do this well, I can generate the data in this form and would prefer to use this rather than a waveform colapse to generate a map for a game.

#

@nova pollen hola

nova pollen
#

what would the training data be?

#

the more context I get the less I think deep learning is suitable ๐Ÿ˜…

merry roost
# nova pollen what would the training data be?

fine, here is the long version

I want to take a racing game that has track prices for a lot of things, use the previous track peices to generate the next one and so on. The training data will come from maps that have no boosters or any sort of speed increse and be sorted into the ai model using the distance from the start block. The reason I have to sort it is because all maps have blocks unsorted, and without any maps with speed boosters, I can assume everything goes in a down direction or at least away from the spawn point allowing me to get the next peice and gett a good training set.

#

using that training set I will try to generate the next block that was placed in the map assinging weights to how similar blocks are to eachother allowing the ai to still get points for being wrong

#

I will also when chosign points, if the selected peice is less than 20 away from the start just include 0's as the types and position

#

*and rotation, allowing me to slowly geneate a map using this ai after it has trained on enought data on the maps and for long enought

#

This aprroch is better over waveform colapse, becausei want to learn ai and I dont want to haveto find all the parts my self, allong with the fact that that would illeminate any jumps or obsticals if I should use waveform colapse

merry roost
#

I know this is possiable, its just how hard it is

nova pollen
#

mm apart from the sequence modelling point i mentioned earlier

#

this is a generative problem

#

since there isn't really a "ground truth" next object

merry roost
#

I would train it on a lot of comunity made maps

#

*reciently there was a no power compotion where people could tag their maps as not using any powered objects like boosters

nova pollen
#

right, but if I gave you the sentence
"I am currently eating a BLANK"
and asked you to complete it, there would be many possible valid continuations

merry roost
nova pollen
#

this makes training it as if there were only one correct answer difficult

merry roost
#

I just wanted to do this for the reasons specified before

nova pollen
#

mm

merry roost
#

as generaly wavefrom colase, only looks at its direct neibors, but I could have it extend its size ig, or make premade assets for each road section

#

but eithro way it would not be as good as training it directly...

merry roost
# nova pollen mm

I have a arm small computer that I already have on 24/7 running nothing rn, so I could just give it the task on the cpu for a few days.

nova pollen
#

doesnt hurt to try i suppose

merry roost
#

it would have conflicting information

#

but it would work

potent sky
#

JavaLim in ai channel ๐Ÿ‘€

potent sky
potent sky
merry roost
cold osprey
#

u wanna replicate how the game generate track pieces?

potent sky
#

The first rule of machine learning is: don't use machine learning
i.e. try to find a simpler solution, mathematical or algorithmic

#

I read your long version but tbh it wasn't clear to me what exactly you're trying to do

#

Like wdym by "sorted into the ai model"

merry roost
cold osprey
#

assuming the game ure using doesnt just randomly generate track pieces and stitch them together, ure just replicating that algorithm by training a model on that data no?

rose dagger
#

Sorry to interrupt the lively discussion here: I have a problem with the training time of my convolutional neural network. The inputs are 512x512 (grayscale) images and i want to perform image segmentation. For this i am choosing the U-Net architecture. Now even for a reduced training set of only ~100 samples, a single epoch takes ~1h to finish. My total amount of training data is ~1600 samples (not even including additional data augmentation). What would be smarter to do, in order to cut down on training time while keeping some of the performance: (i) "Reduce" the images to 256x256 or even 128x128 by some kind of "blurring" , (ii) reducing the networks architecture by removing a few layers or (iii) something else.

merry roost
# potent sky Like wdym by "sorted into the ai model"

the track peices are compleatly out of order in the data, but i have their positions, then using the start block, I can try to find what next track peice there is based off sorting them by distance using only tracks that have no speed increse;

track peices are out of order, I am sorting by their distance from the start blocks, using only tracks that have no external power for your vheical, allowing me to know what order they are in, and use that as input data, as without sorting it it would be hard

merry roost
cold osprey
#

why not just write an algorithm (non ml) to generate tracks?

merry roost
#

I am not using wafeformcollapse because its its not fun, I want to learn ai, and I cannot allow for thins souch as jumps with that.

rose dagger
rose dagger
#

yes

potent sky
#

Pain

merry roost
cold osprey
#

sounds like seq to seq

potent sky
#

Why can't you do jumps with waveform collapse. What are jumps here

cold osprey
#

not sure of the details if its non fixed length, etc tho

cold osprey
#

and how u would restrict certain combination of track pieces. ig it will be learnt based on community made maps

merry roost
#

disguartding any seneary

potent sky
#

Are the "track pieces" to be selected from a finite, discrete, pre-determined set?

past meteor
#

The thing with time series at least is that very very simple models (t = t-1) type things or exponential smoothing can outperform complex models

potent sky
#

Yes

past meteor
#

I think transformers will matter in my case is when I start doing long horizon with a large conditioning window

#

Because that's the space where basic models fall flat

merry roost
cold osprey
#

if all tracks are fixed length, could just do fixed length seq to seq

potent sky
#

Or when there is complex structure inherent in your sequence. Time series forecasting tasks work well with simpler methods, like extrapolating stock prices maybe.
But you'd be hard pressed to compete with transformers on say speech or language tasks

rose dagger
cold osprey
foggy kestrel
#

hey, not really a coding question but does anyone know where i can download the "tesseract executable"

rose dagger
#

Ok, well i'm working on Kaggle notebook, so i should be good, right?

potent sky
potent sky
merry roost
potent sky
#

Relent downloading executables from unauthorised kr untrusted sources

foggy kestrel
potent sky
foggy kestrel
#

i will go through again thank you very much

potent sky
#

Or 7 or 3

potent sky
merry roost
#

it only geneates one peice at at time so that should be fine

potent sky
#

Yes but it isn't necessary for
each track generated to be 13 pieces long is it?

foggy kestrel
#

kinda weird but i think i did find it, had to do some digging in the original tesseract-ocr engine page

merry roost
#

but based on the trained data, I want it to generate the end peice

potent sky
#

This seems like you can just sample from two distributions, one containing you set of tracks and one to regulate when it ends. You can add a bias to tune it.
I don't think it requires ML but sure you can use it if you want

merry roost
potent sky
#

Look into sequence modelling, RNNs, etc. There should be tutorials on pytorch docs.
And remember, one of the most important parts of an ML problem is formulating the data and model inputs in the right manner. You could be stuck in a simple problem for ages if you don't do this right.
Don't rush to the modelling part, give data all the time it demands and you should be better off for it

potent sky
potent sky
merry roost
#

I want to generate a track in a game consisting of only track peices, starts, ends, and checkpoints, lets just say this is then a array of those real in game object ids, mapped to 0-13, 0 being nothing and only occoring before the start block.

the tracks to train on will first have to start with being cleared or selected with only 0 boosters / external power to limit the direction downwards and away form the start block. This will allow us the then sort the track peices that are currently randomly placed in the file, into a neat set from start to end in a continual pattern.

This data then we use to train a model by taking a random peice from a random track of data, selecting that peice as the one to be generated, this can be anything but a start block (Start blocks will only ever exist once and will never be placed by the ai.) the ai then takes the flowing data about blocks:

1x -
current track length

20x -
type (mapped betwine 0-13)
position (x,y,z) (clamped to a 1/4 th grid tile)
rotation (x,y,z) (clamped to 45* increments)

this data is given to the model, who then has to guess the track peice that was selected. This will repeat over and over attempting to generate blocks in the positions that tracks have most relivent online.

this will not be verry accurate, but with enough training, it should be close enough.

Waveform colapse is not a good option as for things like jumps or the end it needs more information that is easier to provide to a ai model.

generated format:
type (mapped again)
position (clamped again)
rotation (clamped again)

potent sky
merry roost
potent sky
#

No it's alright dw about it. I just check this when I can

merry roost
potent sky
#

Why is this problematic for jumps

#

*for waveform collapse

merry roost
#

so you eithro make bigger setcions for waveform collapse

#

contaning multiple peices

#

or has to expand on waveform collase

potent sky
#

But your task is only to create the track right? Why consider jumps

merry roost
potent sky
#

So there are jumps between certain pairs of track elements (say 2-7) and not between others?

#

Or jump is one of the 13 elements?

merry roost
#

this makes jumps easily possiable

cold osprey
#

the model doesnt necessarily need to predict piece and position

merry roost
#

yes, but I would like it to

cold osprey
#

just piece should be fine if u set it in such a way that the outputs are already in its designated position

merry roost
#

yes, but that makes it so jumps cant be done

potent sky
#

I feel there's a bunch of information here that isn't apparent to us as it is to you since we don't know the game you're working with

#

I believe you can try to go for sequence modelling using ML. If nothing else, the process of preparing and structuring the data for the model should help you gain a lot of clarity about the problem

merry roost
# potent sky I believe you can try to go for sequence modelling using ML. If nothing else, th...

here is the exact game i wanted to try to do it on
https://store.steampowered.com/app/1440670/Zeepkist/

Zeepkist is a racing game for 1-4 players, or up to 64 online, in which players race down extreme downhill soapbox courses to set the best times possible!If you like weird physics, soapbox racing, and/or creating your own crazy tracks, then this is the game for you!๐Ÿ”ธ Race against time itself in Adventure mode!๐Ÿ”ธ Crash into your friends in 4-playe...

Price

$11.99

Recommendations

931

โ–ถ Play video
#

Just remove all non track blocks

potent sky
#

I do think this can be solved using some probability and statistics, without ML. But you can try it out and see

merry roost
#

and use maps with no boostars

merry roost
#

I guess I could use some sort of modified waveform collapse

#

but it would be diffuclt

cold osprey
#

I'm thinking more like how a sentence is generated, previous words matter to the next word being generated

potent sky
cold osprey
#

From my brief reading about wave function collapse, I don't see why u don't wanna use it

merry roost
#

allong with the fact I want to learn about using pytorch

cold osprey
#

Yeah like if a jump block has been selected for piece 5, piece 6 cannot be another jump right?

cold osprey
potent sky
#

Whether or not it can be solved without ML, it does look like something that can be usefully solved with ML. So if you want to use it as a project to dive into learning ML, go for it

merry roost
#

It depends on how you generate this, and its hard to explain right now in short sentances but I belive ai is what I want for this ranther than wavefuntction clapse

merry roost
potent sky
#

Look into sequence modelling

#

RNNs, GRUs, LSTMs and the like
Transformers might be overkill

#

Also look into some of the simpler mathematical sequence modelling functions before that. You can derive inspiration from them if nothing else

cold osprey
#

Attention is all you need

merry roost
merry roost
potent sky
#

It refers to Transformers

cold osprey
#

Or rather, the paper

potent sky
#

A machine learning method for sequence modelling

potent sky
potent sky
merry roost
#

only more recient track peices effect the outcome

cold osprey
#

Start with what's simpler and easier to do

#

U can always build from there

sleek harbor
#

is it just me or does it make more sense to use permutation_importance instead of fearure_importance_ or coef_ for the importance_getter of SelectFromModel? (sklearn)

potent sky
merry roost
potent sky
#

Also look into autoregression

errant bison
#

would be soo helpful if u provide the yt link for yolo + ocr.

potent sky
# errant bison would be soo helpful if u provide the yt link for yolo + ocr.

I-
Nvm there you go:
https://youtu.be/FKGtdSJu3X4

Your exact project, have fun lol

๐Ÿฅณ Sign up now for free: https://theos.ai

๐Ÿ‘‹๐Ÿป Join our discord server: https://discord.gg/CKYYExqMuP

โœ… Join our WhatsApp group: https://chat.whatsapp.com/CzlqpwU9rID3rCg0kWq9Gu

๐Ÿš˜ License Plate Detection Tutorial Video: https://www.youtube.com/watch?v=GVLUVxTpqG0

โœ… Google Colab Notebook: https://colab.research.google.com/drive/1LbbTUXzgYT7dn3lQ...

โ–ถ Play video
potent sky
merry roost
#

Oh last thing, is 141 inputs a good amount, is it large or small, also how many hidden layers / nodes should I have? Rember only 7 outputs.

errant bison
potent sky
# errant bison but that would simply not be training with yolo right..? thanks but

Fair enough
Look, your whole solution is neatly divided into 2 models
YOLO to detect and extract the license plate
And OCR to convert that to digital text
Just look up a yolo tutorial even without OCR and you should be fine
There are tons of yolo training tutorials. I'm a little busy rn so I can't search but it should be easy enough to find

potent sky
#

Have a meeting now gtg

merry roost
rose dagger
#

In a convolutional layer with 3x3 filter, why should the number of channels increase to 64? I understand that due to the filter being 3x3 a 572x572 image is mapped to a 570x570 image, but how come we now get 64 channels instead of just 1? (This is a snapshot from the U-Net architecture)

mild dirge
#

Because we don't have one 3x3 kernel, but we have 64 independent 3x3 kernels

#

Each generating a new image that is 570x570

#

That get stacked together

#

@rose dagger

#

And only in the first to second layer is the kernel actually 3x3(x1) because the input image has 1 channel

#

In the second one the kernel is actually 3x3x64

rose dagger
#

Oh i see. Thank you. Then in the remaining encoding block (left side), do we then have a 3x3x2 kernel in the second part (since we go from 64 to 128) or a 3x3x128 kernel?

mild dirge
#

No, each kernel shifts over the entire input image from left to rigth, and top to bottom, because it's a 2d convolution

#

So when you go from 64 depth to 128, you have 128 kernels that each are 3x3x64

#

As each kernel will generate a single image

rose dagger
#

Ok, now i understand what you mean. Thank you, that makes more sense!

crimson summit
#

This is my first time building my own neural network from scratch I just wrote the training part if anybody sees anything wrong with it feel free to let me know. It is a 3 layer 3 neuron in each layer neural network.

foggy kestrel
#

trying to use Voice_Cloning package, this error comes back:

Traceback (most recent call last):
  File "c:\Users\Code\Documents\GitHub\Test\ref.py", line 12, in <module>
    from voice_cloning.generation import *
  File "C:\Users\Code\AppData\Local\Programs\Python\Python310\lib\site-packages\voice_cloning\generation.py", line 27, in <module>
    from encoder import inference as encoder
ModuleNotFoundError: No module named 'encoder'

Looking at Voice_Cloning, inference.py is a script within the encoder folder, which is on the same directory level as generation.py
Is there a way I can just modify this import statement so that it imports the file correctly?

#

this may not be the right chat for this so if someone could direct me to the right chat that would be helpful as well

sterile wyvern
#

How often should you retrain your model? Generally, lets say you trian and test on time serries data 70/30 split in days. After you deploy you would forward test for 30 days then retrain?

queen cradle
# merry roost I want to generate a track in a game consisting of only track peices, starts, en...

Everyone has suggested fancy generative models. But let me suggest a simple one: A Markov model. In the simplest Markov model, you track the last block that was placed. For each of these, you use your training data to find the probability distribution of next blocks. To generate a new track, you pick blocks one at a time: The initial state is the start block; you randomly pick a next block from the distribution of blocks that follow the start block; then you randomly pick a next block, and so on. One of your blocks should be an "end of track" block (maybe this is an actual block, or maybe you stick it onto the end of each track in your data); when you generate the end of track block, your track is over.

merry roost
#

so kinda diffrent

queen cradle
#

You can add extra information to the state space.

#

There's a trade-off between how detailed your state space is and how much training data you have.

#

Sometimes it helps to reparametrize (e.g., maybe there's a way to use relative positions?).

#

You can also create a hierarchical model. The traditional example of this is a hidden Markov model. In these, your states don't correspond to blocks. Your states are something abstract with no well-defined meaning. However, your states also have an "output distribution," which is a probability distribution over blocks. At each step, you pick a new state; using the output distribution you pick a block. Then you pick a new state (which depends on the current state but not on the block you just placed), and so on.

#

Another option is to use a higher-order Markov model, where the next block depends not just on the current block but on the current and previous blocks.

#

Markov models are not as strong as fancier and trendier models. Their advantages are that they require less data, are faster, are easier to implement, and their training has fewer gotchas.

copper crow
#

hi guys, I have a question
I want to create a Python Tkinter application for plotting crypto charts. Do you have any idea what would be the best library for this?

regal vault
#

no matter how i hard i try i cant impliment my code so it runs on the gpu
do you gus know any good wrappers or libraries to run on gpu
numba dosent work becuase it dosent support a lot of things i use
like child inheartence and such

regal vault
#

will it work in a project where I use differnt classes and such

#

all classes i made using no external libriaires

#

@hasty mountain

hasty mountain
#

Pytorch is a framework that loves classes

regal vault
#

k

hasty mountain
#

In fact, I had to learn how they work so I could use Pytorch

regal vault
#

i see

#

in my case i have a project where im making a 3d render and would like it to run on the gpu instead of the cpu

#

*raytracing

#

only thing im worried about is that a lot of these programs are ml based

hardy depot
#

guys im a student and wanna do a good ai course , not a beginner

#

but all the courses in coursera and udacity with certificates are expensive asf, and i already have two courses from udemy so do u guys know any places ican get a cheap course?

simple tapir
#

Does machine learning or deep learning come first, when it's willed to go through this field and learner is beginner?

agile cobalt
#

deep learning is an area of machine learning that uses neural networks

simple tapir
#

So I better take ml courses first then dl courses?

#

@agile cobalt .

serene scaffold
#

you're a student. at university? can you take an AI course?

serene scaffold
simple tapir
#

I see

#

I've a very basic knowledge of machine learning and I think I could learn some deep learning without any issue

simple tapir
#

oh dang

serene scaffold
#

to learn is to suffer.

#

but in all seriousness, machine learning and deep learning take a long time to understand. that's why you can make a lot of money once you do.

simple tapir
#

I'm in my first year at university and studying computer science and engineering. Next year, i'll take artifical intelligence lecture but I'm willing to go through this field on my own aswell to improve myself. Would it be waste of time to take some machine learning classes online?

serene scaffold
#

(when I say "course", that might be what you call a "module")

simple tapir
#

I've already taken Pytorch for deep learning and machine learning and got no problem at all. But it wasn't that theoric

serene scaffold
#

your university teaches a course that's specifically about pytorch?

simple tapir
#

nope, I took it online

#

not from my uni

#

In the first semester, we took calculus 1 and this semester we have calculus 2 classes

serene scaffold
#

will you be taking linalg?

simple tapir
#

yes

serene scaffold
#

what was the loss for the first epoch?

#

how many epochs did you do?

#

hundreds, I see. what does this model do?

#

hmm, okay

#

anyway, it's hard to say if a given loss is "normal" or not

#

what you really care about is how it changes between epochs.

severe topaz
#

i am trying to optimize a plan which reflects the contemporary skills needed...

#

let me know what you guys think

tidal scroll
#

hi guys, want to ask about naive bayes method processing, I have pre process every data and drop unused column but when it comes to detecting outliers with Z Score or IQR my result is empty or rather NaN, do you guys have idea why the result like that?

versed heron
#

any reputable guides on ML to train an AI that can be used within a python script?

serene scaffold
versed heron
#

detect car plates (then, OCR)
and
see if a plant is a "bad" or "good" plant

#

like growing well or not, prolly needs some supervised training im guessing

serene scaffold
#

You'd need a dataset of healthy and unhealthy plant images, yes

#

Though I think that would be difficult for a model to learn

#

Unless there's some visual property shared by all unhealthy plants

versed heron
#

like if they're straight or not

serene scaffold
#

Guess I'm an unhealthy plant

versed heron
#

lmao

serene scaffold
#

Anyway, I wouldn't follow any tutorials on towards data science. Those tend to be trash tier.

versed heron
#

i need some material to start lawl

potent sky
#

machinelearningmastery is a good website

#

imo towardsdatascience has some quality write-ups.
But as a beginner if you don't know your way around it can be easy to get into the bad articles on there (and there are many of them) and consequently adopt wrong understanding, bad ways of approaching a problem etc. which can be difficult to unlearn.
So I agree with Stel here

potent sky
wooden sail
#

what you both say is my general experience with it. you can certainly find very good content there sporadically, but there is poor quality control at best

past meteor
#

i don't think there's any quality control. Someone I know writes for TWDS and honestly she started writing there when she was learning about data science

#

So her intentions were good but the things were just not correct as you would expect from someone beginning to learn anything

hasty mountain
#

I'd say to prefer to search for tutorials in the docs of the frameworks you're using. Tensorflow/Keras and Pytorch got some interesting tutorials.

You can use Towards Data Science articles, but...eh...be careful. Usually the folks that write there also has a small bio. If you see someone that at least seems to understand ML, that could be a good start

#

The best tutorial I found about Variational AutoEncoders was in Towards Data Science, and it was written by an AI Engineer from Meta

dull flare
#

hloww e_skulllaugh

#
ValueError                                Traceback (most recent call last)
<ipython-input-41-8236c67b5777> in <cell line: 15>()
     13                metrics = ["accuracy"])
     14 
---> 15 history = model4.fit(tf.expand_dims(x,axis = -1),y,epochs = 100,verbose = 0)

1 frames
/usr/local/lib/python3.10/dist-packages/keras/engine/training.py in tf__train_function(iterator)
     13                 try:
     14                     do_return = True
---> 15                     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16                 except:
     17                     do_return = False

ValueError: in user code:

    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1284, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1268, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1249, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1051, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1109, in compute_loss
        return self.compiled_loss(
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/compile_utils.py", line 265, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "/usr/local/lib/python3.10/dist-packages/keras/losses.py", line 142, in __call__
        losses = call_fn(y_true, y_pred)
    File "/usr/local/lib/python3.10/dist-packages/keras/losses.py", line 268, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "/usr/local/lib/python3.10/dist-packages/keras/losses.py", line 2156, in binary_crossentropy
        backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
    File "/usr/local/lib/python3.10/dist-packages/keras/backend.py", line 5707, in binary_crossentropy
        return tf.nn.sigmoid_cross_entropy_with_logits(

    ValueError: `logits` and `labels` must have the same shape, received ((None, 2, 1) vs (None,)).
potent sky
#

Yep but docs tutorials are fully code oriented. You preferably need math too. That's where twds comes in sometimes. Machinelearningmastery otherwise, pretty reliable

crimson summit
#

I am trying to understand why some people square the cost function of a neural network and some people dont square it. It seems to me that if you square the error when you are traning the network it will overcorrect because the error will be bigger that what it actually is

mild dirge
#

You mean the loss function L = (f(x) - y) ^ 2? @crimson summit

#

That is because you want to minimize the function, so the minimum would be if f(x) and y are the same. And if there is a difference between the two (positive or negative) then it should be larger than 0. That way minimzing this function gives the best results.

#

And also remember that we have a learning rate that we use for correcting the weights, which should be set low enough to not overcorrect.

past meteor
#

There's a bunch of reasons and the ones listed above are definitely part of them

#

Sometimes you also just don't want large errors so squaring it makes total sense. There's other loss functions that don't do this.

crimson summit
mild dirge
#

It would mean that larger errors are more heavily penalized than smaller errors yes

#

You can also have absolute difference as loss pretty sure

#

Or smooth l1 loss is another one

past meteor
#

Yup you can

mild dirge
past meteor
#

You can also just predict the log of Y, that's a common trick

mild dirge
#

Here's l1 f(x) - y, l2 (f(x) - y)^2 and smooth l1 (which is a bit more complicated)

#

As long as it's differentiable and continuous it can be used pretty much

past meteor
#

I'm not sure but I think MSE is just a tradition that is carried over from statistics

crimson summit
mild dirge
#

Which one?

#

the smooth l1?

crimson summit
#

ya

#

both i guess

past meteor
#

In statistics minimizing the sum of squared errors is equivalent to maximizing the likelihood, which has certain good properties.

mild dirge
#

It's almsot like a mix of l1 and l2, when close to 0 it behaves like l2, and further from 0 it's basically linear

#

As to not penalize very large errors too much

past meteor
#

But penalizing large errors can be really bad

past meteor
#

Look at: huber loss for example

#

Selecting loss functions and models is something you can / need to do based on your "knowledge" of the problem. If you're worried about large errors ruining you, you should be looking at techniques from robust regression

crimson summit
#

would you square the cost of the hidden layer aswell or only the final layer in a 3 layer neural network ?

hasty mountain
#

Though I admit I'm really enjoying the Gaussian Likelihood because it appears to me more accurate...and more interesting...all that thing of the Decoder having to predict the most likely value between an infinite range of possibilities...

short moth
#

Where can i start learning AI with python?

wooden sail
# past meteor wdym with this?

from the plot and comments above, seems like some discussion on L2 ignoring small errors unlike L1, and L1 not being differentiable at 0. i would mention that it's subdifferentiable though, and most autodiff libs use a subderivative of 0 or 1 at 0

past meteor
#

It looked like they were equating f(x) - y to L1

wooden sail
#

ah that's what you mean

mild dirge
#

Oh I forgot the abs there yeah

past meteor
#

smooth L1 is new to me though. Initially I thought it was just ML people renaming elasticnet but it's something else

wooden sail
#

it's something else indeed

wooden sail
#

you see it in many places though. gradient-based methods are nice because for well-behaved functions, you can find local minima

mild dirge
#

You can find the formula here, saw it used for a reinforcement learning project

wooden sail
#

whenever you have good reason to use a non-differentiable cost but also want to use gradient methods, smooth approximations are interesting

#

stuff like softmax falls here when used as a smooth argmax

past meteor
#

Also quite similar to Huber loss I see.

wooden sail
#

ah, that does appear to be the case

keen gust
#

@potent sky another question for you, so I have my streamlit app up and running and I'm having an issue w/ the st cache data ttl. It's set to 1 hour but it doesn't actually clear the cache after an hour. It's still loading the same df from last night but when I edit ttl to a few seconds and test this change locally, it clears just fine. Is the ttl only valid while the app is actually in use? I was assuming if I close it and reopen the next day that it would be cleared on rerun but maybe I misunderstand how that works

potent sky
#

Wdym by close it and reopen? Are you shutting down the program? Streamlit cache is persisted on disk too iirc so it could repopulate if you're shutting the program and restarting it later, but this will reset the timer

hasty mountain
#

Phew... Finally managed to make a functional VAE...
now...onward to creating abominations have some fun with the architecture brainmon

potent sky
#

VAEs are fun

#

What're you using it for, LDMs?

hasty mountain
#

I want to make an experiment with GANs using latent vectors

#

The idea is to try using a GAN to create latent vectors rather than creating an entire image.
An idea that came to me after seeing the latent diffusion idea, which applies diffusion into a latent vector to make an image

#

Oh wait... LDM = Latent Diffusion Model, right?

#

So...almost for that pithink

potent sky
#

Very interesting paper, look it up if you want

potent sky
hasty mountain
potent sky
#

2019 ICLR I think

potent sky
#

I was just beginning work on LDMs for music/audio when they published AudioLDM this year Feb I think

#

I think it's still worth trying tho, you might get a different idea to solving the problems you encounter

hasty mountain
#

Yes. I'll take a look.
Maybe I could at least make something more simpler/cheaper and get an average performance, since those papers usually go for absurd things...

#

Hm... They didn't use PPO for it brainmon

#

Thanks for the recommendation!

hasty mountain
#

Go for it, then.
My university vacation will end soon, so I may take a while to work on it yert

#

Maybe you'll give me some inspiration

plain jungle
dull pike
#

I have a question

#

I think if i dive into a course that's specific to machine learning it would be way harder to get finish/get into

past meteor
#

That's a good course to get a baseline understanding of Python, which will definitely help if you go towards ML later on

keen gust
molten atlas
agile cobalt
#

a book for that sounds like a waste to me? specially at this point in time in which things are still moving ultra fast, to the point that something from 6 months ago may already be outdated

plain jungle
vale idol
#

Hey everyone, I'm looking for some help to connect different dataframes using pandas for a uni project I am woring on. If anyone has experience here and can help please reach out, thanks in advance ๐Ÿ™‚

mild dirge
#

If the question doesn't require hours of guidance, it's probably best to just directly ask it here so people can inmediatly answer. People generally don't dm to find out what the question even is ๐Ÿ˜›

vale idol
#

Yeah that's on me hahah, a bit desperate to find a solution so forgot to provide details ๐Ÿ˜†

#

So have 3 differnet dataframes that contain 4 simmilar varibles which are a yearly time series data for companies (multiple comanies can have multiple scores). What I've been trying to do here is make a function that assigns a label (high,low,mid) every year for each company depending if its value is below or above a certain quantile and store it in a seperate column. Don't have a lot of experience with python and couldn't really find a simmilar issue on stackoverflow

vale idol
# vale idol

Ignore the last two lines since its copy pasted

vale idol
uncut ember
#

I think this work

agile cobalt
#

I don't think that there's a need for apply/applymap at all?

agile cobalt
vale idol
agile cobalt
#

oh, this might be useful (from googling pandas map quantile)

#

!d pandas.qcut

arctic wedgeBOT
#

pandas.qcut(x, q, labels=None, retbins=False, precision=3, duplicates='raise')```
Quantile-based discretization function.

Discretize variable into equal-sized buckets based on rank or based on sample quantiles. For example 1000 values for 10 quantiles would produce a Categorical object indicating quantile membership for each data point.
vale idol
agile cobalt
#

without it you could do some tricks to get which quantile each record fits into, but that function seems to just do it for you with a much simpler api than check which bucket each record fits yourself

vale idol
serene scaffold
#

!code

arctic wedgeBOT
#
Formatting code on discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

vale idol
#
# Generate yearly rankings labels for each provider based on ESG score
top_quantile_30 = 0.3
bottom_qunatile_30 = 0.7
top_quantile_10 = 0.1
bottom_qunatile_10 = 0.9

years = [2013, 2014, 2015, 2016, 2017, 2018, 2019]
reprisk_scores = ['peak_yearly_RRI', 'yearly_environmental_score', 'yearly_social_score', 'yearly_governance_score']
sustainalytics_scores = ['total_esg_score', 'environment_score', 'social_score', 'governance_score']
capitaliq_scores = ['ESG_score', 'Environmental_score', 'Social_score', 'Governance_score']

labels_30th_p = ['LScores', 'LScores', 'LScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'HScores', 'HScores', 'HScores']
labels_30th_p = ['LScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'MScores', 'HScores']

def get_score_rankings(df, score_type):
    #Match relevant provider with correct score type!
    #Rankings for top/bottom 30%
    df['ESG_measure_sorts_30'] = ''
    df['env_measure_sorts_30'] = ''
    df['gov_measure_sorts_30'] = ''
    df['soc_measure_sorts_30'] = ''
    #Rankings for top/bottom 10%
    df['ESG_measure_sorts_10'] = ''
    df['env_measure_sorts_10'] = ''
    df['gov_measure_sorts_10'] = ''
    df['soc_measure_sorts_10'] = ''

    for year in years:
        yearly_df = df.loc[df['year'] == year, ['isin'] + score_type]
        for score in score_type:
            if score == 0:
                df['ESG_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)
            if score == 1:
                df['env_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)
            if score == 2:
                df['gov_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)
            if score == 3:
                df['soc_measure_sorts_30'] = pd.qcut(yearly_df[score], q=10, labels=labels_30th_p)

get_score_rankings(sustainalytics, sustainalytics_scores)

agile cobalt
#

idk how you are creating each dataframe so I have no idea what their indexes look like, but that could be an issue

#

oh wait pithink

#

why the ['isin']?

vale idol
#

Right after the for loop you can see where I define a temp dataframe which just get the data for each year and the relevant columns I need quantiles for. The dataframe I want to assign them to has the data for the entire range of the years

agile cobalt
#

hmm ok it sounds like you are iterating over a list of strings and checking if the value is equal to a string number?

vale idol
agile cobalt
#
# ['total_esg_score', 'environment_score', 'social_score', 'governance_score']
for score in score_type:
    if score == 0:
agile cobalt
#

that is not going to work how you want

vale idol
vale idol
agile cobalt
#

I recommend either converting everything to one standard format, or creating one separate script for each different input data you want to transform

#

after you decide on that, start (from scratch, not copy/pasting what you have right now) prototyping in something interactive like a Jupyter Notebook or an IPython terminal

#

only after you get the operations right try to organize it into a function

vale idol
agile cobalt
#

what do you think that the score variable contains when you are doing score == 0 / score == 2 etc?

agile cobalt
#

what do you think that would happen if you did yearly_df[0]?

#

you have to organize your process in your head first, and only after that start coding - and even then, doing it in small steps, testing each part.

vale idol
#

I'm still fairly new to python so might be messing up basic stuff

agile cobalt
#

!e ```py
strings = ['a', 'b', 'c']
for string in strings:
print(string)
for i in range(len(strings)):
print(i)
for i, string in enumerate(strings):
print(i, string)

arctic wedgeBOT
#

@agile cobalt :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | a
002 | b
003 | c
004 | 0
005 | 1
006 | 2
007 | 0 a
008 | 1 b
009 | 2 c
agile cobalt
#

for x in thing: iterates over each value in the thing, not over each position

vale idol
#

Yeah I understand but the score == 0, 1 etc.. is just to match the column names of the yearly_df and the original dataframe column. Like I said before, I know this might not even be needed if the column names were the same for each dataframe I'm applying the function for

vale idol
#

Got what you mean

wanton laurel
quaint loom
#

Any suggestion on graphs that I should pick when it comes to having a lot of paramters? I have tried to make double y axis but it seems like it still looks like a mess.

pseudo spire
molten atlas
celest vine
#

Any data engineers here?

past meteor
celest vine
past meteor
#

To transform data and do ad-hoc analysis.

#

There's also more and more tools that let you do dataviz with SQL

quaint loom
mint palm
#

i have these projects on my resume(applying for ML/DS role FULL Time), how do they look? in terms of difficulty, required time, how impressive are they?

#

and should i add more? or 3 are enough?

quaint loom
uncut ember
#

just add this to your code

obtuse flax
#

Hey, what are some great ways to run concurrent request in python? I'm working on https://github.com/apolloapi/apolloapi and want to structure concurrent request for our request wrappers. We'll probably implement some call_api method but I noticed python has an async keyword but I've heard python isn't the most friendly language for concurrency.

Apollo is a model management tool for training AI models, automating tasks and catching regressions. I'm currently working on adding a new provider this week that allows for LLM based grading against LLM generated output to produce a grading system for the regression testing feature of the project.

GitHub

A radically simple LLMOps framework for automation, monitoring and management for performance. - GitHub - apolloapi/apolloapi: A radically simple LLMOps framework for automation, monitoring and man...

serene scaffold
lapis sequoia
#

Hello, does anyone have any resources on Multi Task Learning in general to recommend or on Multi-Head architectures more specifically?

serene scaffold
ripe sapphire
#

Hi everyone

#

I wanted to ask that what things do I have to learn in AI field in python programming language, I am little confused as this field is very vast and beyond my knowledge. Hope you guide me

serene scaffold
ripe sapphire
#

But Can you just tell me what things I have to learn in programming, like I know basics of tensorflow, keras, numpy,

serene scaffold
past meteor
serene scaffold
past meteor
#

Most of ML is turbocharged statistics. Knowing basic regression well helps you understand neural nets better later on

serene scaffold
ripe sapphire
alpine rain
#

is it possible to update an NLP model like Stanza to fix certain incorrect dependency parsing values?

plain jungle
#

This is kinda the under hood of a simple NN and why calc is important

alpine rain
#

.92*1=0.92 heavy math indeed ๐Ÿ˜„

lapis sequoia
# ripe sapphire Hi everyone

i was in your place at some point, so i decided to document everything i learned from when i started till now in thie git repo: https://github.com/ahmedbelgacem/awesome-datascience i hope it helps you. It isn't a list of technologies and frameworks its a list of topics with the articles, books and courses i used to learn that topic. It isn't exhaustive as this is what i have learned up till now. I just recently landed a job as a Deep Learning engineer focusing on vision problems so much of this is on computer vision but you can find enough to learn. There's some french courses since i understand french and you may not. Hope this helps

GitHub

A curated list of awesome python, machine learning, computer vision and data science resources, articles, guides, courses and books. - GitHub - ahmedbelgacem/awesome-datascience: A curated list of ...

old echo
#

should i be a software engineer or AI scientist ?

kind jetty
#

do what you enjoy

serene scaffold
#

and that's the end of that.

old echo
#

will data scientists role replaced by gpt-4 ?

serene scaffold
#

this will be your only warning.

lapis sequoia
# old echo should i be a software engineer or AI scientist ?

depends on what you like most and what you're good at. I initially started as a software engineer and i studied software for 5 years. Then i added a masters degree in AI engineering. I thought that i'm good at computer science and found that easy enough. I also liked maths but it was more challenging for me and felt like doing maths was making me think and try hard while i wasn't trying hard in computer science alone. That's why i switched. Today I really like what i do (deep learning engineering) . I find that most of my work on a day to day base is pure software and coding but everything needs intuition, mathematical background and critical thinking. I find hard aspects on a day to day basis and i like the challenge. And no, it won't be replaced by gpt-4.

toxic mortar
#

Hi everyone, I am new here, and this is my first message on this Discord server community.

I'm a software engineering student considering taking a neural networks course next semester. I've seen a few presentations on the course presentation, and it seemed a bit heavy on the theoretical side. I'm trying to see its applicability to real-life situations, but I think I fail to. I had a similar experience with discrete math that I took last semester; I thought it is beneficial for AI/ML, but I didn't find it particularly useful since we did only the pure theoretical part of it.

I'm curious if studying neural networks is a prerequisite for diving into other areas of AI? And how strongly correlated are the concepts covered in this course to the wider field of AI? I consciously used the term 'AI' in the messages above, as I don't want to decide what part I want to delve into before I inspect each aspect and possibility. I hope that makes sense and give you some overview of my question

Thank you ๐Ÿ˜„

alpine rain
#

as long as the hallucination part is not fixed in the LLMs, they should not replace anything

past meteor
#

I'd say a NN course is definitely a good idea for most CS majors even if you don't want to go into AI propper. A lot of chance you'll be working on/with a service that uses AI in the future.

potent sky
potent sky
# toxic mortar Hi everyone, I am new here, and this is my first message on this Discord server ...

Neural networks are overwhelmingly the concept on which most of modern deep learning is based (note: not all)
Deep Learning is a subset of Machine Learning. It has seen great visibility recently in powering technologies like voice assistants, recommendation systems (think "The Algorithm"), better camera quality, and a load of other things.
Machine Learning is one of the ways of manifesting AI and currently the most popular and successful one by far.
Hope this series of associations gives you some clarity!

lapis sequoia
toxic mortar
#

I really appreciate your replies guys. I think of myself that I am a hard-working guy willing to put in the work, and I'm not demoralized by the course, even if it is tough or abstract, as long as it'll benefit me in the long run. Given that, do you think it would be a good idea for me to start independently studying the neural networks course material over the summer before the formal semester begins? Because I think it might make the learning experience smoother when the actual lectures start, as I won't be encountering the topics for the first time if that makes sense

past meteor
#

And then connect the ideas you see in your neural nets course to the ideas you saw in stats. It'll make your knowledge a lot stronger in the long run

toxic mortar
#

Agree, I am taking Probability and statistics course as we speak. I mean, I will finish it in couple of weeks.

past meteor
#

Then a second prereq before going into neural nets is imo traditional machine learning methods

#

It's a hot take but I'd say all of ML is statistics but it tends to be called ML if it's done by someone from a comp sci/engineering background. Traditional stats, "traditional" ML and NN's are imo all part of a big toolbox you can use to solve many problems. Different problems will need different techniques so knowing a bit of everything helps. ๐Ÿ™‚ Reason being that if you "skip" regular ML then you might overengineer things (especially on tabular datasets).

lapis sequoia
past meteor
#

But tbh, I think there's a lot of people now that are working exclusively on speech, text, images, video, ... and I think these profiles can get away with not having a super in-depth knowledge of the traditional stuff. It's more specialised and nearly exclusively deep learning now.

potent sky
#

LinAlg, stats, probability and information theory (maybe vector spaces too if you're interested)

past meteor
#

I think you can get away with vector spaces unless you're going for the theoretical route

dusk bear
#

hey guys..
i am very much interested in ml/dl
but idk where to learn or how to learn๐Ÿฅฒ
i am good with math like linalg, prob and stats..
can anyone please help me
i have done some random courses.. but idk how much i have learnt and stuff.. i didnt do any projects and stuff too.. guide me pls๐Ÿฅฒ

dusk bear
#

learn ones

#

but the thing is i dont get a pathway kinda.. like how to develop

past meteor
#

Assuming you already have the prerequisite linalg, prob, stats then you should to "easy" Kaggle competitions (tabular playground series)

dusk bear
#

have seen many utube tutorials. have all fundamentals but cant map them and learn ๐Ÿฅฒ

past meteor
#

Solve the case yourself, submit your predictions and then look at other people's notebooks

past meteor
#

Beware that Kaggle only trains a subset of the skills you need to work in data though

dusk bear
past meteor
#

there's much more to data science than training models

dusk bear
past meteor
potent sky
dusk bear
dusk bear
#

actually, last week i started aeroplane object detection using RCNN. like ik what is cnn, how cnn works, architecture of cnn but idk how to code for it

#

how to build model for it.. so how to learn all these? this is what i wanted to know actually

slender kestrel
#

hello i have a question about a deep learning model can anyone help me with that ?

past meteor
#

This one in particular covers the theory and implementation of most, if not all, common architectures

past meteor
#

After that (reading these 4 I sent will take a very very long time if you do it properly) then what's left is the cutting-edge in papers + actually using what you've learnt in those to do projects

potent sky
#
  • GitHub and docs are your friends
#

ofc be sure not to simply copy

past meteor
#

The docs of Tensorflow / Pytorch / MXnet have examples that are typically well explained indeed

dusk bear
slender kestrel
#

in this model 2 lstm layers are added in sequence with using return state=True so does it make it a stacked lstm network or not ?

past meteor
#

But fundamentally - you need to decide if you want to be designing novel architectures or if your interest is in applying say the cutting-edge on specific problems

#

Imo these are wildly different skillsets

dusk bear
past meteor
#

I remember I did a workshop track on computer vision in the cloud with a fancy consulting when I was a student. We got a (useless) certificate on the end.

Afterwards I was browsing LinkedIn and I saw someone post about a super cool thing they did. Turns out I was in exactly the same track as them but they inflated it so much I had no idea I even attended the same thing as them.

past meteor
#

I don't know if it's a good idea to essentially dox yourself haha

dusk bear
past meteor
#

I'm a books person so my suggestion is to take https://www.statlearning.com/ and read it diagonally and experiment with the techniques you're learning there in Kaggle competitions.

dusk bear
lapis sequoia
# dusk bear hey guys.. i am very much interested in ml/dl but idk where to learn or how to ...

If you're good with the math I would then recommend to study python in depth. After that start with the bases of machine learning (Andrew Ng haw a really good free course on coursera with Stanford University if you want to start). Then i would recommend some deep learning, neural networks etc after that it would be nice to try different things and choose what you like the most and play around different project you find, for example try computer vision thing, study it specifically then try to build a classifier for something you like. Then study for example nlp and try to build something with it etc

lapis sequoia
dusk bear
dusk bear
# dusk bear btw for this i have used selective search for plane detection and this is what i...
ss.setBaseImage(imtest)
ss.switchToSelectiveSearchFast()
ssresults = ss.process()
imout = imtest.copy()
for e,result in enumerate(ssresults):
    if e < 2000:
        x,y,w,h = result
        timage = imout[y:y+h,x:x+w]
        resized = cv2.resize(timage, (224,224), interpolation = cv2.INTER_AREA)
        img = np.expand_dims(resized, axis=0)
        out = model_final.predict(img)
        if out[0][0] > 0.97:
            cv2.rectangle(imout, (x, y), (x+w, y+h), (0, 255, 0), 1, cv2.LINE_AA)
plt.figure()
plt.imshow(imout)```
This is the code for detection part btw..
lapis sequoia
#

wait i will send you the link

dusk bear
lapis sequoia
#

they changed the name

dusk bear
lapis sequoia
#

i said if you have something in particular you are looking for, say you tell me i want to learn reinforcement learning, i would send you a link for that particular subject

potent sky
cold osprey
#

what kinds of things can i do to improve image classification tasks?

Currently, im just trying out various models and fine tuning them on my dataset. Not sure what else I can explore to improve performance

somber panther
#

recommend any starter courses for ds ml? the one i picked out on udemy is pretty dated

past meteor
past meteor
#

Augmentation and/or other regularization strategies might be a good idea

#

If you have the time for it you can also just hyperparam tune

cold osprey
#

dont have a val set which may be a mistake now

past meteor
#

Well, your test is your validation isn't it

#

so you'd need another dataset

sullen kernel
#

hi, I'm having problems with my project and I would appreciate if anyone could get on a call with me and help me maybe?

cold osprey
#

rightt

#

i have auto transforms that i get from the pre trained model itself

#
ImageClassification(
    crop_size=[288]
    resize_size=[288]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)```
past meteor
#

I think at some point you are beginning to overfit so you can play around with adding dropout inyour FC layers, augmentation, ...

cold osprey
#

there is one dropout layer already but i can increase the proba

past meteor
#

Yeah if you have the compute for it, I'd do it with some sort of hyper parameter tuner

cold osprey
#

hmm, would u do it across diff models too? like effnet b0 to b4 and with various hyperparameter values

cold osprey
shut yoke
#

ah alright

past meteor
#

So across models and also "remembering" that hyperparam1_1 is related to model1 and hyperparam1_2 is related to model2 etc

#

Maybe Optuna has this too - my issue with KerasTuner is that it depends on Tensorflow and installing TF just to get this is crazy ๐Ÿ˜›

cold osprey
#

ah hmmm

#

im not even sure if putting this much effort on just a project to showcase i know how to work with image classification stuff is worth it

dusk bear
potent sky
celest vine
past meteor
#

People wear multiple hats. It's common to be a data engineer that also does analysis / data science

celest vine
past meteor
#

But yeah, even if you don't do it yourself the analyst that is working downstream relative to yourself might do their analysis with SQL

celest vine
past meteor
#

Probably SQL?

celest vine
past meteor
#

Many data engineers don't know Python

celest vine
celest vine
past meteor
#

In the past I did internships in data engineering specifically

celest vine
past meteor
#

Probably better placed people to do that than me :/ maybe @boreal gale

#

If not, try Reddit

celest vine
#

Give your opinions on the roadmap

cold osprey
#

if u plan on reading kimball, we can discuss it too

#

im on chapter 2

dull pike
#

Do you guys think itโ€™s worth it to get a teacher for learning python and machine learning?

celest vine
past meteor
#

The roadmap is fine so long as you do enough projects

#

I wouldn't spend time on Inmon, Data vault, data mesh or what have you. Just good ol' star schema's are fine for entry level

celest vine
past meteor
#

Yeah just star schema's are fine to focus on in the beginning

#

Maybe people that do data engineering full time might disagree so I'd go on r/dataengineering and ask their opinion

celest vine
#

Got it. I appreciate all the suggestions you gave.

cold osprey
#

galaxy schema xD

small heron
#

How are you. ...... I'm create a shopping app using python kivy, if I send information from user interface to SQLite its going but not updating on my app at the real time. for example, on marketplace page if I add item to my cart its saying its added but going to my cart its not appearing but if rebuild the app it will be showing, so how can I make things update at the real time

boreal gale
# celest vine Give your opinions on the roadmap

err.. replying because i got pinged heh.

  1. is solid, it's where i would start if i were to start over
  2. is okay, i mean it's nice to know the concepts and all, but imo the value is limited unless you put it into practice
  3. spark is good to know, but imo is optional, people abuse spark way too often (when you have a hammer, everything looks like nail type of thing), i would just ignore hadoop hive pig, only research them if the job you are applying requires it/you have an unnatural interest in them
  4. can always help you job hunt, it's a plus but not essential
    5):
  • airflow is not a must, but sure you need to come to grips with some orchestration tooling, prefect and dagster are viable contenders (heck even luigi depending on your usecases)
  • compute: no comment really, but if you know spark then this is probably not a big step up, again not essential imo
  • cicd: only CI is relevant to your core duties, knowing how to test your code is a big plus
  • docker: hell yes. you can't escape them containers these days. ducky_ducker
    6): 10000% yes, put it all into practice, do something original, it's the best way to drill some core concept into your brain and it serves well inside a portfolio

but i must say, imo data engineer is not a job you can easily land without some experience in other dev related role, companies that hire junior DE is few and far between.

also this is quoted often in the DE discord https://github.com/datastacktv/data-engineer-roadmap
and DDIA is almost a religous text in DE https://dataintensive.net/

good luck!

proud beacon
#

Hi guy, I have a piece of coding instructions and I am using anaconda3, should I type these into the anaconda prompt of into my VScode application? ``` start Anaconda3

type:

cd E:\Xfer\NC\MCT2000_LOG_FILE

Press Enter

rose dagger
#

Is there some way to reduce/manage the needed memory for a neurel net in tensorflow? I'm building a network with roughly 30 million parameters and i'm using the GPU provided on Kaggle, which roughly has 16 GB of GPU memory. When initializing the model it immediately runs out of memory. Any tips?

#

(I know one obvious option would be to reduce the complexity of the neural net, i.e. remove a few layers / connections, but say i want to improve the memory usage for a given fixed neural network)

tacit knot
#

@rose dagger I don't have an actual answer for you, but i know there are several memory optimization things especially around Stable Diffusion (popular/open source) that you MIGHT be able to apply in some way? I'm guessing you are already familiar with some, but there are things like xformers, cunumeric, and several other things. Have you looked into any of those?

#

I'm actually trying to look into if/how I could potentially convert the ZoeDepth models to use TensorRT for performance boost...lol but so far I've only been trying to "use" AI stuff, not even sure where to start yet.

rose dagger
tacit knot
#

Ah ok, then there is hope for you yet lol...good news is this is a common problem, bad news is that it is really hard to find quality information.

#

That will probably be your single biggest gain. I tried to get it working early on and failed many times. Finally got a better understanding of python environments and such, but it is a near drop in improvement. It DOES have a potential downside, certain things (not sure exactly what all) are not deterministic.

#

but also check out cunumeric, drop in replacement for some core python stuff that I've read can give performance/memory improvements

rose dagger
tacit knot
#

Any idea how far out you are on memory?

#

like do you need to shave a bit or cut it in half?

crimson summit
#

I just finished coding my first Neural Network. It is a simple 3 layer neural network. For some reason it is not working. I double checked the math part and everything seems right. If anybody sees any glaring errors please let me know.

#

here is my code ^

#

this is the data that I am working with ^

#

I am supposed to get something similar to this as my answer for the #7 which is the first number in the test data set

#

instead I am just getting this

rose dagger
tacit knot
#

there is a good and extensive write up about optimization on HuggingFace

mild dirge
#

@crimson summit

#

Here you calculate output_errors but don't use it?

crimson summit
#

could that mess up the network if it is not being used ?

mild dirge
#

No, just a waste of processing time but it won't affect anything

#

Appareantly all outputs are very high, which means the weights might be very high

#

You could check if that is the case

#

Might not even be that the network is broken, but f.e. too high learning rate (0.3 is quite high for general models)

crimson summit
#

i have not messed with the weights yet though

mild dirge
#

Did you try values like 0.001 ?

crimson summit
#

oh no I went down to 0.1

#

let me try the real quick

mild dirge
#

Try something like 0.001 see if that makes any difference at all

#

Checking the manual gradients calculations would take quite a while for me as well, so if it's anything else that would be nice ;P

#

Btw, why do you have this inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01

#

Are you scared of zeros or something?

crimson summit
#

its supposed to scale and shift the inputs between 0.1 and 1

mild dirge
#

Normally you'd normalize to values between 0 and 1

#

Did the book suggest this (may be because you don't have a bias in your NN)

crimson summit
crimson summit
#

but the guy in the book did some super wierd math that is inorrect so I trained my neural network diffrently

#

i am not to surprised that the results are different just trying to figure out what I need to adjust

mild dirge
#

What is incorrect about it?

#

Was it the derivative of the sigmoid?

crimson summit
#

if you look in the train section he calculates the cost of the hidden layer output by just multiplying the weights times the output cost or "error" how he calls it

#

I also just tried making the weights way smaller but that did not do anything

mild dirge
#

Do you have the csv, can you send in dm?

crimson summit
crimson summit
mild dirge
#

Alright let me just check some stuff out then

#

So the values grow big after the hidden layer, so from hidden to output they get to like 14 on average

#

When pulling those values through sigmoid they will basically all be close to 1

#

Not sure why those weights are so high yet though

crimson summit
mild dirge
#

Nah shouldn't be

#

lol

#

found it

#
[2.31742179e-03 4.87518635e-06 6.80298229e-04 7.63022959e-05
 1.15368135e-07 2.46824449e-05 4.67039119e-08 9.99906227e-01
 2.01392314e-07 2.47477905e-05]
#

Getting this output now, with 0.99999 at index 7

#

The way I found it was by printing out the output_errors_deriv, and found that almost all derivative where positive

#

Which means that the model would try to correct the weigths to increase those values, but it wanted to increase all values but the one that was the correct target

#

You swapped targets and final_outputs in your error derivative

#

output_errors_deriv = 2 * (targets - final_outputs)

#

And not
output_errors_deriv = 2 * (final_outputs - targets)

#

@crimson summit

#

Or actually...

#

That was correct

crimson summit
mild dirge
#

But swapping those also fixed it, you should actually change

self.who += self.lr * numpy.dot((output_errors_deriv * final_outputs_deriv), final_inputs_deriv2)

to

self.who -= self.lr * numpy.dot((output_errors_deriv * final_outputs_deriv), final_inputs_deriv2)

instead of swapping targets and final outputs

#

Because atm you are doing gradient ascent instead of gradient descent

crimson summit
#

should I swap the sign to negative on the other weight calculating formula aswell

mild dirge
#

Yeah I'm just checking that

crimson summit
#

I am now getting the correct largest value for the number 7 so it is working fine now

#

I made them both negative btw

#

I just need to make the numbers decimals

#

I think

mild dirge
#

Hmm, still something wrong even after swapping, getting 1k of 10k correct (basically random guessing)

crimson summit
#

yea never mind when I try the second number in the data set its incorrect

mild dirge
#

Yeah I'm not sure atm, it takes me too long to find too. I'd probably have to write it from scratch myself to see how I would do it and then compare it with your solution, but that takes a bit too long right now.

#

I don't think I can really help much further :/

crimson summit
#

No worries bro

#

thank you for the help

mild dirge
#

I'm doing a deep learning project, and my partner tried out all kinds of hyper params, these were the learning rates he tried out for the grid search ... :/

serene scaffold
mild dirge
#

Also set learning rate decay to 0.97, with training taking about 10k update steps (learning rate of 10^-14 after 1000 steps or so)

plain jungle
mild dirge
#

I actually forgot the minus at some point in this project, caused a big head ache haha

plain jungle
#

Lmao, oh I could only imagine

rare fog
#

How would I make a list that follows a distribution that looks something like this, for a given minimum, maximum, and number of items?

agile cobalt
#

(as for which one exactly fits your particular use case, I have no clue though)

rancid widget
#

I' am learning data science and learning statistics. Can anyone shed some light on 2 histograms I have along with how I determine the bins and tell me if it is normal distribution? It's confusing when its not perfect and never seems to be lol

rancid widget
#

nm, I figured out how to plot against QQ plot

sweet crypt
#

In search algorithms in game, how do we know we have taken good actions?

dense crane
#

transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) is this normalization a thing?

#

like is someone ever use that or just it is useable?

winged lance
#

i'm not from coding background . can i learn data science get a job pls give suggestion ?

lapis sequoia
lapis sequoia
sonic creek
#

I need helppp

#

@agile cobalt

#

@stable wing

lapis sequoia
sonic creek
lapis sequoia
#

please write a full sentence framing your problem

sonic creek
#

But it is very simple thig

#

Ok !

#

I have error

#

Can I send it?

#

@lapis sequoia

uneven thunder
#

General question. I'm starting to learn ML and i'm wondering if training a ML model to determine even and odd numbers is a smart beginning. Is this a hard goal? is this a simple process?

Essentially:
Feed the model 100'000 numbers between 0 and 60'000,
Train it for idk, 14 epochs,

Save the model and test it with 10'000 numbers between 70'000 and 120'000.

Would that be a doable beginner project?

lost pier
#

Hi peeps, wondering if anyone can help me with something. I have a pandas dataframe and I'm running a function through it, but it's getting tripped up by null values. The problem is I can't remove the null values, I just want to skip those rows, I can't find a way to do that, there just seems to be dropna() or fillna() but those null values are supposed to be there, I'm just not working on those bits, is there an ignore null and move on method in pandas?

cold osprey
cold osprey
#

!code

arctic wedgeBOT
#
Formatting code on discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

uneven thunder
cold osprey
#

U want to use a NN?

uneven thunder
#

I was thinking about it yes. I feel like a decision tree would do fine, but i'd like to try a NN, yes

cold osprey
#

What features will u pass in that makes u think a decision tree model will work?

uneven thunder
#

I belive with enough trial and error it might figure out to follow the simple rules of "if odd: else:" which would result in a 1.0 accuracy.

The training data would consit of random numbers like i explained before and the correct answer for each current number it's training on

cold osprey
#

There needs to be some sense in how the model will work right

uneven thunder
#

Yes.

cold osprey
#

So if ure a decision tree, how would u 'split' the data?

#

In decision trees, numerical features are treated as 'Is X > 5?' for e.g.

#

Will any form of >, >=, <= or < work?

lost pier
#

@cold osprey I have a largish dataset, 130,000 odd rows. there are two columns I am working with, one has an array which I have exploded they they are now single strings on separate rows, the other column is a key value pair, looks like JSON though to be fair it's in single quotes, but I can deal with that bit. So once stripping off any excess white space and they applying json.dumps and json.loads, I am now trying to apply the following line:

df[["workflow", "cost_centre"]] = df[["workflow", "cost_centre"]].applymap(ast.literal_eval)

after narrowing all this down, it works as expected untill it gets to a row where both of these columns are null values. I need to just skip them not remove or alter them if at all possible

uneven thunder
#

well, i already pieced together a simple feedforward MLP, just to see what happens, but since i have no clue what i'm doing it has an incredible accuracy of 0.5.

I can show you if you'd like.

cold osprey
#

Accuracy of 0.5 is no better than randomly guessing

gentle zenith
#

AI is so cool!

uneven thunder
cold osprey
#

Which is what I would expect

#

What activation functions r u using?

#

I think u would need some non linear stuff to get it to work, not sure

uneven thunder
#

Okay more general question. What model would be suited for such a task. I'm currently playing around with something like this:


import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

# GENERATE
training_numbers = np.random.randint(0, 60001, size=100000)

# LABEL
training_labels = np.where(training_numbers % 2 == 0, 1, 0)

# DEFINE
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(16, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# TRAIN
model.fit(training_numbers, training_labels, epochs=14, batch_size=10)

# GEN-TEST
test_numbers = np.random.randint(0, 60001, size=10000)

# LABEL-TEST
test_labels = np.where(test_numbers % 2 == 0, 1, 0)

# TEST
_, accuracy = model.evaluate(test_numbers, test_labels)
print('Average Accuracy:', accuracy)

# ANALYZE
#removed for discord
model.save("model.h5")

But like i said, i'm just experimenting around, not really knowing what i'm doing

cold osprey
#

Problems like odd even where there is defined way to calculate it isn't usually solved by ML

uneven thunder
#

Yes, but it seemed like an easy "enviroment" with simple rules and it's easy to test.

cold osprey
#

The rule is modulus

#

So u would need to teach a model how to do modulus

uneven thunder
cold osprey
#

Why not just use some dataset on kaggle?

#

Yeah

#

Remainder of modulus to be specific

uneven thunder
cold osprey
#

Idk I mean Titanic dataset for classification?

#

Iris datasets

uneven thunder
#

Titanic?

#

lemme look it up rq

mild dirge
#

Iris, or mnist, or fashion mnist

cold osprey
#

These are like the typical first project datasets before moving onto something that interests u more and u have some domain knowledge over to apply

#

Fashion mnist was my intro to CNNs

uneven thunder
#

somethink like this?

mild dirge
#

Can probably also use a regular MLP for (fashion) mnist because the images are so small

cold osprey
#

Ye tbh just pick any that interests u

uneven thunder
#

Okay. I'll give it a shot.

#

is this something that's best done in a notebook?

cold osprey
#

Otw home rn, will look in more detail in a bit

cold osprey
uneven thunder
#

what is the random_state variable in X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)

cold osprey
#

Just a number to randomize the split

#

For consistency between runs

uneven thunder
#

okay

#

okay i solved it with a decision tree since it's just numbers.

#

I wonder if

#

hm

cold osprey
#

Haha wdym solve

lost pier
# cold osprey Braining this rn, maybe u could show some example rows before the applymap step?

sure, here is a sample:

                      workflow       cost_centre
220     56860820     "ott"     {"ott": "2000920243", "txt": " "}
221     56860822     "txt"     {"txt": " "}
222     56860823     "txt"     {"ott": "2000920243", "txt": " "}
223     56860823     "ott"     {"ott": "2000920243", "txt": " "}
224     56860824     "txt"     {"txt": " "}
225     56860825     "txt"     {"txt": " "}
226     56860827     "txt"     {"txt": " "}
227     15694706     "txt"     {"txt": " "}
228     9877816     "txt"     {"txt": " "}
229     56860828     nan     {"processing": "DE"}
230     56860828     nan     {"processing": "DE"}
231     56860830     nan     {"processing": "DE"}
232     56860831     "txt"     {"txt": " "}

for the record, the result should come out as the following and does untill it sees "nan", which I had no idea was there untill I drilled into the data:

                      workflow       cost_centre
220     56860820     "ott"     {"ott": "2000920243"}
221     56860822     "txt"     {"txt": " "}
222     56860823     "txt"     {"ott": "2000920243"}
223     56860823     "ott"     {"ott": "2000920243"}
224     56860824     "txt"     {"txt": " "}
225     56860825     "txt"     {"txt": " "}
226     56860827     "txt"     {"txt": " "}
227     15694706     "txt"     {"txt": " "}
228     9877816     "txt"     {"txt": " "}
229     56860828     nan     {"processing": "DE"}
230     56860828     nan     {"processing": "DE"}
231     56860830     nan     {"processing": "DE"}
232     56860831     "txt"     {"txt": " "}

For more context I've labelled the columns though this was taken from the start of the fail row 229. So it is using the value in workflow to match the key in the cost_centre column, which can have up to 5 key value pairs in. They do match up, as the workflow has been exploded so that there is a workflow per row, this is just the last piece of the puzzle so that the correct cost centre is also showing on that row.

uneven thunder
#

maybe solve wasn't the right word

#

this seems about right

#

Okay, i now watched a bunch of libraries create a tree which works .

cold osprey
#

u can add on to it

#

create more features

#

tune hyper parameters

rose dagger
#

A bit of an odd error: I trained a neural net with the following architecture (see image) and called model.predict(x) on one of the training data points and got the following error. The training worked without any errors, so what's the issue here?

#

The data point x is a 512x512 array

cold osprey
tidal bough
cold osprey
rose dagger
cold osprey
#

thats how i see it

rose dagger
#

Oh i see.

#

meaning if i wanted to predict 3 data points x1,x2,x3 simultaneously i'd have the input shape as (3,512,512,num_channels), where the first dimension is merely indexing the data points i give as an input

uneven thunder
cold osprey
uneven thunder
#

Okay so but like why can't we task an AI to build a better AI?

#

sorry.

tidal bough
manic cave
#

how difficult would it be to train a model with pytorch that detects bugs and inserts a print statement after that line

serene scaffold
#

that's like god tier

hazy knot
#

Is there a go-to or default method for model explainability?

tall tulip
#

I've standarized the data and also trained model with that data:

data_mean = data.mean()
data_std = data.std()

norm_data = (data - data_mean) / data_std```
**Now I want to inverse the predicted values **
```inverse_data = (predicted_arr * data_) + data_mean```
**But it gives me the below error how can i handle this?**
```ValueError: Length of values (14604) does not match length of index (2)```
fleet heath
hasty mountain
#

Hey @potent sky, since you're into trying some things on latent generative models, maybe this may be useful to you:
https://arxiv.org/pdf/2006.10273.pdf

It's a tutorial on Variational AutoEncoders, where it's explained more about the theory and mathematics around VAEs. It also talk about the confusion around the Decoder Loss (MSE or Likelihood).
My professor sent me this yesterday. Seems interesting.

#

I just don't really get one thing, though: if the ELBo Loss is more accurately applied when using a Likelihood metric(like Gaussian Likelihood)...why does it works with MSELoss in Diffusion Models?
I mean...I remember the sampling function for diffusion probabilistic models is based on ELBo... pithink

potent sky
potent sky
# hasty mountain I just don't really get one thing, though: if the ELBo Loss is more accurately a...

https://arxiv.org/abs/2107.00630 this might help a bit I think

hasty mountain
#

Thanks!

tall tulip
manic cave
cunning vector
#

Hello all, qq. is it fine to run on old pandas version forever, as new pandas versions throwing merge error

#

this merge error was just a warning in older versions

agile cobalt
#

ideally you should adjust your code that it does not gives you neither warnings nor errors

#

if it works on an old version, technically you can just never update anything and keep using it exactly as is, but if you ever need to add new features to it, or if security is a concern (e.g. web servers), you may want to update things

past meteor
#

How do you guys decide where you're going to publish especially if you're doing more applied stuff (like in my case personal health)?

#

Like what helps you decide if you're going for an AI journal or a health (or any other) journal?

cunning vector
coral field
#

What's the difference between Tensorflow's .numpy() and Numpy's np.array()? How does functionality change if I choose one over the other?

hasty mountain
#

I suppose Tensorflow will simply call np.array() while manipulating the data so the operation can be as efficient as possible

tidal bough
#

maybe .numpy can avoid copying the data.

night kernel
#

anyone hear about 'openchatkit' from redpajama? https://twitter.com/togethercompute/status/1666067674382888961

Announcing RedPajama 7B trained on 1T tokens! ๐Ÿš€

โ€ข Instruct, chat, base, and interim checkpoints on
@huggingface
โ€ข The instruct model outperforms all open 7B models on HELM benchmarks
โ€ข The 5TB dataset has been used to train over 100 models

Details๐Ÿ‘‡

https://t.co/oUNKqYBmlS

Likes

358

Retweets

106

#

released this morning, is apparently one of the best open source chat models to-date. if you had to say, what do you believe is the best open source LLM

crimson summit
#

I coded my first neural network and I finally got it to work lets goooo

#

97% accuracy

mild dirge
#

What was the mistake in the end? @crimson summit

toxic mortar
#

Chris Lattner is a legendary software and hardware engineer, leading projects at Apple, Tesla, Google, SiFive, and Modular AI, including the development of Swift, LLVM, Clang, MLIR, CIRCT, TPUs, and Mojo. Please support this podcast by checking out our sponsors:

โ–ถ Play video
crimson summit
mild dirge
#

I can't stand lex's voice, he sounds high as a kite and his questions are so weird(?) sometimes

crimson summit
past meteor
#

Lex also has some hot takes but who am i to judge on that front

vale idol
#

Hi, I have a question regarding how to assign values to series in dataframes. I have a (main) dataframe that is divided into multiple years which also contains various kinds of scores. Each year has 1 unique score attached to an identifier. I would like to calculate deciles for every year (shown in code below) and do this using the yearly dataframe. Unfortunately, I have issues assigning this back to the original dataframe. Additionally, although the code below is only for 1 year, I would like to make a for loop function that does each year in the original dataframe. Any help is really appreciated ๐Ÿ™‚

'''py
sustainalytics_scores = ['total_esg_score', 'environment_score', 'social_score', 'governance_score']
sustainalytics_c = sustainalytics.copy()
labels_30th_p = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

yearly = sustainalytics.loc[sustainalytics['year'] == 2014, ['isin'] + sustainalytics_scores]

sustainalytics_c.loc[sustainalytics_c['year'] == 2014, ['ESG_measure_sorts_30']] = pd.qcut(yearly['total_esg_score'], q=10, labels=labels_30th_p)
'''

night kernel
#

i was watching andrej karpathy's let's build gpt' from january, but i feel that models have progressed since then

hasty mountain
thorn swift
#

im so bored, im desperate for a project if anyone is working on something

somber panther
#

where might a look for some open source projects i might be able to contribute to while i'm learning ds?

agile cobalt
#

you can play around with open datasets on Kaggle and try participating in their Competitions

somber panther
#

is an idea, i don't really thrive in competitive settings

#

feel like id be more motivated if it was something i could invest myself in

#

that's a useful lead though, seeing a lot of libraries in use that i'm currently studying

potent sky
faint marten
# vale idol Hi, I have a question regarding how to assign values to series in dataframes. I ...

Hi I hope you had found a solution. If not, could you clarify what your data frame looks like? So you have a main data frame, with columns [โ€˜yearโ€™ , โ€˜total_scoreโ€™, โ€˜env_scoreโ€™, โ€˜soc_scoreโ€™, โ€˜gov_scoreโ€™], so each year is one row? Or you have an additional column like โ€˜cityโ€™, so each year is N rows, where N is the number of cities?

lapis sequoia
#

guys why does precision have two values when produced using a classification report in scikit learn

#

1 and 0

#

I thought precision = TP/(TP + FP) where TP = True positive, FP = false positive

#

how are there seperate values for 1 and 0

#

is it that the classification_report function is not assuming that 1 means positive and 0 means negative and is thus calculating the precision twice

#

once for 1 as positive and then 0 for positive

agile cobalt
#

it is made to support multiclass classification, not just binary classification

dusk bear
#

guys...
a doubt regarding how to use precision and recall
actually i am building a cnn model for plane detection
i got this precision and recall values but it didnt recognise the third aeroplane only. so is this right or wrong? or .. any comments
please suggest something..
first list is predicted boxes
second list is ground truth boxes

odd meteor
# lapis sequoia guys why does precision have two values when produced using a classification rep...

The metrics is shown per-class basis. This is because we might want to know how the model performed per class in the response variable (Y); since it's possible for one to be more interested in really seeing the model's performance on either the positive / negative class (for a binary classification problem) separately.

For example (Assume class label 1 is the positive class here and this is a titanic dataset), this affords you the liberty to infer that:

Precision: Out of all the passengers the model predicted would survive, only 84% actually survived.
Recall: Out of all the passengers that actually did survive, the model only predicted this outcome correctly for 89% of those passengers.

(Now you can also make such inference for the negative class with ease by focusing on the label 0)

Finally, you can as well get a general overview of each metric performance (not per-class level this time) by looking their respective average score.

You can find the complete documentation for the classification_report function here https://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html

shadow quiver
#

I have a data that 50m rows in Postgres. I could easily manage 1m part of this data even with Pandas.

Now I want to write this data to parquet using Pyspark. But it gives memory error (java heap). I even partitioned the data by 100. Why Spark can't handle it and do it in small batches?

rose dagger
#

I'm working on an image segmentation task where i'm currently trying out the U-Net architecture. When making predictions with the trained model, i am getting images of the following form (see attached image). The boundaries seem to be causing some issues here. My guess is that the cause is a combination of (a) the convolution blocks "downsizing" the images together with (b) the decoding part of the U-Net then "upsizing" the images again.
What are some steps to take to remedy this problem? Note that the inputs are WxH sized and the outputs are WxH sized as well, i.e. the same size as the input images. One idea i had was to slightly "crop" the output/train images so that they are of size (W-n)x(H-n), so that i have removed the boundary. It seems that in this Kaggle competition (https://www.kaggle.com/competitions/hubmap-kidney-segmentation/discussion/238198) the winning solution did exactly that. Any thoughts?

lost pier
#

Hi there, wonder if anyone can help, I've been trying to drop na values from a dataframe and it just will not go. I've tried the following:

df = df.dropna(subset=['workflow'])

I've tried :

df.dropna(subset=['workflow'], inplace=True)

I've tried:

test_df = df[['workflow']]
test_df = test_df.dropna()

and I've tried:

test_df = df[['workflow']]
test_df.dropna(inplace=True)

Bonus round, I've tried

df = df[df['workflow'].notna()]

In fact, the nan values in the dataframe do not even show up as True if isna() is applied. What else can I do to rid my data frame of this plague please?

boreal gale
lost pier
#

This is now just the single column and it still won't go, I'm actually just trying to check for na to see if this is what is messing up my function on the larger dataframe, but I just don't understand why I can never get this to work without a fight

potent garnet
#

Hello everyone, is anyone interested in Kaggle competitions?

cold osprey
#

np.nan()

cold osprey
#

prob a dtype thingy

lost pier
#
224    "soip"
225    "soip"
226       nan
227    "soip"
228    "soip"
Name: Workflow, dtype: object
cold osprey
dusk bear
#

u did workflow

#

do Workflow

boreal gale
#

good eye, if that doesn't work then see what type(col.loc[227]) gives you

lost pier
#

yes sorry, i've just changed the name of the column as it's work data and I don't want to get into trouble, that was just a typo

cold osprey
#

try checking with .isnull()

#

see if it returns True

lost pier
#
type(new_df.loc[227])

returns str

dusk bear
dusk bear
#

well ig its nan not np.nan

#

so replace nan with np.nan

#

and then do dropna

lost pier
#

ah, ok, how can I fix that the original dataframe is 139,000 rows lol

cold osprey
#

^

#

replace then drop

dusk bear
#

new_df.replace('nan',np.nan)

#

and then the dropna code u wrote with subset

boreal gale
#

the above suggestions should work, but i would look into why your data is like that in the first place

dusk bear
#

which dataset are you working on btw? @lost pier

alpine temple
#

Anyone here a PyTorch whisperer?

I've attempted to build a SqueezeNet, and it blows.

lost pier
# dusk bear which dataset are you working on btw? <@398356726656794637>

@dusk bear I have a large data set that with two columns I am working with, one is and array which has been exploded the other is a json object that I am trying to map with the result of the exploded column, but I hadn't seen the nan values till yesterday, so now I am trying to find a way to skip over the nan values as this is just a pipeline transformation for financial data, so nothing can be dropped

alpine temple
#

Wondering if I could talk through my hyperparameters with someone, along with a sanity check.

lost pier
dusk bear
#

ahh.. ok..

boreal gale
lost pier
#

The Json loads and dumps was an attempt yesterday, i've removed that now, I'll show you the code that works up to the nan values, one second

boreal gale
#

oh my apologies, i somehow took it as the json.dumps/loads caused this weirdness.
but yes, showing what you have got would be useful

lost pier
#
file = glob(f"{file_path}*.csv")[0]
df = pd.read_csv(file, encoding='utf-8')
df = df.replace({'\'': '"'}, regex=True)
df["Workflow"] = df["Workflow"].str.strip("[]").astype(str)
df["Workflow"] = df["Workflow"].str.split(",")
df = df.assign(Item_Cost_This_Month=df["Cost This Month"] / df["Workflow"].str.len())
df = df.assign(Item_Cost_Next_Month=df["Cost Next Month"] / df["Workflow"].str.len())
df = df.explode("Workflow").reset_index(drop=True)
df[["Workflow", "Cost_Centre"]] = df[["Workflow", "Cost_Centre"]].applymap(ast.literal_eval)

So the above code, works really well so long as there are no null values, here is a sample row of the whole data:

04/30/2023 23:24:26     1242360.0     LongForm     04/30/2023 23:24:26     05/30/2023 00:00:00     True     0     1     29     0.0     0.12     3.34     ['soip', 'ott']     uk     {'ott': '1234567890', 'soip': ' '}     abc    xyz    prd     NaN     NaN     NaN

workflow is the array, and cost code it the key value pair

rose dagger
lost pier
#

The above only trips up when it gets to a row where the value in the array field is nan, as this value is used to map the key value pair, I just didn't expect skipping over it would be such a battle

boreal gale
tidal bough
lost pier
#

yes, so here is a row after the explode, but before the ast line:

True     0     1     29     0.0     0.12     3.34     'ott'     uk     {'ott': '2000920243', 'soip': ' '} 

and here is a line that is causing an issue:

False     0     0     0     0.0     0.0     0.0     nan     de     {'content_processing': 'XYZ'}     

the above line is the first one that fails, and after looking at the csv and manually copying it out, I saw the issue and then after more digging, found that this was what was stopping it. Today I thought if I dropped all the nan's I could validate that theory lol

lost pier
#

That's the column that's causing the problem too

tidal bough
#

Possibly you want to do something like a json.loads followed by pandas.json_normalize

boreal gale
#

can we have the header as well so we are on the same page? or just highlight which column is which (the ones you have used anyway)

lost pier
#

Yes it's the columns that have nan and the key value pair, these ones:

workflow      cost_centre
'ott'      {'ott': '2000920243', 'soip': ' '}
nan        {'content_processing': 'XYZ'}
#

I have to split this up for these two columns and produce another csv that can then carry on down the pipeline into google big query I think it goes

boreal gale
#

i understand now.
is workflow really 'ott'?
or is it "ott"?
only the later is valid JSON

lost pier
#

It actually comes in from the csv as ['ott'], but I dont' know if that's python doing that

#

same with the key value pair, it looks like json with single quotes, which I thought was not valid json,

#

I did put this in there:

df = df.replace({'\'': '"'}, regex=True)

not sure if it was in the above code, I've tried all sorts of things to clean this up, I'm getting a little lost

#

So after that I tried the json.dumps and json.loads, and that did get the cost centre column into a valid json format, howerver literal_eval was working on the single quote dictionary version to be fair.

boreal gale
#

hopefully this gives you some inspiration.

#

!e

import pandas as pd
import ast
df = pd.DataFrame({"itemgetters": ["['a', 'b']", "['a']"], "lookup": ["{'b': 'quack', 'a': 'meow'}", "{'b': 'quack', 'a': 'meow2'}"]})
df_parsed = df.applymap(ast.literal_eval).explode('itemgetters').reset_index()

lookup_values = pd.concat(
[
  df_group['lookup'].str[key]
  for key, df_group in df_parsed.groupby('itemgetters')
]
)
res = pd.concat([
lookup_values,
df_parsed,
],axis=1)

print(df)
print(df_parsed)
print(lookup_values)
print(res)
arctic wedgeBOT
#

@boreal gale :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 |   itemgetters                        lookup
002 | 0  ['a', 'b']   {'b': 'quack', 'a': 'meow'}
003 | 1       ['a']  {'b': 'quack', 'a': 'meow2'}
004 |    index itemgetters                        lookup
005 | 0      0           a   {'b': 'quack', 'a': 'meow'}
006 | 1      0           b   {'b': 'quack', 'a': 'meow'}
007 | 2      1           a  {'b': 'quack', 'a': 'meow2'}
008 | 0     meow
009 | 2    meow2
010 | 1    quack
011 | Name: lookup, dtype: object
... (truncated - too many lines)

Full output: https://paste.pythondiscord.com/xadogowike.txt?noredirect

boreal gale
#

i gotta bail now, good luck!

lost pier
#

@boreal gale thank you for your help sir, it has been very inspiring for sure

#

@tidal bough thanks very much for your help also, you have indeed correctly identified what was causing that problem. I am able to dropna() straight after ingest

rose dagger
#

Ok, i have posted my question on StackExchange: https://ai.stackexchange.com/questions/40742/convolutional-neural-network-struggling-at-the-boundary-of-images
I hope some of you might be able to help, but even if not, i'd appreciate an upvote on the question, if you think it is well-posed and interesting, in order to increase its visibility.

brave sand
#

can someone help me with object detection? how do I convert xml to csv for tf records?

frail dune
#

Hey, I'm currently researching about digital twins and simulation and I wanted to ask whether someone here has some knowledge and could answer me some questions and give me a small overview on the topic (pm if its ok)

serene scaffold
crystal obsidian
#
import nltk
# nltk.download()
from nltk.tokenize import word_tokenize
from spellchecker import SpellChecker
from gingerit.gingerit import GingerIt
from transformers import AutoTokenizer, T5ForConditionalGeneration

# Step 1: Tokenization
def tokenize_text(text):
    return word_tokenize(text)

# Step 2: Spell Checking
def correct_spelling(tokens):
     spell = SpellChecker()
     corrected_tokens = [spell.correction(token) for token in tokens]
     return corrected_tokens

# # Step 3: Grammar Correction
def correct_grammar(text):
     parser = GingerIt()
     result = parser.parse(text)
     corrected_text = result['result']
     return corrected_text

# # Step 4: Missing or Extra Words
def correct_missing_or_extra_words(text):

    tokenizer = AutoTokenizer.from_pretrained("grammarly/coedit-large")
    model = T5ForConditionalGeneration.from_pretrained("grammarly/coedit-large")
    input_text = text
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids
    outputs = model.generate(input_ids, max_length=256)
    edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True)


    return edited_text


# Example usage
input_text = "Thiiss is aa testt sentnce with spelng mistakas."
tokens = tokenize_text(input_text)
corrected_tokens = correct_spelling(tokens)
corrected_text = ' '.join(corrected_tokens)
corrected_text = correct_grammar(corrected_text)
corrected_text = correct_missing_or_extra_words(corrected_text)

print(corrected_text)
# for i in range (0, len(corrected_tokens)):
#   print(corrected_tokens[i])

this code is extremely slow bcz of the 4th function
also the output was expected:
This is a test sentence with spelling mistakes
but what I got:
This is a test sentence to see if I can spot mistakes.

lapis sequoia
#

which is more likely to cause overfitting in random forests. high number of estimators or low.

lapis sequoia
#
import math
import time
from pynput import keyboard, mouse

is_active = False
last_toggle_time = 0

def on_press(key):
    global is_active, last_toggle_time
    try:
        if key.char.lower() == 'c':
            current_time = time.time()
            if current_time - last_toggle_time > 0.5:
                is_active = not is_active
                last_toggle_time = current_time
                if is_active:
                    start_spinbot()
    except AttributeError:
        pass

def start_spinbot():
    screenSize = mouse.Controller().position
    centerX = screenSize[0] / 2
    centerY = screenSize[1] / 2
    radius = 200
    angularSpeed = 0.1

    mouseController = mouse.Controller()

    angle = 0
    while is_active:
        x = centerX + radius * math.cos(angle)
        y = centerY + radius * math.sin(angle)

        mouseController.position = (x, y)

        angle += angularSpeed

        time.sleep(0.01)

def on_release(key):
    if key == keyboard.Key.esc:
        return False

def main():
    print('Press "c" to activate/deactivate the spinbot. Press "Esc" to exit.')
    with keyboard.Listener(on_press=on_press, on_release=on_release) as listener:
        listener.join()

if __name__ == '__main__':
    main()

i can't find a channel for my issue really but my code is meant to spin the cursor around a 1440p native screen but well not only does it not spin it in the middle but it also doesn't stop after repressing C

cold osprey
#

trynna run it rn

lapis sequoia
#

if i run it it works fine but like i said just doesnt even spin in the middle of the screen and it does not stop no matter what i press or well until i alt f4 out of it

cold osprey
#

ok sec lemme try

frail dune
#

Does anybody know whether its possible to simulate a digital twin of a CAD model in python?

#

and if yes does anybody have a paper or link to a readme or w.e.

lapis sequoia
cold osprey
#

sec setting up env

#

wanna install pynput in separate venv

#

hmm

#

mouse aint spinning

#

i can press esc to exit tho

#

wait nbvm

#

i didntr press c lol

cold osprey
brave sand
#

what do I put as a checkpoint for tensorflow?

#

on line 145

past meteor
brave sand
past meteor
#

Can we take a step back for a second, what are you trying to do?

brave sand
#

i am trying to train an object detector with mobilenet-ssd v2 320x320

#

i have these files, im not sure which one to use

past meteor
#

Is the object you're trying to detect not part of the coco classes?

brave sand
#

no, it isnt

#

i already labelled my data

#

and converted to csv for tfrecords

past meteor
#

Okay great, sorry for asking. Just wanted to be sure ๐Ÿ™‚

brave sand
#

i'm on the last step, training the model

#

i'm unsure on what the checkpoint is

past meteor
#

In all honesty I don't know either by I'm going to have a look as well

cold osprey
#

iirc checkpoint of the model during training?

past meteor
#

Yeah but they seem to have multiple check point files

cold osprey
#

oh its starting form the ssd_mobilenet_v2_coco checkpoint to train

brave sand
#

am I missing something?

past meteor
brave sand
#

so basically the meta file is the checkpoint file i'm looking for

#

@past meteor that doesn't work

potent sky
#

If you want to use the checkpoint for training, all of them are important
The meta file describes the graph structure etc. The .data file has the actual model Weights

potent sky
past meteor
#

@potent sky can I use you as a sounding board for a second?

potent sky
#

Or atleast used to be last I used it
TF undergoing too many changes atm ;-;

past meteor
#

I want to make synthetic data (tabular use cases).

I was thinking of going with graphical models because I can specify how everything is related to each other first. Afterwards I sample from it and send it through a (V)AE to add a bit of unpredictable/non-boring noise.

Am I severly overengineering/overthinking this?

past meteor
#

If all the relationships are linear and everything is independent w.r.t. each other then I'd obvs just sample my N variables and make a predetermined function that determines f(X_1, ..., X_N) but that's just too boring

brave sand
#

i just activated tensorflow for my gpu

#

now the old code won't work

#

how the hell

potent sky
#

Your reasoning for using graphical models seems pretty sound. If I wanted data realism and had to capture relationships between different variables, this would be a good option

potent sky
potent sky
# brave sand how the hell

Are you on windows? TF GPU is not supported on windows anymore I think.
Overall tf is undergoing many changes

brave sand
#

nvm, i'm back at the same error