#data-science-and-ml
1 messages Β· Page 120 of 1
Thank you ππΎ I activated it
And thank you too because I see my problem with the input data
can someone give me an example of epoch? im new
you mean the graphics ?
what graphics?
the example of epoch ..!! you need the script?
yes
def plot_training_history(self, model):
fig, axs = plt.subplots(2, 1, figsize=(8, 6))
axs[0].plot(model.history.history['loss'], label = 'Training loss')
axs[0].plot(model.history.history['val_loss'], label = 'Validation loss')
axs[0].set_title('Model Training History')
axs[0].set_ylabel('Loss')
axs[0].set_xlabel('Epoch')
axs[0].legend()
i'm work a trading bot
thanks
welcome
hey can i know what does loss in that pic mean?
I'm trying to convert my total precipitation data into float. But it's not converting ```py
data['total_precipitation'] = data.total_precipitation.astype(float)
data['date'] = pd.to_datetime(data['id'])
print(type(data['total_precipitation'])) # <class 'pandas.core.series.Series'>
Epoch X/Y: Indica la current epoch y el total number de epochs (Y) que se estΓ‘n running.
171/171: Represents el total number de training steps executed in the current epoch. In this case, 171 training steps were completed in each epoch.
ββββββββββββββββββββ: Provides a graphical visualization of the epoch's progress.
7s 41ms/step: Indicates the average time taken for each training step to execute during the current epoch. In this case, each step took approximately 7 seconds and 41 milliseconds on average.
loss: 0.8608 - val_loss: 2.5735: Shows the loss in the training set and in the validation set at the end of the current epoch. In this case, the loss in the training set was approximately 0.8608, while the loss in the validation set was approximately 2.5735.
import pandas as pd
Assuming 'data' is your DataFrame
Check the data types of 'total_precipitation' column
print(data['total_precipitation'].dtype)
Convert 'total_precipitation' to float
data['total_precipitation'] = pd.to_numeric(data['total_precipitation'], errors='coerce')
Check if conversion was successful
print(data['total_precipitation'].dtype) # Should be float64
When i tried dtype. It's giving me float64. But the problem is when i try to plot it. It's giving me this error Series is not callable
import matplotlib.pyplot as plt
Assuming 'data' is your DataFrame
plt.plot(data['date'], data['total_precipitation'])
plt.xlabel('Date')
plt.ylabel('Total Precipitation')
plt.title('Total Precipitation Over Time')
plt.show()
i know it's training/validation loss but i dont get what it means
The "val_loss" (loss on the validation set) is a crucial metric for evaluating the model's performance and its ability to generalize to unseen data. If the loss on the validation set is significantly higher than the loss on the training set, it could indicate that the model is overfitting to the training data.
you welcome bro
matplotlib is cringe after experiencing ggplot
i use matplotlib a lot, but ggplot its awesome...!!
guys how to make this work
import pandas as pd
from pandasai import PandasAI
from pandasai.llm.openai import OpenAI
i get this error:
ImportError Traceback (most recent call last)
Cell In[26], line 2
1 import pandas as pd
----> 2 from pandasai import PandasAI
3 from pandasai.llm.openai import OpenAI
ImportError: cannot import name 'PandasAI' from 'pandasai' (C:\Users\****\anaconda3\Lib\site-packages\pandasai_init_.py)
"you need to import a SmartDataframe or a SmartDatalake, check out the docs:
https://docs.pandas-ai.com/en/latest/getting-started/" -gventuri
.
thanks
I'm happy most of these (so, a tiny subset of NNs) don't look so alien to me anymore π€
a considerable improvement since I first saw it
@serene scaffold I'm using the parser in the 2nd dataframe pic to analyze then first one.
I will not look at pictures of text.
to give a dataframe as text, do print(df.head().to_dict('list'))
though for full disclosure, I have a meeting that's about to start, so I might not be able to help. Posting the dataframe as text and the code increases your chances of getting help from someone else.
'dt1': [Timestamp('2001-04-15 23:30:00'), Timestamp('2001-04-17 00:28:00'), Timestamp('2001-04-17 23:44:00'), Timestamp('2001-04-18 23:48:00'), Timestamp('2001-04-19 23:12:00')],
'dt2': [Timestamp('2001-04-16 08:06:00'), Timestamp('2001-04-17 07:28:00'), Timestamp('2001-04-18 07:02:00'), Timestamp('2001-04-19 07:06:00'), Timestamp('2001-04-20 06:56:00')]}
the first one is rlly long since it's one data every minute
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
actually if it's just head:
'Activity Counts': [0, 0, 0, 0, 0]
'Date time': [Timestamp('2001-04-15 00:00:00'), Timestamp('2001-04-15 00:01:00'), Timestamp('2001-04-15 00:02:00'), Timestamp('2001-04-15 00:03:00'), Timestamp('2001-04-15 00:04:00')]
it's not standard notation, they probably define it to mean exactly this
where did you find this?
well, initially I found it here
https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks
there seems to be no definition of what it does there
it's also used here
https://towardsdatascience.com/grus-and-lstm-s-741709a9b9b1
and here
https://colah.github.io/posts/2015-08-Understanding-LSTMs/
and probably other places
those are separate images
my only question is about the notation used in the first one and how it corresponds to the much more readable notation in the second
dot is probably dot product and * is Hadamard product
no way to know if it doesn't say it explicitly π it's probably taken from the original paper
alright, looking at the diagrams, my best guess is that it's supposed to be concatenation
but if it's concat, why does the other notation have two different weight matrices 
it does look like it means concatenation, which would make sense if the vectors were row vectors, but then the whole multiplication doesn't make sense
better look at the original paper
all non standard (which is fine, if you explicitly define the symbols. not otherwise though)
The diagrams for LSTMs and GRU do a lot more damage than help
we've had this discussion with zestar before
alright, concat makes sense, but then why do the "expanded" versions have different weight matrices
Have we?
why look at a diagram with like 10 nodes and arrows everywhere when you can represent it all with like 4 equations
yes hahaha
when you asked about a weird network that still worked even when you took the "wrong output", remember?
ah yes
and looking at the equations immediately elucidated that was just like removing half a layer
oh wait, does it mean the weights are also concatenated? is it a vstack?
i would honestly recommend to look for the original paper and check the equations there
The diagrams kind of try and convey the intuition behind why the original author thinks they work
or just a better source that is more explicit
I found a paper, all it says is that it's a generalized form...
As per usual I'll recommend ISLR https://www.statlearning.com/ they have a good chapter on LSTMs and GRU
clearly not the original paper, but cmon
I think the whole cell state thing is just:
- A way to have more parameters and non-linearities
- A way to prevent vanishing gradient by having something that is essentially a skip connection
Forget gate, output gate, ... they all obfuscate what it's doing
welcome to diagrams + time series models
einsum is the #1 reason i prefer numpy to matlab
I'm pretty sure he intended us to use do-calculus
so someone, at some point, just mashed it together
this may or may not be the original paper, it's not even called GRU there https://arxiv.org/pdf/1406.1078v3
any one knows pyinstaller?
yeah, ok, so ig those are almost implementation details, you can either concat and apply a larger weight matrix or you can apply separate weight matrices and sum the products
I just saw Dune 2 last night. I opened this server now and I'm seeing Lisan Al Gaib π (Muad'Dib)
I hope they make a part 3 'cos they didn't wrap up things properly in part 2.
Hi guys , I have just finished with regression , classification and clustering , what should I do next?
computer vision
can anyone suggest a good crash course to learn tensorflow and pytorch, and viewing current data science trends which one is a better alternative to use because I have seeen lots of data scientists using pytorch but in most courses through which i have learn machine learning utilises tensorflow and keras library
but tbh i havent much explored the tensorflow too tho
just wanted to ask which is best to use these days and easier and faster to learn?
Build personal projects on those topics or test out what you've learned by participating in a ML Hackathon.
pytorch is the one you should use, yes
and yeah, lots of old tutorials use tf, but it's how it is
You'll find your path once you've explored long enough. I started with Keras, then moved to PyTorch. I had to learn PyTorch when I got into ML Research. I tried PyTorch and I haven't looked back since. You might prefer TensorFlow to other frameworks, you just have to start from one, I guess.
In summary: I'd say, "what's easier and faster to learn" is subjective. These frameworks are just tools, so just pick up one already and start 'cooking'. It doesn't matter if you started with a cutlass, hoe, tractor, or shovel, just pick a tool already and get started.
thanks! by chance do you know any crash course on pytorch which is really easy to understand and has almost all concepts covered
unfortunately no, but you can look at the pinned messages here
Deep Learning paper of the year material.
That everyone cites, but nobody understands or implements themselves.
Welcome to the most beginner-friendly place on the internet to learn PyTorch for deep learning.
All code on GitHub - https://dbourke.link/pt-github
Ask a question - https://dbourke.link/pt-github-discussions
Read the course materials online - https://learnpytorch.io
Sign up for the full course on Zero to Mastery (20+ hours more video) - https:/...
New Tutorial series about Deep Learning with PyTorch!
β Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer *
Part 01: Installation
I show you how I install PyTorch on my Mac using Anaconda. Installation on Linux or Windows can be ...
btw, I find the diagrams in this paper pretty good https://proceedings.mlr.press/v63/gao30.pdf
"All you need is all you need"
Did I win ML?
Good diagrams make things way easier, but they are hard to come up with and often not the best idea. Symbolic / algebraic is pretty good.
It's now boring tbh. I thought I'm the only one who's tired of that
woah thats a good journey!
tysm!
all things considered this GRU model is avoiding learning anything...
well, I was just gonna ask about that as well
ohhh, yk what
the first graph is pretty misleading
I'd really check for bugs in your implementation, the train accuracy going down is quite worrying?
any idea why its not working
i downloaded it and even installed everything
how can i check it
hit the windows key and start typing: "environment variables", a window will pop up, click on the button that say "environment variables", a window will pop up, find a variable named Path, double click it and create a new entry that is a path to your conda installation
i can see this path
it's echo on win
$PATH does not works on windows though
either echo %PATH% or just use set with no arguments
(both assuming cmd.exe, idk about powershell)
where do i do this
^
oki ill try this
Win+R -> cmd -> echo %PATH%
yeah, idk
here
click new first?
yep
like this?
# (B, Seq, F) => (Seq, B, F)
x_seq = x_unpacked.permute(1, 0, 2)
out = []
for x_t in x_seq:
z = torch.sigmoid(
(self.W_z @ x_t[..., None])[..., 0]
+ (self.U_z @ hidden[..., None])[..., 0]
+ self.b_z
)
r = torch.sigmoid(
(self.W_r @ x_t[..., None])[..., 0]
+ (self.U_r @ hidden[..., None])[..., 0]
+ self.b_r
)
new_h = torch.tanh(
(self.W_h @ x_t[..., None])[..., 0]
+ (self.U_h @ (r * hidden)[..., None])[..., 0]
+ self.b_z
)
hidden = z * hidden + (1 - z) * new_h
out.append(hidden)
out_seq = torch.stack(out)
out_seq = out_seq.permute(1, 0, 2)
it should be a textbook implementation of GRU
perhaps, if that's the right path, then yes
yes
you'll need to restart whatever shell you were using after you apply the changes
anaconda3, right?
I actually don't know that
still this
did you restart it
restart what?
the shell
yeah i closed it and opened it again
it also might be possible the executable is somewhere in a ./bin directory
u mean in this?
which one?
sorry im not good at terminal stuff and all
ok
copy and what do i do with it?
oo alright
omg finally!
tysm lisan and matiiss!
now everything will workfine, right?
oh
I'm currently working on a mini project extracting and manipulating data from stock data (S&P500) to practice using numpys and pandas, while I learn the prereq maths for ML. Anyone recommend any features to add to this projects to help improve my skills (intermediate to advanced)?
is it normal to take like 4-5 seconds to get this output?
i even checked time like this but i dont think the creation of the tensor is taking the time
importing torch can take some time, ```py
import time
start = time.time()
import torch
end = time.time()
print(end - start)
you may want to use Jupyter Notebooks or IPython to avoid having to re-import libraries and re-load data
Predictions / forecasting, economic data, technical indicators, etc
I've been thinking of making a SMA. I guess that may be a good start
Ill try to figure out myself, then get pissed off and find someone whos already done it XD
Sma is like one pandas operation, don't stress it
The interesting question is about forecasting
For example; how accurate is SMA as a predictor
The forecasting I think is too difficult atm as my coding and maths skills arent to that level yet. Unless, you recommend I just go for it and see how it goes
Many things aren't so complicated to 'do' because it's simply running some code. Like: a linear regression using sklearn is maybe a dozen lines of code. You don't have to know the math, but just how to assemble the code. Once you can do that, then lots of things are possible. But, it's daunting at first
Ill give it a go. So I'll need to learn how to use the sklearn libraries or are there other libraries I could use?
Sklearn is the thing to learn. Once you know that pattern, you can then use other libraries (ie: Xgboost)
But, being able to fit a linear regression model to a historical data set, and forecast the next N days is probably a good starting point
Ill have a go with it. Thing is with me, I have a bad habit of needing to know everything before I get onto a project when it'd be better if I just learn the 'basics' and then get on with it
I've only really in the past few months trying to develop that mindset
Oh, coding is like cave diving; you sometimes have to dive in and figure out where you are
There's an art to finding projects that are hard enough but solvable.
What's nice is: once you make it work, you can then take a step back and ask: 'why did it work?'
tysm for this suggestion! yeah it was taking 7.5s lol
Should I do mini projects or focus on one at a time and use as many different functions and skills on that one project (if that makes sense)?
jupyter notebook is gamechanger!
What ever you find more fun
As long as you're writing code, you're learning.
I always feel doing multiple projects overwhlems me as I feel a need to complete a project fully
Fully might be a step too far, at least for me. Nothing ever is fully done.
When I mean fully, I mean the program I created has fulfilled the criteria I wanted it to fulfill
That's fair, I wouldn't worry about your method: you have so much to learn (we all do) all that matters is that you're learning something
Thank you so much for your help
it actually kiiinda seems to be pretty big yes, when running locally on the CPU it only took 200 samples yet the graphs looked normal at least
there are like 800 batches, each containing 32 sequences
though I use 256 sequences per batch to speed up the training
100
170 epochs? π€―
could someone pls help me in my #1240058452647084103 discussion please, I am unsure about this normal distribution i am using
Software can be done, but it's only done when you decide it's done. You can extent it forever, and it will degrade over time, not improve, if it starts to run away from the original. Make a new project instead.
Or at least, it could be, if the platform it's built on did not constantly change things that break it (outside of your control and the real reason software is never considered complete).
Unfortunately can't ever tell if Google is not just faking it. Their previous lying ruined it for themselves. With cherry picking I can make it seem as if I have solved any problem in the world.
They need to start showing a lot of cases in which it fails too.
Boston Dynamics started doing this and it helped them a lot.
(Robotics loves cherry picking, it's almost all faked)
(And when it's not, it's not impressive for those outside of robotics)
They showed that their demo videos are cherry picked, but since they are admitting it and showing how it fails most of the time, it's fine.
No need to be paranoid about showing off the failure cases.
Yeah, they went even further.
If you look at the old Google I/Os where they were presenting their AI assistent way back in like 2015, it made it look like ChatGPT-4o is now.
It was all faked though.
And that is why it was also never released.
They then made the excuse that it was because of AI safety and the public was not ready for it or some nonsense.
Yeah, and they give out the free version that you can try out, which is the most important step.
It's like with a lot of ML papers. I'm in the "show code" camp. Or it at least has to be very simple for me to implement myself either from scratch or with a small tweak in like Pytorch or something.
Also because bugs are easy in ML, and there have been cases where someone did show the code and it was just bugged and either did not work because of the reasons they thought, or when fixed worked even better
A funny thing about bugs in ML is that they can often introduce small (pseudo)random noise, which can often improve its results.
Hey folks, I am in high school and have just completed my research paper which I will be publishing to github if everything checks out, it is my first time doing something like this and would like to know your thoughts! Here is its copy - https://docs.google.com/document/d/1RMx3emOys4p3OvgNvbzYJY-gE3c-1SshNoOMt4wqC6w/edit?usp=sharing
Would love for y'all to share your thoughts, thanks!
Software really needs it because it's not tangible.
imagine reading the literature
LOL
?
it's a shitpost
What is? "review of literature"?
imagine reading the literature
LOL
alright, this is rigged
what are you on about π
it's where you read all (or many of) the academic papers that relate to a topic of interest, so that you know basically everything there is to know about the state of the art for that """art"""
my first contribution to ""the body of knowledge"" was about developing a system for automating it https://www.sciencedirect.com/science/article/pii/S1532046421002999
Oh, so you are recommending I read a bunch of academic papers to get the style of writing and editing required for said papers?
idk, I was just shitposting about literature reviews.
alright
I see
when I said "imagine reading the literature", the subtext is that being up-to-date is cumbersome, and that academics secretly don't do it.
I always read a paper before bed, helps me fall asleep /s
when you read a paper, do you speculate about how many of the authors and reviewers are native english speakers?
ohk i get it
(I am yet to actually properly read a paper π¬ I have a couple I know I should actually read because they're more less part of course content)
my advice for reading papers: don't feel bad if none of it makes sense after your first pass. or seventh pass. a lot of academics are (a) writing for an audience that presumably knows more than you and (b) are profoundly unskilled writers.
both run the same code, this is an RNN, not even a GRU
the dataset is basically a bunch of text
the first one, runs locally on my computer on the cpu, only uses like the first 200 samples from the dataset
the second one, runs on a gpu on cloud, using all, idk, like 25k samples from the dataset and it looks terrible
yep, literaly copy pasted it
and I was using pretty much (but like, literally the same in terms of reproducibility) RNN implementation on numerical time series data and it worked pretty ok
so maybe it is the loss function, maybe it is the metric
you mean batch size?
uh oh, the test loss in the first one is way way above
relatively speaking, the difference is much greater in the first graph
I know, I don't like it either, we're just given some template code and that's how it be there...
yeah yeah, I have changed it, I was just running the older code cuz I thought it was working somewhat at least.... π
I want more people to make papers like Schmidhuber, with the funky iconic diagrams / images. Not because it's useful, but because it makes me more willing to read it.
yeahhh, it's just overfitting
Simplicity and maintainability in software often fails to account for the fact that how simple, readable, or easy to maintain something is, depends also largely on the skill of person reading it. This is obvious for mathematics, for example, a novice looking at something like a differential equation might consider it complex nonsense, but to someone more skilled it's actually much easier to understand than via plain English. However, it seems that this same realization has not been made for software at large.
it's as basic an RNN as an RNN can be basic
You can optimize for simplicity for a novice, but like in mathematics, this can greatly hold it back.
(But if what you are making is simple software (e.g. mostly business logic stuff, just an app), then this totally makes sense)
yeah, can do that
(But there is also a lot of stuff in software that has never really been measured, just promised improvement of readability and maintainability (e.g. Java style OOP, where are all the supposed gains? (i'm in the mood for hot takes)))
alright
thanks for the help btw
ah, it's 4am here π gn, thanks again
howwww is this happening 
batch size: 8, embedding dimensions: 32, rnn hidden size: 64
how much simpler can it possibly get
yeah, nope, I tried with torch.nn.RNN, it just wouldn't budge, there's clearly an issue somewhere else then
will figure it out tomorrow
it's probably something to do with the text then, maybe it's just not what RNNs are good for... it's trying to predict text, training on quotes, tries to predict the next word
why text? I didn't pick the data, but now that I think back to my previous stuff working on that other data, that only had 3 inputs and 2 outputs and it was struggling with pretty much cyclic data, this has a ton of inputs and a couple thousand outputs technically, idk... kind of a multivariate time series
oh well, I'm fairly certain my implementation of the layers is correct though
is anyone familiar with python simple diarizer package? I'm getting speechbrain error
This is the error messge
SpeechBrain could not find any working torchaudio backend. Audio files may fail to load. Follow this link for instructions and troubleshooting: https://pytorch.org/audio/stable/index.html
torchvision is not available - cannot save figures
SpeechBrain could not find any working torchaudio backend. Audio files may fail to load. Follow this link for instructions and troubleshooting: https://pytorch.org/audio/stable/index.html
Traceback (most recent call last):
File "C:\Users\HP\Pycharmprojects\proyek_ta\test.py", line 6, in <module>
from simple_diarizer.diarizer import Diarizer
File "C:\Users\HP\Pycharmprojects\proyek_ta\.venv\Lib\site-packages\simple_diarizer\diarizer.py", line 9, in <module>
from speechbrain.pretrained import EncoderClassifier
ModuleNotFoundError: No module named 'speechbrain.pretrained'
Process finished with exit code 1
How can I test data for accuracy or significance of variable-to-variable dependency?
I have a weather dataset but I'm not sure if it's me or if there's data not showing significance to the temperature. Should I visualize it or use some algorithm? I'm lost so any guidance is greatly appreciated here
``
Model: "model_1"
Layer (type) Output Shape Param #
input_5 (InputLayer) [(None, 60, 55, 6)] 0
conv2d_17 (Conv2D) (None, 60, 55, 32) 1760
max_pooling2d_11 (MaxPooli (None, 30, 27, 32) 0
ng2D)
conv2d_18 (Conv2D) (None, 30, 27, 64) 18496
max_pooling2d_12 (MaxPooli (None, 15, 13, 64) 0
ng2D)
up_sampling2d_5 (UpSamplin (None, 30, 26, 64) 0
g2D)
up_sampling2d_6 (UpSamplin (None, 60, 52, 64) 0
g2D)
conv2d_19 (Conv2D) (None, 60, 52, 32) 18464
conv2d_20 (Conv2D) (None, 60, 52, 1) 33
=================================================================
Total params: 38753 (151.38 KB)
Trainable params: 38753 (151.38 KB)
Non-trainable params: 0 (0.00 Byte)
``
my input is 60,55,6 and output shape is 60,55,1 but why in end its coming to 60,52,1 how can i configure it?
I have faced this multiple times can you guys give me a direction so that i can figure out why there is shape mismatch
try visualization, correlation analysis
plotting and just looking is unironically very good (monke brain good at spot patterns)
there are also some statistical tests (pearson correlation being one of the most well-known), but dont just rely on them
The question box on my chat gpt account is very small. Does anyone know how to solve this issue?
Does anybody know an fast way to check models on Huggingface for their context size?
yes, accuracy is measured as number of correct predictions over number of all predictions made
there are 11k tokens
embedding (I'll assume at least 1 weight + bias, it's the built-in one) + RNN (3 params there (2 weights, 1 bias)) + linear/fc layer (so 1 weight, 1 bias)
I can get the state dict ig
or just len of .parameters()?
it's 6
I am generating text along the graphs
love amy repository moonlight murders .happiness asses shakespeare companions hideous ephemeral tissue dent laziness stale cliche anxious below theoretical distract paul ago saja buys entry goats napoleon pets fireflies fireflies yous lang glows innovation stealing ,god dearly bind profanity freaking trained recalled races princesses claw copying visualize pulses deciphered ,all ,god begun begun picks instagram tricks praised traffic conclude crash introverts dogma
this is from the GRU network (which has like 2x the params), but it shows a similar graph anyway
Hi guys, I would like to make a Facial Recognition system using tensorflow. I know basic statistics, Machine learning and data analysis. As a Beginner to Deep Learning , what am I supposed to do ? Do I see a Youtube Channel and implement it directly or do I have to learn the fundamentals? I also know Web backend for the application side of things .
yeah, it's weighted too
wait wait, when you say parameters, what do you mean exactly?
alright, slightly confused, do you mean like features or like torch.nn.Parameter parameters?
alright, I was counting Parameter objects
there are like this many parameters then 2197498
this one had 1098234
the GRU model is at 4.5M in its standard config if you will
alright
idk, if I don't do that, it just tries really hard to predict mostly the ones that are most frequent in the dataset
alr, I'll find a paper
although... I'll give it some time, at least it's progressing the right way π
this makes sense too
found a paper, from what I can tell it suggests using a bunch of hidden layers, meanwhile I'm here using one 
if anyone's interested: https://www.isca-archive.org/interspeech_2010/mikolov10_interspeech.html
hello
Does anyone know some free API for accessing a LLM through Python? I took a look into ChatGPT but it costs money. I tried running Llama2 using Ollama in my computer but it's too slow. Is there any alternative solution?
ChatGPT but it costs money
actually it doesn't have to, there's a free tier according to here
not really. you need tokens which they dont give you. you have those limits but you have to pay for the tokens
ah I see, I stand corrected (twice in such a short amount of time too π¦ )
I doubt you'll find any online solution that doesn't cost you money, but you can try running a smaller model
thank u!
how many parameters is the llama2 you're running?
it gets as small as 7b iirc
as small as 7b in the llama series anyway
the phi series have even fewer params
1.Collect and pre-process datasets of facial expressions captured in different contexts (e.g., cultural variations, lighting conditions).
2.Train separate models for each context or develop a multi-context model that adapts its predictions based on additional input data (e.g., location, cultural background).
these are 2/5 of the tasks i received from my internship provider
how can I check this
not sure, how did you download it?
how can i provide multiple context to an image
ollama pull llama2
first downloaded and set up ollama ofc
from here I'd assume you're already running 7b (the smallest one)
try the phi series ig
in their github in the examples, it shows the gemma model which has even less params than phi
I'm not sure if this actually helps, but there are other less mainstream architectures like rwkv, mamba, etc. which might use less computing power (really am not sure on this)
im gonna see how phi3 goes. thank u for helping!
Hello everyone, sharing a notebook on association rule mining . https://www.kaggle.com/code/jaepin/association-analysis-using-grocery-dataset
I'm curious as to what the minimum support is for large transactional datasets. I've seen other do as low as 1% for apriori, but it seems to be too low in my opinion and in consequence they get trivial or inexplicable associations. Does anyone here work in retail and or have performed association rule mining at work before? what are your thoughts/comments?
cohere!
Thank you guys @jaunty helm @gritty vessel
Solved?
Just woke up lol but I've got a better idea of how I'll go about it
Would Google Colab work with this?
the smaller dataset is the more epoch the model has to learn for high accuray right?
How about aws to use ia
When you have less data, the model is more likely to overfit on this small dataset, as there is not enough data to generalize.
Guys how knows about the asji server or something like that?
Sorry I checked the name it is asgi
Hey everyone, why would you want to embed the tabular data when applying transfer learning for a regression task?
so, after reading that paper (#data-science-and-ml message), which suggested between 30 and 500 hidden units, I picked 50, so now I have 50 GRU layers each with a hidden size of 256, embedding dims is only 128 though, in total it's ~24M parameters
either way, an epoch takes about 10 to 15 minutes now, but at least the graphs make sense (this is after epoch 10)
the sentences are maybe not the best ones yet, but at least the graphs...
consider . until until until until is . . . .
your sometimes , . . . . . . . . . . . . . . . . . . real point dark . . . . . . . . . . the . . a rather the the , , the is the affects the the rest his the
you . . . . . . .
all humor , . . . . strength , . . earth . . . . . the just the . , . . . . . . . . the nature nature and . . . . so that . . , , like [any] than day waiting before go it . . . . . .
when . . . . . , . . , . . . , . . . .
[any] . . . .
ah [any] a . . . . . . . . . . . of all . . . . no all well make
it . . the w , , [any] . . . . . the happy to person the [any] [any] person . . . . . and you has w w a . . . . . . . . . . . . . . . w a . . . is . . . . . . . . .
i . . . . . . . . .
[any] . . . . . . . . .
my work the an the . . . .
different a the . . the make w raised .
wdym? tbf, the tokens are just whole words here
mainly practice, we just had it as a homework
No, forever
pretty sure google colab is gonna shoot the party down pretty soon 
that's not the point
this is literally a 2010 paper as well, lol
dw, I'll move on to transformers soon
may I call it DGRUN: Deep Gated Recurrent Unit Network? lol
Guys, I have achieved 61 val accuracy for my CNN human emotion classification model, I have around 3k images for each class. There are 8 classes in total, is this good? How can I potentially increase val accuracy and reduce loss values?
Hey guys, got this end result on my LSTM and i wanted to code the prediction for the next days. Can anyone help me, im super lost
uhh u dont have any non test predictions yet?
we dont even know which package you used for the predictions
need more context
No I dont
How do i make the future predictions for the next 7 days for instance
or maybe 15
def load_model_and_predict(self):
# Load the trained model
model = load_model('trading_model.keras')
# Retrieve historical trading data
rates = mt5.copy_rates_range(self.symbol, mt5.TIMEFRAME_M5, datetime.now() - timedelta(days = 30), datetime.now())
if rates is None or len(rates) < 60:
print("Not enough data for prediction")
return
i have this for 30 days
uh this is not the prediction part of the code
there should be a part that calls model.predict
it should be something like this for future predictions:
n_future = 5 # Number of future time steps you want to predict
predictions = []
for _ in range(n_future):
# Make a prediction for the next time step
next_pred = model.predict(input_data)
# Append the prediction to the list of predictions
predictions.append(next_pred[0, 0])
# Update the input data by removing the first element and appending the prediction
input_data = np.append(input_data[:, 1:, :], [[next_pred[0]]], axis=1)
oh wrong reply xD
.
i have this predictions call
def load_model_and_predict(self):
# Load the trained model
model = load_model('trading_model.keras')
# Retrieve historical trading data
rates = mt5.copy_rates_range(self.symbol, mt5.TIMEFRAME_M5, datetime.now() - timedelta(days = 30), datetime.now())
if rates is None or len(rates) < 60:
print("Not enough data for prediction")
return
# Prepare the data for prediction.
close_prices, ma5, rsi_values, cci_values = self.get_indicators(rates)
features = np.column_stack((close_prices, ma5, rsi_values, cci_values))
features_scaled = self.scaler.transform(features) # Use the same scaler as during training.
# Prepare input data for the LSTM model
X, y = self.prepare_data(features_scaled)
# Make predictions using the loaded model
predictions = model.predict(X)
# Here you can do whatever you need with the predictions.
print("Predictions:", predictions)
And the results i have this,
177/177 ββββββββββββββββββββ 8s 45ms/step - loss: 0.7418 - val_loss: 3.4327
Epoch 25/30
177/177 ββββββββββββββββββββ 8s 44ms/step - loss: 0.7563 - val_loss: 3.4275
Epoch 26/30
177/177 ββββββββββββββββββββ 8s 44ms/step - loss: 0.7597 - val_loss: 3.4299
Epoch 27/30
177/177 ββββββββββββββββββββ 8s 45ms/step - loss: 0.7648 - val_loss: 3.4291
Epoch 28/30
177/177 ββββββββββββββββββββ 9s 51ms/step - loss: 0.7499 - val_loss: 3.4423
Epoch 29/30
177/177 ββββββββββββββββββββ 8s 45ms/step - loss: 0.7444 - val_loss: 3.4330
Epoch 30/30
177/177 ββββββββββββββββββββ 9s 50ms/step - loss: 0.7396 - val_loss: 3.4489
197/197 ββββββββββββββββββββ 5s 21ms/step
Predictions: [[-0.17073686]
[-0.17073686]
[-0.17073686]
...
[-0.17073686]
[-0.17073686]
[-0.17073686]]
but i know i have a mistake on my code i trying to fix, but its crazy because my trading bot make profits and predictions for the movement of trading signal and trend
Time: 2024-05-15 18:15:00, Open: 1.08653, High: 1.08666, Low: 1.08633, Close: 1.08658, Direction: buy
Time: 2024-05-15 18:20:00, Open: 1.08658, High: 1.08692, Low: 1.08657, Close: 1.0868, Direction: buy
Time: 2024-05-15 18:25:00, Open: 1.0868, High: 1.0872, Low: 1.08669, Close: 1.08715, Direction: buy
Time: 2024-05-15 18:30:00, Open: 1.08716, High: 1.08744, Low: 1.08708, Close: 1.08723, Direction: buy
Time: 2024-05-15 18:35:00, Open: 1.08723, High: 1.08724, Low: 1.08686, Close: 1.08692, Direction: sell
Time: 2024-05-15 18:40:00, Open: 1.08692, High: 1.08697, Low: 1.0867, Close: 1.08678, Direction: sell
Time: 2024-05-15 18:45:00, Open: 1.08678, High: 1.08687, Low: 1.08633, Close: 1.08633, Direction: sell
RSI: 57.673367175840006
last value CCI: 55.203492465835076
MA5: 1.104268
Current Price: 1.1042
Cjurrent Price EURUSD: Bid - 1.08751, Ask - 1.08751
Current Signal trade: SELL
current trend: DOWN
Trend Down, Recommend sell.
MA5: 1.104268
Current Price: 1.1042
Current Trend: UP
succefull order buy send
Stop Loss: 1.08733, Take Profit: 1.08798
Hey Guys, any suggestions on which would be the best resources to practice Python for Data Analysis? Some free alternatives for Leetcode and Stratascratch?
Any of you feel that chatGTPjust makes you worse off, give you wrong answers and wastes time by saying you are wrong?
not sure about the last one, more often than not it should be the other way around (you wasting your time telling GPT that it is wrong), but for the first half yes
yes a chatgpt give always wrong answare
but you have to request a right answare showed the mistake
I do not even use it much. I put my 3000 hours into this before even touching chatgtp.
Had the sickest two-part tariff ever. Also, people should use two-part tariffs and stuff for market segementation
that is money like on a insane level
bruv why use google colab
is it just because its like a virtual gpu or smthing
i never used it i just want to know
because so many ai tutorials on youtube use colab like why not just use vscode
gpu usage indeed
ah ok
anyone?
I do not even use it. I question myself and I am like: "I would have been done if I just went to stack overflow for a quick answer"
Was Chatgtp always this bad?
hello guys
what would be the best way to predict a word by giving an incorrect input and from a premade dictionary the word gets predicted
eg: input word " BAD"
word from dictionary "BAT"
so basically we are getting some word prediction form an model but to increase its accuracy further we are trying this method
pls mention me whenever any of you responds
using python and experimenting without looking at the answers
only look at documentation
u can just take data from kaggle and experiment on ur own
learn matplotlib, sns, plotly(maybe)
if u are trying to do data analysis only i would just ditch python altogether for R
R slaps python in anything not deep learning
and even then it has h2o which is industry standard level
tidymodels also work very good
but is not meant for deep learning
gonna sound like a broken record, but you could try an RNN class network
but for real though, an RNN + DQN might do the trick here, where it learns to prioritize the words in the dictionary...

eh, probably, attention and all, speaking of which, I gotta go and revise that, btw, colab shut the party down... I got until epoch 16, it wasn't making much sense in rollouts, but it was continuing to improve by the other metrics
actually, some of the stuff it was generating made a teeny tiny bit of sense
these were some of my favourites
- it every much meaning god want suffering suffering idea w first ideas element inclined . . . . long . [any] happy off knew still . . even same making even come [any] . . the you . you . is is . . . you . . . . . . . [any] . those the
- your believe the the . . lot is gives both give science believeth science science . out .
- your w against the is the the , . a . out . out . , your . been i . great nobody day day worst important how ever - look passion pleasurable alec times kisses aslan aslan alone . knows best found doesnt waffle waffle aslan st value something . . soul is . to . to to the the the
- different the is w w w fact . the the . the w determination isnt because today ?its wednesday end . end success world used the the better [any] between go
How do you guys work on solo projects, i.e finding huge data sets
Hugging face Fine Web dataset is huge
Thanks buddy
how many epochs? (which ig is a silly way to compare them, but oh well...)
training loop as in the code for it?
Any idea of models trained on desktop and software interactions, capable of recognizing UI elements of OS, softwares and windows, without relying on HTML (ie. Electron App)
for epoch in range(1, 1000 + 1):
for dataloader in [dataloader_train, dataloader_test]:
losses = []
accs = []
if dataloader is dataloader_train:
model.train()
torch.set_grad_enabled(True)
else:
model.eval()
torch.set_grad_enabled(False)
for x_padded, y_padded, x_length in tqdm(dataloader, file=sys.stdout):
x_padded = x_padded.to(DEVICE)
y_padded = y_padded.to(DEVICE)
x_packed = pack_padded_sequence(
x_padded, x_length, batch_first=True, enforce_sorted=False
)
y_packed = pack_padded_sequence(
y_padded, x_length, batch_first=True, enforce_sorted=False
)
y_prim_packed, _ = model.forward(x_packed)
idxes_batch = range(len(y_packed.data))
idxes_y = y_packed.data # (B * Seq, F)
loss = -torch.mean(
torch.log(y_prim_packed.data[idxes_batch, idxes_y] + 1e-8)
)
losses.append(loss.cpu().item())
idxes_y_prim = y_prim_packed.data.argmax(dim=-1)
acc = torch.mean((idxes_y_prim == idxes_y) * 1.0)
accs.append(acc.cpu().item())
if dataloader is dataloader_train:
loss.backward()
optimizer.step()
optimizer.zero_grad()
Are you using lightning + tensorboard?
I have no idea what either of those are
Aha, there's a whole load of boilerplate you can be saved from
https://lightning.ai/docs/pytorch/stable/starter/introduction.html Pytorch lightning reduces the amount of boilerplate you need to write for training models (just write the step (forward pass + how to compute the gradients) and it does the rest, including things you typically want like early stopping and so on.
Also integrates well with Tensorboard https://lightning.ai/docs/pytorch/stable/visualize/logging_basic.html, plots the val and train loss
Very nice, thank you
I'm typically very skeptical for 3rd party deps but lightning automates all the things you don't want to be spending time on so you can just focus on getting the architecture right and you can forget the lest of the plumbing
odd you say? you want the loader or the dataset? cuz the loader is torch...DataLoader
hello
I assume this is what you want to see
def __getitem__(self, idx):
x_raw = np.array(self.final_quotes_sentences[idx], dtype=np.int64)
y = np.roll(x_raw, -1)
# lag
# [this is fun] => [is fun this]
y = y[:-1]
x = x_raw[:-1]
x_length = len(x)
pad_right = self.max_sentence_length - x_length
x_padded = np.pad(x, (0, pad_right))
y_padded = np.pad(y, (0, pad_right))
return x_padded, y_padded, x_length
i cannot understand a single line of code in it--
oh right, there's this thing where it's padded with a bunch of zeros
that actually might be throwing it off a bit
because the rnn cells just include those 0s during the recurrence
instead of one hot encoding and dotting it just takes out the values by index
I'm aware
there is, idxes_y is the ground truth
Question, will python help me become a Data Scientist?
yeah
y_packed is the ground truth
look, I didn't write all of this code myself
we are given templates
the output of the model is not a token, it's a 1d tensor of size equivalent to the number of classes
it sure is a tool to be used in the field, but only you can help yourself
Alright, thanks.
I'm a teen learning Python, with the only ambition of mastering it, I just wanna know what benefit can I assert myself if I learn this language?
so, you can either one hot encode the token and you get a one hot tensor that's the same size as the output of the model and then you dot them together or you can just index the output tensor
Well, Python is widely used in the field and has a large ecosystem of libraries and such
I'm certain it'll help me get a decent pay job as well?
I'll dedicate most of my time learning this language, to be fair.
I agree π
what did you find odd about this loss graph though?
I would have, but I think I ran out of the free colab compute minutes or however they measure that
I usually run models in paperspace now, I'll give it another try and more epochs, yeah
it's a bit of a first that epochs take this long too π
it was GRU specifically btw
they were SOTA token predictors in 2k10 
though interestingly the paper did say it took them 6 hours to train the model π€
https://ai.google.dev/competition if anyone is interested
btw, what is steps? is that how many batches have gone through? so like if you have say 10 batches per epoch and 5 epochs, that's 50 steps?
oh, right, so like, stepping opposite the gradient
yeah, gotcha, I see
this is something I'd do with tensorboard
For what it's worth, I like Matiiss' approach, even if the models they're trying suck it's a good ddactic experience
hey
Tbf babysitting models and doing "grad student descent" isn't what you should be doing though
uh oh
I used to do this when I was a masters student myself π . Now I just code up hyperparameter tuning stuff after doing 1-2 trial runs and let GPU go brrr
It's far better than tweaking things manually - it doesn't work nor scale
Unless you have intuitions for why a certain hyperparameters should be set to a specific value
I read papers, drink coffee or just idk sleep and monitor my logs on tensorboard (train/val curves), optuna (hyperparameter importance) and mlflow (hyperparam results obtained per model instance) every so often
I don't think I used their hyperparams, it was just suggested to have between 30 to 500 layers of RNN hidden stuffs, they used 100 in their paper, I used 50, didn't touch other params
I'm not sure there were any hyperparams listed though 
I think there's actually funny insight from the models I'm running
they're kiiinda scattered around, those hyperparams
they used a learning rate of 0.1 or 0.3, something along those lines
"worse" architectures have better results because they get to do more hyper parameter searches / hour
Are you just trying to recreate their results?
well, no, I just don't want my model to suck
What's your reasoning?
I mean, reading their paper did help, cuz I went from having 1 hidden layer to 50... that like, severely improved performance
Oh, you said don't not use Adam
I think if you're willing to use SGD with a very low learning right and high patience on your early stopping it might stomp adam
At least, that's what I've seen empirically
But yeah it's annoying advice but the easiest way to improve models is to actually hyperparemeter tune them "automatically"
alrighty, I'm confused, the paper mentions hidden/context size several times and I don't understand whether that means size of matrix or how many matrices there are
Size of context (hidden) layer s is usually 30 β 500 hidden units.
https://www.isca-archive.org/interspeech_2010/mikolov10_interspeech.pdf
I mean, given current empirical evidence of what I have seen, it seems to mean how many matrices there are (i.e., how many RNN cells), but that means that the hidden size of a single cell is not mentioned as far as I can tell
no
Are you using learning rate halving?
I think it's just 1 hidden layer with 30 to 500 neurons, could be wrong tho
Size of context (hidden) layer s is usually 30 β 500 hidden units.
mmm, cuz I thought it basically meant this...
I don't think there's hidden-to-hidden connections here
cool...
well, I did try 256, 512, 1024 (well, whatever they're called) for a single hidden layer, that yielded pretty bad results
chaining hidden layers immediately improved the model
I was gonna move to transformers, but then I noticed that transformers use batch norm... so, I decided to take a deeper look at that again and reading the original paper, idea being that it would speed up the model I had...
oh well, starting with the batch norm paper wasn't too bad of an idea either π
However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks
https://arxiv.org/pdf/1607.06450 (Layer Normalization)
whenever i see ur name the dude comes to my mind, not the protagonist but the old person that shouts "As its written" xD
the thing that is sus about batch norm is that the authors seem to change their mind if it should be before or after the activation
how do u guys find the time to read papers xD
its beyond me
how the f
well if ur finances are ok i would do the same
Hey folks
is it allowed to share projects one is working on?
or do we need permissions from mods?
What are the best resources for meta-prompting?
why would u send a whole project at once?
just ask the questions in snippet form
hey guys new to the data science and ai subject basis any recommendations on what i should learn
first*
Im in the same boat. Tbh the way I lve been learn changes a lot π. For the maths, read the pinned messages for the resources and for the coding, look up Python DataAnalysis by Wes Kinney
get comfortable in either R (perfect for anything other than deep learning or deployment) or python (good for deep learning and deployment)
you too
deployment means making a server ready for the model with everything set up for the machine to work
then try to implement/use and learn the ml algorithms, no need to learn EVERY ml algo since there are bajillions of em, just learn the big ones (as in popularity) such as linear/logistic regression, svc, random forest
this should take u at least 1 month
after that come back here
but deployin models require more than necessary computer knowledge
thats the hard part of getting experience for data science
without proper equipment u cannot really see real world applications
like u gotta setup a server (requires networking and linux knowledge) or buy/rent a server, you gotta set up api (superficial webdev knowledge) etc etc
true tho if u work on these than u become what industry wants
but its too much for someone new
cuz if they dont then they most prob cant xD
heyyy im a student as well
tru
is that french?
or latin
oh its like saying thus my hypothesis is the truth
or maybe demonstration complete?
i dk
i will never create a proof
i hate proofs
yea but they are not presented to a person to grade
blegh
compiler is mah bitch
i would prob just remove all the safety from the compiler
i cannot have a compiler say what i can and cannot do
yea that was counterintuitive
actually, i need for the compiler tell me what i can do
now that i think about it
since if it doesnt, then im not using that language at all
hmm
i need to eat, all this thinking made me hungry
do u like trying new food?
or maybe i should dm u xD
I swear to god, i fucking hate matplotlib. im trying to set the x axis to be labelled with the years from a dataframe and have it formatted so its clear to read but every solution i find is not working
bro just switch to R
ggplot is like heaven compared to matplotlib
or do cell magic to just run r in one cell of the notebook
if u ever want to change dm me, ill teach u
the ways of R
ill be alright XD. ive briefly used r before. simple af to use but sticking to python for now.
suit urself, R is heaven
AgglomerativeClustering.init() got an unexpected keyword argument 'affinity'
Any idea why can't i type affiiniy
cuz init method of the class doesnt take that argument?
which package is it?
hc = AgglomerativeClustering(n_clusters = 5, affinity='euclidean', linkage = 'ward')
i found this stackoverflow post https://stackoverflow.com/questions/53849107/sklearn-agglomerative-clustering-custom-affinity
So I should change affinity to metric?
never used the method so i have no idea
Manage to fucking solve the issue. I may switch to R if matplotlib makes me want to commit an act of terrorism
it did to me, multiple times, now if i am to do visualisation i just dont use python at all
dplyr is superior to pandas anyways
Ive only just started using it matplotlib and already it feels fucking frustrating to use. Going to start a few guided ML projects just to get grasp on what ML progrmaming is like
Ill see how it goes as I advance. Im still very much a beginner. Tbh ive only been practicing with pandas plus a bit of matplotlib; so really just doing the data prep and cleaning of data if thats what its called
pandas is pretty much just numpy so it's really all the same
huge thanks!
what does this mean, instead of calling it a dataframe you call it an n-dimensional array? haha
kaggle has a huge dataset db
is that one weird exception when dtype is a list of named types?
which I assume they still probably sort of organize it such that it's multiple arrays with same type of data
ah
this got me looking at the pandas source code
and
they have a class called Extensionarray which is basically what all their other "arrays" are inheriting
which is why it can support so many different types of data
exactly yeah
oh I absolutely agree
I'm saying this is why they are slower
I don't know why we have so many fancy ds libs when pretty much everything can be done with pure numpy
patsy is cool too
lower level stats modeling used a lot with numpy
hahahaha yeah
what that means to me is it defines the models but doesnt execute the computations, but its mostly used for like linear modeling
Can anyone help me find a better road map for ai and ml
Hey guys so I just finished studying about regression , classification and clustering . I did do some projects also .. What should be my next focus?
Neural Networks or NLP or Computer Vision or Genrative AI?
good question, i think by declaring the models it makes the calculations faster
I just know its used in statsmodels which I used to use a lot
yeah the use cuda python right?
right right
The Release Notes for the CUDA Toolkit.
The worst api ever is pyspark man
just be glad you don't have to use that haha
well they function very differently than pandas or R dataframes, which like you said makes it more reliable and faster
I'm just complaining because it took my a while to get used to... its basically a wrapper for apache spark so it was kinda like learning a new language
But I'm pretty used to it now. and I love the concept of cluster compute
really? on a single machine?
I guess yeah if you don't force it to infer the schema maybe, but without a cluster polars has been the fastest library for general EDA
for me
Hey everyone, I know it's really simple question but why I am having this error?
stack expects each tensor to be equal size, but got [30432, 72, 8] at entry 0 and [30432, 1, 1] at entry 1?
Cannot I've different shapes for the X and y?
you can
!paste could you share your code?
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
and the entire traceback as well
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
Here is the code and traceback..
https://paste.pythondiscord.com/FMJA
Basically what is does that, it creates X => which is shape of the 30432, 72, 8 and y normally it should be 30432, 1 and then my neural network learns from this
alright, well, that's the data, but how are you processing it? like, what's the model/network?
pyspark definitely has better support for non tabular data, which seems to be your use case
I'm pretty sure problem isn't in the network, previously when I using pytorch dataset it was working charm. Now when trying lightening I got errors...
well, does the model's input and output match your data?
yeap, it's. I think it's due to the L.LightningDataModule. Cuz, when I comment out the y part and create an empty shaped numpy array same as X it starts to work..
can you still show the model
I just added them here. Model is a little bit not really correct now because of the data shape this data loader model spits out. But with pytorch normal dataset, I used to be able to run it with X tensor size (32,72,9) and Y tensor (32, 1)
https://paste.pythondiscord.com/5Z4A
I think the error is there because, method val_dataloader() spits tensor size torch.Size([2, 30432, 72, 9]), that's what it should not spit π
I need help with reading data from a CSV file.
Here's the code:
import pandas as pd
file_path = "Train.csv"
df = pd.read_csv(file_path)
here's the error:
DtypeWarning: Columns (1,7,16,35) have mixed types. Specify dtype option on import or set low_memory=False.
df = pd.read_csv(file_path)
about the CSV file:
it has around 98K lines and 40 columns.
some data values are missing, and data consist of several types int, float, string, etc.
that should not be an error, but more warning for you that the some columns has mixed data types
sorry, I have no idea at this moment π
all these norms are kinda interesting btw, batch, instance, group, layer..., well, tbf, ig group norm could be considered a generalization of instance and layer norms, but still
whats interesting about them @spring field ?
that they normalize stuff π
idk, they speed up training and such
Does anyone have a good understanding about Generative Adversarial Neural Networks, especially Super-Resolutions GANNs and Enhanced Super-Resolutions GANNs?
it's best to just ask your question directly instead of asking whether someone can answer the question first, it's just an unnecessary step
I actually want to have a conversation around this topic?
Which is the reason to my post above.
If anyone can best explain what is a GANN, and how it works (preferably in basic english).
Well, then you may want to begin that conversation, otherwise someone will have to ask you what you want to talk about which again is an unnecessary step, just begin.
might I suggest https://www.youtube.com/watch?v=Sw9r8CL98N0
Artificial Intelligence where neural nets play against each other and improve enough to generate something new. Rob Miles explains GANs
One of the papers Rob referenced: http://bit.ly/C_GANs
More from Rob Miles: http://bit.ly/Rob_Miles_YouTube
https://www.facebook.com/computerphile
https://twitter.com/computer_phile
This video was filmed...
I have seen this video before π
I think there is a way to make a thread, so that way, I don't have a conversation about this on this channel.
But I am not sure how to do that @spring field
You can create a post in #1035199133436354600 but I don't think that's what you want, I suggest just sticking to this channel
it's not thaaat active anyway
Thank you...
Now, do you know much about the subject asked: #data-science-and-ml message
Have you had any personal experience about this?
What are your thoughts?
Not yet, in the coming weeks I will actually learn about them, so, I'll certainly be able to chime in then, maybe a bit sooner π
If that is the case, would you permit me to collaborate with you?
I would like to embark on this journey with you and we can both share knowledge together.
What do you say? @spring field?
Thank you for the invitation, but I'll have to decline it for now π
Ok ππΎ
Do tensor turn into list if I slice it?
I got this traceback error message
Traceback (most recent call last):
File "C:\Users\HP\Pycharmprojects\proyek_ta\test.py", line 42, in <module>
result = model.transcribe(speech)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\HP\Pycharmprojects\proyek_ta\.venv\Lib\site-packages\whisper\transcribe.py", line 135, in transcribe
decode_options["language"] = max(probs, key=probs.get)
^^^^^^^^^
AttributeError: 'list' object has no attribute 'get'
Here's a snippet of the code in question
signal, fs = torchaudio.load(WAV_FILE)
model = whisper.load_model("base")
segments = diar.diarize(WAV_FILE, num_speakers=NUM_SPEAKERS, outfile="segments.txt", silence_tolerance=2)
speeches = []
for segment in segments:
speech = signal[:, segment['start_sample']:segment['end_sample']]
result = model.transcribe(speech)
speeches.append(f"speaker_{segment['label']}: {result}")
you can just print(type(speech)) before passing it to model.transcribe
oh right
i just got a new laptop and im using anaconda after like 1 year, from last 1 year i was coding on online platforms like google colab and kaggle now im gonna use anaconda
and now i dont need online link of data to connect to my python platform
would anaconda be good for my every data analysis project?
i mean i wont need to install anything different for any other new library of python for data analysis right?
anyone using vLLM to host the LLM as an webserver API to interact with? The question is how do I send multiple prompts via a post API request to generate in parallel? (as I can with the offline inference of vllm llm class)
can someone give me a roadmap to become to get a job in data science(currently learning basic python)
Like what are the types of tech stack to become a good at data science
https://roadmap.sh/ is one I see people recommending
there is a roadmap for data analyst. Is it the same as data science?
well next to that is AI and Data Scientist
is learning django framework is a good idea for data science?
probably not
it's a massive framework on its own
I mean, ig you could learn just enough for whatever you need, but I'd maybe suggest first picking up flask if you really need to build a backend for some DS project or sth, it's much more lightweight
but generally, that's backend related, probably not that much of a concern for you in the DS field, especially not now, since now I assume you're just starting, so, just focus on DS
okie
not needed at all, just use fastapi
guys
Not required
Did you get the laptop @orchid forge
Creating a Series by passing a list of values, letting pandas create a default RangeIndex.
what does the RangeIndex means here and why it creates a default RangeIndex?
ya
Its a range startinf from 0 to no. of rows you have in a DF
No isoos
im studying from the beginning
Which laptop is it btw?
Its alright
I suggested a HP i7 that day
How would u guys show off a scraping project? I have one where I thought the process of getting the data and formatting it was pretty interesting and I wanna show the process, not just the code.. Ive been writing up a word document but idk if there's like a standard way to do this or what
True
Hey guys so I just finished studying about regression , classification and clustering . I did do some projects also .. What should be my next focus?
Neural Networks or NLP or Computer Vision or Genrative AI?
have you implemented ml algos in python?
from scratch just using numpy?
do that
lol im scraping video thats y i kinda wanna focus more on the process
data isnt useful besides for training data i just wanna show that I was able to use the hidden api and construct the footage from raw .ts
maybe project just sucks idk
im a student just trying to get some scraping gigs on upwork rn
want to show ppl that i can do anything
prolly will stick to the word document and copy some of it to the github readme, and link to the repository on the document
that way I can have the document to show on upwork and the github for my actual resume applying for internships and stuff
well the readme at least I'd have the github regardless ig
If I was training a network to recognize bus, bike and car in an image, would the output neurons look something like this?
yeah was considering that but thought it might be overkill.. once I have a website for some other projects I'll throw it up there but it's not really worth for just this one for me
anyone familiar with regression models?
Just ask the question plz.
Henlo?
Iβm running into some problems Iβm trying to use Jupyter β¦.
I tried downloading the module from terminal via pip but itβs still showing the error
Any help is appreciated
hi guys does anyone know if there's a pretrained model from huggingface that classify tone/pitch, avg. decibels, max. decibel from audio?
is it easier than training my own model to do classification task?
i only have 10 hours of training data without classification label yet only transcription
#importing libraries
import numpy as np
import pandas as pd
#pull the data
dataset = pd.read_stata("eitc.dta")
#preparing dummy variables
dataset['post93'] = np.where(dataset['year'] > 1993, 1, 0)
dataset['mom'] = np.where(dataset['children'] > 0, 1, 0)
dataset['mom_post93'] = dataset['mom'] * dataset['post93']
#Isolate X and Y variables
Y = dataset.loc[:, 'work'].values
X = dataset.loc[:, ['post93', 'mom', 'mom_post93']].values
#Do logistic regression
import statsmodels.api as sm
X = sm.add_constant(X)
model1 = sm.Logit(Y, X).fit()
model1.summary(yname = "Work",
xname = ("intercept", "After 1993", "Is mom",
"Mom after 1993"),
title = "Impact of tax credit on employment - model1")
Can anyone explain why a constant/intercept is needed to perform the regression
try to create a line passing through these points using only y = a*x, without an intercept b
Without an intercept your regression line would always go through (0,0) for instance
Think about modelling time spent studying versus your grade. Let's assume the relationship is linear..if you study 0 hours your grade isn't 0/100 right?
impossible
so shouldn't the program return an error if i don't pass an constant/intercept with that logic?
because the program should perceive something impossible as an error
I see
but can you kindly inform me how this constant/intercept is made
and how this constant/intercept affects my other values
By solving an optimization problem. Others call it curve fitting.
i think i got it. ty
Yeah, you can look up maximum likelihood estimation (MLE) and you'll find tons of resources explaining it in detail
now *coughs aggressively
I don't know statsmodels' (clunky) API well but it adds an intercept by default
Using sci-kit learn feels a lot more natural
Does anybody here have any experience working with Yolov8?
Trying to get it to read some documents, hoping I could get a question or two answered.
Ask your question directly please
Sure thing.
Right now, I'm trying to use it as an OMR for detecting checkmarks/unfilled checkboxes in a series of documents. I've fed it ~50 yes boxes and no boxes and gave it ~250 epochs of training, and it hasn't detected a single one, which leads me to think somethin's wrong with the dataset I gave it.
For a simple black and white image, where the checks/boxes are relatively small, about how many do you think would be necessary for it to start identifying them correctly?
hello, how is everyone? I have a quick question about implementing GPT into a python project. IΒ΄ve tried implementing the basic project that the documentation has, but I have an error that basically says that I ran out of free trials, but I never used it before. Am I missing any configurations for it to work?
the error is 429 I think
Can someone help Iβm having problem with my visual studio code My visual studio code is stating numpy not accessed.
Import numpy could not be resolved
For reference, below is the 'ideal' image for the form. They wind up getting printed, filled, then scanned, so there's some noise/orientation stuff goin' on there.
Do you have the right python interpreter selected, where numpy is installed?
OpenAI API free trial credits expire a few months after you create your API Key iirc, have you created it a while ago?
yeah, I guess I did a couple of months ago
you could try using Google's Gemini instead, it has a fairly generous free tier, though is not as good
I have python 3.12 interpreter selected on my vscode but havenβt downloaded numpy i thought it comes with the python interpreter
nope, you have to pip install it - it is not part of the standard library.
How do i go by it ππ
assuming you have selected the correct interpreter, pip install numpy
Hi, can anyone please help with this. I believe there's something very simple here which I am missing. Thanks!
Just trying to make sure that the age input is within the given range and that it is an integer but it doesn't seem to be working
input always returns a string
I thought I fixed that by doing int(age)
But maybe not
what do you think isinstance does?
redirect for further inquiries as you're being helped in pygen already #python-discussion message
i was wrong
scipy conjugate gradient is better
its just using finite differences to calculate the gradient and i thought thats part of conjugate gradients
Guys what do you mean by a pipeline?
It can mean several different things. It can mean like an sklearn pipeline which is encapsulating a model with its preprocessing or it could mean something like data processing pipelines
you should be able to specify an analytic gradient too, but yeah, by default most optimizers will use stochastic approximation with finite differences, and then use the gradient to approximate the hessian if 2nd order
I have 2 YoE currently working as a data scientist, I can't decide whether I pursue a Master's degree in DS. I've read a lot of blogs but still confused, I'd love if someone gives me a hint or something?
What about an nlp pipeline?
Same as the sklearn, preprocessing + predictions in one. Look at spacy for an example.
interesting, but did you try what the error message suggested? .venv/bin/pip install textblob
How can i speed up my object detection solution/model?
Im currently getting 11FPS on my computer and i'm looking for ways to getting more because thats unfortunetly not enough
I need to deploy that object detection code to raspberry pi thats why im using onnxruntime for that task
Any ideas?
Also idk if thats even possible to use GPU on raspberry pi so im currently using my CPU for such tasks
that might be part of the issue
actually I was coming here to ask, cuz I just read about how thunderbolt lets you connect to an external GPU
does it mean, I could buy a CUDA-supported GPU and get models running on that through my laptop?
Hi everyone good morning from mumbai. Can anyone in the chat recommend me good sources like YouTube channels and video links or PDFs to learn how to build an LLM?
you might have python aliased in your .bashrc, so the python from the venv is not being called
huggingface
Is data science more programming or math? In reference to ML clearly being more math and theory, would data science also be like that? Or is it about the same amount of math used in software/website development (definitely not)
any good IDE for SQL?
you can do it with a lot of math or with barely any math
data science?
yes
you can derive theoretical proofs of convergence of various data science algorithms, or you can just tinker around
Is doing it with barely any math really data science at all? Or is it like majority of YT ML courses which use pre-trained models and claim it to be ML
I'd say it is, automl often wins data science competitions
but if no one does math there will be less progress in the field
So If there is anyone with some knowladge about it would be cool if you give me some tips how to do it
Anyone got experience with external GPUs? I just yesterday came across this article mentioning how Thunderbolt 3 sort of had support for that. Would that be an okay alternative to like having to buy/build an entire PC if I could hypothetically use it through my laptop. Which I'm realising doesn't have a Thunderbolt connection... Can regular USB ports be used for this? say USB 3 (some gen). Idea being that I could practice ML locally much easier. Is it worth the cost anyway? Currently the cheapest option I have found is like $0.55 an hour and sometimes there are free options as well, they are like A4000, P5000, RTX4000 GPUs which I assume are more powerful than anything I'd be willing or able to buy anyway. So yeah, thoughts?
Okay so, It seems to be majorly statistics
I need to pick something for my integrated MSc if I do take Sc, and data science isnt looking so bad
This page has some info, but I'm guessing they also sell this stuff and you might want a more technical review. Don't have personal experience with it.
Have you checked Google colab prices?
You can also fully go the VM route, on Azure or somewhere else. If you pick a spot VM with a less popular GPU you can probably get a decent price
I checked Colab plans first, yes, they used some mysterious compute units, so I had no clue how much computing I'd actually get, I read on reddit that it wasn't that much and that it was just an opaque pricing model, it also seemed to be a limited amount of those units a month, soo, that's even worse. So I found paperspace which apparently was recently acquired by DigitalOcean and I already use DO for hosting, so they seemed like a good choice and their pricing is transparent, you know how much you're gonna pay per hour and there is no monthly limit on how many hours you can run their containers per month afaik. So yeah, that's what I use rn, they also occasionally have some free containers available for running, so that's also great.
Hmm yeah it looks decent. If you go for a spot VM someplace else you might be able to save some money if you're willing to let the VM go down occasionally
Unless paperspace also has spot
Although depanding on the price of the GPU and how much usage you're getting out of it it might make sense to buy one
Anyway to your question external GPUs might be ok but anything that isn't physically connected to your computer will have latency
Google Colab is decent. There's a GPU that costs 1.8 compute units per hour and 100 of them is β¬11
But, this is a question for @final kiln
That's what I'm considering yes, but at this point I'd probably have to buy/build a PC for that, so, not fun in terms of money π
I'm not using them thaaaat much right now and as I said, occasionally they are available for free for some time.
makes sense yes, but I'm also realising I probably don't have a computer at this time that would even support it
What are you using?
Anyone... does Excel is also needed to become data scientist/analyst?
yeah, that's what I'm doing right now and yes, it's in bursts at least for now
Honestly, unless you need it for gaming I wouldn't invest in getting a big GPU
You're better off with a desktop because laptops get chunky really fast. You can SSH into your desktop anyway if it's for model training.
Source: someone with a laptop with a 3070 card
alright, in that case I'll get a new laptop cuz my current one's display broke (second time a display has gone out for a laptop of mine), and I kinda want a bit of mobility
(I could also try replacing the display, but... idk)
and nvm, didn't read zestar's message...
Honestly if I could go back I'd get an M2/M3
yeah, that was my exact thought process as well, I could get a PC and use my current laptop to SSH into it and such, but yeah... laptop's display broke and such... not fun
for the price of a framework motherboard (if you wanna change cpu) you can get an elitebook or a thinkpad
those also have all user-replaceable components
Alright, thanks y'all for the advice, appreciate it 
u can't with framework either other than buying a new gpu module or a new mobo
both as expensive as a whole new laptop
Needed? Probably not. Is it used in the field? Probably yes. It's not that difficult to do the basic stuffs with it though and it's somewhat intuitive, especially if you will come from a programming background IMO, so yeah, I wouldn't worry about it too much maybe, but I'll let others with more experience elaborate.
Also Excel added (limited) TypeScript and Python support, so yeah, that's a thing as well.
They rebranded a lot of jobs that don't have an iota of science to data scientist and for some of those you need Excel π₯΄
ok this is cheaper than i remembered, but by no means cheap still
here's the mobos too
only the old ones are reasonable~ish
the new ones are on the same scale as a thonkpad
i agree with their philosophy, but you pay a premium for it
at any rate, even if you had a desktop computer, you would never want to pay for an A100 or H100. i agree with you and zestar that using an online service is best
or if you're lucky, your company/university has specialized compute hardware for you to use
that'd require laptop cpu makers to change how they make cpus
not something framework can tackle alone
yes lmao
yeah but you have no way of dissipating that much heat in a laptop chassis. what does exist is "DTR" (desktop replacement)"mobile computers". they are massive because they bring a desktop cpu, which also means you now need massive cooling
(gpus on laptops are also different from desktop ones, to the extent that 4050 and 4090 are almost the same thing on mobile)
that's something to take up with intel, amd, and nvidia, not with framework
essentially
if you're not playing, just forget about gpu and dock your laptop into 2~3 monitors and nice peripherals
i've derailed us way off topic from DS and AI by now, but yeah, tangentially related since it means you can never do big scale AI tasks on a laptop, and hardly on a desktop as well
keep us posted, please
Let P be a language of polynomial definitions
p(x) = a0x^0 + Β· Β· Β· + anx^n
I have 2 requirements I am sort of struggling with rn, I am not really sure if I have solved them.
If the exponent for a variable is 0 the variable is allowed to be completely omitted and if the exponenet is 1 then the variable is allowed to be written without an exponent.
Here is my take on it:
<T> -> <Coefficient> <VariablePart> | <Coefficient>
<VariablePart> -> <Variable> <ExponentPart> | <Variable>
<ExponentPart> -> ^ <Power> | ββ
<Coefficient> -> <CoeffTerm> + <Coefficient> | <CoeffTerm>
<CoeffTerm> -> <polVariable> Γ <Coefficient> | <polVariable>
<polVariable> -> [a-w][yz] | <β(βCoefficientβ)β>
<Power> -> [2-9] <[0-9]>* ```
hello everyone,
needed some guidance in getting started with learning more on Data Science and AI, I'm 17 just finished high school and i'll be joining college studying for a degree on AIML in roughly 3 months and i'll be trying to use that free time in just knowing the basics and getting things brushed.
I'm done with the basics of python but confused on what to get started with next.
would be great if someone would be able to guide me via DM for the next 3 months, just need help with knowing what to do next not doubts.
@pine peak what is the name of the degree program? Computer science, or artificial intelligence?
i actually have a choice its either,
Artificial intelligence and data science
or
Bachelore in technology with specialization in AI and machine learning
depends on what college i get into really (results come out in 2 days)
The one that has more theoretical math requirements is probably "better"
that would be first one, the second degree is more of using the applications of AI and machine learning
What you'd learn in the second one would become outdated more quickly
makes sense
i'm done with python (basics) for now what do you think i should go with next?
Can somebody please explain why my graph looks like this?
In theory shouldent the F1 score have a positive relationship with confidence?
Or does the confidence variable just mean "don't accept predictions with confidence ratings less than x"
what... is confidence?
Chance the AI thinks the classification it gave the object is correct
basically "how confident" it is that it's correct
hmm
i struggle to see why the two should have a linear r/s**
ur F1 is precision + recall, right?
precision = TP / (TP+FP), recall = TP/(TP+FN) if i rmb correctly, but the confidence is basically saying how 'sure' the model is in its prediction, which could be either 1 (TP) or 0 (TN), right? (the numerators are both TP, there's no TN in either the numerator nor denominator of both precision & recall)
perhaps another way to say this would be, high confidence doesn't mean high TP. it actually means high TP and TN. whereas f1-score/precision/recall are more focused on TP and not TN?
so if u had another 'score' that measures how sure the model is in its prediction that the target is 1 (positive), then yes perhaps that 'score' (whatever its called) might show a positive r/s with f1-score
also, i guess these are with the assumptions that your model is actually tuned properly and having good evaluation scores -> otherwise if your model is tuned poorly it'll prolly throw out high confidence scores for predictions that turn out to be mostly wrong?
Does anyone here have any idea what this means? How do they get the values of p and q here
It reads to me that they are plotted on Figure 3-4 which you didn't post.
looks like p and q are given to you, not something to be computed. it's just an illustrative example. as reptile says, the vectors should be in the figure. that being said, you are given T and you are also given the coordinates of p and q in the basis S (the canonical basis), so you also know what p and q are in that basis
the basis S is just the identity matrix, so p and q are exactly the vectors mentioned there: p = [3, 1] and q = [-6, 2]
What's up guys
I just made my first AI with the digit classification MNIST dataset. How far behind am I?
That's a good first project.
Him im running my onxx model using onnxruntime on my CPU and im wondering how can i speed it up
With out using GPU
Also thats object detection model, and dynamic quantization seams to not working, it gives me worse results. Before that i had like 11 FPS and after, something like 7FPS
Hey guys how can someone find teams to work in for the kaggle competitions?
maybe try Forums/Discussions
apprechiate it. Do you know any active/popular forums? (i can google though peer advice is more valuable)
I meant mostly Kaggle's own, e.g.
- https://www.kaggle.com/discussions
- https://www.kaggle.com/competitions/ai-mathematical-olympiad-prize/discussion
but looks like they have a feature for that?
https://www.kaggle.com/discussions/product-feedback/341195
[Product Update] Improvements to the Competition Team Tab and Finding Teammates.
Hello guys, how's going?
I recently acquired a very powerful GPU but my training times only went down by half compared to training locally on my laptop. im using pytorch lightning and i see that GPU usage is very low (< 5%) and 100% cpu usage. training till early stopping still takes about 1.5 hours per set of hyperparams im testing which is a little too slow for me.
can i have a bit of intuition/suggestions as to what might be the potential issues? thanks!
edit: im training a bi-directional LSTM - does anyone have insights on if this model is not as efficient on a GPU (i've heard some comments about this) - dataset is relatively small tho
It might help to put the code in the pastebin and indicate which parts you changed after migrating to the better GPU
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
It's like integrating y-axis to x-axis to find the median
i dont have access to my code right now but from what i can recall, i changed the accelerator='gpu' and devices=[1] (we have multiple GPUs) in the Trainer object of pytorch lightning. also some misc stuff such as calling .numpy() on the Tensor objects but i think that's mostly it.
i kinda thought one of the points about using lightning was the easy transition from cpu to gpu :/
ill paste my code when i have access. but honestly it's... super long
what kind of maths like which topics in maths are required for datascience and like ml and ai
I'm trying to predict the lightning events
Using cnn
But my model is not able to learn lightning events instead it's giving high accuracy in predicting non lightning events
I can share the whole code like I did to prepare feature data and target data but I'm pretty sure I have to explain it first as it is little messy
Lightning that happens in cloud
Data consists of 6 variables Tir1, tir2 ,swir, vis, latitude and longitude and in target I have lighting occurred or not at that point
Coordinate where lightning will occur
Wait I m writing
Each variable is a 2d array of 1536,1392
So images yeah
6 images as features and in output one image
Sure
I am following unet arch
Yup
Oh sorry Auto correct
Model architecture is like this. Input = 1536,1392,6
32 conv2d
Maxpool
64 conv2d
Maxpool
128 conv2d
Maxpool
256 conv2d
Upscale
128 conv2d
Upscale
64 conv2d
Upscale
32 conv2d
1 output = 1536,1392,1
Arch is like this
The basic one only
I curated data myself
I am following a paper
Kk
But there is little bit diff
Something like this only I was following I am following that lightning cast one
ProbSevere LightningCast: A deep-learning model for satellite-based lightning nowcasting
Oh I can't share pdf here
The problem is lightning events are less than 1%
Non lightning events are 99.9%
Of data
Can that be the problem ?
The paper I am following?
@wooden sail how do you feel about Einops at a glance? https://github.com/arogozhnikov/einops
i saw there code they didnt do anything
output_tensor = rearrange(input_tensor, 't b c -> b c t') is interesting
i love it, though at face value, rearrange is not needed
yeah touching it would be actual is a bad idea
Compared to just transposing?
as in real case scenerio we will have a faulty model
technically transposing is also never needed
the whole point of einstein notation is that reshaping is not necessary
just careful treatment of the indices suffices to define any contraction
You can express the susbequent expressions in einstein notation on the original tensor?
they have just gridden the data like data is in different scales
Is that what you're getting at?
so to bring dat to similar resolution they did those steps
the bigger problem is that all of the modules they list compatibility with already have einsum
they probably just wrap the underlying einsum implementation (is my guess)
I feel like this is what I needed all along because I'm going around commenting tensor shapes like a madman
you should do that anyway imo
That's fair, but I suppose with this the intent matches the output more
how do you mean?
To use programmer buzzwords: this looks self-documenting
generally where i was working they didnt talked about testing the model they talked about validation more
while with Torch I do a bunch of operations and really have to comment why to save the reader (myself in a week) the trouble of trying to figure it out
so it is possible from validation they mean testing
nup they have test on new data
Anyhow, I'll try it out. Or do you recommend just using einsum itself in Jax, Torch and numpy?
i have some amount of beef with this, in the direction of turing machines being able to represent the same thing as lambda calculus but in different syntax. the problem comes from using a representation of the math that is just not really suitable for "self documentation"
and even that aside, in papers using only math and text, you still reiterate the dimensions whenever confusion might arise. i wouldn't expect imperative code to be able to circumvent this
so from 1536,1392,1 i.e 2138112 total events only 30k are lightning events
i would suggest just using einsum from the respective module unless you plan on making your pipelines uniform by always using einops
in total then when we divide it based on time it gets even lesser
I'll start from there then yes
i did but model than started predciting lightning values in excess
then i talked here and they told that making changes in data is not advisable as in real life scenerio this will be the case that lightning is less
would you like to see the code and data ? i can stream itand explain what i did
till now
alright in voice caht 1
damn i dont have streaming perms
of model ?
this is X_data
i didnt used exactly same arch they used used litte simpler one
still that model is also pretty complex
my point is it should predict atleast something why only predcit non lightning events
this is y_data
model
one more thing should i feed latitude and logitude as features or not?
yeah just number of layers is less
like they used 2-3 32 layers than 2-3 64 layers
do they?
they have 4 features i believe
didnt made any instead i i plotted it on map
and compared it with original
some of predicted looked pretty similar
Epoch 1/10 22/22 [==============================] - 245s 11s/step - loss: 23.8827 - accuracy: 0.9090 Epoch 2/10 22/22 [==============================] - 240s 11s/step - loss: 0.1823 - accuracy: 0.9974 Epoch 3/10 22/22 [==============================] - 237s 11s/step - loss: 0.1417 - accuracy: 0.9891 Epoch 4/10 22/22 [==============================] - 233s 11s/step - loss: 0.1220 - accuracy: 0.9966 Epoch 5/10 22/22 [==============================] - 246s 11s/step - loss: 0.1100 - accuracy: 0.9995 Epoch 6/10 22/22 [==============================] - 245s 11s/step - loss: 0.0749 - accuracy: 0.9989 Epoch 7/10 22/22 [==============================] - 240s 11s/step - loss: 0.0743 - accuracy: 0.9995 Epoch 8/10 22/22 [==============================] - 237s 11s/step - loss: 0.0447 - accuracy: 0.9991 Epoch 9/10 22/22 [==============================] - 244s 11s/step - loss: 0.0458 - accuracy: 0.9990 Epoch 10/10 22/22 [==============================] - 235s 11s/step - loss: 0.0425 - accuracy: 0.9991
it is going down
that i dont have currently
its not overfitting i checked it
in my lab i did plotted loss val loss
but when i completed the work they didnt allowed me to take the work outside the lab
can i omehow make this run on my system?
i have 16gb ram and rtx 3050 laptop gpu it crashes down everytime
i try to run
yeah i need to create a loss func that monitors how good it predicts the lightning events
now i just have to figure out how can i run this on my laptop lol
ok
that is because it is predicting non lightning cases accurately \
alright i will check it
that might be the case
ok
how to segregate images based on their features?
I want to segregate all the shapes with curved edges in them and segregate them into the circular category, there are 3 in total, circular, intersecting and overlapping
Hi, im using easyocr as my main OCR and im running it on device with out GPU and im wondering if it's possible to install only libs needed for CPU
It's PyTorch based right? PyTorch has a cpu install option.
Ye
Your right
I forgot about It, thx
import string
exclude = string.punctuation
def remove_punc(text):
return text.translate(str.maketrans(' ',' ',exclude))
Could anyone explain what does this code do?
I got a part of it understood . It is used to replace characters in the string
hey i'm new here and i need help regarding a project of a virtual assistant. My problem is im using tokenize,removing stop words and then using stemming on the question, after that these two sentences "What is your name" and "what is my name" is only left with "name" to work upon which then will respond as a single answer for both of the questions . how can i fix that?anyone can help me?
str.maketrans(...)
If there are two arguments, they must be strings of equal length, and
in the resulting dictionary, each character in x will be mapped to the
character at the same position in y. If there is a third argument, it
must be a string, whose characters will be mapped to None in the result.
so basically the first 2 are just dummy elements (mapping' 'to' ', basically doing nothing), and everything inexcludewill be mapped to None... so be excluded from the result
!e ```py
print("abc".translate(str.maketrans('a', 'z')))
@left tartan :white_check_mark: Your 3.12 eval job has completed with return code 0.
zbc
Could you provide more details? Share your code perhaps. A help thread might be better: #βο½how-to-get-help
playing with this dataset, trying to do multiclass classification on Target(3 possible values, Dropout Enrolled Graduated)
I'm find that models have a hard time discerning Enrolled: (using SVC as example)
-----SVC-----
precision recall f1-score support
Dropout 0.84 0.73 0.78 284
Enrolled 0.52 0.33 0.40 159
Graduate 0.77 0.94 0.84 442
```the data's imbalanced,
ALL['Target'].value_counts()
Graduate 2209
Dropout 1421
Enrolled 794
```I triedclass_weight='balanced'on SVC which improved it a little (f1ofEnrolledsits around0.47), then I tried down sampling to get
down_sampled = pd.concat([
ALL.loc[ALL['Target'] == 'Graduate'].iloc[:800],
ALL.loc[ALL['Target'] == 'Dropout'].iloc[:800],
ALL[ALL['Target'] == 'Enrolled'].iloc[:800]])
-----SVC-----
precision recall f1-score support
Dropout 0.78 0.67 0.72 200
Enrolled 0.62 0.66 0.64 199
Graduate 0.75 0.81 0.78 200
```which (I think) made it overall better, but it's still having a hard time classifying `Enrolled`, so I think the bad performance isn't *only* due to imbalance
are there any other techniques I can use to improve this?
have you tried using another model type?
randomforest or tree based models could work better
yea, I get a similar trend using trees, I just used SVC as example cause discord character limit
e.g. lightgbm
original dataset
-----LGBMClassifier-----
precision recall f1-score support
Dropout 0.79 0.74 0.76 284
Enrolled 0.54 0.38 0.44 159
Graduate 0.79 0.91 0.85 442
accuracy 0.76 885
macro avg 0.71 0.68 0.68 885
weighted avg 0.75 0.76 0.75 885
after down sample
-----LGBMClassifier-----
precision recall f1-score support
Dropout 0.77 0.69 0.73 200
Enrolled 0.64 0.70 0.67 199
Graduate 0.80 0.81 0.80 200
accuracy 0.73 599
macro avg 0.74 0.73 0.73 599
weighted avg 0.74 0.73 0.73 599
it's basically the same anyways so I thought it's not important
instead of doing downsampling u can try giving weights to the classes
cuz downsampling makes u loose data
or u can try deep learning as well
just let the neural network figure it out
I'm not sure how to do that manually (manually with reason and not just "this feels ok"), and I already tried SVC(class_weight='balanced') or LGBMClassifier(is_unbalance=True), etc. (basically using the models' parameters that signal the data being imbalanced) and it didn't seem to have too much of an effect
what are the variable names?
wdym by that
i am trying to see if there is anything we can do to make data more meaningful
cuz this issue needs more context
here's the dataset ig
it has 37 columns, and rn I've just been looking around, the SVC and lightgbm result you see above had minimal feature engineering
let me work on this dataset a bit
this is from the GRU network (which has like 2x the params), but it shows a similar graph anyway