#data-science-and-ml
1 messages · Page 151 of 1
I've always loved math, I just haven't had the chance to learn it.
i guess i will just have to create time
i mean mathematics is the deepest core to every stem field and if you want to REALLY learn these topics, you're gonna have to do the math
like i said, if you dont care and just want a very superficial understnading then it doesnt matter
Hastebin is a free web-based pastebin service for storing and sharing text and code snippets with anyone. Get started now.
I could use a hand
For one, the classifier doesn't seem to be learning. The loss starts at 100% and doesn't go down
thank you so much. whats your research about?
Beyond that, I'm in a chicken-egg situation. The encoder needs the classifier to prevent collapsing into the zero vector for all classes, and the classifier needs robust encodings to make decent predictions.
i am a phd in applied mathematics, my research is in high dimensional multi-agent deep reinformcent learning and i also developed an algorithm for ultra fast n-nomial tree traversal
i will put that second paper in preprint soon
wow, this is good stuff
honestly this sounds impressive. your research and the algorithm you developed for ultra-fast-n-nominal tree traversal are groundbreaking. now i'm even more inspired
first then, not so much, applying things that hav elong been reserached in a.i.. The algorithm...I hope so. I think it will get very well received because it si the first progress in the problem in 40 ish years. max tree before was like depth 60 and i get to depth like....175? 200? depends
sounds cool, what's the difference between high dimensional and traditional multi-agent rl?
high dimensional is exactly that, super high dimensional inputs. Thing combining things like computer vision that has super high dimensional feature space inputs
multi agent RL has difficulty with these things but there are many clever ways to get around it
my research in that field is like seafood stew -> i throw a bunch of really well known mathematical techniques together and hope the stew that comes out still tastes good
but this is why i said mathematics is so important to learning a.i. at a high lvl, i can barely use anything prebuilt because it doesn't conform well to the problem. if you dont understand the nuances and try to just throw basic solutions at complex problems, you just won't get anywhere
especially when you are working with custom environments nad not something like openai gym environments that are super well tuned
rl is a fun domain though, all my rl projects are a joy to watch
assuming you're rendering your environments
ig you could just have a sim in memory
what kinda models are you working with?
honestly, thats still amazing. yes, it has been around but your ability to push the boundary with such a significant advancement is amazing. and taking the depth from 60 to 175 or even 200 is also a big progress especially after 40 years. i know it will be received well in the field
there arent necessarily models? I need to craft my own networks. I can use some open research that are good suggestions, but they dont work with the problem. Like i can show an example of what I am playing with right now:
class Q(torch.nn.Module):
#self.scaling = torch.diag([1/175, 1/175, 1, 1/175, 1/175, 1])
def init(self, first_layer_dim = 64, second_layer_dim = 32, seed=42):
if seed is not None:
torch.manual_seed(seed)
super().init()
self.linear_in = torch.nn.Linear(myBox.D_state_space, first_layer_dim, bias = False)
self.hidden_linear = torch.nn.Linear(first_layer_dim, second_layer_dim)
self.hidden_linear2 = torch.nn.Linear(second_layer_dim, second_layer_dim)
self.linear_out = torch.nn.Linear(second_layer_dim, myBox.D_action_space)
self.tanh = torch.nn.Tanh()
self.softrelu = torch.nn.Softplus()
self.ReLU = torch.nn.ReLU()
self.bn1 = torch.nn.BatchNorm1d(64)
self.bn2 = torch.nn.BatchNorm1d(32)
def exponential_activation(self, x):
return torch.exp(x / 100)
def forward(self, x):
#(175,175,2,1,1,1)
if isinstance(x, (list, tuple)):
x = torch.tensor(x, dtype=torch.float32)
if isinstance(x, numpy.ndarray):
x = torch.from_numpy(x).float()
#print(f"Input to the network: {x}")
out = x
out = self.linear_in(out)
out = self.tanh(out)
out = self.hidden_linear(out)
out = self.tanh(out)
out = self.hidden_linear2(out)
out = self.tanh(out)
out = self.linear_out(out)
return out
def policy(self, x):
return torch.argmax(self.forward(x))
myQ = Q(seed = SEED)
it has a bunch of problems and i keep messing with it but there is a lot of meat and potatoes to things like this
like i could do things like add backward passes, do away with torch all together and work on the graph itself with custom weights. squeeze means and variances and then put them back together
or alternatively i could do something contracting like 32 -> 64 -> 128 -> 64 - > 128 -> 64 -> 32
that way it is contracting midway and then return back but there is a lot of meat and potatoes to this as well and adding all this complexity doesnt help learning necessarily even tho most people just say "throw more layers at it"
Fixed!
Just needed another term to enforce unit magnitude. Kept it from collapsing, not sure though if its producing distinct vectors for each class
The thing is, I often see people use “R&D” as a blanket to not deliver
Unless it’s a lab I’d say: don’t make ML/DL a goal, but rather something concrete. Make something bad first and iterate. Also think of how you’ll do the deployments, monitoring, …
Hey, I'm about to use/create an object detection model where I have both RGB, and a depth channel. I would preferably use all of this data, but it seems most detection models (and especially pre-trained ones) expect RGB data (so only 3 channels). If you have any other ideas of how to solve this, please let me know. Some current considerations are:
- Combine RGBD to 3 channels by performing PCA or some other method?
- Simply replacing one of the RGB channels with the depth channel.
- Using two models, giving one model the Depth (as grayscale?), and the other RGB, and combining the results somehow?
you can consider slapping a few layers in front that map the 4 channels to 3, and/or replace a few of the initial layers of the pretrained models
Yeah, was considering the adding of layers as well to map from 4 to 3, but it would get more complicated to train the entire model.
But ig if the model is written in pytorch, it would be pretty easy to combine. thx for the tip!
Have you had a chance to take a look?
more complicated in what sense?
Well it requires me changing the model architecture. If the model is given as just some black box it might be difficult to modify it. But it depends on what library it is implemented in.
i'm not sure you can circumvent this tbh, since pretrained models that weren't trained on the same types of features will just perform poorly if you don't explicitly do something about it
you're just not feeding in the type of data they were trained on
I will finetune it on my data still.
Taking a walk rn, I can show an image of the data later if ya interested.
i would place the approach of adding layers that map from 4 to 3 channels under fine tuning
can someone help me in how to use linear regression in coding
Yo
Does anyone know how I can implement a split neural network? Having a hard time with it. Basically I have a client and a server and say a layer NN is split with 4 layers on client side and 6 on server side. Does anyone have an idea how i can backpropogate in such a framework?
what part of it troubles you?
I know linear regression theoretical part
I just don't know how to use it in coding can u give me learning resources that can explain how to use it
In coding
what level are you looking for? with just the math part, implementing it in numpy should be pretty straightforward
if you only care about getting the result, scikit learn should have you covered
e.g. you could use this https://scikit-learn.org/1.5/modules/generated/sklearn.linear_model.LinearRegression.html to black-box pretty much everything. then you only need to specify inputs and outputs
pretty similar if you use this https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html
if you want to do it more manually, numpy's pseudo inverse https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html is pretty much all you need, since the least squares solution of linear regressin is given by the pseudo inverse
Guys just wanted to ask
We do feature selection right!
Then we apply that on training dataset... which is Xtrain
Do we also select the same features from test data Xtest for prediction??
well, you kind of have to, since your model will not have been trained on (nor even have an input shape compatible with) the other features.
hello there, I am an accounting student and want to explore the field of data science, I have done some research over the internet and found out that I should learn python. Can anyone suggest me a course that would easier for me to learn python, for someone who has no coding background.
!resources
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
Hi...anyone interested in chating about stuffs and ideas or anything in general can dm me...i would love to talk to u
Hey, I'm looking for someone to help me through an Automated Essay Scoring deep learning project
It's a project course in my undergraduate degree
Where can I find such people?
How to get into reinforcement learning and is C++ needed?
C++ is not.
Why not? Don’t you need a raspberry cpu and stuff?
pretty good book on the topic imo
no you don't need any specific language or cpu to start reinforcement learning
you could code it in scratch and run it on your samsung smart fridge if you want
Thank you
I need some input.
My encoder consumes batches of images all in the same style and attempts to produce identical vectors across the batch. I'm using euclidean distance for this. To keep the vectors from collapsing to the zero vector, I'm using another constraint to reward the encoder for producing unit vectors. That's done.
The problem gets a bit more nuanced from here though. I'm hesitant to use orthogonalization or contrastive loss to explicitly push styles apart because many styles are quite similar. What I need is an embedding space in which similar styles' vectors point in similar directions. If I explicitly push styles apart simply on the basis of belonging to different classes, the encoder could default to locating the first distinguishable feature and, based on that, pointing the vector in some arbitrary direction. In short, I need a smooth embedding space.
I know KL-divergence does this to an extent.
In my mind, an adversary seems like a natiural choice, but it comes with some caveats of it's down. What I don't want is for the encoder to try to fool the classifier by building the most ambiguous possible vectors. I need the encoder to capture salient details, the classifier to learn those details, and then the encoder be encouraged to look deeper to fool the classifier.
As such, I think I have two problems: forcing the encoder to capture actually salient information, and forcing the encoder to do so in a smooth way such that similar classes have similar vectors
Anyone have any thoughts?
Does anyone know what's going on when I'm using subplots with matplotlib and I'm trying to set the xlim's of the subplots but both subplots get affected?
Wooooooooopsies
I had my dropout set of 1.0 instead of 0.1
XD Might explain a thing or two
Code?
import matplotlib.pyplot as plt
import csv
n_bins = 3000
volumes = []
with open('bgm_volume.csv', newline='') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in spamreader:
volumes.append(float(row[1]))
f, [ax, ax2] = plt.subplots(1, 2, sharex=True)
ax.hist(volumes, bins=n_bins, histtype="step")
ax2.hist(volumes, bins=n_bins, histtype="step")
ax.set_xlim(-90,-80)
ax2.set_xlim(-5,13)
# hide the spines between ax and ax2
ax.spines['right'].set_visible(False)
ax2.spines['left'].set_visible(False)
ax.yaxis.tick_left()
ax.tick_params(labeltop='off') # don't put tick labels at the top
ax2.yaxis.tick_right()
# Make the spacing between the two axes a bit smaller
plt.subplots_adjust(wspace=0.15)
plt.show()
Set share_x or share_t to false
Oh that worked, thanks
hey guys how do i post code like this?
```py
code
```
Hi guys, so im currently learning about SHAP values, and im going through a dataset from kaggle (https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset), its a stroke prediction dataset. the code below is to calculate the SHAP values and ultimately plot a force plot and a summary plot to visualize feature importance when predicting the likelihood of a stroke. shap_values[1] always results in an index out of bounds error, i cant access the class predictions of 1, because there seems to be none, anyone know where im going wrong? sorry kinda new to this stuff, any help would be much appreciated
X_train, X_test, y_train, y_test = data_loader.get_data_split()
X_train, y_train = data_loader.oversample(X_train, y_train)
svm = SVC(kernel='rbf', probability=True, random_state=42, class_weight='balanced')
svm.fit(X_train, y_train)
y_pred = svm.predict(X_test)
print(f"F1 Score: {f1_score(y_test, y_pred, average='macro')}")
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")
explainer = shap.KernelExplainer(svm.predict_proba, X_train.iloc[:10], link="logit")
start_index = 1
end_index = 2
X_test_row = X_test.iloc[start_index:end_index]
shap_values = explainer.shap_values(X_test_row)
print("Shape of SHAP values for class 0:", shap_values[0].shape)
print("Shape of SHAP values for class 1:", shap_values[1].shape)
if len(shap_values) > 1:
print("Shape of SHAP values for class 0:", shap_values[0].shape)
print("Shape of SHAP values for class 1:", shap_values[1].shape)
else:
print("SHAP values for class 1 are not available")
shap.initjs()
prediction = svm.predict(X_test_row)[0]
print(f"The SVM predicted: {prediction}")
shap.force_plot(explainer.expected_value[1], shap_values[1][0], X_test_row)
shap.summary_plot(shap_values[0], X_test
this is the first part of the code with the dataloader class and libraries:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.svm import SVC
from sklearn.metrics import f1_score, accuracy_score
from imblearn.over_sampling import RandomOverSampler
from sklearn.model_selection import train_test_split
class DataLoader():
def __init__(self):
self.data = None
def load_dataset(self, path="data/healthcare-dataset-stroke-data.csv"):
self.data = pd.read_csv(path)
def preprocess_data(self):
categorical_cols = ["gender", "ever_married", "work_type", "Residence_type", "smoking_status"]
encoded = pd.get_dummies(self.data[categorical_cols], prefix=categorical_cols)
self.data = pd.concat([encoded, self.data], axis=1)
self.data.drop(categorical_cols, axis=1, inplace=True)
self.data.bmi = self.data.bmi.fillna(0)
self.data.drop(["id"], axis=1, inplace=True)
def get_data_split(self):
X = self.data.iloc[:, :-1]
y = self.data.iloc[:, -1]
return train_test_split(X, y, test_size=0.20, random_state=2021)
def oversample(self, X_train, y_train):
oversample = RandomOverSampler(sampling_strategy='minority')
x_np = X_train.to_numpy()
y_np = y_train.to_numpy()
x_np, y_np = oversample.fit_resample(x_np, y_np)
x_over = pd.DataFrame(x_np, columns=X_train.columns)
y_over = pd.Series(y_np, name=y_train.name)
return x_over, y_over
data_loader = DataLoader()
data_loader.load_dataset('data/healthcare-dataset-stroke-data.csv')
data_loader.preprocess_data()
I mean you have sharex=True so I'm not sure what you're expecting
I didn't realize sharex being true did that, I'm new to matplotlib. I thought it was for like sharing other proporties
the sharex and sharey literally means they're sharing 1 x/y axis
so when you change the shared axis, you'll see the affect on both plots
lmao
reminds me of Little Bobby Drop Tables
ok ok thanks
hi
hi quick question does anyone here has some insight of this following nlp model which is flair because it seems it is a model and framework at the same time
flair is an nlp library and there are models that come with flair.
it's a similar arrangement as spaCy
can you give some example of that specific model
A very simple framework for state-of-the-art Natural Language Processing (NLP) - flairNLP/flair
can we introduce some algorithm that may take use of flair Nlp
If you had the choice would you learn numpy before matplot or the other way around?
for me i think numpy first
Neither
I'd learn them as I needed them
numpy is the foundation of all python data science.
matplotlib is a horrible thing. I'm trying to move to plotly.
hello, i'm thinking about web scrape a website but it only has price and the name of the product
i mean is that all enough ?
hi, does anyone know local models for speech language recognition that work on cpu beside models from speechbrain? it should only return only in what language is audio file, thanks
start by making a part-of-speech tagger.
i know the basics of python
idk anything about ai development
Its easier then. Just start of with libraries like numpy and pandas.
Those will help with data manipulation, then go on to scikit learn, pytorch and tensorflow. Purely ml libraries. Ofc u gotta learn them
whats the level of math i gotta know to start
cuz my math aint that good
Start of with linear algebra. There are a few channels. I recommend 3blue1browns playlist. Watch and learn it
the whisper models can do language recognition (when language is not specified), and I don't think there's anything strictly preventing you from running it on CPU
of course, it's a bit overkill for just it, since generally whisper is for transcription.
You wanna practice web scraping ??
I agree about matloblib and plotly here
I'm just a bit stockholm'd in terms of matplotlib
yeah
Why it looks like u someone who writes bots. Lol
I wanna do a project in university in which i have to use python
some ai related task but I wanna make gui also
we can use tinker in py but it is noot good
i also know c# and I wanna make gui in c#
what should I do to combine both languages?
lol
what u using for scraping??
selenium ? soup ? lxml ?
I used to practice here
beautiful and selenium
also instead of doing so much bullshit with putting command on the terminal to upload the file from the computer, YOY CAN SIMPLY DRAG AND DROP THE FILE LIKE A NORMAL PERSON
what??
is this even relevant here?
how do you use both arent they both parsers
and how do you request the html
late question but what do i do after the learning the math, i dont think i can start building my own ai after learning maths
actually, u can, a.i. is just math, there is nothing more to it
once you know the math, its a breeze to learn how to translate the math into code
which is all machine learning is
So Ive been tinkering with focal loss for my class imbalance
I got promising results on BiLSTM with focal loss
honestly tho sentiment analysis is really hard
in terms of having great metrics
its not like your regular CNN
how you preprocess text, how you set up your model, how you deal with class imbalance
its just way too much tinkering and adjusting
to use lemmatization or not, to remove stop words or not, what kind of embedding to use? TF-IDF, Word2Vec or GloVe, what type of regex to use? Do you need to remove $ sign from your text and bla bla bla
at the end of it I turned into a zombie
I wish there was an easier way to deal with the regex part of it
What’s harder: NLP or CV? Go!
thanks
you can actually learn all about ML without being a math wizard. understanding what math is applied and what it achieves is critical. it can be misleading though. you dont need to be an Actuary, but you need to understand the concepts and why we apply them or don't.
can you ask help questions here?
Probably...
Either ways i have a question
Im tryna learn AI with python, what module to start with?
Sklearn, Keras, Pytorch, Tensorflow?
I'm making an ai voice assistant that uses gemini ai, it works fine for the first question but when i try to ask it once again it does not work anymore, it takes the prompt perfectly well, while it is listening for wake word, the mic icon is lit up for me but after that it turns off, any help?
https://paste.pythondiscord.com/DGTA
numpy and pandas/polars
Hoi, just need to ask if what i want to try is doable, or rather even feasible/worth it
So with A.I models, they can chug all your vram and then some, can i make say comfyui to not use shared vram, aka ram, but dump it over to a second gpu which's only purpose is to hold the memory for main gpu to read? For instance 3060 12GB
And as PCIe gen 4 is 2GB's per lane, and second gpu with bifurcation would make main gpu x8, and 3060 x4, would that make it even slower than ram holding the dump of A.I nonsense?
GPT spat out this code, but don't know if it'd even work as intended lol.
import torch
primary_gpu = 'cuda:0'
secondary_gpu = 'cuda:1'
# Simulate workload
try:
data_primary = torch.zeros((5000, 5000), device=primary_gpu) # Allocate on primary GPU
print("Primary GPU memory usage:", torch.cuda.memory_allocated(0) / 1024**2, "MB")
# Check if primary GPU is nearing capacity, then offload
if torch.cuda.memory_allocated(0) > 2 * 1024**2: # Example: 2GB threshold
print("Offloading to secondary GPU...")
data_secondary = data_primary.to(secondary_gpu)
print("Secondary GPU memory usage:", torch.cuda.memory_allocated(1) / 1024**2, "MB")
except RuntimeError as e:
print("Error:", e)
Alr, thanks
How does one webscrape YouTube commoners? It was not “div” of usual tags
This channel isn't about web scraping, but you can't download YouTube videos like that without violating their terms
DM me. I can point you in the right directions.
A Gradio-based tool using SAM and Stable Diffusion for interactive image segmentation and inpainting, runnable in just 3 lines of code. https://github.com/SanshruthR/Stable-Diffusion-Inpainting_with_SAM
Sklearn
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
Just the comments for NLP cleaning
hey everyone, trust we're all having a good day.
is DSA required for data science?
DSA is required in every Computer Science fields.
DSA is inherent to writing any code really
can anyone suggest some good ML projects that can get me internships
hello to everyone. Im doing my masters in DS but feel like idk sh!t about python. The rest is history😂 . Hope this server will help!
https://www.dataquest.io/blog/jupyter-notebook-tutorial/
I recommended something interactive as you follow along and learn, there's a decent tutorial. I would search for this kind of content though.
Hey anyone managed to get BiPedal walker working on an Actor Critic before? Like the normal one from Gymnasium. SAC keep sucking, I tried different lr, and different batch size it is just consistently stuck at around -70 to -40 range for rewards
What is the easiest and least taxing way to fine tune BERT from a pandas dataset?
What kind of quant are you trying to be?
Ask lots of questions and hang out in #python-discussion . You'll learn something new every day. I do.
If you are trying to be a quant(I assume this because of your username) you need to get down: markowits, CAPM, and single index models. And you need to be able to do them like they are nothing, because they are nothing. You should be able to do it on the spot at any given time.
Are there any good papers to research on machine learning so I can learn how to use my twitch better?
John Mccarthy is quite cool http://jmc.stanford.edu/articles/index.html. He has multiple research papers on machine learning
Thank you
Which one should I look out for specifically?
https://youtu.be/HiOtQMcI5wg?si=tDUlbSkkO5vSVbHS
why this guy didn't use selenium as we know that amazon is a dynamic website?
Take my Full Python Course Here: https://bit.ly/48O581R
Web Scraping isn't just for those fancy "programmers" and "software developers". Us analysts can use it too! In this project I walk through how to scrape data from Amazon using BeautifulSoup and Requests.
LINKS:
Code in GitHub: https://github.com/AlexTheAnalyst/PortfolioProjects/blob/mai...
Oooh, I see what you're doing. Add more negative data, and mark I hate you as negative
Selenium requires extra set up.
Like?
I thought that we can't do request dynamic websites
Only selenium is possible
Yo guys, right now im learning linear algebra with book "Linear Algebra Done Right" by Axler. Do you think its a good choice for learning linear algebra for ML?
How can I label a data set?
you can use a labelling tool
Hear my words: do not try to create your own dataset. Just find an existing one and train the neural networks on that
What kind of tool?
Well yes and yes I'm going to be placing this into a robotic dog and I want to make it so that can avoid objects and I don't know if there's a furniture and room data set so that doesn't bump into walls etc
and it will be preferrable to use Yolo architecture to train the data
you can consider CNN but i will prefer yolov8 or v9
what do u want to scrape
How to fix this error on colab?
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-6-9f1d71c52d24> in <cell line: 1>()
----> 1 from inltk.inltk import get_embedding_vectors
2 vectors = get_embedding_vectors(text, 'kn')
6 frames
/usr/local/lib/python3.10/dist-packages/fastai/imports/core.py in <module>
7
8 from abc import abstractmethod, abstractproperty
----> 9 from collections import abc, Counter, defaultdict, Iterable, namedtuple, OrderedDict
10 import concurrent
11 from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/__init__.py)
---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
I am trying to do a simple use of inltk ->
from inltk.inltk import get_embedding_vectors
vectors = get_embedding_vectors(text, 'kn')
other than downgrading pythong whats my other solution
What have you tried + what is your challenge?
the traceback is related to collection module not with 'inltk'
read this
i have this code which tries to make gridded plots. it works fine without the grids, but now i get an error: ```Traceback (most recent call last):
File "/home/fbwdw/docs/polgeo/annealer/python/solver.py", line 649, in <module>
plot_path(3, 3)
File "/home/fbwdw/docs/polgeo/annealer/python/solver.py", line 594, in plot_path
hist_ax1.tick_params(axis="y", labelcolor="blue")
^^^^^^^^^^^^^^^^^^^^
AttributeError: 'numpy.ndarray' object has no attribute 'tick_params'
def plot_path(num_plans: int, num_problems: int):
hist_fig, hist_ax1 = plt.subplots(num_plans, num_problems)
for plan in range(num_plans):
for problem in range(num_problems):
data = json.load(open(f"results/solutions{problem}_plan_{plan}.json"))
assignment = data["assignment"]
hist = data["hist"]
sim_annealer_history = []
mlp_fit_indices = []
mlp_fits = []
current_index = 0
for annealer_history, mlp_fit in hist:
sim_annealer_history.extend(annealer_history)
mlp_fits.append(mlp_fit)
current_index += len(annealer_history)
mlp_fit_indices.append(current_index)
hist_ax1[plan, problem].plot(
[i[0] for i in sim_annealer_history],
label="Objective Function",
color="blue",
)
hist_ax1[plan, problem].set_xlabel("Iteration")
hist_ax1[plan, problem].set_ylabel(
"Simulated Annealer History", color="blue"
)
hist_ax2 = hist_ax1[plan, problem].twinx()
hist_ax2.scatter(
mlp_fit_indices, mlp_fits, color="red", label="MILP Fit", zorder=5
)
hist_ax2.set_ylabel("MILP Fit", color="red")
hist_ax1[plan, problem].tick_params(axis="y", labelcolor="blue")
hist_ax2.tick_params(axis="y", labelcolor="red")
```
how many columns in your dataframe have?
the hist_ax1 is currently multi dim data that's why it can't be plotted
Installing the selenium driver
ok i made some changes (edited original code) and now all the plots are empty
I think you don't need to install that in newer version of selenium
it just works smoothly with chromium
hist_ax1[plan, problem] should be different. It returns an array of axes, you should do hist_ax1[plan][problem]
looking at the docs i think that part is fine: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.subplots.html
I want a 2D grid fo plots
Hello, I am looking to clean texts extracted from a pdf, to remove unnecessary information, the goal is to remove footnotes from a page. (the project is larger, the aim is to train a model from texts extracted from a company's documents to make a small model).
- Train a model to determine whether the page contains footnotes (I already have 1000 docx files that are very easy to determine whether the page contains footnotes or not). my pages are therefore labelled
- Extract features from all the pdf pages to train a model.
I've tried using pymupdf but I can't extract certain elements such as exponent numbers (which are often used in footnotes).
so to summarise I'm training a model that determines whether the page has footnotes or not, and if it does, I remove them (part not yet done).
Do you have a general idea for removing footnotes from a pdf, or has someone already done this? I don't really see any other solution.
You're right, mb. I see what's up now
hist_ax1.tick_params(axis="y", labelcolor="blue") what do you want to achieve with this?
it just changes the color of the ticks
i dont think it affects the code much
yeah... without it all the plots are still empty
hmm, it seems you were right, and the docs were wrong. i will open an issue in the github
I mean, don’t go over board. It doesn’t matter as much as people say. Know basic stuff. Know the fundamentals, which is saying little. The main determinant that will makeyou good, is programming a lot. No way around that. You need to know the language and libraries and standards VERY well. I don’t care what anyone says; just program as much as possible and stay in the right direction in terms of how you practice and hardcore ml/dl programming skills dominates really everything else.
what would you give this roadmap on a scale of 1 to 10?
I give all roadmaps 0's.
And this particular one seems worse than most.
i feel like all devs are pessimistic, roadmaps are terrible, bootcamps are a waste of money, courses are meaningless and yt videos are a waste of time. However everyone agrees that building projects is a good way to learn
what do you think is the best way to learn?
I think yt and courses and bootcamps are fine ways to learn.
Projects, of course, is where you practice the skills... and you can't get good without practice.
BUT roadmaps: There is no roadmap. None of us took the same path. The only important decision is: What to learn/do next?
alternatively: the roadmap to becoming a Data Scientist is: get a bachelors degree, then a masters degree, then a phd 🙂
sup guys, i have recently completed re implementing GPT model with code after learning the transformer architecture, and now im trying to learn about LLMops (lang chain and all) im not clear what the road map should me tbh, can someone suggest me what to do. ( I did Machine learning with maths, deep learning with math)
And i would highly appreciate if u guys can suggest me some good projects to practise and hone my skills.
I really need some good project ideas
this could be the only decent path to becoming a data scientist
just curious, did you do your bachelor in CS or DS?
doing CS
bachelor?
yea
how did you make/train the model?
GPT architecture
@left tartan if u could help pro 🫠
World model is getting exciting recently. Anybody know if there are any good definition and taxonomy of what is exactly a world model?
https://www.worldlabs.ai/about
https://www.sciencedirect.com/science/article/pii/S0893608021003610
Just do it. People talk so much. If you want to do it, go learn. Practice. No one stops anyone. That’s it. It’s so simple. Just go put thousands of hours into this. People just talk so much. If you really wanted to do this, you couldn’t be bothered to not do it, so I know you do not want to do it. Just learn. It has to naturally dominate your thoughts, not for money but just the fact it exists.
if am going to use a nlp library like flair we are allowed to use different model right
and when it comes to algorithm we can apply some algo into
i was able to use Chrome() without any web driver now idk why suddenly its asking for chrome web driver
kaggle gives you 60gb, if thats not enough, host them an download them in batchess with requests
you could zip it, upload, unzip with python
you prolly already uploaded it by now 🤷♂️
Then go to dataset and create one
Public/private as per you
And upload zip file there
Then you can easily use that dataset in notebook
guys I need someoen to do code review for me, can someone help??
#❓|how-to-get-help Click code review
does T4 work on something akin to a free-trial system? because I got a message that was like
Sorry bud. No more fast compiling for you. You've run out of GPUs.
How much for premium? Screw you [or literally "it depends"]
but i'm dumb so im probably misinterpertating things. All I know is the option that makes my face recongition pipeline go from running in 30 mins to running in 2 mins can't be used anymore.
Why on earth does this make more money than computer science?
because data is everywhere and a key component in today's decision making process
Yes, that makes sense. I don’t think the economy needs another software engineer with crusty old compilers
these are all means to an end, not end in themselves
I was joking
What is community involvement?
Anyone here that hosts local LLMs? How much time and effort does it take and is it any good to do so?
is there any differences between torch.nn.ReLU and torch.nn.functional.relu?
don't u heard of llama?
"torch.nn.ReLU" is a module version
it is used layer wise ( while defining layers of nn )
in simple terms
torch.nn.ReLU -> is used in your neural net class
torch.nn.functional.relu -> is used in your forward method ( only if you have not specified relu layers )
I have
how much RAM do you have?
there are different ways to download
20 GB
ohh nice you can easily run 3B one
llama is a model series
ollama is one way of connecting to a model
and also you probably don't care about RAM but VRAM (and what gpu it is specifically)
thanks, and I have an integrated GPU, is it fyn?
ahh okay
thanks
Intel (R) UHD graphics
then you don't want to run locally unless you can withstand like 1 token per second
most likely less
come on, we can easily run instruct models nowadays
...with a dGPU.
is your ram ddr4 or ddr5
if it's the latter then it'll be slightly more bearable
DDR4
Hmm okay
yeah... you're likely gonna be counting in seconds per token
yea, was thinking of mostly buying
if you want to try still, I suggest running a quantized Gemma 2b, gguf format
(only gguf supports non-gpu inference)
https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct
how to request to get access?
Scroll down once you're logged in and fill this out.
if you're just looking for inference, just download the quantized file from someone else instead of the full precision model
someone else?
but this are gated models right?
yea
there are multiple people making quants
you gotta request it, its located on the url link. click expand
yeah I was finding this
ohh thanks
lol I was searching this somewhere else
its easy to miss, i figured it might have been it because I ran into that once before
you don't want to run a full prec model anyways
a Q8 quant is basically the same quality, but cuts vram requirements in half
ohh so
https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct
is this full precision?

yes
see the model card, tensor type
bf16, that's usually what full prec models nowadays will be in
okay understood
which is good for 8GB ram?
without even GPU
4-bit ??
I just wanna try them
nothing
but generally Q4KM is a good balance*
*(but smaller models seem to be becoming more and more packed with info that even q4 has noticeable quality degrades)
*(but large models, i.e. 70b range, seem to run fine even at 2 bits)
@jaunty helm are you a ML/AI dev?
no
I do run small LLMs for fun tho
okay got it
any suggestions for fine-tunning?
not with that setup
people I see doing tunes have at least 4090 or even stuff like H100
oh i see
come on how can I fine-tune them on CPU
I will probably use sage-maker ( they have some free tier I guess )
check this for a rough guideline on hardware reqs for tuning
https://github.com/hiyouga/LLaMA-Factory
ahh thanks
just don't
try even training a small CNN on CPU only
then imagine how much slower it will be to do things on LLMs, 1000x of times larger than that
yeah I know that
usually I do all stuff on T4 GPU of aws
hey ik its a trivial question but, is DSA important for ML?
it depends
on what?
don't consider DSA as DSA
it is always consider as filtering ways
think about it
all candidates have projects ready for the interview, all have knowledge about basic things
but only few will get selected so they filter us by "DSA"
although they ask simple logical questions and sometimes they just ask you to explain some concepts
so it depends
so like its good to have knowledge about it
in which year you are in?
graduated, started learning ML after college
are u working now?
nope, im trying to get into another uni for masters
so you can probably grind at intermediate level at your masters time
but nowadays, by contribuiting also getting jobs
yeah thats the plan at least
well lets see whats the outcome of the application
btw wym by contributions?
i would say it's fairly important, especially for ML people. if only because you need to be able to recognize problems and how to solve them. many DSA problems are things with good/reasonable solutions that you don't need ML for, so being familiar with it helps you decide whether ML is worth it or not
(ML is usually the wrong answer for many problems)
there are opensource AI startups sir
i see thanks
@jaunty helm is llama.cpp important for inference after downloading model file
or is there any different ways to do so?
yes (assuming gguf)
I run koboldcpp tho
what about this?
I like this one https://github.com/oobabooga/text-generation-webui
"ML people" is very vague in 2024. Most of ML is turning into a commodity and unpopular opinion, if you're sitting at the level where you're just calling LLMs and maybe training some models with sklearn/torch then not knowing DSA will not significantly hurt you
also one thing
I have mobile application where I can download easily this models and run the inference
but the main thing to notice I don't have to use llama.cpp files then
just chatbot like interface and can start inference easily
so how this is easily done?
thats the thing, im just starting out in ML, and i have already started with the most basic things, dsa sql and stuff, i just wanted to know like lets say to enter into job market, how much companies emphasize on it, and even after that getting in whats the scope for it
No one has ever asked me DSA in a job interview but your mileage may vary
ohh nice, for which role ?
data scientist, ML engineer, AI engineer, etc etc etc
llama_perf_context_print: load time = 1581.69 ms
llama_perf_context_print: prompt eval time = 206.91 ms / 8 tokens ( 25.86 ms per token, 38.66 tokens per second)
llama_perf_context_print: eval time = 19642.90 ms / 215 runs ( 91.36 ms per token, 10.95 tokens per second)
llama_perf_context_print: total time = 19902.65 ms / 223 tokens
eval is inference right?
so 38 tokens per second is good I guess
after reviewing the other benchmark 38.66 seems wayt too fast.. I think its got to be the 10.95 ?
unless your pc is jet powered.
Whats your CPU @unkempt apex
you don't need llama cpp to do inference
please explain.
other backends exist
like vllm, aphrodite, tabby, etc
cool thanks.
yeah it's good for ryzen 3 3200g
just saw the vllm code , the level understanding to even understand that is way high
this guys are cracked
i am extremely passionate for AI and ML and mathematics
is there a way to check if a matrix is diagonizable? After research and brute force i reached this:
import numpy as np
from scipy.linalg import eig
A = np.array([[1, 2, 1],
[-1, 3, 1],
[0, 2, 2]])
eigenvalues, eigenvectors = eig(A)
if np.linalg.matrix_rank(eigenvectors) == A.shape[0]:
print("Matrix A is diagonalizable.")
P = eigenvectors
P_inv = np.linalg.inv(P)
diagonal_matrix = P_inv @ A @ P
print(diagonal_matrix)
else:
print("Matrix A is not diagonalizable.")
but it doesnt seem just right
Hello, do any of u know how to create a virtual env in VS Code, on windows?
How do you install virtualenv correctly on windows?
I downloaded virtualenv1.9.1 from here and tried installing it with:
python virtualenv.py install
but it does not appear in MyPythonPath/Scrip...
aight thanks
I am assuming its nt possible to do tat in google colab
colab manages with sessions
yes
if I have a degree in mathematics can i skip past the maths part of learning ML?
even tho i've forgotten most of it
guys when using regrestion i can use same feature to calculate cutome_score variable and then using it as Y that wont afffect the model right?
i only worked with calssification but it turned out the model accurcy is 76% becuase i always try to predict car model which turned out bad idea becuase i have over 1000 class
Masters or bachelors and what country?
i thought it was programmibg
So is driving a car. 😂
Again, depends on what you want to do
Nowadays there's people using off-the-shelf stuff, others training models using high level interfaces like sklearn/torch and finally doing novel stuff, in R&D
For most of it the math you need isn't a lot
For instance, I had "standard" university linear algebra and calculus
this is just about right. diagonalizability requires you to check whether you have enough linearly independent eigenvectors to span the eigenspaces
you could make the check exit faster if all the eigenvalues are distinct to machine precision
nvm, this function is actually bad, it doesnt work properly but i already fixed it
the basic idea is right
hey im trying to use the openai API for the first time, im trying to get it to fill out a question automatically when someone puts in a certain topic, has anyone worked with it before?
using python and its for my history class 😭
i have a question on this
is it really that neccessary to first learn the math of it
or learn math while learning ML
I've watched a lot of videos about ML
and they all mention that math is fundamental knowledge
especially statistics
linear algebra i heard when working with neural networks
and probability
like normal distrubtion
calculus
@weary timber first link here is good
if its explains it all thats good
only 99 pages
I think it does, idk all the math required for ML
for just starting out 3b1b is a great resource
his series on neural networks goes over the intuitions behind nn's and gradient descent, then goes into the math, he also has entire series for learning linear algebra, derivative calculus, and (I think) statistics
so what do i do
funnily enough i wouldn't recomment 3b1b to anyone starting out
its not 99 pages i looked at the first page of the contents
its 400 pages 😭
the videos make most sense in the context of someone that has already seen the topics at least once, to get a different perspective on them
so what do i do
yall know
there's no shortcut. if you wanna make the networks yourself while understanding what's going on, you have to do what everyone else that works that way did. eat a handful of books
👍🏿
if all you care about is using off-the-shelf models, just cover what you need to develop intuition
this might be a dumb question but have you watched his neural network series? and any of the episodes from his linalg or calculus series?
if you wanna make novel models and do research, you have to read as many books and papers as the experts working on that
i have
I think I'd agree with what you say about the math part in the nn video
but I used the linalg ones to get a headstart on linalg in college I found it good as a beginner who had no prior knowledge
although khan academy might even have been a bigger contributor
is linalg itself enough tho
pretty sure the same guy does content for khan academy
khan academy is like diet book reading
no
Enough for what? lol
Is $120,000 USD a lot?
could someone guide me to a # channel where i could ask a pyspark question?
i have the word "job" and i want to get the 10-15 most related words from a 400 word list of words related to "job", is there a way to do this ?
i cant seem to figure out how i would define this "relation" between words
That Sosa album is amazing. I’ve listened to it a nonsense amount of times.
fr
True Religion Fein, just classic. Every track is just immaculate.
Winnin with King Louie 🥶
It’s just too good
It depends
Ok, what about living alone with no little paws scratching at me to feed them?
Again, it depends, where are you located?
US as I said. Alabama
Super basic question I think, but why is it against best practice to have a sigmoid activation in the final layer of a neural network?
Like specifically the output layer, just to be clear
For binary targets
Like I'm assuming the neural network is a binary classifier
nah, ML is glorified statistics
hey guys
i am trying to make a classification model for a more advanced task and i wanna get it published, but my friend says that I should probably integrate p-value and other inferential stats into it. i know a little bit of stats because i used to data analysis with r, but other than the basics, idk
how do i do this?
published, in an academic publication?
mb, it meant it in the sense of a science fair
but overall, I was recommended to integrate more stats in addition to loss(categoricalcrossentropy ofc)
in either case, you need to demonstrate that the performance of your model is statistically significant. which means that the performance can't just be attributed to random chance.
that may be a bit of an issue
but i can integrate more stats
such as acc and maybe auc
however, the judges for scifair come from many backgrounds
and my project is more niche
so i somehow need a way to explain it very well
btw thanks
Has anyone tried making a data set on themselfs
Do you recall what I told you about trying to create a dataset at your stage?
Yes. I'm just curious if someone has made a model based on their life
I'm curious because the human brain is more of a mathematical equation and if you train it on a computer on so much data does that become the person that that data represents just the same equation but not fully sorry
I am sorry
the human brain is more of a mathematical equation
source @unkempt wigeon?
I know but in essence it's just a mathematical formula that changes who's to say that it more than a mathematical formula
I know I sound like a broken record but truly just because it's a piece of flesh does it mean it's a mathematical equation that can be predisposed to wait devices are a pre-programmed through genes
What is this? Is this nn or just basic ml?
If you’re doing binary classification the final layer should probably be sigmoid yes
Why do you need zeros In ml
It would make more sense that the universe is this mathmatical equation and your brain is just a part of that.
Okay then I’m super confused about why it may not be best practice to have a final layer sigmoid for binary classification although maybe I just misunderstood and the point was that you shouldn’t do that for other problems besides binary classification
In what context did you see that?
True
I’m cramming a machine learning/ai course so it’s very possible I misheard as I’ve been cranking through this
This pretty much answers your question:
https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html
Ohhh!
This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.
Same goes for (non-binary) cross entropy https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html / softmax
That’s exactly it thank you!!! Not sure I understand it yet but that’s definitely what it was
The numerical stability related to loss
Just don't forget to sigmoid/softmax the logits during inference
Your brain is a physical thing, math is an abstract thing (multiple abstract things, it's an umbrella term). Also, why "equation" (a single one) specifically? As for it becoming that person, that depends on what you mean by that. If you mean exactly that person, then no, that is not physically possible. So you probably mean loosely in some sense, but how loose is still that person (and how do you measure the similarity)?
I finally think I got this working. Keep your fingers crossed for me! Its still got hours of parameter searching but the results coming in so far seem promising. I changed the optuna around a bit so it first does a initial wide parameter search and its creates a focused objective and uses those best parameters to narrow things down.
How do tools like Argil and Heygen work, and how are they able to clone your video so well? I have researched and known about Audio cloning and there are many open source models for the same so it's easier for me to study. But I couldn't find any open source model which has the same output level as Argil and Heygen.
Does anyone know any open source model like Heygen or Argil?
Heygen: https://www.heygen.com/
Argil: https://www.argil.ai/
PS: Ping me while replying
Is this what I really need for machine learning then to make a deep neural network I only got through a few videos
Have you considering something more effiecent than Numpy?
Py torch
oh I see.
I know I'm a little too young to try and make my own data set but if there's not a library that I can find whining specifically I have to make it and for custom data you would need numpy to create the data set
HA! Get that out of that mindset. You're not to young! I'd argue being young is an advantage. Like babies developing skills faster. Not considering your a baby, but you shouldn't let your age be a defining factor in learning. Its apparent you got the ability to make your own. Whats stopping you?
What I mean is my age in learning data science and AI as a whole sorry
A lot of things
Welcome to the Club. I'm just a ups driver trying to fit in and learn about my passions. I don't belong here, plus Im 44.. Maybe I'm too old to learn? I don't let those thoughts stop me. Don't ever put limits on yourself. You got this!~
Funny, its like looking at two opposite sides of a coin
?
I was a metaphor, two passionate people learning about ML, despite our percieved limiations, age or our backgrounds.
Just opposite sides of a coin.. nevermind lol
I get carried away a bit. Anyways, we are here to help guide you.
damn
Hey guys, I’m in a bit of trouble. I’m a first-year master’s student, and I have four exams where I need to create "simple" ML models for image segmentation. One of the exams involves analyzing drone images, segmenting them, and then creating an ML model that gives me the best position to land the drone. The other exam is a ship detection challenge. For both of these, the professors provided the datasets. For the first one, I have to use Keras, and for the second, I have to use PyTorch.
The problem is that in my bachelor’s, I was a web developer, and I’m having some trouble adjusting to data science and ML models because I’m not fast enough. The teaching from the professors hasn’t been very helpful either. I was wondering if you guys have any advice or resources you would recommend to help me learn how to build these ML models from scratch, like I’m a 5-year-old—covering everything, from preprocessing to building the models themselves.
Have ~2YOE and couldnt agree more with this video https://www.youtube.com/watch?v=espQDESe07w
15 Machine Learning Lessons I Wish I Knew Earlier
In this video, I will tell you 15 lessons I learned over the years that could have made my Machine Learning journey easier to save you some time.
Also Watch:
How to Learn Machine Learning in 2024 (7 step roadmap) https://youtu.be/jwTaBztqTZ0
All Machine Learning algorithms explained in 17 m...
someone can recommand a good cnn repo for img classification
how do i build a model architecture for imagine recognition or should i go with premade ones like vgg16?
hi everyone
disclaimer: not a data scientist. just a pure math student
I've collected a lot of data on my personal spending (stored in Actual Budged, can export to any format), and would like to extract any useful patters out of it and learn some data science along the way
Could you recommend any blog posts / articles / tutorials etc?
Assume 4 years of hobby-experience with python and c++
Pre-Made, resnet
Or Xception
Hi everyone,
I'm currently studying data science at university, but I feel that the course material alone isn’t sufficient for me to fully grasp some of the concepts. I’m really passionate about this subject and want to improve my understanding, especially in the following areas:
Probability basics and laws of probability
Difference between empirical expectation/variability and actual mean/variance (μ and σ²)
Naive Bayes
Logistic regression
k-Nearest Neighbors (k-NN) algorithm
I’d love to hear your recommendations for resources to help me dig deeper into these topics. I prefer books (textbooks or more accessible reads), but I’m open to high-quality video resources as well if they are particularly effective.
Any suggestions, from beginner-friendly to more advanced, would be greatly appreciated!
Thanks in advance!
Hello. ive discovered a software called pinokio where u can download mainstream models. However, do u know any site where i could download them on their raw version and import/use them with keras/tensorflow?
I just found these guys today. https://www.youtube.com/watch?v=espQDESe07w, their channel has some good videos
15 Machine Learning Lessons I Wish I Knew Earlier
In this video, I will tell you 15 lessons I learned over the years that could have made my Machine Learning journey easier to save you some time.
Also Watch:
How to Learn Machine Learning in 2024 (7 step roadmap) https://youtu.be/jwTaBztqTZ0
All Machine Learning algorithms explained in 17 m...
Thank you.
I like it, except the very beginning of the video about a "bad job" part. The only bad job is the one you don't enjoy and doesn't pay the bills. Like the fact Dr. Pol made an amazing career out of sticking his arm where the sun dont shine.
hahahah me too
here
The only time you're too old to learn is where conditions which impair cognition are present. Dementia, death, etc.
Mind you, I feel my mental acuity slowly ebbing.
should i use it with the pretrained weights and fine tune it or just the pure model and then train it with my own dataset?
Pretrained weights and finetune
I'm not sure death is a condition that impairs cognitive ability since death is a state where the concept of cognition simply ceases to exist
why tho?
Because a lot of what CNNs do is feature learning and you don’t need to repeat this step for each new use case, features are features
What definitely differs is how you discriminate between them
So I’d nearly always benchmark just training the fully connected layers versus finetuning everything versus training FC alone, lower learning rate and then finetune everything
But remember, you’re likely trying to solve a problem in the real world, no? The margins between all approaches are likely so small that it’s not worth doing anything more than training the FV layers and freezing the rest, with respect to your task in the real world. “Will 2-3% accuracy matter?” That’s the question
Hi I am pretty new to python and I want to learn. Any advice on where I can go to learn? I am a finance major looking to transition into data science in the future
!res
try a byte of python and automate the boring stuff
after you get down python basics, you can then move to more specialized fields (like ds as you mentioned)
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
Cool, thanks!
I am sure I will be back in here. I am going to start working toward this tomorrow
I have taken a class or two in the past, but it was a while ago lol. not enough to know what im doing at all
for a peek into the future, you'll likely be working with at least the following 3rd party libraries:
numpy: for number crunching, literally the backbone of python's entire ds ecosystempandas/polars: specialized libraries for manipulating tabular data- a plotting library. there are a lot of them, like
matplotlib,hvplot,plotly, etc... with more being created as we speak - others depending on the specific task at hand
Sounds great! I have done a little bit of work in numpy and pandas but I am very new. I need to strengthen my basics before I do that stuff
gl
Thank you, I appreciate your help!
also I'm totally not biased but try polars
I never want to look back to pandas again after having done so
I actually just started a computational finance club at my school with some friends. Someone said that polars was a lot better. Is it speed?
They are helping me get my python up to speed. Thats my huge weakness
I figured I would ask here as well
tbh I haven't really compared the speed myself on large datasets
what I can say is that using polars feels a lot nicer
Great to know, I will have to take a look!
reading through the things you said to look at. exactly what i was looking for 😄
Does any1 have any tips for transformer model training?
Im just about to start training my transformer based model. anything except the obvious will help
e.g. off the top of my head
imagine that you want to replace values in B if A > 5000 with the values in C
pandas makes it annoying , you either have to duplicate code or make it multiple lines
df.loc[ df['A'] > 5000, 'B' ] = df.loc[ df['A'] > 5000, 'C' ]
```compare that with polars, which makes complex chaining of operations so nice
```py
df.with_columns(
pl.when(pl.col('A') > 5000)
.then('C')
.otherwise('B')
.alias('B')
)
```doesn't matter alot here, but the difference is very noticeable if it gets more complicated
interesting, so it just does it all back to back?
mmm i see how that can be better with large data
efficiency too right? i do know thats important
basically I find myself very often wanting to do queries on queries on filter on ...
just a chain of operations depending on conditions and updating, etc.
with pandas, you just have to split over a lot of lines, or write lots of duplicate code to generate the correct mask
polars feels more natural
that does sound a lot better to work with
people say it's faster
there's the "Lazy API," which means it doesn't compute each operation immediately
then when you need the result, you do df_lazy.collect() which executes the entire chain of queries
the optimizer can then take its chance at making it faster
interesting, i didnt know that
ill have to read more on it to see how it could apply to what i want to use it for
i guess to let you in on why i want to learn this. not only for DS but I would like to work towards quant one day. I have some connections at school that could help me get there possibly.
it will probably come down to how well i can learn all of this stuff over time. i dont expect it to happen overnight. i dont plan on trying to break into that field for 5-10 years
but i would love to get into DS before that if it even happens
good luck on that
it's a marathon so don't be too worried about having to rush
i appreciate it, youre right. i am trying really hard not to put pressure on myself. its tough because i want to get there very badly. but i will keep my head up and work hard
also thank you for this conversation as well, helps me feel better lol
Greetings, I am new here. Can anyone here help me with an opencv problem?
I have got this error: ```---------------------------------------------------------------------------
error Traceback (most recent call last)
Cell In[11], line 4
2 im = "Test Images/Taylor Swift/TST.JPG"
3 cropped_image = crop(im,face_model)
----> 4 pred = m.predict(cropped_image)
error: Unknown C++ exception from OpenCV code```
Here are the installed packages: https://paste.pythondiscord.com/ZRNQ
Here is my code: https://paste.pythondiscord.com/DQAA
but the pretrained resnet50 is for real life object detection my dataset is nowhere close
What is your data
i want to make a model that predicts the name of the given app's icon
Even so, I'd still just go with fine-tuning a model
The features that are detected in lower layers are just edges, corners etc.
That being said, CNNs are texture biased https://arxiv.org/abs/1811.12231
Convolutional Neural Networks (CNNs) are commonly thought to recognise objects by learning increasingly complex representations of object shapes. Some recent studies suggest a more important role of image textures. We here put these conflicting hypotheses to a quantitative test by evaluating CNNs and human observers on images with a texture-shap...
I'd advise you to just finetune the entire thing then, not just the FC layers (or to do both and compare)
For fun im implementing neural networks from scratch in numpy. Right now im working on a CNN, but the performance is terrible, even for small 32x32 images. Do you know of any numpy hacks to perform the 2d convolution faster?
you could try jit'ing it with numba.
you might also consider using jax instead of numpy. jax is basically cuda numpy with autograd.
How are you implementing the convolution currently? I don't think numpy's or scipy's implementation is bad
i was hoping to stay away from numba. Yeah in the future i was going to switch to JAX
This is my current forward implementation:
def conv_forward(self, x, w, b, stride=1, pad=0):
"""
Forward pass for convolutional layer using im2col
"""
N, C, H, W = x.shape
F, C, HH, WW = w.shape
# Pad input
x_pad = np.pad(x, ((0,0), (0,0), (pad,pad), (pad,pad)), mode='constant')
# Output dimensions
H_out = 1 + (H + 2*pad - HH) // stride
W_out = 1 + (W + 2*pad - WW) // stride
# Im2col transformation
x_col = np.zeros((C * HH * WW, N * H_out * W_out))
for c in range(C):
for h_idx in range(HH):
for w_idx in range(WW):
row = c * HH * WW + h_idx * WW + w_idx
for n in range(N):
for h_out in range(H_out):
for w_out in range(W_out):
col = n * H_out * W_out + h_out * W_out + w_out
h_pad = h_out * stride + h_idx
w_pad = w_out * stride + w_idx
x_col[row, col] = x_pad[n, c, h_pad, w_pad]
# Reshape weights and compute output
w_reshape = w.reshape(F, -1)
out = w_reshape.dot(x_col) + b.reshape(-1, 1)
out = out.reshape(F, N, H_out, W_out).transpose(1, 0, 2, 3)
cache = (x, w, b, stride, pad, x_col)
return out, cache
i found the im2col optimization and it helped but still a bit to slow for my taste
Either use the builtins like confused reptile says or use numba like stelercus says
Using Jax would be even better yes because doing neural nets without autograd isn't worth it
what builtin do you mean? afaik numpy only has a 1d convolve?
yeah just for the sake of learning i wanted to do the grad myself once
how do i load a dataset folder with images within subfolders in pytorch?
PLEASE SUGGEST GOOD (preferred free) certifications for AI and ML CV. AZURE GCP AWS? which are best to learn? also postgres?
there are no free certs for AI/ML that anyone takes seriously. There might be paid ones that are taken seriously, but only if you already have a scientific university degree.
I DO
OH NOW NEW PFP
What is your degree in?
Master's in AI
I don't know enough to give career advice to people in India.
i understand
but whAT would be your general advuice?
stride_tricks
however by its nature (and as mentioned in one of its pages), it's less efficient than specialized solutions
Or numba on top of numpy
It’s very finicky
But once you get a hang of it, it’s actually quite nice
Is there any way I can get real time help over a voice call? I'm really bad at coding and asking for help about coding (providing files etc) and I need some help for a work related problem.
Hello. ive discovered a software called pinokio where u can download mainstream models. However, do u know any site where i could download them on their raw version and import/use them with keras/tensorflow?
Hadn't heard of it, but I'll take a look now 👀
I desire to use word embedding as a yool gor predicting awareness of foreign language words. I am not certain about using tools and generating sample of any kind. Do you have any clues?
!pypi taichi
I have a kinda weird question, I want to make a prototype chatbot thing with AI, how do I go about it, and what are some good sources/data of natural speech in text? Is it ok if I ask this but with Java script? Or is this server only python?
this server is only Python. why do you want to use JS?
How do I pick the right K value? clearly I cant choose 1 cause its overfitting. (Ik my accuracy is awful, Im working on it)
This is from K-NN
Yes, it's better than Numba.
It also has its own GUI stuff, so it could qualify as a game engine too, which comes in handy to quickly visualize some stuff that is not just plotting (and needs to run fast).
class neuralnetpy (object):
def __init__(self,sizes):
self.w=[[[0.5 for x in range(sizes[i+1])]for y in range(sizes[i])]for i in range(len(sizes)-1)]
self.b=[[0.5 for x in range(sizes[i+1])]for i in range(len(sizes)-1)]
self.lrn_rate=0.01
self.sizes=sizes
self.e=2.71828182846
def sgmd(self,x):
return 1/(1+self.e**(-x))
def sgmd_drv(self,x):
return self.sgmd(x)*(1-self.sgmd(x))
def T(self,m):
return [[m[j][i]for j in range(len(m))]for i in range(len(m[0]))]
def fprop(self,inp):
self.actvs=[inp]
for w,b in zip(self.w,self.b):
self.actvs.append([[self.sgmd(x+y)for x,y in zip([sum([x*j for x,j in zip(inp[0],y)])for y in self.T(w)],b)]])
return self.actvs
def bprop(self,targ):
dts=[[(self.actvs[-1][0][i]-targ[i])*(self.sgmd_drv(self.actvs[-1][0][i]))for i in range(len(self.actvs[-1][0]))]]
for i in reversed(range(len(self.w)-1)):
dts.append([sum([ dts[-1][j]*y[j]for j in range(len(dts[-1]))])*self.sgmd_drv(self.actvs[i][0][k])for y,k in zip(self.w[i],range(len(self.w[i])))])
dts.reverse()
return dts
def update(self,dts):
for i in range(len(self.w)):
changelist=[[x*y*self.lrn_rate for y in dts[i]]for x in self.actvs[i][0]]
for ind in range(len(self.w[i])):
for j in range(len(self.w[i][ind])):
self.w[i][ind][j]-=changelist[ind][j]
for ind in range(len(self.b[i])):
self.b[i][ind]-=(dts[i][ind]*self.lrn_rate)
def train(self,inp,targ,epoch=0):
self.fprop(inp)
dts=self.bprop(targ)
self.update(dts)
def predict(self,inp):
return self.fprop(inp)[-1]
net = neuralnetpy([10,5,2])
for x in range(100000):
inp=[[x%2 for j in range(10)]]
targ=[x%2,(x+1)%2]
print(net.predict(inp))
net.train(inp,targ,x)
heres a working neural network ive created with example
Vercel has a lot of docs on this I think
The thing with JS and LLMs is that you definitely need a backend somewhere because you can’t just put all those API keys client side
And if you’re going to make a backend, just make it in Python 😄
Can som1 tell me if an estimated 3hrs for training a transformer based model a lot? or is it average? just curious if i can optimize the training loop further or no. Im using a RTX3060Ti , dataset size = 65000, tokenizer vocab size = 250002 , with my batch size around 10 bc of OOM issues.
which model
and you mean training from scratch or transfer learning?
scratch
just started training with a batch size of 10
should take 3.69 hours (approx)
the larger your batch size the faster your epochs will be
OOM issues...
10 is the max
im coding the part that will load the model rn
trying to optimize
yeah its 'transformers'
ahh nice, can you share some more info ( can also share in DM ) so that I can also learn!
is there anyway i can train the yolo model without the stupid yaml and txt files and just a dataset folder?
should it be expected that a model with an embedding layer, layer norm, linear layer and cross entropy for loss performs significantly worse at predicting tokens than if you were to take the layer norm out?
significantly worse as in the loss basically stays still with the layer norm
who has experience with web scraping?
please help me
this channel isn't for web scraping. take a look at #❓|how-to-get-help. be sure to open a thread where you ask your whole quesiton--don't wait for a commitment.
nvm on this for anyone wondering, i was mixing things up
guys how to open excel files/csv in python
im trying to open a file on google colab , its appearing but the data says NaN
i am trying to build a chatbot and while loading the requirement.txt i am getting this error in a dependency
Collecting murmurhash==1.0.2 (from -r requirements.txt (line 35))
Using cached murmurhash-1.0.2.tar.gz (35 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [20 lines of output]
i think the main issue is not to build the wheel does any one know how to solve this i need help urgently thank you inadvance
Easy answer is pandas read_csv or read_excel
ok
i got tht part but in google colab and my download it cant read the data
the data iis saying NaN
even tho thers data in it
anyone have experience with displaying a bidirectional graph like this?
How do you do that
nvm found solution
Did you try copy pasting the file path
And whenever there is a forward slash add another if you are on windows
yea ive tried
it didnt work
on google colab it gave error saying unable to find file. i checked on command prompt and it does exist
add 2 slash?
just 1 more so there is 2 slashes where there was originally 1
am i supposed to get a lot lot of 'nan' values in my model data checkpoint? i think i might have set the lr too high...
No
oh god
There is no god
You'll want to walk through each step and see where the nans start appearing
i used a smaller test dataset and this didnt happen tho
i should adjust lr and weight init
@serene scaffold so this model is beyond saving. right?
thx
no it's not
i changed the model perameters a bit too much so im not going to bother optimize it so that it will run because i added better stuff like xavier model weight loading
how do i get 1 item from the dataloader?
is it normal to end up with hundreds or even thousands of columns when modeling a time series?
for context, I have multiple time series as inputs, and I'm doing multi-step forecasting on one of those series
then I basically just end up with have hundreds of columns that's time_series_x__lag-y
so is there a better way of doing things or is this fine
sounds about right, each column is a "sliding window" of the full series, right?
having more overlap between the windows and considering a long total time duration generally gives several columns, and sometimes you can't avoid it
I mean they're lagged, I'm not sure what you mean by sliding window
the lag-y column were created by (in polars)
lookback_window_length: int
time_series_column_name: str
df.with_columns(pl.col(time_series_column_name).shift(i).name.suffix(f"_lag{i}") for i in range(1, lookback_window_length + 1))
```or the (should be) equivalent pandas
```py
for i in range(1, lookback_window_length + 1):
df[f'{time_series_column_name}_lag{i}'] = df[time_series_column_name].shift(i)
ah you're shifting the entire series
yeah...
I'm pretty new to time series so idk what else there is
then the window is of the same length as the whole series
implicitly padded with 0s or None, NaN, null, or something of the sort at the edges
I know I'll be forecasting each day exactly from starting_time forecast_steps ahead, so I've made it so the table looks like
| datetime | time_series_now | time_series_lag1 | time_series_lag2 | ...
| 2024-1-1 | ...
| 2024-1-2 | ...
| 2024-1-3 | ...
...
```then treat it as a "normal" tabular regression task
you don't always have to use the full series length for each column is all i mean. but yeah, that's about right
looks fine, this often happens when doing linear correlation or convolutions
since i don't use pandas/polars, i'd call this a "toeplitz matrix". this is pretty standard
maybe something like this
!e
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
c = np.array([1,2,3,4,5])
r = np.array([1,0,0,0,0])
A = sp.linalg.toeplitz(c, r)
print(A)
plt.imshow(A)
plt.savefig("biggest_oof.png")
ig the problem now is I've thousands of columns (since I have multiple time series and I've made lags for each) and everything's easily overfitting
rn the best I've gotten is dropping everything else except the 1 series I'll be forecasting for
you're doing autoregressive predictions, yeah?
using past lags to make a linear prediction of the future
I don't think so?
I use the table above and forecast forecast_steps numbers by putting the estimator into a MultiOutputRegressor
autoregressive would be
forecast 1 step -> put that as lag_1 -> forecast 1 step again -> ...
right
not necessarily
hm
what module is multioutputregressor from?
sklearn
my X table is this, and the y table looks like
| predict_1 | predict_2 | predict_3 | predict_4 | ...
| 1 | 1.22 | 3.13 | ...
...
```from what I read, the `MultiOutputRegressor` should basically train `forecast_steps` estimators, each estimating `predict_i`
and what estimator are you using
ah forecast steps
or? i cant find that on sklearn
sorry I think I made you misunderstand
the task I've been given tells me I need to forecast forecast_steps ahead
so I made my targets (y table) have forecast_steps columns
right now I'm using lightgbm
aight. and what makes you say you have overfitting?
I've tried other linear models that directly support multioutput, like Ridge and MultiTaskLasso (both from sklearn)
with 2 approaches
- keep every column (that means besides the series
aI'm supposed to predict, I also include seriesbcanddwhich should impactain some way) - keep only
a
the best result for now was using 2. with lgbm
I plotted 4 graphs, 2 shows the actual value of the series and what the model predicts of the training set, and the other 2 for the testing set
from sklearn.metrics import (
PredictionErrorDisplay,
mean_absolute_error as MAE,
root_mean_squared_error as RMSE,
)
display = PredictionErrorDisplay(y_true=y_true, y_pred=y_pred)
print(f"MAE = {MAE(y_true, y_pred):.2f}")
print(f"RMSE = {RMSE(y_true, y_pred):.2f}")
display.plot(kind="actual_vs_predicted", scatter_kwargs={"alpha": 0.1})
display.plot(kind="residual_vs_predicted", scatter_kwargs={"alpha": 0.1})
plt.show()
```it's pretty much a perfect fit for the training set, and the testing set is bad comparatively
this trips me up because it's hard to tell if the estimators act on the data from the left or from the right
wdym by this?
without the lags, you only have 1 column?
there are a few others, like the month and the day (made into sin/cos)
but the majority is a lagged version of a time series
the columns look something like
['month_sin',
'month_cos',
'day_sin',
'day_cos',
'feature1',
'feature2',
'time_series_a',
'time_series_a_lag1',
'time_series_a_lag2',
'time_series_a_lag3',
'time_series_a_lag4',
'time_series_a_lag5',
'time_series_a_lag6',
...
'time_series_b',
'time_series_b_lag1',
'time_series_b_lag2',
...
]
let's say the time interval is 1 minute
then time_series_a is data right now, time_series_a_lag1 is data from 1 minute ago, time_series_a_lag2 is data from 2 minutes ago, etc.
and I predict forecast_steps ahead, a.k.a data 1 minute in the future, 2 minutes in the future ..., forecast_steps in the future
so in total I'll predict forecast_steps many numbers for each row
and with MultiOutputRegressor, that means for each step i minutes into the future, it'll train a estimator_i that takes all of these features and predict 1 number, that should correspond to the data that's i minutes in the future
and were you getting outputs of the expected shape? i'm just trying to make sure the data is in the right shape because multioutputregressor seems to expect tables of size n_samples x n_features, which is the other way around
the shape is correct
each row here is basically a complete description of the entire lookback window I'll have access to for a prediction
and the output of feeding 1 row into the estimator is 1 row with forecast_steps columns, the ith column corresponding to the data that's i minutes into the future
eh i guess i don't understand what polars' shift function does then
i'll just let someone more familiar with that help you out, i can't do this without looking at the math 😛
should be the same as pandas shift
i've never used pandas either
ah
Hi, I’m Pranix. I just finished my 12th grade, and I want to learn Python for AI and Machine Learning to become an expert.
I really need guidance and a mentor who can help and motivate me. I’d really appreciate the support. My goal is to create a private server where I can add a chatbot (like a mentor) to store and manage my problems, solutions, and everything else.
Is there anyone here who’s experienced in AI/ML? Please let me know!
well it should do something like this
column: [1, 2, 3, 4, 5, 6]
column.shift(1): [null, 1, 2, 3, 4, 5]
column.shift(2): [null, null, 1, 2, 3, 4]
not sure if you'll find a long term mentor here
but if you have specific questions you can ask
do you know python then?
can you help me for my road map
and how much time it can take/.?
!res then we have a list of resources
check out Automate the Boring Stuff or A Byte of Python
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
? @jaunty helm
and you're applying this to the column you showed above?
to this
roadmap.sh
I guess?
i donot understand that
vc?
ok @jaunty helm
1st i wanna clear u smtng
.
which bachlor degree is best for AI/ML?
can we sit on vc and talk for few minutes/.?
this really depends on what you hope to get out of the degree
i just want to become expert and later make a comp that produce ai and ml for companies like that
if you just want something practical that covers the basics, there are programs called something along the lines of "data science" or "machine learning". if you want to apply ML to a particular, field, you should probably study that field in your bsc and then ML in a masters
which degreee is better?
this is still too wishy washy
but in any case you should keep in mind you'll probably need at least a masters if you want a fair shot in a big company
i want practicle more that other , only focusing on coding stuff
ok
yes, the _lag{i} features are all created this way
in effect, I'll end up with a table like this
| datetime | time_series_a | time_series_a_lag1 | time_series_a_lag2 | ...
| day 1 minute 1 | 1.22 | null | null
| day 1 minute 2 | 13.4 | 1.22 | null
| day 1 minute 3 | 123.0 | 13.4 | 1.22
...
| day 2 minute 1 | x | y | z
...
```let's just say for simplicity that the `starting_time` of my forecast is `minute 1`. I then remove rows that's not `minute 1` to make the table look like this
| datetime | time_series_a | time_series_a_lag1 | time_series_a_lag2 | ...
| day 1 minute 1 | 1.22 | null | null
| day 2 minute 1 | x | y | z
...
```then for each of these rows, I should predict forecast_steps many numbers, which corresponds to the prediction for that day
(only day 1 would have missings because the lookback window is a day, so I drop it from training)
so which degree is best? to master ML? and has more practical thing
@jaunty helm @wooden sail
AI n ml same yes?
i want to be very gooood at ML
i'm doing a phd in ML-adjacent topics and would say i'm ok at ML
mostly do maths and writing rather than programming though
I dunno tbh
maybe #career-advice could help out as well?
i think this is not a good way of handling it because you have concatenated several features in each column
consider shifting each group of features separately instead
wdym?
or maybe that's what you did and i just didn't catch it
say we have
day 1 min 1 1
day 1 min 2 2
day 1 min 3 3
day 2 min 1 a
day 2 min 2 b
day 2 min 3 c
after a shift we'd want
day 1 min 1 null
day 1 min 2 1
day 1 min 3 2
day 2 min 1 null
day 2 min 2 a
day 2 min 3 b
is that how you're doing it? or you shift the whole thing together?
i.e.
day 1 min 1 null
day 1 min 2 1
day 1 min 3 2
day 2 min 1 3
day 2 min 2 a
day 2 min 3 b
this is the column time_series_a_lag1
yeah that shouldn't work because you're mixing up "features"
notice that depending on the lag, you now get values from day 1 in day 2
and generally, variables with the wrong meaning
that is expected
that's ok with the days, but
for day 2, I should have access to all the data from day 1, and use that to predict the stuff for day 2
what if we have
day_cos a
feature 1 b
and shift into
day_cos null
feature 1 a
for day 3, I should have access to all the data from day 2 (but not necessarily day 1), and use that to predict day 3
I don't shift the day nor feature columns cause those aren't time series data
hm
more data inspection it is
AI, ML, and data analytics are not the same thing, but there seems to be much confusion about this, both in and outside of universities, so you may find it under various different names and as a loose collection of computer science, statistics, and maybe AI. You will need to look more into the details of what each one available covers, and if that is what you care about.
Since it's relatively new (as a thing you can get a degree in) there is no great answer for this.
(I have also seen ML under applied mathematics degrees (you can come at it from different places as a specialization))
Whatever you want to study
The vast majority of people apply ML in a specific domain
I have done two ML/AI projects in BioTech and my lack of domain knowledge is an issue. I think someone with a background in bio-informatics would've been a better fit.
Haven't followed the entire convo chain but also consider not taking every lag especially if they're not adding information, maybe every n-th is enough
that ties in to the windowing as well: maybe the temporal correlation extends only over small time durations and the other values just make it worse
How do you feel about using the ACF/PACF for this?
Only captures linear relationships doesn't it (as it's correlation based)
that would usually be my starting point
same
that's why i had asked if purplys was using AR
you mean this? calculated using statsmodels.graphics.tsa.plot_(p)acf
(I'm not sure how it plays with irregular data, rn I'm seeing some jank with one of the series)
Do you have seasonality?
And exogenous variables?
Actually, can you just show me a plot of the data 👀
If you have clear trend and seasonality you should do a decomposition first and then do the ACF/PACF on the residuals, idk if @wooden sail agrees
Alternatively, if you have clear trend and/or seasonality you can just make auto ARIMA go brrr. It's likely going to be a SARIMA model if you have both
(because ACF/PACF mostly make sense in the context of stationary series)
what img size does resnet accept?
how to crate a ia generating text
Default was 224 x 224 x 3 but people just resize
here's data from 2 of the stations plotted against time of day (I stacked data from every day into 1 plot)
I found that there's some... not-so-nice ones (like with 8 here)
In Pytorch you add preprocessing on the level of the dataset and keras you add preprocessing layers, you can resize there
wdym by people just resize?
oh
by preprocessing you mean torchvision.transforms?
exactly
and it's not that for station 8 there's only data for 9-15 (eye balling it)
it's that the data for the times outside of that range genuinely fall to near 0
You need exogenous variables
@jaunty helm What Data Science Tool do you use?
those are seaborn / matplotlib
Ok
I think your regression is dependent on the time of day
I always think of it from the markov property
"Can I predict the next point from just the past few?"
rn I just have month and day sin/cos, and which station it is (one hot encoded if using linear models)
and then a multi output regressor, whether it's using MultiOutputRegressor or something like Ridge that supports it natively
Yeah, there's definitely multiple ways to do this kind of thing
Basically treating it like multiple regrssion like you're doing
Or using time series models
Specifically SARIIMAX at this point lol
Seasonal Autoregressive Integrated Moving Average + exogenous variables
that is a lot of words
down the rabbit hole I go
It's "simple" if you learn all the letters in isolation
do AR first, then MA, then I then S then X
And if you can relate them to concepts like ACF, PACF, stationarity, differencing you're gucci 👌
(As a business / economics alumnus I used to be able to, but alas, I forgot the details)
I don't use SARIMAX as much as I'd like to in the real world because I find specifically the Python implementations to be lacking. Either they don't adhere to the sklearn interface, they're slow, I read the source code and don't trust it, ... that's a disclaimer 😄
is there any differences between transforms.ToTensor and transforms.PILToTensor
I think I vaguely remember avoiding AR/MA/crazy things first cause I looked it up and they don't do well with incomplete time series or something
for example say I have data for each day, but only for 10am-2pm, and I need to predict say 2pm-4pm; do they work in this scenario?
Depends on the implementation but in principle you should
how will i know the function for computing an output based on input is working properly??
idk if its neccessary for me to tell this but its a neural network
and i havent trained it yet so the weights, inputs and biases are just some random values
im learning PyTorch and making an autoencoder with the MNIST dataset. a MSE loss of 0.06 is low right? i thought it was fine but for some reason when i plot the images with matplotlib, the output image is just complete noise even though my loss is low
What's important is that the loss decreases with training
it does, i fixed some of the code and now it goes down to about 0.005. my problem lies more with matplotlib, as i dont know why its drawing just noise. im guessing im feeding it the wrong info, but i already checked that the output data is in the correct shape and its values range from 0 to 1 because of the sigmoid in the final layer
num_epochs = 10
outputs = []
#Training
for epoch in range(num_epochs):
for(imges, _) in data_loader:
imges = imges.reshape(-1, 28 * 28)
optimizer.zero_grad() # Zero the gradient, = RESET
recon = model(imges)
loss = criterion(recon, imges)
print(loss)
loss.backward()
optimizer.step()
outputs.append((epoch, imges, recon))
# Drawing
for k in range(num_epochs):
plt.figure(figsize=(9, 2))
plt.gray()
imgs = outputs[k][1].detach().numpy() # full batch btw, 64 images of size 784
recon = outputs[k][2].detach().numpy()
for i, item in enumerate(imgs):
if i >= 9: break
plt.subplot(2, 9, i + 1)
item = item.reshape(-1, 28,28) # -> use for Autoencoder_Linear
# item: 1, 28, 28
plt.imshow(item[0])
for i, item in enumerate(recon):
if i >= 9: break
plt.subplot(2, 9, 9 + i + 1) # row_length + i + 1
item = item.reshape(-1, 28,28) # -> use for Autoencoder_Linear
# item: 1, 28, 28
plt.imshow(item[0])
plt.show()
the fact that all the noises have the same pattern might be a hint, but im at a loss idk
nvm i figured it out somehow
Is https://labelstud.io/ a viable and trusted option when it comes to image labeling? Or are there better alternatives?
what was the solution?
i just rewrote the plotting function, it was copy-paste so i guess something was messed up
is it necessary for me to divide in the loss/cost function?
divide by number of examples
it doesn't change the result, but some expressions turn out "nicer" in that the scalars in front cancel out after differentiating
also if the number of examples is large, you might have overflow issues (though dividing only helps there if it's done before adding, and it could instead result in zeroing out small quantities)
i will train with batches so
doesnt that keep the example count stable?
idk what to call it like fixed
how do you mean? like when working with batches of different sizes?
or training e.g. a classifier with different numbers of samples per class?
i will divide the data set to batches
since the batches will have the same amounts of examples in them
the example count will be fixed
by stable i meant that
sorry for the horrible english
then dividing by the number of examples makes no difference
hi,can someone help me in dms or vc understand what Coefficient in mutiple regression or know videos or links that explain it better than ai
Hello
Can someone teach me the calculation and code of backpropogation for multiple layers in the simplest way possible?
do you know how to take the derivative of a function with many variables, but for only one variable?
Karpathy tutorial series on yt
Well I am starting to get interested in the matter man that is complicated. I have tryed using Lamma 3 with voice but can't seem to do it any tips? I want to create my own assistant.
Gang, help. Why is this happening haha
Issue 1: The x-ticks (pie pieces) aren't ordered from 0 to 24 (oopsie should be 1)
Issue 2: all y-ticks (rings) aren't showing q.q
Issue 3: Someone seems to have eaten a part of the polar chart.
I found this 'Discouraged' hint, followed instructions but the problem persists.
What is a good laptop for DS? I just accepted an internship offer with Pepsi!!!!!
I want to get something for my own use, i usually just code on my desktop but I am going to need something more portable
if by DS you mean deep learning, there isn't one.
otherwise, any non-chromebook should do.
congratulations on the internship. I would ask them how it's going to work. if you're going to be doing all your development on a remote VM that they operate, than the specs of your computer don't really matter.
they probably won't let you store any of their proprietary data on a machine that you own.
how can i tell if my model is overfitting?
it does very well on the training data and very poorly on the test data
that is assuming that you don't also have a data leak
what is a dataleak?
when data from the testing set was included one way or another during training
is the structure of
784 input neurons
16 hidden layer neurons
16 hidden layer neurons
10 output neurons
good for digit recgonizer?
try it and compare it to other structures
👍🏿
how do i train my network if i have done it only with numpy?
what to do if train loss isnt decreasing?
Adding more neurons/layers, removing batchnorm and dropout should decrease your train loss
But the questio nis, do you want that? (specifically: overfitting)
wdym?
it will cause overfitting?
Decreasing training loss isn't hard, but if it's paired with an increase in validation loss or if it's just stagnant it just means you're overfitting
people who've used fastdup, is there a way to get the image filename list from html galleries
how would it be overfitting if it didnt even finish the first epoch
any one knows about hyperspectral image analysis?
i dont understand how to use transforms.Normalize, how do i know the correct value of mean and std to set?
also is it nessesarry to use transforms.Normalize after transforms.ToTensor in a compose?
How can i get around efficientnet b0 error: TypeError: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.
I can't save my model
Is it forbidden to upload files? if so how can I share a jupyter notebook? I have some questions and I don't know if there's a service like pastebin for jupyter notebooks
at the terminal, do python -m jupyter nbconvert --to script --stdout the_notebook.ipynb and copy/paste the result into the pastebin
ipynb files are not human readable--someone would have to run jupyter on their own computer to read the notebook, and people probably won't want to do that.
Yeah, that's the problem. I wanted to preserve the output so that the person looking at what've tried would have an easier time inspecting it. I guess that's not possible?
try exporting the notebook as a pdf, in addition to what I said.
Thanks!
It doesn't accept pdfs either 😅 . It can't be helped I guess
Has anyone here read "The hundred page machine learning book" by andriy burkov? Im on MAP, and i dont understand this "one x at a time" approach
class neuralnetpy (object):
def __init__(self,sizes):
self.w=[[[0.5 for x in range(sizes[i+1])]for y in range(sizes[i])]for i in range(len(sizes)-1)]
self.b=[[0.5 for x in range(sizes[i+1])]for i in range(len(sizes)-1)]
self.lrn_rate=0.01
self.sizes=sizes
self.e=2.71828182846
def sgmd(self,x):
return 1/(1+self.e**(-x))
def sgmd_drv(self,x):
return self.sgmd(x)*(1-self.sgmd(x))
def T(self,m):
return [[m[j][i]for j in range(len(m))]for i in range(len(m[0]))]
def fprop(self,inp):
self.actvs=[inp]+[[self.sgmd(x2+y2)for x2,y2 in zip([sum([(x*j)for x,j in zip(inp,y)]) for y in self.T(w)],b)] for w,b in zip(self.w,self.b)]
return self.actvs
def bprop(self,targ):
dts = [[(self.actvs[-1][i]-targ[i])*(self.sgmd_drv(self.actvs[-1][i]))for i in range(len(self.actvs[-1]))]]
dts.extend([[sum([dts[-1][j]*y[j]for j in range(len(dts[-1]))])*self.sgmd_drv(self.actvs[i][k])for y,k in zip(self.w[i],range(len(self.w[i])))]for i in reversed(range(len(self.w)-1))])
dts.reverse()
return dts
def update(self,dts):
for i in range(len(self.w)):
changelist=[[x*y*self.lrn_rate for y in dts[i]]for x in self.actvs[i]]
for ind in range(len(self.w[i])):
for j in range(len(self.w[i][ind])):
self.w[i][ind][j]-=changelist[ind][j]
for ind in range(len(self.b[i])):
self.b[i][ind]-=(dts[i][ind]*self.lrn_rate)
def train(self,inp,targ):
self.fprop(inp)
dts=self.bprop(targ)
self.update(dts)
def predict(self,inp):
return self.fprop(inp)[-1]
net = neuralnetpy([10,5,2])
for x in range(10000):
inp=[x%2 for j in range(10)]
targ=[x%2,(x+1)%2]
print(net.predict(inp))
net.train(inp,targ)
a fully working neural network with example and no imports
hey all, I am trying to build a model to predict if a price has increased by a percentage or not, I've been instructed to use Random forest, but I've not learned about it in my ML class so I dunno how and what I should do in order to optimize it, obviously feature selection , hyperparamaters tuning , also i believe scaling is not important in RF so , yea well tips appreciated
p.s the data is already quite clean, and not too big, around 6.7k lines
im trying to
find a new vey to make crypto currency with ai
not possible in uttar pradesg
need beoioke
beople*
Anyone know why I might be getting this error when trying to train a Transformer using HuggingFace's Trainer API in PyTorch?
state_dict, save_function, push_to_hub, max_shard_size, safe_serialization, variant, token, save_peft_format, **kwargs)
3015 shard_state_dict = {name: "" for name in shard}
3016 for module_name in shard:
-> 3017 module = module_map[module_name]
3018 # update state dict with onloaded parameters
3019 shard_state_dict = get_state_dict_from_offload(module, module_name, shard_state_dict)
KeyError: 'query_tokens'```
you should not try to do this.
more information would be useful, like the whole traceback. why did you expect query_tokens to be a key in module_map?
im not sure, this is code in the transformers library
I'll paste the full code and traceback, 1 second
from transformers import TrainingArguments, Trainer
from transformers import DataCollatorForSeq2Seq
# Create the data collator
data_collator = DataCollatorForSeq2Seq(
tokenizer=processor.tokenizer, # Use your processor's tokenizer here
model=model # Optional but recommended for better padding behavior
)
training_args = TrainingArguments(
output_dir="./results",
per_device_train_batch_size=1, # TODO: Make 16
per_device_eval_batch_size=1, # TODO: Make 16
num_train_epochs=3,
logging_dir="./logs",
logging_steps=10,
save_steps=100,
evaluation_strategy="steps",
save_strategy="no",
eval_steps=100,
fp16=True,
gradient_accumulation_steps=1,
max_steps=4125, # TODO: Make 4000
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=validation_dataset,
data_collator=data_collator
)
trainer.train()
*not the full one 1 sec
Traceback: https://paste.pythondiscord.com/CRDA
@gleaming osprey this person got a similar error https://stackoverflow.com/questions/64911499/huggingface-transformer-models-keyerror-input-ids-message-at-beginning-of-be
the structure of your two *_dataset values is probably not right.
Looking at this post...
what do you mean by this?
can you show print(train_dataset)?
alrighty
<__main__.VQADataset object at 0x7b8d312dc7f0>
And I've implemented VQADataset as an IterativeDataset like this:
from collections import Counter
# Create a Dataset class
class VQADataset(torch.utils.data.IterableDataset): # Change to IterableDataset
def __init__(self, data, processor):
self.processor = processor
self.data = data
def __iter__(self):
# Iterate through the streaming dataset
for element in self.data:
# Get the most common answer
answers = Counter(i['answer'] for i in element['answers'])
answer = answers.most_common(1)[0][0]
question = element['question']
image = element['image']
# Use the processor to tokenize and prepare inputs
inputs = self.processor(images=image, text=question, return_tensors="pt", padding=True)
labels = self.processor.tokenizer(answer, return_tensors="pt", padding=True, truncation=True)
item = {
"input_ids": inputs["input_ids"].squeeze(0),
"pixel_values": inputs["pixel_values"].squeeze(0),
"labels": labels["input_ids"].squeeze(0),
}
yield item # Yield the item for iteration
thanks for giving the class implementation also--I was hoping I'd see query_tokens in here somewhere. I assume validation_dataset is the same type?
Yeah initialized the very same: ```py
Create Datasets and DataLoaders
from torch.utils.data import DataLoader
train_dataset = VQADataset(vqa_train_data, processor)
train_dataloader = DataLoader(train_dataset, batch_size=16, num_workers=2)
validation_dataset = VQADataset(vqa_validation_data, processor)
validation_dataloader = DataLoader(validation_dataset, batch_size=16, num_workers=2)
test_dataset = VQADataset(vqa_test_data, processor)
test_dataloader = DataLoader(test_dataset, batch_size=16, num_workers=2)
I'd also like to note that in the output of model.state_dict().keys(), query_tokens is present.