#data-science-and-ml
1 messages · Page 158 of 1
so is deep seek better
Hey guys, any sites or Youtube channels where i can learn about Evolutionary AI ?
What is evolutionary ai btw? Related to bioinformatics?
yeah what is it
it's just optimization algorithms
Thanks
wat should i learn after pandas if i wanna go into ml
Do you have a university degree that's related to ML?
im 15
🙃
i ve got gcses this year
but i wanna excel
You should plan to get a degree that's related to ML. But in the meantime, a good next step would be to train basic classifiers with scikit learn, so that you learn about concept like x and y data, train and test sets, and evaluation metrics
greatly in cs
sounds interesting
so which packages should i learn after pandas
Don't focus on learning libraries
Repeat after me
Don't focus on learning libraries
(I can't hear you)
no as in i mean to take that route of learning sci kit learn
im not gonna learn some random libraries
im js asking bc idk if i should learn matplotlib
is it related?
or is it more data science
@smoky basalt I just gave some suggestions for what concepts to learn next and what library to use to do it. So that's my answer for now.
ok so i after pandas i move to scikit learn?
would i need to learn numpy or smthn like that?
heard numpy is important
I'm not going to give an answer that frames what you should learn in terms of libraries, because that's the wrong way to think of it.
uhm im confused
You need to learn concepts. You use libraries to apply those concepts
so wat concepts to learn first
I already told you
Here
why do I need to place everything into a class?
you don't.
python is not java.
I was looking at the pytorch resources and it hade the code in a class so i was just wondering
sounds like you're talking about something more specific than "everything".
Is there any discord server specific to pytorch?
The example in Creating Models does need to be a class because it uses inheritance.
If you don't subclass Module you won't be able to call its' inherited to() function to send it to your device.
Besides, you may want to parameterize your NeuralNetwork to try different instances of parameters, it's useful to have it as a class.
Module is used to make things convenient and acts as a composition tool. It's a pattern they want you to use, but it's not required.
A standard OOP library pattern is to have a fundamental building block that can be inherited from. For example, in a game engine this would typically be the GameObject type.
Thank you I have a hard time understanding classes I know what they are and their subtypes but to me that part is alien to me
If you look at their example custom Module you can see that it has a Flatten, and if you look into how Flatten is defined you will find that it too is a Module: https://github.com/pytorch/pytorch/blob/v2.6.0/torch/nn/modules/flatten.py#L13
torch/nn/modules/flatten.py line 13
class Flatten(Module):```
The idea is to be build new Modules be composition of other Modules.
This forms a tree structure, where each Module has some list of children, and when you then call something like to, it can navigate this tree and call to on all the children, and all of their children (etc), resulting in everything getting sent.
You could build your own system like this, or do it manually for each one.
Can anyone help on a beginner's project "Spotify Recommendation System" ? I am really stuck on how I should Cluster and find data to make the recommendation system model
I figure I could help,
Sir you helped me a lot yesterday. So I figured I shouldn't bother you . Sorry 😔.
Check dms,
How is fine tuning expensive? It seems very cheap to fine tune while a lot of people claim its expensive
Fine tunning seems better for most use cases
I feel like i am missing something
better than what?
Than rag
finetuning openai models is mostly about having the models follow the system prompt and not necessarily about "knowledge"
if 3-7 batches is enough for the model to acquire knowledge and follow certain prompts, then it would be really worth it. however i am still not sure why its not nearly as common as rag
I'm a little new to this concept, but I was looking at this video where they simulated a Deep Reinforcement Learning AI. The guy said that it was trained for 5 years. Do they actually train and simulate it for 5 years in human time? Or do they mean the 5 years in game?
The game they simulated was pokemon red
I'm dipping my feet in this area, and the fact that it takes years to train seems a little daunting, but I highly doubt they do that for so long.
It means 5 years of in-game time
As far as I know, they often train these in parallel, so they might have, let's say, 200 versions of the game being played at the same time
5 years divided by 200 would be around 10 days (of real human time)
And the more and better hardware you have, the faster you can do it
ohhhhh
That makes more sense
Because it's not about knowledge it's about following instructions
finetuning an openai model should be seen as a substitute to shortening a prompt
Is this correct for a resource?
I answered that for you a while ago.
Thank you I couldn't remember
Sorry
how important is differential equations for machine learning
it's not.
so i just need linear algebra and multivariable calculus
and then statistics
those are more relevant, yes
and it's specifically multivariate calculus for derivatives. I don't know of any application for integrals in ML.
many cost functions tend to be formulated as integrals, in their so-called "variational form"
This depends more so on what you are trying to have it learn. If you are trying to use machine learning to approximate (learn) physics for example (being done a bunch in computer graphics right now), then yes.
Differential equations is what is currently used in physics to describe physical systems' behavior (and for the foreseeable future).
If you want to do ML research, then it can be good to know too, as it opens the door to physics for you, and ML is no stranger to taking ideas from there (recent Noble Prize in physics was given to an ML researcher due to its link to physics).
(If you want to do (broad) research you want all the math so you can take ideas from other fields (go wide, not narrow))
I think I like answering no to these questions
If someone asks if they need to know diff eq for ML/AI the answer is likely just no. If they’re interested in theory and not practice, they’d likely just want to pick it up themselves
Or you learn the parts of diff eq you need to when you need to (it’s how I approach it)
song_interaction_count = test_data_kmeans.groupby('name')['user'].count()
popular_songs = song_interaction_count[song_interaction_count >= 3].index
test_data_kmeans = test_data_kmeans[test_data_kmeans['name'].isin(popular_songs)]
Create a utility matrix (user-item matrix)
utility_matrix = test_data_kmeans.pivot_table(index='user', columns='name', values='user_rating', fill_value=0)
Convert the utility matrix to a sparse matrix
sparse_matrix = csr_matrix(utility_matrix)
Define a batched KNN function for incremental computation
def recommend_songs_batched(song_name, utility_matrix, sparse_matrix, batch_size=1000, num_recommendations=5):
if song_name not in utility_matrix.columns:
print(f"Song '{song_name}' not found in the dataset!")
return []
song_idx = utility_matrix.columns.get_loc(song_name)
knn = NearestNeighbors(metric='cosine', algorithm='brute', n_neighbors=num_recommendations + 1)
recommendations = []
for start in range(0, sparse_matrix.shape[1], batch_size):
end = min(start + batch_size, sparse_matrix.shape[1])
# Fit KNN only on the batch
knn.fit(sparse_matrix.T[start:end])
# Get indices within the current batch
indices_in_batch = list(range(start, end))
# Compute neighbors for the target song
distances, indices = knn.kneighbors(
sparse_matrix.T[song_idx].reshape(1, -1),
n_neighbors=num_recommendations + 1
)
for i, idx in enumerate(indices.flatten()):
# Skip the first neighbor (it is the song itself)
if i == 0:
continue
global_idx = indices_in_batch[idx]
recommendations.append((utility_matrix.columns[global_idx], 1 - distances.flatten()[i]))
return sorted(recommendations, key=lambda x: -x[1])[:num_recommendations]
Example usage
song_to_recommend = "Camby Bolongo" # Replace with a valid song name
recommendations = recommend_songs_batched(song_to_recommend, utility_matrix, sparse_matrix, batch_size=5000, num_recommendations=5)
print(f"Recommendations for '{song_to_recommend}':")
for rec, score in recommendations:
print(f"{rec} (Similarity Score: {score:.2f})")
Can anyone tell me why I can't find songs in my dataset while I am searching from songs in my dataset ?
if song_name not in utility_matrix.columns:
```you sure that the column names are where you want to check for a song?
like I expect utility_matrix['some_column_that_has_song_names']
My utility matrix is like this which takes song names in every columns
interesting choice
still tho, add a print or a breakpoint there to make sure it's actually in the columns to debug
I did and some names do match . It's likely due to taking batch sizes this problem is occurring
Hello guys. I am a newbie in python and a data science enthusiast.
Quick question, how do I delete a NaN row in python?
pandas? df = df.dropna()
Yes. Okay.
Thanks a lot
I want to make a model which will generate a 2d image from text input from user and then it will make the 3d model of the 2d image which was created and the 3d model will be used in blender to view, so which model should I use or the best resources which can help??
Also the model must run locally in my system
see https://stability.ai/stable-3d + look up alternatives to it
Stable Zero123 is an advanced AI model specialized in generating 3D objects. It stands out due to its capability to accurately interpret how objects should appear from various perspectives, which is a significant advancement in the realm of 3D visualization.
idk if this is related but
am i the only one getting this error with google gemini?
EPROTO 04110000:error:0A000119:SSL routines:ssl3_get_record:decryption failed or bad record mac:c:\\ws\\deps\\openssl\\openssl\\ssl\\record\\ssl3_record.c:623:
same error happened even when i use python, nodejs, or curl
one needs to see the code that causes the error as well, but this is probably an API question rather than an AI question.
it also happen when i use the official library pip install google-genai
please only share text as actual text. not as a screenshot.
requests.exceptions.SSLError: HTTPSConnectionPool(host='generativelanguage.googleapis.com', port=443): Max retries exceeded with url: /v1beta/models/gemini-1.5-flash:generateContent (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:1018)')))```
from google import genai
client = genai.Client(api_key="bla")
response = client.models.generate_content(
model="gemini-1.5-flash", contents="Explain how AI works"
)
print(response.text)
I think it's deprecated.
You may use,
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content("Explain how AI works")
print(response.text)```
>>> import google.generativeai as genai
>>> genai.configure(api_key='dskodkskdsooas')
>>> model = genai.GenerativeModel('gemini-1.5-flash')
>>> resp = model.generate_content('hello')
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1738599352.419306 11664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.538987 11664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.661139 20156 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.791475 20156 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.899497 9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.024943 9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.155409 11664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.287683 9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.415192 9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.545329 20156 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
thank you for your reply, i have fixed this by, genai.configure(api_key=GOOGLE_API_KEY,transport=‘rest’)
Is it possible to combine activation functions?
😭
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='generativelanguage.googleapis.com', port=443): Max retries exceeded with url: /v1beta/models/gemini-1.5-flash:generateContent?%24alt=json%3Benum-encoding%3Dint (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:1018)')))```
can you try with ur api key so i know if it on google or on me
it's used to work not that long ago
like few weeks
I will check wait
Probably issues with your dev setup. I tried both on my env and on Kaggle. It works.
Sending you nb link over dm.
i tried on my phone and it worked now i think it was just my laptop
hey, I'm trying to train a model to read a word from an image. I'm exploring architectures like attention-based encoder-decoder models or CNN-RNN-CTC architecture. What do you guys think would be better?
oh i see, i'm going to uni next year and am asking this question mainly to decide whether to do it in community college over the summer or at uni next year where it will be much more rigorous but also i think that rigor can be more beneficial to me, so i just wanted to know how important it will be for my future career
OCR fits best in this type of situations
because when you train some models there is concern with accuracies
Can any one suggest platforms to learn and brushup my coding skills
codewars
leetcode
Hey guys I always struggled with Neural Networks, so I made a simple, yet powerful Neural Network file for beginners 😄
Here it is:
please give me reviews since this is my first release in... anything, so please give feedback!
did you use chatgpt

nop spent a month writing this thing
they just said that because it has a lot of comments
Hello guys. Me and a friend are beginning a project of building an LLM. If you want to join us or hear more send me a DM.
building an LLM takes an absurd amount of compute power. are you sure you don't mean "fine-tune"? this is a critical distinction.
No, I know the difference. Give an example of the minimum computing power you think is required.
what kind of LLM do you want to make?
It’s a transformer architecture LLM
what's your budget?
Someone who speaks Spanish
for a fairly serious model like llama3.1, with 400B parameters, estimates are around 50 million dollars (not counting capital expenditures). A tiny one you could probably train for less than a million by renting a datacenter
deepseek v3 was trained for "only" 5 million, but it's a massive outlier in cheapness that required a lot of low-level optimization.
Hi guys I want to prepare for data science internship but I don't have that much knowledge but my interview is very near how can I prepare to ace the interview?
#career-advice plz
guys whats the best BERT edition i just picked one ranodm one (its doing horribly during fine-tuning)
there isn't a best one. it depends on what you're trying to do and what kind of text you're trying to process
basically a paragraph, question, and answer seperated by [SEP] tokens. Classify the answers if they are correct or not
what are the paragraphs and questions about, topically? cooking? politics? sports?
its mostly political
but like
one is on US's history
one is on New Zealand airline
ok nvm its kinda varied
but the sources are all from CNN
so which models can understand CNN articles
@void crescent look to see if there's a BERT model on huggingface that'st rained on news articles.
In object detection, I have an image and the coordenates of the item in question. But some images don't have the item, and the code returns an error if I give empty coordenates in the training. What should I do?
but how to port it to tensorflow
I don't know. I recommend using pytorch, since that's what industry is using.
which is better to learn, tensorflow, pytorch, or scikit-learn
torch & tf are similar, but the former is way more popular at this point
scikit has more "traditional" stuff if you will, like the classic lin reg, decision trees, svms, just to name a few
oh do the first two not have these?
nope, pytorch and tensorflow is specifically focused on neural networks
so they have different uses, in that case should i learn scikit and then one of these two?
with a focus on torch or tensor cause computer vision is heavily focused on cnns right
sure
computer vision is focused on CNNs
while I'm not that familiar with cv, I wouldn't say that it's all CNNs, like vision transformers (ViT) are now a thing (maybe partly due to transformers worked great for NLP => people trying to shove it everywhere, but still)
oh i see
i don't really know much, not in college yet
but hopefully i can learn some cv before college, i need to first learn math up until differential equations though
these is someone know about ai
i want some advice like...
what i should to learn
i know should python but should i have good experience in python language
does someone know about fast rcnn?
be sure to always ask your actual question. don't ask for an expert.
hey guys, how is it going? some of you have used the Deepseek distill model of llama on AWS Bedrock? I wanna know about the pricing of use it and how does works the process to import such model? I am using in my API the llama model on AWS, but i want to migrate to DeepSeek model.
i dont think he minds waiting a thousand years
for it to return a response
can someone show me complex code for smthn in ml
i wanna see how compelx it gets
stats : basic descriptive + tests (t test anova chi2 correlation) SQL : modelisation entity-relationship model, obv joins, views common table expressions, being good at queries python : pandas numpy, matplotlib/seaborn, excel basics, Pbi basics Tableau basics, added value R, SAS
is this a good way to get the basic in data analyst
they're asking about creating one from scratch. if you tried to do that on a "normal" computer, it would probably take longer than the universe has existed.
and that's before you can actually start using it
💀
woah
any projects u made?
Im trying to get into data analyst
Claude AI seems to be the best for coding
At least for me
Then it’s DeepSeek and then gpt is just the worst
What about the llamas?
From what I've seen llamas better at accuracy and efficiency in coding tasks
Chatgpt though is better for assistance with creative tasks ex. writing code comments or generating documentation
but claude is the best for coding in my opinion
dont take these benchmarks too seriously
theyre good to establish some sort of baseline but anything within a few percentage points shouldn't be taken to be significant
use the models on your own and formulate your own opinion
Goodhart's law is an adage often stated as, "When a measure becomes a target, it ceases to be a good measure". It is named after British economist Charles Goodhart, who is credited with expressing the core idea of the adage in a 1975 article on monetary policy in the United Kingdom:
Any observed statistical regularity will tend to collapse once...
Does anybody know about daily dataset updates?
I'm wondering about data for a trading bot (ML), i saw a dataset in kaggle but i found is not accurate, it had price spread
you may need API's which updates frequently
claude is very helpful at coding problems. it’s been able to answer all of my class questions
thanks, do you know more about?
because i want daily trading or weekly trading, but 1st of all i need the data haha
what type of trading are you doing?
stocks?
import pandas as pd
from tqdm import tqdm
import yfinance as yf
import os
import contextlib
import shutil
from os.path import join
def read_symbols_data():
data = pd.read_csv("http://www.nasdaqtrader.com/dynamic/SymDir/nasdaqtraded.txt", sep='|')
data_clean = data[data['Test Issue'] == 'N']
symbols = data_clean['NASDAQ Symbol'].tolist()
return symbols, data_clean
def download_specific_symbols_data(symbols, period):
os.makedirs('hist', exist_ok=True)
is_valid = {}
with open(os.devnull, 'w') as devnull:
with contextlib.redirect_stdout(devnull):
for symbol in tqdm(symbols, desc="Collecting"):
data = yf.download(symbol, period=period)
if len(data.index) == 0:
continue
is_valid[symbol] = True
data.to_csv(f'hist/{symbol}.csv')
valid_symbols = [symbol for symbol in symbols if symbol in is_valid]
return valid_symbols
def move_symbols_to_directory(symbols, source, dest):
os.makedirs(dest, exist_ok=True)
for symbol in symbols:
filename = f'{symbol}.csv'
shutil.move(join(source, filename), join(dest, filename))
def check_if_empty(input_path:str):
if len(os.listdir(input_path)) != 0:
files = [os.remove(os.path.join(input_path, file)) for file in os.listdir(input_path)]
os.rmdir(input_path)
def optimize_code_specific_symbols(symbols, period='max'):
os.makedirs('hist', exist_ok=True)
data_path = "data"
if os.path.exists(data_path):
check_if_empty(data_path)
valid_symbols = download_specific_symbols_data(symbols, period)
_, data_clean = read_symbols_data()
valid_data = data_clean[data_clean['NASDAQ Symbol'].isin(valid_symbols)]
os.makedirs('data', exist_ok=True)
stocks = valid_data[valid_data['ETF'] == 'N']['NASDAQ Symbol'].tolist()
move_symbols_to_directory(stocks, "hist", "data")
os.rmdir('hist')
if __name__ == "__main__":
index_symbols = [
"^GSPC", # s&p 500
"^DJI", # dow jones industrial average
"^IXIC", # nasdaq composite
"^RUT", # russell 2000
"^VIX", # volatility index
"^FTSE", # ftse 100
"^N225" # nikkei 225
]
specific_symbols = ['^GSPC'] #S&P 500 should appear in a folder under "{stock_name}.csv"
optimize_code_specific_symbols(specific_symbols)
this is some old code that i made when I did a stock prediction project
I added the index ticker codes in the main function
your dataset should look like this
is tqdm the library that makes the console look nice
its the loading bars
I added it so when you download more than one stock it would show a loading bar of how many stocks are done
oh, yeah i got it mixed up with another library. a while ago i watched a video about some random python libraries and tqdm was mentioned along with another library that organizes error messages and such
Hello, got a data science interview for a Two Sigma internship but I've legit never done a data science interview before. How should I prep?
Here's what I know about the interview: "The permitted languages for the first question are: C, C++, Java, and Python. The permitted languages for the second and third questions are: Java, Octave (Matlab), Python, and R."
What should I be grinding? So far I've just been spamming Pandas syntax
import tensorflow_hub as hub
import tensorflow_text as text
preprocess = hub.KerasLayer(tfhub_handle_preprocess)
encoder = hub.KerasLayer(tfhub_handle_encoder)
inputs = tf.keras.layers.Input(shape=(1,), dtype=tf.string)
x = preprocess(inputs)
x = encoder(x)
outputs = tf.keras.layers.Dense(1, activation="sigmoid")(x["pooled_output"])
model = tf.keras.Model(inputs, outputs)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()
weird im getting
ValueError: Exception encountered when calling layer 'keras_layer_4' (type KerasLayer).
A KerasTensor is symbolic: it's a placeholder for a shape an a dtype. It doesn't have any actual numerical value. You cannot convert it to a NumPy array.
Call arguments received by layer 'keras_layer_4' (type KerasLayer):
• inputs=<KerasTensor shape=(None, 1), dtype=string, sparse=False, name=keras_tensor_364>
• training=None
what is a good pandas book to read cover to cover?
I would treat this more as a reference, but if you read and learn it cover to cover you'll be well off. https://wesmckinney.com/book/
questions and answers in the pandas tag of stackoverflow
the official User Guides are a must-read
I want to learn python where can i learn from
Hello
!resources
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
Can anyone recommend me how to learn AI development?
I haven't found any good explanations of how AI works
"how AI works" depends entirely on what kind of AI you're talking about.
y=mx+b, at scale 🐒
Lesson 1: How to break Copilot.
Is there any of topic channel? I have a survey to ask and idk where…
So true brother
have you watched 3blue1browns explanations
What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks
Additional funding for this project was provided by Amplify Partners
Typo correction: At 14 minutes 45 seconds...
I have a basic question about tree ensembles. Like I get well enough how through various methodologies we can generate 100 trees or whatever. After that is it just like… the most common answer? So if 51 trees say it’s a cat and 49 say it’s a dog then it’s a cat?
is "tree ensemble" different than "random forest"?
it's one tree ensemble, there's also regular bagged, regular boosted, extra trees, xgboost, ...
yes
Guys Which is/are the best Machine Learning resource(s) for a strong academic and practical foundation? ISLP or Andrew Ng (2018 - YouTube Version) or some other resource? Which one to pick up first?
I am looking forward to build a strong academic/theoretical and practical foundation in ML. So that it would help me out during advanced courses in my masters and also in building projects
Please suggest
Linear algebra book of your choice, linear algebra done right is what I would recommend. Some stupid YouTube ML course does not provide a single drop of theoretical foundation if that is truly what u want
linear algebra done right is my favorite linear algebra book but it’s a tough follow if someone isn’t comfortable with proof based math
And linear algebra foundations really
Then they shouldn't be learning ML bc all they really want is to copy paste code that has been 100 ways
Done*
anyone available?
Always ask your actual question right away. Don't ask if anyone will answer your question before you ask it
Even if someone is available, they can't know if they can help until they know the question.
Red-pill me on image segmentation
Ok thank you
image segmentation is best done using linear programming. adobe actually originally used LP as their solver for patching together images. combine that with a skillful algorithm for managing nxm images with variable colors and relevant pixelations and you have the most modern image segmentation that exists
enjoy ur red-pill
I am 25 years old...and don't know much about DATA SCIENCE.... but i am really interested to pursuing it...
and also wanna earn something from this profession
is it too late for me ?
I am really interested to meet with a data scientist...
is there any?
That's like saying I'm 250 lbs can I ever just walk to the gym and start losing weight? God I mean ig if you don't want to then you never will
Thank you.
No, I started a bit later even with practically zero knowledge on the field and only a scientific background. How easy or difficult it will be to find a job I cant tell you, since that depends a lot on where you live as well. And also on what you and/or the companies define as datascience (most of it is not machine learning in my experience).
Agree mostly but I moreso meant that linear algebra done right isn’t really an intro text and I thought OP was asking for intro recs
https://youtu.be/nqfWhZJVPyU?si=pGquX4-ncEVhFKTO
I've been really interested in this new genetic algorithm, but I've been finding it really difficult to describe to people, is anyone interested in talking about it?
Matthew Andrews talks about his work with M-E-GA
need help with face recognition anyone ?
just ask your question
Read like a superhero: Turn PDFs into Bionic Reading format https://github.com/SanshruthR/Bionic_Reading_Hub
Read like a superhero: Turn PDFs into Bionic Reading format https://x.com/Benavent/status/1853508523638116689 - SanshruthR/Bionic_Reading_Hub
i like tht
guys how the fk does chatgpt handle oov when i type in the prompt "hello chat gpt kjakfdask"
how does it work, can you do a brief explanation?
So, it basically converts the pdf to doc and then it extracts all of the tables and images and renders that as a html with the font embedded inside the html file. I am using html because of consistency and almost every thing supports html.
You can also use those fonts locally by getting the ttf files from the repo and installing the font files https://github.com/SanshruthR/Bionic_Reading_Hub/blob/main/Fast_Serif.ttf
Read like a superhero: Turn PDFs into Bionic Reading format https://x.com/Benavent/status/1853508523638116689 - SanshruthR/Bionic_Reading_Hub
there are some other font files in there too
I made it cuz, I hate corporate greed lol
sorry for not being clear, i actually wanna know how the bionic reading part happens and its assocation between ml
So, the font creates artificial points, the brain doesn't reads the full thing it just focuses on those points and fills in the gaps. So, instead of reading the full word the brain only focuses on the main points and knows what the word is. ML is being used to identify which letters should be selected for better reading.
Hi there. I'm new to ai. I am currently researching this topic for a few personal projects. Can someone guide me on the topic of fine tune language models? I have a general idea of this process from the hugging face manuals, but if I can ask someone about it in more detail (preferably in private messages, as I don't want to reveal details of projects to an audience), please let me know. I'd be very grateful
I will only help you in the server.
Can you describe more about what you want the fine-tuned language model to do? There are many kinds of language models, including ones that aren't what you consider to be language models.
The main idea is that I write some literary texts and as an experiment and study of the topic I would like to train a model on them (for example gpt2-large or any other, if you can suggest better options). in particular, I am interested in how to properly compose a dataset based on my texts, how to correctly mark them up, so that the whole text is used for training and not just a part of it or 2 texts are glued together, how to add any third-party labels that will be important to the tokenizer, as well as how to train the model so that the model is approximately as good as it should be.
I'm also wondering if it is possible to train the model so that it generates text with similar structure
I write some literary texts and as an experiment and study of the topic I would like to train a model on them so that the model can do ...
can you finish this sentence?
and as an experiment and study of the topic
also I don't understand this part.
that part's not important in general, I meant I'm doing it for me
I mean I literally don't understand what you wrote. I don't know what it means.
The main idea is that I write some literary texts and I would like to train a model on them
is this the salient part?
I'm sorry for my English
In continuation of the phrase you asked: That the model would generate similar text structure
by "similar text structure", do you mean the syntax of individual sentences, or a more conceptual ordering of information?
By structure I mean ordering of information, for example each text has a label [TITLE] and [MAIN PART]. The goal is to generate a text with a structure where there will be a title and a main part
when it comes time to generate a new text, what do you want the workflow to be?
If I understood the question correctly, I envision it like this: promt: “[TITLE] Once upon a time...” or promt: “[MAIN PART] A long time ago...”
so you give it the title, and then it generates the main part?
Any1 know where I can learn to build a collaborative filtering model? Im trying to make one for my project
that's not quite right, I'm writing a label and the beginning of the text content inside this part. As an promt, there can be both the beginning of the text in the TITLE and the beginning of the text in the [MAIN PART]. the result of the generation is a text with the structure [TITLE] and [MAIN PART]. that is, in fact, the structure of the final text is always the same, but the starting point of generation can be any of its parts. I will also be satisfied with the option if just a phrase is used as an promt, without specifying which part it refers to
hello peeps, can anyone point me in the direction of stock market focused trading communities? im a novice python programmer & funded day trader 🙂
anyone who's worked with layoutLM or floorplans or both?
you'd need to train it to generate the whole things, including the [TITLE] and [MAIN PART] tags. And then when you go to use it, you prompt it with [TITLE] This is the first part of the title and have it keep generating until it generates [MAIN PART], and then you add the human-written beginning of the main part, and then you have it keep generating from there.
when doing backpropagation do I use the output that is after Softmax or the raw output? (neural networks)
the very last output.
you also seem to have forgotten a very important step
@dapper dune are you generating your own training data?
i see you're trying to train/fine-tune on literary texts you will write...
@dapper dune are you interested in creating data-sets of an application's usage?
not really, I already have about 100 texts written by me.
I'm more interested in how to properly prepare these texts for fine-tune.
oh so you haven't gotten to the point of fine-tuning yet....
quite possibly. As I said, I only have a rough understanding of AI
What very important step?
I'm abit new to neural networks
Watch Andrew Ng
good morning, i have a quick question. before i study videos regarding data science with python, should i first familiarize myself with ML or can i learn data science first and then ML? thank you for your advice
A.I suggested that I start with Data Science first for the following reason. If you were to start directly with ML, you would constantly run into problems because, for example, you don't know how to prepare or analyze data.
You’ll want a solid math/stats foundation to do ML yes
How much experience do you have with math, stats, and programming
im new to python and programming. actually i study python basics etc. After that i want to going forward with Data Science
How about math/stats
Is there anyone experimenting with a CLIP model ?
Maybe check out the pins for some resources
low experience close 0
Are there any good papers that cover embedding code repository into vector database?
thx for helping dude
Are you in school
nope im adult. i have just basic knowledge in math/stats....like median,algebra, probability etc
if you are talking about advance math/stats skills i would say im a noob tbh
Def get on the math as well on top of the programming. Maybe prioritize it even more
ok thx mate
https://youtube.com/shorts/w_Gqn61h0QI?si=iTOEEHLOAxwtppa1 how to achive this thing
Using python and AI to auto select a song based on emotion #shorts #python #computerscience #artificialintelligence
just find the code
didn't get
https://www.quantamagazine.org/undergraduate-upends-a-40-year-old-data-science-conjecture-20250210/ Hm.
I have basic knowledege of stats and prob and intermediate knowledge of python, can I build a career in data science or even data analysis.... I have 6-9 months of time. And if yes please tell me what can I do for it?🙏
Hi guys, has anyone ever had issues with the DBSCAN algorithm? I'm using it in a research project with simple code on images, but it's crashing my machine. I've been coding for four years, and this is the first time I've encountered a real bottleneck in it.
I am a statistician, so I tested it in R too. While searching for 'why does this work here but not in Python?', I discovered that the implementation in R is more efficient (AKA C++ imp), running smoothly. However, for real-world applications, Python would be a better choice. So if anyone has experienced these issues, a faster solution would be great! 🙂
are you using this? https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html
Gallery examples: Comparing different clustering algorithms on toy datasets Demo of DBSCAN clustering algorithm Demo of HDBSCAN clustering algorithm
Yes, I also tried HDBSCAN, but with the exponential increase in parameters for images, the same error appears. Tonight, I will run it on Colab to test the actual limits of the bottleneck.
This implementation has a worst case memory complexity of O(n^2), which can occur when the eps param is large and min_samples is low, while the original DBSCAN only uses linear memory. For further details, see the Notes below.
like 200 min samples or 2.5 eps, (i know is too much but i dont work with the alg too much), it crashs.
Yes, it's heavy (like other unsupervised algorithms) 😭
You can try CURE, it’s a single pass cluster algorithm
That being said, I rarely cluster and when I do, I just use k-means
I assume your issue is that you don’t want to specify the number of clusters ahead of time
Yes, that's it. I want to specify whether the algorithm fails to recognize the clusters or merges them incorrectly. I'm conducting research on solar panel recognition, and using DBSCAN is mandatory for my thesis (TCC).
How many samples do you have
A lot, like 250+ images, and possibly more. I just split some satellite images provided by my professor from my country for the beginning of the research. But I’m processing them one by one. The full image is around 500MB, but each split is about 2.5MB
250 images cover my entire state
In the test case I'm working on right now, only with partitioned images, with a specific area.
I assume these are very high dimensional images
Even if your program didn’t crash I’d worry about the quality of clustering in this high dimensional space
One common issue with DBSCAN is its runtime--which grows quadratically with the dataset’s size (and working with a satellite image simply means even more increased runtime)
If you'd wanna optimize for speed without sacrificing utility, then you might wanna explore DBSCAN++. This is a much faster and more scalable alternative to DBSCAN. It's actually 20x faster than DBSCAN.
what are some good linear and logistic regression projects I could do thatd be good to increase my programming skills and as a good portfolio project? nothing too complicated but challenging enough. I want to get into ML, but its too fucking complicated so Ill stick with the simpler aspects
guys does anyone have Zillow dataset
do you guys think i could learn pytorch while learning calc 3 or is it better to wait until after i finish calc 3 and linear algebra
you can learn pytorch with basically 0 math knowledge, learn it whenever you like.
I want to process a dataset but it's too large for my disk space, so I'm using streaming mode to iterate over it. Is there a way to free up memory after each batch of iteration since the data in memory builds up: (and this is a really large dataset)
from datasets import load_dataset
dataset = load_dataset("calabi-yau-data/ws-5d", name="reflexive", split="full", streaming=True)
# Convert dataset to an iterator
dataset_iter = iter(dataset)
# Iterate through first 1000 rows, in chunks of 100
batch_size = 100
total_rows = 1000
for i in range(0, total_rows, batch_size):
batch = list(itertools.islice(dataset_iter, batch_size))
if not batch: # Stop if there are no more rows
break
print(f"Batch {i // batch_size + 1}:")
print(batch) # Process the batch as needed
It's a shame datasets aren't indexable otherwise I would've run the code to return the specified range of rows .
How to install torchdirect-ml because i use pip but it cannot find the package even though i already installed the packages from pypi
Guys do you know how can i convert this data into a table? I think it should be printed in a table right?
you must specify the delimiter/separator as ; when reading
it assumes , by default
Okayy, now it works, thanks bro
hey guys, i have heard about andrew ng's ml course. I wanna know whether it is good? and also if it is free (lol)?
for context im kinda new to ml and would like to go in depth into how different models work
Yes it's good (not a guarantee you'll love it though)
Yes it's free
but math knowledge is necessary to actually do stuff with it right
not bs stuff, actual good projects
i'm learning multivariable calculus and linear algebra anyways, just wanted to know if i could go on ahead and learn ml frameworks/libraries as well
ohh
can you please share a link, i was only able to find one on coursera and i think it was paid
you can audit it for free on Coursera (select the individual sub-courses, not the main one)
the full course including excercises is not free though
and yes it is good
just keep in mind it's about the foundations you need to understand how models work, it will not go into specifics about different models, it'll just give you the knowledge to understand what is going on for the general case
so it's all theory and not much application?
the exercises include training simple models, no transformers or other huge architectures iirc
if by "application" you were thinking about training your own ChatGPT, you ain't finding that there
-# (if you were actually thinking about that, take a look at Andrej Karpathy resources like https://github.com/karpathy/nanoGPT and his youtube series though)
Anyone knows how to download tensorflow for GPU and what other things I should download?
No, one of the the main points of libraries like pytorch and tensorflow are that they enable you to use machine learning without needing the intense math knowledge. Certainly having a foundation of knowledge will help you understand what's going on under the hood or be required to actually preform novel research, but as far as required math knowledge to build and train a model with one of these frameworks, it's about as minimal as you can get.
https://www.tensorflow.org/install/pip, note the system-specific info
i have watched his videos and i found them super helpful
and i want something similar for other types of models
like svm, random forest etc
Data science role has a different meaning for different companies. some says go for data analyst and some say other. I also don't know any data scientist.
Can anyone help me deciding which job role should I go for?
what's the best approach i can do to make myself learn ai faster using machine learning using python
what's the rush?
I have to get a job
are you pursuing an AI or ML related degree?
Nope. An IT degree. I'm just curious about what path I should take to learn AIML fast but not too fast that I wouldn't know the concepts and my time would be wasted.
are you in the US?
No I'm from India
I don't know how it works in India, but you will probably need to go back to uni to get an AIML job
I still am in uni
I'm a 3rd year student
can you switch to an AIML related major?
Nope not now
can you stay in uni and get a masters?
Bachelors is fine for me. I'm thinking of getting a job after I finish doing my bachelors and gain some work experience. Then I can think of getting higher education.
you almost certainly won't be able to get an AI/ML job as your first job
Yes I won't. But I would like to learn more about it.
!resources data science
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
Thank you.
Any reviews on Kerashub?
It totally depends on what you’re more inclined towards and what you expect from the role. If you enjoy working with data viz, reporting, and business insights, a Data Analyst role might be a good fit. If you're interested in building predictive models or genai and working with machine learning, then a Data Scientist/AI engineer/researcher role would be more suitable.
Can someone familiar with rerankers explain the differences between AnswerDotAI's rerankers and FlagRerankers?
I have only used answer ai's rerankers library for most tasks, but as per my understanding these Flagrerankers are basically using different embedding method called Flag embedding, which claims to produce better embedding features.
If I could recall correctly, rerankers library already consists of BAAI BGE models so you dont really need to use any other library for flagrerankers.
import pandas as pd
@round rapids you're voice banned for voice gate spam btw
whats voice gate spam?
oh

did you pip install pandas?
in the cmd?
yes
ah anyways, can you unban me?
all the text in the terminal here. please copy and paste it into the chat as text.
what?
there's text in the terminal in the screenshot that you posted. I need to copy and paste it, so please put it in this chat as text.
C:\Users\bilal>pip install pandas
'pip' is not recognized as an internal or external command,
operable program or batch file.
C:\Users\bilal>
this screenshot.
yes, I copied it and sent it.
or am I missing something
you did not.
that's literally what it says in the screenshot
or do you want another text, I don't get it
PS C:\Users\bilal\Desktop\coding> & C:/Users/bilal/AppData/Local/Programs/Python/Python313/python.exe "c:/Users/bilal/Desktop/coding/python with pandas/python.py"
Traceback (most recent call last):
File "c:\Users\bilal\Desktop\coding\python with pandas\python.py", line 1, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
PS C:\Users\bilal\Desktop\coding>
@round rapids do this
C:/Users/bilal/AppData/Local/Programs/Python/Python313/python.exe -m pip install pandas
thanks
and, could you please unban me from voice chat?
no
aight
@serene scaffold you're a saint of patience, helpfulness,and restraint
actually I'm a bastard
@round rapids did that command work?
also what are you trying to do with pandas?
yup
I'm trying to build along some projects on yt with pandas
I recommend doing the kaggle pandas tutorial
https://www.kaggle.com/learn this one?
Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.
Solve short hands-on challenges to perfect your data manipulation skills.
Cool, thanks. I really like AnswerDotAI's approach on the output of the reranker's score, so sticking to that is great.
Hello
Can someone help me with my data science homework?
We're task with making the same output as this picture, but i cant get the position for the individual inset just right
That graph alone looks vers hard(though i am not a maths student, i admire those math geeks)
For me im just having trouble with finding how all of it fit together on they're own place
Keep trying, thats what i to say to those around me in problem, currently you a server-mate(idk if that term is correct or not)
I personally have been doing a f tokenizer over and over again for almost 3 months now...hope this is the last version I will ever make because its a true headache... its still interesting thought but well, also tiring
You mean you are trying to put all those graphs in one singularity.
Im trying to MAKE those graphs
And that's what im having trouble with
Oh dear devil..
I remembered one incident where we had to take anhydrous for experiment and it was just 1± mg differencing the hell out of me, either 0.5 or 1.5 mg went more..
Could anyone explain to me the mathematical theory behind a tokenizer?
I got a system that counts chars, an arbitrary threshold so far that takes out too frequent characters, then I got a compiler that looks up all possible character combinations with the sorounding characters , then it looks up the frequency of the combinations too, then I tried normalizing their frequency by dividing it by the overall frequency of all elements with that lenght
Maybe Im doing it wrong from the start the think I am aiming for is a dinamic threshold and tokenizer (in the long run)
Hi everyone,
I’m working on developing a web scraping API using Django to collect financial data and save it to a database that can be automatically downloaded. I’m looking for any guidance or step-by-step tutorials on:
• Setting up Django for web scraping
• Creating API endpoints to expose the scraped data
• Automating database downloads
If anyone has experience with this or knows of a comprehensive tutorial, I’d greatly appreciate your help!
Thanks in advance!
Seems like a good question for #web-development 🙂
absoulutley
what asked ?
kaprekar numbers
so you just have to mess around with equations until you make that graph?
Question: How do you learn about and get into AI?
hi there
AI is a really broad term
sorry to say this but it engloves tons of processes and sub processes
it goes from tokenization to neural network, its a really vast field, what do you need to know about in particular?
i think he wants to know how people learned and got into the subject programming wise
Yes
hey guys, what are some good beginner projects to do with just numpy
do you want to finetune an already existing model or instead want to do it all by yourself
All myself
start what with a tokenizer?
if you're interested to learn about AI, tokenizers aren't a good place to start.
Dang... nvm then
what is "an AI" according to your understanding?
Text analisis and answer processing
that's an incredibly narrow subset of AI.
Whats yours then?
Programs that emulate the application of knowledge.
Thats also not correct
It's correct.
Because it sees the rules of text, not reality
Do you not consider self-driving cars to be AI?
It formulates rules that are derived from tons of text not from visual or other perceptions
Got a point...
formulates rules that are derived from tons of data
this is approaching a correct definition of machine learning
machine learning is where you have a computation graph whose state is determined by data
First you look up too frequent characters right?
As they have a meaning by themselves
you're still only thinking about NLP
Im too new to this to understand it as a whole
it's fine. NLP is the best one 😄
What do you mean by that?
the steps of the algorithm (the computation graph) are decided upon by humans, but the algorithm depends on values that are not set manually--they're "learned" from data.
So far I made a code that read a text, got ride of too frequent characters by storing them, here I already get into a problem because I either look up neighbor characters to the target character or do combinations and look up their frequency
I also have trouble deciding wether how to make the threshold that decides wether something is too frequent dinamic
The idea with machine learning is that the machine decides all that
So far I got the overall frequency and normalized group and individual frequencies
A video about neural networks, function approximation, machine learning, and mathematical building blocks. Dennis Nedry did nothing wrong. This is a submission for #SoME3
Original vid: https://www.youtube.com/watch?v=0QczhVg5HaI
My Links
Patreon: https://www.patreon.com/emergentgarden
Discord: https://discord.gg/ZsrAAByEnr
Links and Content:
...
I get it but there has to be a code that enables it to do so
You're writing your own tokenizer for a network to learn language?
Ye Im attemting to make a tokenizer from scrach
tokenizers often do actually require that you set the rules manually
Im already on version 3.2 of my proyect I have been doing this for almost 4 months and Im getting nowhere lol
What formula should I usse to make a dynamic threshold?
for what?
How do I start off?
If I want a job in the industry
!resources data science
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
To choose wether a token is too frequent
a bachelors degree that's related to AI is a hard requirement for almost all AI jobs, just so you know.
What's the purpose of it?
You asking me?
Is there a self learning path I can take?
or a specific use case
to get a job? no
What’s the self learning path I can take to different industries within tech?
I want it to identify tokens, and well, get to somewhere at the very list, as long as I can get to I will working on it
I could get into for example software engineering or cyber security by self learning or no
Just make things that are useful and interesting, build a profile on GitHub
Or like what is something that’s really growing that’s not niche that I can self learn and make lots of money off of
try asking in #career-advice. but I can tell you that it's nearly impossible to get into AI without a related degree.
Agentic systems probably.
I don't have a related degree but know devops, done gigs in data science that way. 'cause someone has to actually deploy the stuff if it's for industry
What formula should I usse to make a dynamic threshold to choose wether a token is too frequent?
Too frequent for what?
too frequent to be combined into a bigger token, meaningfull by itself
Are you looking to make something novel, something that is better than what we have?
Hello everyone!
Does anyone have this book in PDF or ePub format?
Thank you in advance, have a good day.
i want to understand the current to improve it if possible
we won't help you get a pirated copy of a book that someone wrote. maybe check with your library
if you help me understand how it works right now I may get a better idea
if its too much text dm me
dont get auto banned for text wall
Well consider that you can take a word and break it up into ["w", "o", "r", "d", "wo", "or", "rd", "wor", "ord", "word"]. Do that for all the words you see in some text that you care about understanding, count how many times each one shows up. Sort by the score. Choose a maximum list length and cut off at that point.
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
Hmm.
I think that's essentially what OpenAI did, with some tricks to keep the size down. But the score doesn't just have to be based on frequency. Ideally, IMO, you'd also use input from dictionaries to score the list. And you might even benefit from making some character sequences have the same token ID
like "ize" and "ise" for example
how do you diffrence structures of consecutive tokens from subparts that have to be fussed into a token
also, you could look at the average position where sentences containing a token end up in an LLMs latent space, take the distances between them, and use that to help order the list - maybe this would improve training times if you're making a model from scratch, because the token ids actually contain semantic data
you check length-first when you tokenize
or I guess "tokenizing priority" first, whatever that turns out to be. Which is also an interesting problem
guess that works
althought I still believe that differentiating structures from tokens is a huge challengue
as you have to know when to check surrounding tokens to make structures and when to unify tokens based on frequency patterns...Im not even sure if I should just get the general frequency of consecutive characters to fuse tokens that likelly appear together or go for a more roundabout approach dividing the text in a percentage basis
I find it a fascinating topic...anyway, too many hard questions that have been making my head hurt for quite a while, Its already horribly complicated althogether, at list for me, I have been trying to apply some already in usse normalizing and threshold finding formulas and their results wherent that great...maybe because it lacks input text, maybe thats way Im trying to read up as many studies as I can of ppl who actually got somewhere
Anybody have any experience in fine tuning LLMs? Wondering how good a model would be if I finetuned it on a codebase to fully understand it/I could ask questions about it
wanna try to make a new one from scrach instead?
Na, it needs that higher level of reasoning and writing that you get with some of the top models
I don't really want it to just be repeating existing code
But to actually understand it
ye guess you got a point there
which model are you going to finetune?
Llama?
I'm not sure, I'm thinking llama or deepseek
But I'm not even sure how feasible finetuning deepseek is
deepseek is too new if you want more feedback you should get llama as its a clasic
there is a ton more information on the net about Llama I mean
Right but isn't r1 also opensource?
So.. the reasoning capabilities would be a lot higher
Might be worth the effort
Yes in some simple programs
how did you handle too frequent characters or character combinations? how did you separate combination of characters(fusing) from structures of does?
I know there is entire github projects that make it easy to finetune
And that tokenises your input
What do you mean by how did I handle too frequent characters? The data set, if its good, won't have too frequent characters
I mean, have you ever built a tokenizer from nothing?
Yes, character level based tho, very simple
you know the datasets that LLMA programs read are real life books right? they read inmense ammount of data
Basically string to int lol
yes you get the frequencies
but in tokenization you have to choose
wether a token is meaningfull by itself or if it has to be combined
and that requires a threshold
wanted to ask how you did that because I cant figure it out
Llama trains on huge amounts of data Yes but finetuning requires minimal data
You can start with your wanted token count, say 10million
Then for each part/character only take the top N amount that fits into your required count
Obviously 10million is insanely high
how do you choose what to take and what to not
there are many combinations that may be usseless
Frequency analysis
If the combination appears many times, that is what your data is showing
and on what basis do you calculate the threshold that the analizer usses
is there a formula?
The amount of time you're willing to spend on compute and required time to train N amount of tokens
Obviously you can do it at character level and it will take 100x longer or large strings
It purely depends on how granular you want it
There isn't really a set margin I don't think, maybe somebody else can help ya there
I think from a practical standpoint, you want your tokenizer to use a simple algorithm so it can be fed data easily. There's probably good reasons why OpenAI made theirs they way they did, rather than work on a word level first then have letters as supplimentary chars
Im trying to get subword level tokenization so that afterwards I can get sentence boundaries in pleace, e.g. Dr. isnt a sentence end most of the time
btw, srry for the ping, what does nunpy do?
numpy is python for working with numbers, it's data scientists making Python ugly but it's fast
thats a concise answer
What are you gonna use this tokenizer for btw? Like, break text into parts... and then what?
Because the design decisions all have trade-offs, and it's really common for people to aim for something they don't need at the expense of something they need later
hey guys is it possible to get the coordinates of detected objects in yolov5?
Anyone familiar with Streamlit?
Yeah go ahead ask the real question
Trying to deploy my ML model but I keep on getting the 'ModuleNotFoundError', I have installed and provided all modules in requirements.txt, any idea how to debug it?
Go to dev side (you should find the icon somewhere on the Streamlit app) to see the exact module that's missing.
Don't ask question to ask question. Ask questions and provide complete details with the idea that someone will answer your questions
Hey so I want to learn data analytics.
And from YouTube I have noticed about this field and it's majorly about making attractive dashboards from power Bi or tabluea.
So any expert would guide as to why is python or sql used?
Is it necessary to learn these 2?
with SQL you can put it into PowerBI and do things such as join datasets and filter/select out data which would normally take a lot longer and be a lot harder to do without it
for example you could join a dataset in a way that part of the data is kept and only certain parts of another set are kept while the rest of the data is deleted. This is helpful in scenarios such as if maybe you are filtering out all data that has a duplicate, etc., or maybe you want to get rid of all blank accounts with nothing in them. im not sure if you would use that in a real scenario since i am just a college student, but this is what was relayed to me basically
I use tabluea... So is it possible to work on it too?
python can also automate some processes and is just a good general language overall because its compatible with a lot of things i believe. tl;dr: they make doing stuff faster
not sure on tabluea, since i honestly havent learned it yet 😅
nope im a college accounting student with an MIS minor looking to pursue a Data Science master's, i currently just work in tax
Hmm would you like to give any opinion or advice since going into data related field?
Like how much SQL should I know or is python really necessary to learn? Since I can start early and learn it later on
so it really depends on what type of "data" you want to do. there are a lot of different things that are labelled under those positions that are all a little different. data analyst is commonly used interchangeably with some terms. for example you might see it mixed in with business and occasionally financial analyst positions, basically if it lists using Python etc. in the job description it is most likely what you are looking for. im not sure too much on the big differences between data scientists and data engineers. it could be a good idea to do research online on youtube most likely of people in those professions documenting what its like and requirements to break into the field. data analysts though, i dont believe you usually need a degree but they might ask for one in a field related to it like statistics, mathematics, etc.
if you just want to be an analyst of some sort, so, a lot of those positions are basically working with datasets and communicating the results to teams or managers. so depending on the field what they ask you to know could be different. for example a financial analyst would be asked to make budget forecasts etc., i think these positions usually operate in Excel/PowerBI. like i said probably a good idea to look at some youtubers who are in the field most likely!
and yeah python/sql would probably be honestly the most basic things to learn ^^; i believe R and some other things are good later on that might be a bit more advanced too
Well the most I have seen is either data cleaning or making fancy data dashboards.( I think this is where they might use all of the required knowledge into this part only)
And I have also visited freelance sites and noticed that making these "dashboards" is the easy part but there are more advanced ones as well and I have no idea 💀
True both should be learned!
If you want to self-learn, I really like the Geeks4Geeks website projects, you could try taking notes over the basics, and then try recreating projects without looking at them. PyCharm Community Edition is great for practicing. I also watch a lot of Python Programmer's youtube channel and he has a lot of project/video basics as well. I honestly haven't been practicing SQL too hard but I probably should because I have an exam this week over some of it 😭 but yeah hope this helps!!
maybe more remarks to ur initial question.
Data Analytics can differ as Redd stated already, it gives many different job roles where those Analytics Methods comes to work.
Business -> BI-Reports (mainly PowerBI)
R&D -> ML/NN (Python, Azure, SQL)
Production -> SCADA
and so on
so id say bare minimum is SQL
Yea good resources actually since you have reminded me of them.
But anyways thanks for the advice!
as you get more expirienced you will be confronted with cloud, kubernetes, new frameworks etc.
Rest with chatgpt? 💀
disagree on that one i dislike geeks4geeks and wouldnt suggest that site to a newbie
!resources
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
so you become a 6 figure prompt engineer?
oh i didnt know my bad! ill check the resources out, i honestly only just started learning python a few weeks ago for one of my courses
and sql honestly, but im familiar with powerbi. i mean its not hard to learn powerbi tbh
if u are a under-/graduate check if you can sneak o'reilly python books
Cuz I have no idea about it and SQL sounds too advance..
Seen too many SQL courses specific to data analytics and it's like 4 hours long on average
yeh its the new excel
SQL is an easy language, even easier than python
oh i think one of my graduate courses i might take at one of my target schools actually uses one of those books since the course name is similar to the book name
u can get SQL basics in like 1 day and have 60% of the whole lang. already learned
I can't say prompt engineering since I'm not sure what to do next after DA.
(I'm a freshman in computer engineering and I'm looking into something that would align data and hardware)
what u mean by align data and hardware?
But there's too many versions of SQL.
Each with different style
yeah we had 1 day in my AIS class where we discussed different SQL commands you can use that i think ill be tested on D: and then make a project later i believe. just 1 day lmao. i wanted to take a full course on it later on anyways though. my professor was previously a software engineer for tax software services in SoCal so thats probably why, he likely thinks we wont need to know too much more than what he showed us
But either ways how much SQL should I know?
Like in terms of context
the basics would be how does SQl work, e.g. what is a table, how to aggregate data, applying filters
understanding the concepts of relational db vs non-relational
Hmm to do something special yk.
Like making a server and clouds.
I see Data analytics as a way to understand SQL , python and machine in a beginner way.
I still haven't figured it out but I have major interest in this field
So what are some resources that you would suggest to learn SQL?
so u wanna be a sys admin?
System admin?
just setup a local SQL instance and play with it honestly
o'reilly Learning SQL, 3rd Edition
Idk that but like server related or database engineer
a book i accidently stole from the airport and read the entire thing on flight good read 10/10
Wdym that you "stole accidentally" 💀🙏
the reader does not steal
and the thief does not read
but i definitely put it in my bag and forgot to return it so
If the reader does not steal then you stole it from the guy thus your self contradicting your statement
I mean sure ig 💀
What's your house address so that I can steal it?
55 bucks i just searched up the price 😭
clustering benchmarks for those of you that do unsupervised
https://cs.joensuu.fi/sipu/datasets/
up to 1000 dimensions for some of the sets.
These are nice, clean sets with ground truths that somewhat remove the dataset hassle from algorithm development
What is the lastest version of python and CUDA does tensorflow and pytorch both support. I have a rtx 3060 and I need it to run on the GPU
- don't worry about tensorflow support. just don't use tensorflow.
- you usually want to stay at least one python version behind. looks like pytorch doesn't work on 3.13 yet.
Okay so I will work with 3.12 but I do need the tensorflow support tho since I gotta implement an existing code which is written on tensorflow. I could switch it to torch later but now I kinda need tensorflow.
Check the Install instructions on the official website then
If you are using Windows, you will need to use WSL to run Tensorflow on a GPU
It says u need WSL only if u use 2.11 so I can use 2.10 for native windows support. So I just need to get the highest CUDA possible which is compatabile with both Tensoflow 2.10 and pytorch
Have you encountered any WSL-specific bugs within CUDA?
I recall running into some strange issues a few years back, some of the examples they provided wouldn't compile. But I haven't really looked into things since then
I have never used WSL
The only time I used linux was in college when I had to use Ubuntu. I thought I could get away without touching linux but here we are
Linux is superior.
this is sort of tangential to the channel topic, but I really dislike Windows and the complete mess it represents. Sorry - won't talk about this again 😄
I have been using it for a while, and it works fairly well
I did run into some weirdness and had to purge all nvidia dependencies once though, but it has been working well since too so 
I do agree windows is shit I ain't debating lol. Linux is the best but there is a learning curve that I yet to take on and I game and stuff so it is kinda easier to do so on windows though it being shit.
Well I guess it is a good time to start learning then
btw, my gaming laptop has an RTX 3070 laptop GPU. I think it is enough for practice, but not production
no mods I think, but I am not even sure about what mods would be in this case? overclocking?
the first time I installed it, it worked without any problems
later on after I installed more things was when things got weird, not sure if I broke apt at some point
I am not sure either. I added that question out of my own personal ignorance. Maybe mods are possible, but not sure
What book is it?
@opaque condor
Thank you very much
Hello guys, I am trying to get into data science by making projects. Can anyone tell me what projects I can create, I know the usual matplotlib, pandas, numpy and plotly. I am bored of creating graphs and all, is there any other thing I can do using these?
hellooo pandas question D:
i cant figure out how to merge two datasets on the index if they have different row counts (patients), one of them has significantly more than the other but im trying to keep the rows to match the smaller set to be more conservative
I am looking for an experienced developer in Python openCV and .NET programming.
If you have the ability, you should work on a project related to image processing.
If you have the ability, please contact me.
df_small.join(df_big, how="left")
yo anyone familiar with RL?i think i fumbled but im not sure what could be the reasons
kinda feels like its trying random stuff instead of exploiting a possible strat
Hi, @undone ridge
I am a full stack developer with rich experience of developing in python, opencv and asp.net.
Looking forward to work on exciting and challenging projects.
By combining exquisite design with the latest tech, my paramount goal is to deliver the best of the best that the world has seen.
Please feel free to contact me.
I need help installing tensorflow GPU for windows I installed wsl, cuda toolkit and Nvidia drivers, what else should i do, I need detailed explanation please help
ok
bro can someone help me navigate im currently learning python as my first language my goal is to get into AI/Ml what should i do after learning python can someone explain me?
you have to specify first what are you trying to do!
its a recommender system using reinforcement learning with a gridworld environment
i used bibit algo for reducing state space by making states biclusters of user -item matrix instead of items
but the performance varies wildly by user
and im not exactly why is that since the qlearning and gridworld is the same(same as in the same for every user)
it should start low and plateau at 60 but it starts going crazy for some users
Is cudnn 9.7.1 only available for windows 10? is there one for windows 11
try using the windows 10 download, in general windows 11 is fairly compatiable with 10
or just use WSL (I'd recommend this tbh)
see also: https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html
Are pretty much all of the common models out there based on data that only goes up to Oct '23 at the latest? It feels weird to me
The only one I found was "Meta Llama 3.3 70B" which goes up to December '23
a lot of the platforms big companies sourced their data from started closing up after they realized how much money they could get from selling their data, e.g. Reddit and Twitter killing their free APIs, newsletters suing big AI companies, whatever was going on with LAION datasets
Ah true, I didn't think about that
most models from companies with $$$ to spend include some premium training data, like the partnerships OpenAI has been doing, or otherwise include content scrapped from sources they probably had better not admit they are using
With the current administration, who knows if it'll matter, though
inb4 OpenAI is given access to NSA data
well... at least as far as US goes I wouldn't be surprised if they damaged themselves just to try and fail to harm Deepseek
Weell, you see, Chat GPT is good. But Deepseek is bad.
Because uh....
Well, let me ask Chat GPT to answer for you, one sec
Cause deepseek can't talk about CCP!!! (only on the website)
You should ask this question in dedicated RL server ( search for it)
Because this is unique project I have heard
So in that server you can get different people dedicated to RL
Hey
Is there a guide to install ollama with different backend?
In llama.cpp docs they have mentioned metallium backend I want to install the same backend on ollama
Are there any guide for that?
Hi, is there any workaround for this?
https://stackoverflow.com/questions/77131746/how-to-download-punkt-tokenizer-in-nltk
the solutions in this didnt work
Hey everyone! 👋
I’m working on a Python package that needs to automatically track transformations applied to pandas, NumPy, and scikit-learn. The goal is to detect when a dataset is modified without requiring the user to write extra code or manually call tracking functions.
The main challenge is finding a method that works seamlessly while ensuring all meaningful changes are detected.
🔹 What I Want to Achieve
- Automatically track modifications when a user applies transformations like
df.fillna(),df.drop_duplicates(), orsklearn_pipeline.fit(X). - Ensure minimal code changes for the user—ideally, they just import the package and work with pandas/NumPy as usual.
- Detect in-memory modifications, including
df.iloc[0, 1] = 5orarray[2] = 100, without requiring the user to explicitly log them. - Avoid major performance overhead—the tracking system should be lightweight and not slow down computations.
🔹 Approaches I’ve Considered
Proxy Wrapping (Overriding pandas, NumPy, and scikit-learn Methods)
- Override common transformation functions (
fillna(),drop_duplicates(),apply(),fit(),transform()). - Pros: Works transparently, no user interaction needed.
- Cons: Override all the functionalities!
🔹 What I Need Help With
- What other approaches would you suggest for tracking pandas/NumPy transformations **(almost) without user interaction **?
- How would you track inline modifications (
df.iloc[...] = 5) without modifying user code too much? - What’s the most efficient way to track changes while avoiding performance overhead?
Would love to hear your thoughts on how you’d approach this! 🚀 Thanks in advance for any insights! 🙌
I need help installing tensorflow GPU for windows I installed wsl, cuda toolkit and Nvidia drivers, what else should i do, I need detailed explanation please help
hey guys how do I use this?
where should i put it?
I want to know the coordinates using detect.py
I'm new to python so I really don't know what most of these do
import torch
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
#device configuration
device = torch.device('cuda' if torch.cuda.is_available() else "cpu")
#hyper paramiters
input_size = 784 # 28X28=784
hidden_size = 100
num_classes = 10
num_epochs = 2
Batch_size = 100
learn_rate = 0.0001
trainging_datasets = torchvision.datasets.MNIST(root='./data',train=True,
transform=transforms.ToTensor(), download=True)
tests_datasets = torchvision.datasets.MNIST(root='./data',train=False,
transform=transforms.ToTensor())
train_loader = torch.utils.data.DataLoader(dataset=trainging_datasets, batch_size=Batch_size,
shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=tests_datasets, batch_size=Batch_size,
shuffle=False)
exampels = iter(train_loader)
samples, labels = exampels.next(exampels)
print(samples.shape,labels.shape)
how come there is an attribute error
Hi
hello
as we've discussed a few times: always always show the whole entire error message no matter what. never say that you got an error message without showing the whole entire error message.
sorry i was trying to fix a typed line an unfinished line now Ive run int a another error
its in the paste bin
you wrote -1.28*28, which is -35.84, which is a float. you probably meant to write -1, 28*28, which is tuple[int, int]
thank you
n_correct = (predictions == labels).sum().item()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'bool' object has no attribute 'sum'
can you explain what this error message is telling you?
it is not finding a sum even thou i pit .sum as following the tutorial
do print(type(predictions), type(labels))
and tell me what it says
printed output:
<class 'torch.Tensor'> <class 'torch.Tensor'>
<class 'torch.Tensor'> <class 'torch.Tensor'>
(predictions == labels) is going to return a Boolean tensor where True (a.k.a 1) indicates a correct prediction and False (a.k.a 0) indicates an incorrect one.
use torch.sum((predictions == labels)) instead
epoch -1 / 2, step 200/600, loss = 1.3214
epoch -1 / 2, step 300/600, loss = 0.8638
epoch -1 / 2, step 400/600, loss = 0.8123
epoch -1 / 2, step 500/600, loss = 0.6779
epoch -1 / 2, step 600/600, loss = 0.4998
epoch 0 / 2, step 100/600, loss = 0.5865
epoch 0 / 2, step 200/600, loss = 0.4572
epoch 0 / 2, step 300/600, loss = 0.3157
epoch 0 / 2, step 400/600, loss = 0.5043
epoch 0 / 2, step 500/600, loss = 0.4080
epoch 0 / 2, step 600/600, loss = 0.3515
it dose not update the epoch and it starts with -1 instead of 1
did what Emyrs said work?
yes but it is not updating the epoch
AttributeError: 'bool' object has no attribute 'sum' indicates that (predictions == labels) itself is a single bool, which I find surprising.
I just checked your code and there are a few issues I spotted. The logging should be epoch + 1 not epoch -1
print(f'Epoch {epoch+1}/{num_epochs}, Step {i+1}/{n_total_steps}, Loss = {loss.item():.4f}')
print(f'epoch {epooch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss.item():.4f}')```
Yeah, I initially suspected it's probably a case where Wendigo might have mistakenly made predictions and labels a rank 0 tensor instead of a rank 1 tensor. But then i also noticed in his code that s/he used torch.max(outputs, 1) instead of torch.argmax(outputs, dim=1)
torch.max(outputs, 1) returns a tuple instead of predicted label. I usually prefer using torch.argmax() in this case because it's a more safer option.
Im just following a tutorial
Is everything working fine now?
epoch wont update
Even after fixing the logging print out? Can I see an example of what it's returning
epoch 1 / 2, step 100/600, loss = 1.7161
epoch 1 / 2, step 200/600, loss = 1.2840
epoch 1 / 2, step 300/600, loss = 0.8888
epoch 1 / 2, step 400/600, loss = 0.7217
epoch 1 / 2, step 500/600, loss = 0.6352
epoch 1 / 2, step 600/600, loss = 0.4856
epoch 2 / 2, step 100/600, loss = 0.5465
epoch 2 / 2, step 200/600, loss = 0.3712
epoch 2 / 2, step 300/600, loss = 0.4181
epoch 2 / 2, step 400/600, loss = 0.3767
epoch 2 / 2, step 500/600, loss = 0.4154
epoch 2 / 2, step 600/600, loss = 0.4518
accurecy = 0.8600000143051147
New Tutorial series about Deep Learning with PyTorch!
⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer *
In this part we will implement our first multilayer neural network that can do digit classification based on the famous MN...
But here the epoch is updating, if we go by this log. It moved from -1 to 0. Fix the logging part and share the new update
The epoch is updating. You trained for 2 epochs; it trained for 1st and 2nd epoch as shown here.
its updating now corectly???
Yes.
how can i test my forward network?
thank you
The last part in your code that's using a context manager has already handled the evaluation of your MLP (feed forward NN) on your test-set; hence, the reason you got accuracy= 0.86
with torch.no_grad():
n_correct = 0
n_samples = 0
for images, labels in test_loader:
images = torch.flatten(images, start_dim=1).to(device) #<--flatten image to a 1D vector
labels = labels.to(device)
logits = model(images) #<---Your feed-forward / forward propagation
predicted_labels = torch.argmax(logits, dim=1) #<--- I used 'argmax' here instead of 'max' to get predicted labels
labels = labels.view_as(predicted_labels) #<--- ensure labels match shape of predicted_labels
# Compute accuray per batch
n_samples += len(labels)
n_correct += torch.sum(predicted_labels == labels).item()
# Compute the avergae accuracy over all batches
final_acc = 100.0 * n_correct / n_samples
print(f'Accuracy = {final_acc:.2f}%')
new to python? , then learn first basics of it, because it will really help you to understand this code
and regarding the question, ultralytics is a library to load and fine-tune Yolo and other models
What I mean is there a way of graphing each epoch to know how much learning it's done and if I were to have a TK intermodule that allows me to draw and written digits will allow me to see does it recognize what has been written
Aside from seeing the loss go down via the training log, yes.
You can modify the code to accommodate for storing the loss per epoch, then once the training is over, create and invoke a function that plots the learning curve in each epoch.
About using Tkinter to draw a digit and have your trained model have a go at it, yeah, I believe it's possible as well. I haven't done sometime like that myself but yeah it's possible.
It's getting better and better that's what I'm saying from all this data
My model is 97% correct yay
Next step would be, implementing EarlyStopping and applying it to your code to help prevent overfitting.
What does earlystopping do?
Prevents your model from overfitting. After a certain tolerance level has been reached, EarlyStopping will force your model to stop training.
Where would I implement it at the end inside of the loop or in the main body?
inside the training loop
What earlystop should I go for in training on average
To avoid overfitting?
hello, I am trying to create a neural network from scratch using numpy, but i am kinda lost in building a optimiser, I don't know how i should implement that... if anyone can give me any info, that would be super helpful! thanks!
What is the limit that I should set to avoid overfitting?
There is no one-size-fits-all answer. It depends on multiple factors such as dataset size, model complexity, and the problem you're solving. I could have a tolerance level of 3 while yours could be 9. In my case, if validation loss doesn't improve after 3 epochs, EarlyStopping will be triggered which ultimately will terminate the model from training further.
To be honest I forgot they hit retrain and now I'm up to 98% accuracy which I'm surprised actually worked so that's why I was wondering because I didn't realize it and instead of disrupting the network I chance clinic keep going do you think it's a little overfitting now
Which tool can make an node-link-graph for big data? I tried Pyvis and Sigma.js but there was too many connections that built white blobs
Maybe you could coarse grain the graph, and represent that instead?
as things stand, visualizing the whole thing in one go is just not a good idea
You can perform a type of renormalization wherein you could, for instance, only represent nodes with a certain connectivity greater than N
I actually thought that these two images were some sort of a burlap cloth.
Or maybe represent only neighborhoods? there are a number of interesting things you could do that would convey information more efficiently than this data dump
If I got 50% be ok?
Once it has gone past what I can easily do with networkx library, I gently abort mission 👀 . However, I've seen someone use Gephi and Graphistry for this sort of thing.
The white are the Links / connections between nodes
Can you elucidate? I'm not sure I understand your question very well
Thanks, I try Graphistry
To stop the training when it gets to 50% so that avoids overfitting
Instead of stopping at 50% accuracy, use early stopping with validation loss to decide when to stop training dynamically.
Could also use Patience, e.g. if no improvement in test loss after x amount of epochs
what should i make next?
This is also an idea.
Maybe I can make something like NASA, that I make many small things and combine it the small things. Nasa does this for images from the space, can I do this with pyvis?
So instead of generate an graph.html, I create many graph.html (cluster) and then program an graph.html that combines all cluster into a graph.
Since you've implemented EarlyStopping in your MLP, you might wanna move to CNN next.
Well, before moving to CNN, pick two different datasets (one tabular dataset and one image data) and practise what you just learned by training a NN with MLP.
Once you've trained a MLP on a new image dataset - - preferably an image with 3 color channels, then try to train a CNN model on same dataset. Hopefully, this will enable you see and understand why CNN tend to outperform MLP.
MLP?
MLP?
what would it be in an if statement?
Multi-Layer Perceptron
What happens if the accuracy is a full 1.0 instead of 0.900000
Would it be possible to combine make a multi- model by training the data on the network then change the data afterwards?
Bruh i made an Gpt like transformer but it sucks it learns so slowly......
Could i get any tips? Like i tried every thing changeing learning rate, optimizers, model paramiters, datasets, vocab... and my model its just stuck
Remember learning takes time especially in this case the computer has to translate what humans mean as in language and then convert that into a vectors that it can draw lines between to find the words that are appropriate for what is on the graph
So I just found out that xscale and yscale exist in matplotlib.pyplot.
Basically, is this a function that you guys use often? Because this seems like a function that could be a big game changer, yet I've never seen it used before today.
please at me if you have anythinh.
Not sure I understand your question properly. Can you add more clarity to it?
The model is 'standing on business.' More like it's saying "there's no gimmicking around here" in a Commando voice.
But ideally, while the model gets all prediction right with 100% accuracy, it's good to confirm the model isn't overfitting.
it's impossible in practice because there is noise
why not use yolo11?
game changer
it does the thing it's supposed to do, which is fine, but it's not like its existence changes everything
mpl code is still a pain with/without it
Basically, is this a function that you guys use often?
I use logarithmic scales often, so yes
hello, I am looking for a partner to learn DSA with me using python and on intermediate level
Why mean is since the neural network should remember that Ms data set that I gave it without downloading it again if I change the data set that is being placed inside of it would it change what it understands or I can't learn two things at once
Yeah but my problem is that i tried to change my model complexity and just let it run but it didnt help my llm cant just cross that 4.6 loss line
I train on a good dataset called Fineweb 10bt (10 bilion tokens) and my character level model (with less paramiters) did better (had loss 0.8) than my new model that cant cross loss 4.6.
My model graph isnt even close to this
I just dont know what to do anymore
from my limited knowledge, I think llms usually only train for a few (< 5) epochs, on a very big (trillion tokens) dataset
also, maybe dynamic learning rates as training goes on?
I have dynamic learning rate
I also tried turning it off but it didnt help
When i do more than one epoch my model learns the training data not how to generalize (i tried it on smaller datasets)
I could give u my code if u want too look for some errors
well, I'm not exactly an llm training expert...
if I were you, I'd look for people that do something similar, say asking those who release finetunes on what they do
Ok and thx for the help..
What's an alternative library to sentence-transformers for creating embeddings
That supports python 3.8
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
So in general, how do you know that scaling the graph will give you a better idea of what you want to illistrate?
Because my problem is I heard of it while doing a tutorial where they were like "Oh geez. You should really scale it to see what I mean", and I'm wondering "How the heck would I know to do that by myself?"
Sorry this is such an open ended question: I typed the function in on YouTube and expected to find several videos on when and when not to use it and found very little.
if you know your data has a certain distribution, you could scale it to make it more obvious or show up better. a log tailed distribution is hard to visualize without a log scale because everything will be clumped on the left. but with scaling, it will look normal
or if the data follows a power law, you could use a log log scale to show that
okay, so that's just a matter of doing some statics on the data to find out if it has a distrubution: I can re-teach myself how to do that.
It's just an experience thing, I'd say. Like, if you look at the plot and you see that most of the points merge together near 0 and are hard to see, probably that plot will be more enlightening as a log-y one. That sort of thing.
interactive plots are also very useful for exploring data
Interactive?
Hold the phone: doesn't matplotlib's pyplot produce a static output? Are you saying to use something else to get a feel for the data then using pyplot?
matplotlib can have interactive plots. Something I have been getting into lately is PowerBI, which I may want to use for some initial data inspection when we get data from clients.
for i (images, labels) in enumerate(train_loader):
^^^^^^^^^^^^^^^^^^
SyntaxError: cannot assign to function call'
can you see how that's a syntactically invalid for loop?
lads what is your opinion on data camp as a learning resource?
I feel some of concepts are slightly rused and not explained as well
I'm just really following the tutorial I'm following and it used in numerate
are you Wendigo on a different account?
No my brain is just confused sometimes I am feeling what people might be thinking very tired
so you are not the same person as Wendigo?
Yep
chatroom paranoia has remained the same for around 4 decades. Are you, or are you not a sockpuppet, sir?
Mechanical Fox is replying to a message that very clearly was a response to Wendigo, but Mechnical Fox spoke as though they were Wendigo.
sure, but regardless of those details, the point remains. Autoconfusion is a giveaway, but also writing style.
?
for the pd.read_csv() function there is an argument called na_values? anyone know what this is?
pandas is very well documented
!docs pandas.read_csv
pandas.read_csv(filepath_or_buffer, *, sep=<no_default>, delimiter=None, header='infer', names=<no_default>, index_col=None, usecols=None, dtype=None, engine=None, ...)```
Read a comma-separated values (csv) file into DataFrame.
Also supports optionally iterating or breaking of the file into chunks.
Additional help can be found in the online docs for [IO Tools](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html).
take a look and you'll see what that argument is for.
ahhhh ok thank you! where can I learn more about the bot conmands here in this server
go to #bot-commands and do !help
thanks lad appreciate it
I'm a lad now?
if ur girl thanks lass
I don't care if people think I'm a girl as long as they think I'm a pretty girl.
only pretty girls out of the female category know data science and ai
other plotting libs exist, like plotly
personally I've been liking hvplot + bokeh
do y'all have any reccomended books for newbies?
pls ping me if y'all know, i will genuinely forget
You can check the pinned messages
I need help with this question https://imgur.com/a/ysbFJcz
We have a channel for DSA. You might wanna check #algos-and-data-structs
nope
New Tutorial series about Deep Learning with PyTorch!
⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer *
In this part we will implement our first convolutional neural network (CNN) that can do image classification based on the ...
You should be able to see what about this is wrong. If you can't, what you're trying to do is probably above your skill level
so basically for our project we will be having to use some models and we decided on LSTM/ RNN as its time series
tmrw we have a meeting with our mentor and we r supposed to have done some research or have a demo model or anything
so basically main PS is
like aiml powered smart energy management system
be it for home or office anything
so like we r supposed to collect data through IoT devices
and there will be 3 models:
consumption prediction
anamoly detection
generation prediction
original idea was to make it for gated communities then thermostat, AC use case for offices came later we just combined and made it general
idk what exactly to prepare for this. Like my teammate is going through a github project that has like LSTM for some finland electricity consumption thingy
im trying to go through research papers but any inputs ideas etc? what models can be used and metrics to be kept in mind. we talked to our professor and she suggested the RNNs / LSTMs
Matplotlib supports interactive plots (switch backend to an interactive one, e.g. matplotlib.use("TkAgg"), or if you're in a jupyter notebook use %matplotlib widget). I also sometimes use other libraries than mpl, for example plotly can do very cool interactive plots which support, for instance, showing all the info about a point in a tooltip when you hover over it.
Hello, I am starting in ML, I would like to work in a project to improve, send me DM
hi does polars have tuple support? can't find anything on it
Your class definition is all messed up
class ConvNet(nn.Module):
def __init__(self,):
self.conv1 = nn.Conv2d(3, 6,5)#r channels
self.pool = nn.MaxPool2d(2,2)
self.conv2= nn.Conv2d(6,16,5)
self.fc1 = nn.Linear(16*5*5, 120)
self.fc2= nn.Linear(120, 84)
self.fc3= nn.Linear(84, 10)
x = conv1
def forward(self,x):
pass
Revise this part first
what part specifically?
train what LLM to do what?
there are many GPT models
from scratch
You didn't specify super() , so your class isn't inheriting from nn.Module
You don't seem to have a layer for flattening, so your Linear layer won't be able to take the outputs of the conv layer
x = conv1 should not be in your def init()
You don't have any activation functions so your Linear layers don't learn any non linear patterns
Your forward pass doesn't do anything, so your model won't process inputs
It's astronomically expensive to train an LLM from scratch. Give up on this immediately.
so i cant test my own model :(?
you can create some models that aren't LLMs from scratch, and you can fine-tune existing LLMs. Only a handful of very wealthy organizations with tons of training data can create LLMs from scratch.
i want to make a small one couldnt i ?
I suppose, but it probably wouldn't be able to respond coherently to any prompt.
Sorry I don't have better news
There are still beginner ML projects that you can do. But they probably won't involve LLMs.
can you tell me one in nlp?
i made countless project on image classification and stuff
you can fine-tune a BERT model (which was considered an LLM when it was created) to do some kind of classification task.
Image captioning
Stack a CNN on top of a RNN
okay thakns
do you see why the line with your cursor is wrong?
insert mode...
Why did you write "NeralNet" inside super() ?
how can i disable that cursor?
By indenting properly
in the white block
Press tab there, yeah
I made a—really low accuracy— implementation of an ngram model to predict words. I didn’t make it from scratch
Used an existing package that already had a tokenizer and tools for making document feature matrices
And then I implemented the prediction method just to get a feel for how ngrams work
Not LLM level stuff but it was doable and I learned a lot
Point being I think you could do something small like that
why is the loss function showing it has an error
thank you so much
What do you believe this part of your code does?
def forward(self,x):
pass
Matter of fact, I mentioned changes in my previous reply when you asked
Because of insert cursor I have been copy and paste it from another bit of code and do I just copy and paste the same forward method that I've been working with cuz it seems like it when mostly work out
have you seen https://github.com/karpathy/nanoGPT and https://github.com/jingyaogong/minimind/blob/master/README_en.md ?
they're not really able to have coherent conversations, but working at all at that scale is a bit impressive, and kind of does what they're looking for
that said... tbh not sure if recommending those would be of any use given considering they do not really understand what they're getting into
hello, I am looking for a partner to learn DSA with me using python and on intermediate level
You're looking for #algos-and-data-structs . Data science is something else
U can with some good graphic card or with cloud.
They were asking where to train a foundation model for free.
When you make an LLM from scratch
I made one but its shitty
Like i tried to swich from character level vocab to subword and its shit now
I don't think even gpt-2 could be taught to respond correctly to prompts
No but my model is shit even at training data maybe i need to do more epochs
training it rn
i already did that, youre right its pretty low accuracy when you do a broad one
np
so even gpt2 says some nonsense ?