serene scaffold Feb 1, 2025, 3:48 PM

#

what's the tldr?

smoky basalt Feb 1, 2025, 3:49 PM

#

so is deep seek better

untold shell Feb 1, 2025, 6:39 PM

#

Hey guys, any sites or Youtube channels where i can learn about Evolutionary AI ?

past meteor Feb 1, 2025, 7:45 PM

#

untold shell Hey guys, any sites or Youtube channels where i can learn about Evolutionary AI ...

I read this https://link.springer.com/book/10.1007/978-3-662-44874-8

SpringerLink

Introduction to Evolutionary Computing

fringe bluff Feb 1, 2025, 8:04 PM

#

untold shell Hey guys, any sites or Youtube channels where i can learn about Evolutionary AI ...

What is evolutionary ai btw? Related to bioinformatics?

floral tree Feb 1, 2025, 8:06 PM

#

yeah what is it

past meteor Feb 1, 2025, 8:06 PM

#

it's just optimization algorithms

untold shell Feb 1, 2025, 8:51 PM

#

past meteor I read this https://link.springer.com/book/10.1007/978-3-662-44874-8

Thanks

smoky basalt Feb 1, 2025, 8:52 PM

#

wat should i learn after pandas if i wanna go into ml

serene scaffold Feb 1, 2025, 8:55 PM

#

smoky basalt wat should i learn after pandas if i wanna go into ml

Do you have a university degree that's related to ML?

smoky basalt Feb 1, 2025, 8:55 PM

#

serene scaffold Do you have a university degree that's related to ML?

im 15

#

🙃

#

i ve got gcses this year

#

but i wanna excel

serene scaffold Feb 1, 2025, 8:56 PM

#

smoky basalt im 15

You should plan to get a degree that's related to ML. But in the meantime, a good next step would be to train basic classifiers with scikit learn, so that you learn about concept like x and y data, train and test sets, and evaluation metrics

smoky basalt Feb 1, 2025, 8:56 PM

#

greatly in cs

smoky basalt Feb 1, 2025, 8:56 PM

#

serene scaffold You should plan to get a degree that's related to ML. But in the meantime, a goo...

sounds interesting

#

so which packages should i learn after pandas

serene scaffold Feb 1, 2025, 8:57 PM

#

Don't focus on learning libraries

#

Repeat after me

#

Don't focus on learning libraries

#

(I can't hear you)

smoky basalt Feb 1, 2025, 8:58 PM

#

no as in i mean to take that route of learning sci kit learn

smoky basalt Feb 1, 2025, 8:59 PM

#

serene scaffold Don't focus on learning libraries

im not gonna learn some random libraries

#

im js asking bc idk if i should learn matplotlib

#

is it related?

#

or is it more data science

serene scaffold Feb 1, 2025, 9:00 PM

#

@smoky basalt I just gave some suggestions for what concepts to learn next and what library to use to do it. So that's my answer for now.

smoky basalt Feb 1, 2025, 9:00 PM

#

serene scaffold <@987382475221246022> I just gave some suggestions for what concepts to learn ne...

ok so i after pandas i move to scikit learn?

#

would i need to learn numpy or smthn like that?

#

heard numpy is important

serene scaffold Feb 1, 2025, 9:01 PM

#

smoky basalt ok so i after pandas i move to scikit learn?

I'm not going to give an answer that frames what you should learn in terms of libraries, because that's the wrong way to think of it.

smoky basalt Feb 1, 2025, 9:02 PM

#

uhm im confused

serene scaffold Feb 1, 2025, 9:02 PM

#

You need to learn concepts. You use libraries to apply those concepts

smoky basalt Feb 1, 2025, 9:02 PM

#

so wat concepts to learn first

serene scaffold Feb 1, 2025, 9:03 PM

#

I already told you

serene scaffold Feb 1, 2025, 9:03 PM

#

serene scaffold You should plan to get a degree that's related to ML. But in the meantime, a goo...

Here

unkempt wigeon Feb 2, 2025, 5:15 AM

#

why do I need to place everything into a class?

serene scaffold Feb 2, 2025, 5:18 AM

#

unkempt wigeon why do I need to place everything into a class?

you don't.

#

python is not java.

unkempt wigeon Feb 2, 2025, 5:20 AM

#

I was looking at the pytorch resources and it hade the code in a class so i was just wondering

serene scaffold Feb 2, 2025, 5:21 AM

#

unkempt wigeon I was looking at the pytorch resources and it hade the code in a class so i was ...

sounds like you're talking about something more specific than "everything".

unkempt wigeon Feb 2, 2025, 5:22 AM

#

https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html

#

i got to the code with the class and the thought do i even need this in a class

vivid skiff Feb 2, 2025, 5:30 AM

#

Is there any discord server specific to pytorch?

flint sierra Feb 2, 2025, 5:30 AM

#

unkempt wigeon i got to the code with the class and the thought do i even need this in a class

The example in Creating Models does need to be a class because it uses inheritance.

neon island Feb 2, 2025, 5:32 AM

#

unkempt wigeon i got to the code with the class and the thought do i even need this in a class

If you don't subclass Module you won't be able to call its' inherited to() function to send it to your device.

#

Besides, you may want to parameterize your NeuralNetwork to try different instances of parameters, it's useful to have it as a class.

iron basalt Feb 2, 2025, 5:36 AM

#

unkempt wigeon I was looking at the pytorch resources and it hade the code in a class so i was ...

Module is used to make things convenient and acts as a composition tool. It's a pattern they want you to use, but it's not required.

#

A standard OOP library pattern is to have a fundamental building block that can be inherited from. For example, in a game engine this would typically be the GameObject type.

unkempt wigeon Feb 2, 2025, 5:40 AM

#

iron basalt Module is used to make things convenient and acts as a composition tool. It's a ...

Thank you I have a hard time understanding classes I know what they are and their subtypes but to me that part is alien to me

iron basalt Feb 2, 2025, 5:44 AM

#

unkempt wigeon Thank you I have a hard time understanding classes I know what they are and thei...

If you look at their example custom Module you can see that it has a Flatten, and if you look into how Flatten is defined you will find that it too is a Module: https://github.com/pytorch/pytorch/blob/v2.6.0/torch/nn/modules/flatten.py#L13

arctic wedgeBOT Feb 2, 2025, 5:44 AM

#

torch/nn/modules/flatten.py line 13

class Flatten(Module):```

iron basalt Feb 2, 2025, 5:45 AM

#

The idea is to be build new Modules be composition of other Modules.

#

This forms a tree structure, where each Module has some list of children, and when you then call something like to, it can navigate this tree and call to on all the children, and all of their children (etc), resulting in everything getting sent.

#

You could build your own system like this, or do it manually for each one.

gilded sundial Feb 2, 2025, 6:39 AM

#

Can anyone help on a beginner's project "Spotify Recommendation System" ? I am really stuck on how I should Cluster and find data to make the recommendation system model

stuck tapir Feb 2, 2025, 8:34 AM

#

gilded sundial Can anyone help on a beginner's project "Spotify Recommendation System" ? I am r...

I figure I could help,

gilded sundial Feb 2, 2025, 8:35 AM

#

stuck tapir I figure I could help,

Sir you helped me a lot yesterday. So I figured I shouldn't bother you . Sorry 😔.

stuck tapir Feb 2, 2025, 8:45 AM

#

Check dms,

tawdry sundial Feb 2, 2025, 8:55 AM

#

How is fine tuning expensive? It seems very cheap to fine tune while a lot of people claim its expensive

#

Fine tunning seems better for most use cases

#

I feel like i am missing something

past meteor Feb 2, 2025, 9:04 AM

#

tawdry sundial Fine tunning seems better for most use cases

better than what?

tawdry sundial Feb 2, 2025, 9:55 AM

#

past meteor better than what?

Than rag

past meteor Feb 2, 2025, 10:10 AM

#

tawdry sundial Than rag

finetuning openai models is mostly about having the models follow the system prompt and not necessarily about "knowledge"

tawdry sundial Feb 2, 2025, 12:51 PM

#

if 3-7 batches is enough for the model to acquire knowledge and follow certain prompts, then it would be really worth it. however i am still not sure why its not nearly as common as rag

round tusk Feb 2, 2025, 5:08 PM

#

I'm a little new to this concept, but I was looking at this video where they simulated a Deep Reinforcement Learning AI. The guy said that it was trained for 5 years. Do they actually train and simulate it for 5 years in human time? Or do they mean the 5 years in game?

#

The game they simulated was pokemon red

#

I'm dipping my feet in this area, and the fact that it takes years to train seems a little daunting, but I highly doubt they do that for so long.

serene grail Feb 2, 2025, 5:22 PM

#

round tusk I'm a little new to this concept, but I was looking at this video where they sim...

It means 5 years of in-game time
As far as I know, they often train these in parallel, so they might have, let's say, 200 versions of the game being played at the same time
5 years divided by 200 would be around 10 days (of real human time)

And the more and better hardware you have, the faster you can do it

round tusk Feb 2, 2025, 5:23 PM

#

serene grail It means 5 years of in-game time As far as I know, they often train these in par...

ohhhhh

#

That makes more sense

past meteor Feb 2, 2025, 6:13 PM

#

tawdry sundial if 3-7 batches is enough for the model to acquire knowledge and follow certain p...

Because it's not about knowledge it's about following instructions

#

finetuning an openai model should be seen as a substitute to shortening a prompt

unkempt wigeon Feb 2, 2025, 8:20 PM

#

Is this correct for a resource?

serene scaffold Feb 3, 2025, 1:09 AM

#

unkempt wigeon Is this correct for a resource?

I answered that for you a while ago.

unkempt wigeon Feb 3, 2025, 1:33 AM

#

serene scaffold I answered that for you a while ago.

Thank you I couldn't remember

#

Sorry

glacial root Feb 3, 2025, 2:46 AM

#

how important is differential equations for machine learning

serene scaffold Feb 3, 2025, 3:04 AM

#

glacial root how important is differential equations for machine learning

it's not.

glacial root Feb 3, 2025, 3:07 AM

#

serene scaffold it's not.

so i just need linear algebra and multivariable calculus

#

and then statistics

serene scaffold Feb 3, 2025, 3:07 AM

#

those are more relevant, yes

#

and it's specifically multivariate calculus for derivatives. I don't know of any application for integrals in ML.

wooden sail Feb 3, 2025, 4:35 AM

#

many cost functions tend to be formulated as integrals, in their so-called "variational form"

iron basalt Feb 3, 2025, 4:39 AM

#

glacial root how important is differential equations for machine learning

This depends more so on what you are trying to have it learn. If you are trying to use machine learning to approximate (learn) physics for example (being done a bunch in computer graphics right now), then yes.

#

Differential equations is what is currently used in physics to describe physical systems' behavior (and for the foreseeable future).

#

If you want to do ML research, then it can be good to know too, as it opens the door to physics for you, and ML is no stranger to taking ideas from there (recent Noble Prize in physics was given to an ML researcher due to its link to physics).

#

(If you want to do (broad) research you want all the math so you can take ideas from other fields (go wide, not narrow))

past meteor Feb 3, 2025, 6:33 AM

#

I think I like answering no to these questions

#

If someone asks if they need to know diff eq for ML/AI the answer is likely just no. If they’re interested in theory and not practice, they’d likely just want to pick it up themselves

#

Or you learn the parts of diff eq you need to when you need to (it’s how I approach it)

gilded sundial Feb 3, 2025, 9:19 AM

#

song_interaction_count = test_data_kmeans.groupby('name')['user'].count()
popular_songs = song_interaction_count[song_interaction_count >= 3].index
test_data_kmeans = test_data_kmeans[test_data_kmeans['name'].isin(popular_songs)]

Create a utility matrix (user-item matrix)

utility_matrix = test_data_kmeans.pivot_table(index='user', columns='name', values='user_rating', fill_value=0)

Convert the utility matrix to a sparse matrix

sparse_matrix = csr_matrix(utility_matrix)

Define a batched KNN function for incremental computation

def recommend_songs_batched(song_name, utility_matrix, sparse_matrix, batch_size=1000, num_recommendations=5):

if song_name not in utility_matrix.columns:
    print(f"Song '{song_name}' not found in the dataset!")
    return []

song_idx = utility_matrix.columns.get_loc(song_name)

knn = NearestNeighbors(metric='cosine', algorithm='brute', n_neighbors=num_recommendations + 1)
recommendations = []

for start in range(0, sparse_matrix.shape[1], batch_size):
    end = min(start + batch_size, sparse_matrix.shape[1])

    # Fit KNN only on the batch
    knn.fit(sparse_matrix.T[start:end])
    
    # Get indices within the current batch
    indices_in_batch = list(range(start, end))
    
    # Compute neighbors for the target song
    distances, indices = knn.kneighbors(
        sparse_matrix.T[song_idx].reshape(1, -1),
        n_neighbors=num_recommendations + 1
    )
    
    for i, idx in enumerate(indices.flatten()):
        # Skip the first neighbor (it is the song itself)
        if i == 0:
            continue

        global_idx = indices_in_batch[idx]
        recommendations.append((utility_matrix.columns[global_idx], 1 - distances.flatten()[i]))
        
return sorted(recommendations, key=lambda x: -x[1])[:num_recommendations]

#

Example usage

song_to_recommend = "Camby Bolongo" # Replace with a valid song name
recommendations = recommend_songs_batched(song_to_recommend, utility_matrix, sparse_matrix, batch_size=5000, num_recommendations=5)

print(f"Recommendations for '{song_to_recommend}':")
for rec, score in recommendations:
print(f"{rec} (Similarity Score: {score:.2f})")

#

Can anyone tell me why I can't find songs in my dataset while I am searching from songs in my dataset ?

#

jaunty helm Feb 3, 2025, 9:32 AM

#

gilded sundial Can anyone tell me why I can't find songs in my dataset while I am searching fro...

    if song_name not in utility_matrix.columns:
```you sure that the column names are where you want to check for a song?

#

like I expect utility_matrix['some_column_that_has_song_names']

gilded sundial Feb 3, 2025, 9:35 AM

#

jaunty helm ```py if song_name not in utility_matrix.columns: ```you sure that the colum...

My utility matrix is like this which takes song names in every columns

jaunty helm Feb 3, 2025, 9:37 AM

#

gilded sundial My utility matrix is like this which takes song names in every columns

interesting choice
still tho, add a print or a breakpoint there to make sure it's actually in the columns to debug

gilded sundial Feb 3, 2025, 9:38 AM

#

I did and some names do match . It's likely due to taking batch sizes this problem is occurring

woeful pulsar Feb 3, 2025, 9:51 AM

#

Hello guys. I am a newbie in python and a data science enthusiast.
Quick question, how do I delete a NaN row in python?

jaunty helm Feb 3, 2025, 9:52 AM

#

woeful pulsar Hello guys. I am a newbie in python and a data science enthusiast. Quick questi...

pandas? df = df.dropna()

woeful pulsar Feb 3, 2025, 9:55 AM

#

Yes. Okay.
Thanks a lot

wide bane Feb 3, 2025, 9:58 AM

#

I want to make a model which will generate a 2d image from text input from user and then it will make the 3d model of the 2d image which was created and the 3d model will be used in blender to view, so which model should I use or the best resources which can help??

Also the model must run locally in my system

agile cobalt Feb 3, 2025, 12:08 PM

#

see https://stability.ai/stable-3d + look up alternatives to it

Stability AI

Stable Zero123 — Stability AI

Stable Zero123 is an advanced AI model specialized in generating 3D objects. It stands out due to its capability to accurately interpret how objects should appear from various perspectives, which is a significant advancement in the realm of 3D visualization.

candid ridge Feb 3, 2025, 2:07 PM

#

idk if this is related but
am i the only one getting this error with google gemini?
EPROTO 04110000:error:0A000119:SSL routines:ssl3_get_record:decryption failed or bad record mac:c:\\ws\\deps\\openssl\\openssl\\ssl\\record\\ssl3_record.c:623:

same error happened even when i use python, nodejs, or curl

serene scaffold Feb 3, 2025, 2:11 PM

#

candid ridge idk if this is related but am i the only one getting this error with google gemi...

one needs to see the code that causes the error as well, but this is probably an API question rather than an AI question.

candid ridge Feb 3, 2025, 2:12 PM

#

it also happen when i use the official library pip install google-genai

serene scaffold Feb 3, 2025, 2:12 PM

#

candid ridge it also happen when i use the official library `pip install google-genai`

please only share text as actual text. not as a screenshot.

candid ridge Feb 3, 2025, 2:12 PM

#

requests.exceptions.SSLError: HTTPSConnectionPool(host='generativelanguage.googleapis.com', port=443): Max retries exceeded with url: /v1beta/models/gemini-1.5-flash:generateContent (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:1018)')))```

#

from google import genai

client = genai.Client(api_key="bla")
response = client.models.generate_content(
    model="gemini-1.5-flash", contents="Explain how AI works"
)
print(response.text)

lean oriole Feb 3, 2025, 2:50 PM

#

hello guys

#

i'll be in your care 🙂

silent basin Feb 3, 2025, 3:45 PM

#

OpenAi: "hey, you stole our data"
Everone: "now you know how it feels"

#

WhatsApp_Image_2025-02-03_at_17.03.55_a1bcc84d.jpg

sullen herald Feb 3, 2025, 4:03 PM

#

candid ridge ```py from google import genai client = genai.Client(api_key="bla") response = ...

I think it's deprecated.
You may use,


genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content("Explain how AI works")
print(response.text)```

candid ridge Feb 3, 2025, 4:17 PM

#

>>> import google.generativeai as genai
>>> genai.configure(api_key='dskodkskdsooas')
>>> model = genai.GenerativeModel('gemini-1.5-flash')
>>> resp = model.generate_content('hello')
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1738599352.419306   11664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.538987   11664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.661139   20156 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.791475   20156 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599352.899497    9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.024943    9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.155409   11664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.287683    9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.415192    9664 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context
I0000 00:00:1738599353.545329   20156 ssl_transport_security.cc:1665] Handshake failed with error SSL_ERROR_SSL: error:1e000065:Cipher functions:OPENSSL_internal:BAD_DECRYPT: Invalid certificate verification context

sullen herald Feb 3, 2025, 4:19 PM

#

candid ridge ```ps >>> import google.generativeai as genai >>> genai.configure(api_key='dskod...

https://discuss.ai.google.dev/t/gemini-api-connection-ssl-error/335/8

Build with Google AI

Gemini API Connection SSL error

excuse me, have you solved this problem?

#

thank you for your reply, i have fixed this by, genai.configure(api_key=GOOGLE_API_KEY,transport=‘rest’)

unkempt wigeon Feb 3, 2025, 4:19 PM

#

Is it possible to combine activation functions?

candid ridge Feb 3, 2025, 4:21 PM

#

sullen herald ```thank you for your reply, i have fixed this by, genai.configure(api_key=GOOGL...

😭

#

raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='generativelanguage.googleapis.com', port=443): Max retries exceeded with url: /v1beta/models/gemini-1.5-flash:generateContent?%24alt=json%3Benum-encoding%3Dint (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:1018)')))```

#

can you try with ur api key so i know if it on google or on me

#

it's used to work not that long ago

#

like few weeks

sullen herald Feb 3, 2025, 4:26 PM

#

I will check wait

sullen herald Feb 3, 2025, 4:33 PM

#

candid ridge can you try with ur api key so i know if it on google or on me

Probably issues with your dev setup. I tried both on my env and on Kaggle. It works.

#

Sending you nb link over dm.

candid ridge Feb 3, 2025, 4:55 PM

#

sullen herald Probably issues with your dev setup. I tried both on my env and on Kaggle. It wo...

i tried on my phone and it worked now i think it was just my laptop

Screenshot_2025-02-03-23-53-47-18_a2cf8efcdd42a8e6f7906303f104fb67.jpg

#

eager hamlet Feb 3, 2025, 9:13 PM

#

hey, I'm trying to train a model to read a word from an image. I'm exploring architectures like attention-based encoder-decoder models or CNN-RNN-CTC architecture. What do you guys think would be better?

glacial root Feb 3, 2025, 10:56 PM

#

iron basalt Differential equations is what is currently used in physics to describe physical...

oh i see, i'm going to uni next year and am asking this question mainly to decide whether to do it in community college over the summer or at uni next year where it will be much more rigorous but also i think that rigor can be more beneficial to me, so i just wanted to know how important it will be for my future career

unkempt apex Feb 4, 2025, 5:28 AM

#

eager hamlet hey, I'm trying to train a model to read a word from an image. I'm exploring arc...

OCR fits best in this type of situations
because when you train some models there is concern with accuracies

dawn current Feb 4, 2025, 10:26 AM

#

Can any one suggest platforms to learn and brushup my coding skills

hollow pagoda Feb 4, 2025, 1:08 PM

#

dawn current Can any one suggest platforms to learn and brushup my coding skills

codewars

#

leetcode

summer flame Feb 4, 2025, 3:17 PM

#

Hey guys I always struggled with Neural Networks, so I made a simple, yet powerful Neural Network file for beginners 😄

Here it is:

#

📎 BeginnerNN.py

#

please give me reviews since this is my first release in... anything, so please give feedback!

jade hatch Feb 4, 2025, 3:32 PM

#

summer flame

did you use chatgpt lemon_smug lemon_smirk

summer flame Feb 4, 2025, 3:32 PM

#

nop spent a month writing this thing

serene scaffold Feb 4, 2025, 3:33 PM

#

summer flame nop spent a month writing this thing

they just said that because it has a lot of comments

summer flame Feb 4, 2025, 3:33 PM

#

o yeah

#

since its an unofficial official package, i just wrote a bunch of comments

hollow cobalt Feb 4, 2025, 4:54 PM

#

Hello guys. Me and a friend are beginning a project of building an LLM. If you want to join us or hear more send me a DM.

serene scaffold Feb 4, 2025, 5:00 PM

#

hollow cobalt Hello guys. Me and a friend are beginning a project of building an LLM. If you w...

building an LLM takes an absurd amount of compute power. are you sure you don't mean "fine-tune"? this is a critical distinction.

hollow cobalt Feb 4, 2025, 5:04 PM

#

No, I know the difference. Give an example of the minimum computing power you think is required.

serene scaffold Feb 4, 2025, 5:08 PM

#

hollow cobalt No, I know the difference. Give an example of the minimum computing power you th...

what kind of LLM do you want to make?

hollow cobalt Feb 4, 2025, 5:22 PM

#

It’s a transformer architecture LLM

serene scaffold Feb 4, 2025, 5:30 PM

#

hollow cobalt It’s a transformer architecture LLM

what's your budget?

flat mirage Feb 4, 2025, 8:30 PM

#

Someone who speaks Spanish

tidal bough Feb 4, 2025, 11:09 PM

#

hollow cobalt No, I know the difference. Give an example of the minimum computing power you th...

for a fairly serious model like llama3.1, with 400B parameters, estimates are around 50 million dollars (not counting capital expenditures). A tiny one you could probably train for less than a million by renting a datacenter

#

deepseek v3 was trained for "only" 5 million, but it's a massive outlier in cheapness that required a lot of low-level optimization.

hollow pagoda Feb 5, 2025, 6:54 AM

#

flat mirage Someone who speaks Spanish

hola coma estas

#

ajajajaja

tropic bluff Feb 5, 2025, 8:00 AM

#

Hi guys I want to prepare for data science internship but I don't have that much knowledge but my interview is very near how can I prepare to ace the interview?

left tartan Feb 5, 2025, 2:18 PM

#

tropic bluff Hi guys I want to prepare for data science internship but I don't have that much...

#career-advice plz

void crescent Feb 5, 2025, 2:25 PM

#

guys whats the best BERT edition i just picked one ranodm one (its doing horribly during fine-tuning)

📎 message.py

serene scaffold Feb 5, 2025, 2:28 PM

#

void crescent guys whats the best BERT edition i just picked one ranodm one (its doing horribl...

there isn't a best one. it depends on what you're trying to do and what kind of text you're trying to process

void crescent Feb 5, 2025, 2:30 PM

#

serene scaffold there isn't a best one. it depends on what you're trying to do and what kind of ...

basically a paragraph, question, and answer seperated by [SEP] tokens. Classify the answers if they are correct or not

serene scaffold Feb 5, 2025, 2:31 PM

#

void crescent basically a paragraph, question, and answer seperated by [SEP] tokens. Classify ...

what are the paragraphs and questions about, topically? cooking? politics? sports?

void crescent Feb 5, 2025, 2:31 PM

#

its mostly political

#

but like

#

one is on US's history

#

one is on New Zealand airline

#

ok nvm its kinda varied

#

but the sources are all from CNN

#

so which models can understand CNN articles

serene scaffold Feb 5, 2025, 3:02 PM

#

@void crescent look to see if there's a BERT model on huggingface that'st rained on news articles.

hard nest Feb 5, 2025, 4:14 PM

#

In object detection, I have an image and the coordenates of the item in question. But some images don't have the item, and the code returns an error if I give empty coordenates in the training. What should I do?

void crescent Feb 5, 2025, 4:16 PM

#

serene scaffold <@824897280809238549> look to see if there's a BERT model on huggingface that'st...

https://huggingface.co/cssupport/bert-news-class

cssupport/bert-news-class · Hugging Face

#

but how to port it to tensorflow

serene scaffold Feb 5, 2025, 4:26 PM

#

void crescent but how to port it to tensorflow

I don't know. I recommend using pytorch, since that's what industry is using.

glacial root Feb 5, 2025, 4:29 PM

#

which is better to learn, tensorflow, pytorch, or scikit-learn

jaunty helm Feb 5, 2025, 4:31 PM

#

glacial root which is better to learn, tensorflow, pytorch, or scikit-learn

torch & tf are similar, but the former is way more popular at this point
scikit has more "traditional" stuff if you will, like the classic lin reg, decision trees, svms, just to name a few

glacial root Feb 5, 2025, 4:32 PM

#

jaunty helm torch & tf are similar, but the former is way more popular at this point scikit ...

oh do the first two not have these?

jaunty helm Feb 5, 2025, 4:33 PM

#

glacial root oh do the first two not have these?

nope, pytorch and tensorflow is specifically focused on neural networks

glacial root Feb 5, 2025, 4:34 PM

#

jaunty helm nope, pytorch and tensorflow is specifically focused on neural networks

so they have different uses, in that case should i learn scikit and then one of these two?

#

with a focus on torch or tensor cause computer vision is heavily focused on cnns right

jaunty helm Feb 5, 2025, 4:40 PM

#

glacial root so they have different uses, in that case should i learn scikit and then one of ...

sure

computer vision is focused on CNNs
while I'm not that familiar with cv, I wouldn't say that it's all CNNs, like vision transformers (ViT) are now a thing (maybe partly due to transformers worked great for NLP => people trying to shove it everywhere, but still)

glacial root Feb 5, 2025, 6:08 PM

#

jaunty helm sure > computer vision is focused on CNNs while I'm not that familiar with cv, I...

oh i see

#

i don't really know much, not in college yet

#

but hopefully i can learn some cv before college, i need to first learn math up until differential equations though

frosty pawn Feb 5, 2025, 6:20 PM

#

these is someone know about ai

#

i want some advice like...

#

what i should to learn

#

i know should python but should i have good experience in python language

small arch Feb 5, 2025, 7:51 PM

#

does someone know about fast rcnn?

serene scaffold Feb 5, 2025, 7:58 PM

#

small arch does someone know about fast rcnn?

be sure to always ask your actual question. don't ask for an expert.

muted vine Feb 5, 2025, 8:24 PM

#

hey guys, how is it going? some of you have used the Deepseek distill model of llama on AWS Bedrock? I wanna know about the pricing of use it and how does works the process to import such model? I am using in my API the llama model on AWS, but i want to migrate to DeepSeek model.

smoky basalt Feb 5, 2025, 10:17 PM

#

serene scaffold building an LLM takes an absurd amount of compute power. are you sure you don't ...

i dont think he minds waiting a thousand years

#

for it to return a response

#

can someone show me complex code for smthn in ml

#

i wanna see how compelx it gets

tepid tartan Feb 5, 2025, 10:24 PM

#

stats : basic descriptive + tests (t test anova chi2 correlation) SQL : modelisation entity-relationship model, obv joins, views common table expressions, being good at queries python : pandas numpy, matplotlib/seaborn, excel basics, Pbi basics Tableau basics, added value R, SAS

is this a good way to get the basic in data analyst

serene scaffold Feb 5, 2025, 10:24 PM

#

smoky basalt for it to return a response

they're asking about creating one from scratch. if you tried to do that on a "normal" computer, it would probably take longer than the universe has existed.

#

and that's before you can actually start using it

smoky basalt Feb 5, 2025, 10:25 PM

#

💀

smoky basalt Feb 5, 2025, 10:25 PM

#

tepid tartan stats : basic descriptive + tests (t test anova chi2 correlation) SQL : modelisa...

woah

smoky basalt Feb 5, 2025, 10:25 PM

#

serene scaffold they're asking about creating one from scratch. if you tried to do that on a "no...

any projects u made?

tepid tartan Feb 5, 2025, 10:53 PM

#

smoky basalt woah

Im trying to get into data analyst

smoky basalt Feb 5, 2025, 11:33 PM

#

tepid tartan Im trying to get into data analyst

fairs

#

ml looks cool

spice ravine Feb 5, 2025, 11:59 PM

#

Claude AI seems to be the best for coding

#

At least for me

#

Then it’s DeepSeek and then gpt is just the worst

serene scaffold Feb 6, 2025, 2:11 AM

#

spice ravine Then it’s DeepSeek and then gpt is just the worst

What about the llamas?

spice ravine Feb 6, 2025, 2:46 AM

#

I've never tried the llamas

#

is it good?

obtuse yacht Feb 6, 2025, 4:10 AM

#

From what I've seen llamas better at accuracy and efficiency in coding tasks

#

Chatgpt though is better for assistance with creative tasks ex. writing code comments or generating documentation

#

but claude is the best for coding in my opinion

tender hearth Feb 6, 2025, 4:12 AM

#

obtuse yacht From what I've seen llamas better at accuracy and efficiency in coding tasks

dont take these benchmarks too seriously

#

theyre good to establish some sort of baseline but anything within a few percentage points shouldn't be taken to be significant

#

use the models on your own and formulate your own opinion

obtuse yacht Feb 6, 2025, 4:13 AM

#

thats true

#

companies often fund these "benchmarks" to demonstrate the new models

tender hearth Feb 6, 2025, 4:23 AM

#

https://en.wikipedia.org/wiki/Goodhart's_law

Goodhart's law

Goodhart's law is an adage often stated as, "When a measure becomes a target, it ceases to be a good measure". It is named after British economist Charles Goodhart, who is credited with expressing the core idea of the adage in a 1975 article on monetary policy in the United Kingdom:

Any observed statistical regularity will tend to collapse once...

woeful escarp Feb 6, 2025, 7:12 PM

#

Does anybody know about daily dataset updates?
I'm wondering about data for a trading bot (ML), i saw a dataset in kaggle but i found is not accurate, it had price spread

unkempt apex Feb 6, 2025, 8:57 PM

#

woeful escarp Does anybody know about daily dataset updates? I'm wondering about data for a tr...

you may need API's which updates frequently

heavy canyon Feb 6, 2025, 9:35 PM

#

obtuse yacht From what I've seen llamas better at accuracy and efficiency in coding tasks

claude is very helpful at coding problems. it’s been able to answer all of my class questions

woeful escarp Feb 6, 2025, 10:38 PM

#

unkempt apex you may need API's which updates frequently

thanks, do you know more about?

#

because i want daily trading or weekly trading, but 1st of all i need the data haha

obtuse yacht Feb 7, 2025, 1:09 AM

#

woeful escarp thanks, do you know more about?

what type of trading are you doing?

#

stocks?

woeful escarp Feb 7, 2025, 1:10 AM

#

obtuse yacht stocks?

Index

#

Man index and currency

obtuse yacht Feb 7, 2025, 1:20 AM

#

woeful escarp Index

import pandas as pd
from tqdm import tqdm
import yfinance as yf
import os
import contextlib
import shutil
from os.path import join

def read_symbols_data():
    data = pd.read_csv("http://www.nasdaqtrader.com/dynamic/SymDir/nasdaqtraded.txt", sep='|')
    data_clean = data[data['Test Issue'] == 'N']
    symbols = data_clean['NASDAQ Symbol'].tolist()
    return symbols, data_clean

def download_specific_symbols_data(symbols, period):
    os.makedirs('hist', exist_ok=True)
    is_valid = {}

    with open(os.devnull, 'w') as devnull:
        with contextlib.redirect_stdout(devnull):
            for symbol in tqdm(symbols, desc="Collecting"):
                data = yf.download(symbol, period=period)
                if len(data.index) == 0:
                    continue
                is_valid[symbol] = True
                data.to_csv(f'hist/{symbol}.csv')

    valid_symbols = [symbol for symbol in symbols if symbol in is_valid]
    return valid_symbols

def move_symbols_to_directory(symbols, source, dest):
    os.makedirs(dest, exist_ok=True)
    for symbol in symbols:
        filename = f'{symbol}.csv'
        shutil.move(join(source, filename), join(dest, filename))

#

def check_if_empty(input_path:str):
    if len(os.listdir(input_path)) != 0:
        files = [os.remove(os.path.join(input_path, file)) for file in os.listdir(input_path)]
    os.rmdir(input_path)

def optimize_code_specific_symbols(symbols, period='max'):
    os.makedirs('hist', exist_ok=True)
    
    data_path = "data"
    if os.path.exists(data_path):
        check_if_empty(data_path)

    valid_symbols = download_specific_symbols_data(symbols, period)

    _, data_clean = read_symbols_data()
    valid_data = data_clean[data_clean['NASDAQ Symbol'].isin(valid_symbols)]

    os.makedirs('data', exist_ok=True)

    stocks = valid_data[valid_data['ETF'] == 'N']['NASDAQ Symbol'].tolist()

    move_symbols_to_directory(stocks, "hist", "data")

    os.rmdir('hist')


if __name__ == "__main__":
    index_symbols = [ 
        "^GSPC", # s&p 500
        "^DJI", # dow jones industrial average
        "^IXIC", # nasdaq composite
        "^RUT", # russell 2000
        "^VIX", # volatility index
        "^FTSE", # ftse 100
        "^N225" # nikkei 225
    ]
    specific_symbols = ['^GSPC'] #S&P 500 should appear in a folder under "{stock_name}.csv"
    optimize_code_specific_symbols(specific_symbols)

#

this is some old code that i made when I did a stock prediction project

#

I added the index ticker codes in the main function

#

michael_photo_2025-02-06_at_8.21.50_PM.png

#

your dataset should look like this

glacial root Feb 7, 2025, 1:31 AM

#

obtuse yacht ``` import pandas as pd from tqdm import tqdm import yfinance as yf import os im...

is tqdm the library that makes the console look nice

obtuse yacht Feb 7, 2025, 1:32 AM

#

glacial root is tqdm the library that makes the console look nice

its the loading bars

#

I added it so when you download more than one stock it would show a loading bar of how many stocks are done

glacial root Feb 7, 2025, 1:52 AM

#

obtuse yacht its the loading bars

oh, yeah i got it mixed up with another library. a while ago i watched a video about some random python libraries and tqdm was mentioned along with another library that organizes error messages and such

ionic valley Feb 7, 2025, 7:48 AM

#

Hello, got a data science interview for a Two Sigma internship but I've legit never done a data science interview before. How should I prep?

Here's what I know about the interview: "The permitted languages for the first question are: C, C++, Java, and Python. The permitted languages for the second and third questions are: Java, Octave (Matlab), Python, and R."

What should I be grinding? So far I've just been spamming Pandas syntax

void crescent Feb 7, 2025, 11:43 AM

#

import tensorflow_hub as hub
import tensorflow_text as text

preprocess = hub.KerasLayer(tfhub_handle_preprocess)
encoder = hub.KerasLayer(tfhub_handle_encoder)

inputs = tf.keras.layers.Input(shape=(1,), dtype=tf.string)
x = preprocess(inputs)
x = encoder(x)
outputs = tf.keras.layers.Dense(1, activation="sigmoid")(x["pooled_output"])

model = tf.keras.Model(inputs, outputs)

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.summary()

weird im getting

#


ValueError: Exception encountered when calling layer 'keras_layer_4' (type KerasLayer).

A KerasTensor is symbolic: it's a placeholder for a shape an a dtype. It doesn't have any actual numerical value. You cannot convert it to a NumPy array.

Call arguments received by layer 'keras_layer_4' (type KerasLayer):
  • inputs=<KerasTensor shape=(None, 1), dtype=string, sparse=False, name=keras_tensor_364>
  • training=None

hollow silo Feb 7, 2025, 12:44 PM

#

what is a good pandas book to read cover to cover?

flint sierra Feb 7, 2025, 12:49 PM

#

hollow silo what is a good pandas book to read cover to cover?

I would treat this more as a reference, but if you read and learn it cover to cover you'll be well off. https://wesmckinney.com/book/

Python for Data Analysis, 3E

untold bloom Feb 7, 2025, 1:34 PM

#

questions and answers in the pandas tag of stackoverflow

agile cobalt Feb 7, 2025, 3:40 PM

#

hollow silo what is a good pandas book to read cover to cover?

the official User Guides are a must-read

gilded pebble Feb 7, 2025, 5:45 PM

#

I want to learn python where can i learn from

short barn Feb 7, 2025, 5:45 PM

#

Hello

serene scaffold Feb 7, 2025, 5:46 PM

#

gilded pebble I want to learn python where can i learn from

!resources

arctic wedgeBOT Feb 7, 2025, 5:46 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

short barn Feb 7, 2025, 5:47 PM

#

Can anyone recommend me how to learn AI development?

#

I haven't found any good explanations of how AI works

serene scaffold Feb 7, 2025, 5:57 PM

#

short barn I haven't found any good explanations of how AI works

"how AI works" depends entirely on what kind of AI you're talking about.

main fox Feb 7, 2025, 6:44 PM

#

y=mx+b, at scale 🐒

hollow cobalt Feb 7, 2025, 6:56 PM

#

Lesson 1: How to break Copilot.

limber belfry Feb 7, 2025, 7:36 PM

#

Is there any of topic channel? I have a survey to ask and idk where…

fluid basalt Feb 7, 2025, 8:29 PM

#

main fox y=mx+b, at scale 🐒

So true brother

obtuse yacht Feb 7, 2025, 8:55 PM

#

short barn I haven't found any good explanations of how AI works

have you watched 3blue1browns explanations

#

https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

YouTube

3Blue1Brown

But what is a neural network? | Deep learning chapter 1

What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks

Additional funding for this project was provided by Amplify Partners

Typo correction: At 14 minutes 45 seconds...

▶ Play video

fading wigeon Feb 7, 2025, 11:48 PM

#

I have a basic question about tree ensembles. Like I get well enough how through various methodologies we can generate 100 trees or whatever. After that is it just like… the most common answer? So if 51 trees say it’s a cat and 49 say it’s a dog then it’s a cat?

serene scaffold Feb 8, 2025, 4:04 AM

#

fading wigeon I have a basic question about tree ensembles. Like I get well enough how throug...

is "tree ensemble" different than "random forest"?

past meteor Feb 8, 2025, 9:40 AM

#

serene scaffold is "tree ensemble" different than "random forest"?

it's one tree ensemble, there's also regular bagged, regular boosted, extra trees, xgboost, ...

past meteor Feb 8, 2025, 9:40 AM

#

fading wigeon I have a basic question about tree ensembles. Like I get well enough how throug...

yes

steel hull Feb 8, 2025, 1:41 PM

#

Guys Which is/are the best Machine Learning resource(s) for a strong academic and practical foundation? ISLP or Andrew Ng (2018 - YouTube Version) or some other resource? Which one to pick up first?

I am looking forward to build a strong academic/theoretical and practical foundation in ML. So that it would help me out during advanced courses in my masters and also in building projects

Please suggest

flat token Feb 8, 2025, 5:47 PM

#

steel hull Guys Which is/are the best Machine Learning resource(s) for a strong academic an...

Linear algebra book of your choice, linear algebra done right is what I would recommend. Some stupid YouTube ML course does not provide a single drop of theoretical foundation if that is truly what u want

dense needle Feb 8, 2025, 7:12 PM

#

linear algebra done right is my favorite linear algebra book but it’s a tough follow if someone isn’t comfortable with proof based math

#

And linear algebra foundations really

flat token Feb 8, 2025, 9:09 PM

#

Then they shouldn't be learning ML bc all they really want is to copy paste code that has been 100 ways

#

Done*

tame blaze Feb 8, 2025, 9:25 PM

#

anyone available?

serene scaffold Feb 8, 2025, 9:33 PM

#

tame blaze anyone available?

Always ask your actual question right away. Don't ask if anyone will answer your question before you ask it

#

Even if someone is available, they can't know if they can help until they know the question.

lapis sequoia Feb 9, 2025, 1:49 AM

#

Red-pill me on image segmentation

steel hull Feb 9, 2025, 6:56 AM

#

flat token Linear algebra book of your choice, linear algebra done right is what I would re...

Ok thank you

flat token Feb 9, 2025, 7:22 AM

#

lapis sequoia Red-pill me on image segmentation

image segmentation is best done using linear programming. adobe actually originally used LP as their solver for patching together images. combine that with a skillful algorithm for managing nxm images with variable colors and relevant pixelations and you have the most modern image segmentation that exists

#

enjoy ur red-pill

lofty thorn Feb 9, 2025, 7:35 AM

#

I am 25 years old...and don't know much about DATA SCIENCE.... but i am really interested to pursuing it...
and also wanna earn something from this profession
is it too late for me ?

lofty thorn Feb 9, 2025, 7:52 AM

#

I am really interested to meet with a data scientist...
is there any?

flat token Feb 9, 2025, 8:18 AM

#

lofty thorn I am 25 years old...and don't know much about DATA SCIENCE.... but i am really i...

That's like saying I'm 250 lbs can I ever just walk to the gym and start losing weight? God I mean ig if you don't want to then you never will

lapis sequoia Feb 9, 2025, 11:02 AM

#

flat token image segmentation is best done using linear programming. adobe actually origina...

Thank you.

grave garnet Feb 9, 2025, 10:11 PM

#

lofty thorn I am 25 years old...and don't know much about DATA SCIENCE.... but i am really i...

No, I started a bit later even with practically zero knowledge on the field and only a scientific background. How easy or difficult it will be to find a job I cant tell you, since that depends a lot on where you live as well. And also on what you and/or the companies define as datascience (most of it is not machine learning in my experience).

dense needle Feb 9, 2025, 10:53 PM

#

flat token Then they shouldn't be learning ML bc all they really want is to copy paste code...

Agree mostly but I moreso meant that linear algebra done right isn’t really an intro text and I thought OP was asking for intro recs

delicate cargo Feb 10, 2025, 12:20 AM

#

https://youtu.be/nqfWhZJVPyU?si=pGquX4-ncEVhFKTO

I've been really interested in this new genetic algorithm, but I've been finding it really difficult to describe to people, is anyone interested in talking about it?

YouTube

Last Monkey Standing

Update Chat: Matthew Andrews [February 8, 2024]

Matthew Andrews talks about his work with M-E-GA

https://github.com/ML-flash/M-E-GA

▶ Play video

vale matrix Feb 10, 2025, 2:40 AM

#

need help with face recognition anyone ?

bronze fossil Feb 10, 2025, 4:31 AM

#

just ask your question

fervent canopy Feb 10, 2025, 6:35 AM

#

Read like a superhero: Turn PDFs into Bionic Reading format https://github.com/SanshruthR/Bionic_Reading_Hub

GitHub

GitHub - SanshruthR/Bionic_Reading_Hub: Read like a superhero: Turn...

Read like a superhero: Turn PDFs into Bionic Reading format https://x.com/Benavent/status/1853508523638116689 - SanshruthR/Bionic_Reading_Hub

hollow pagoda Feb 10, 2025, 9:28 AM

#

fervent canopy Read like a superhero: Turn PDFs into Bionic Reading format https://github.com/...

i like tht

weary timber Feb 10, 2025, 3:11 PM

#

guys how the fk does chatgpt handle oov when i type in the prompt "hello chat gpt kjakfdask"

weary timber Feb 10, 2025, 3:12 PM

#

fervent canopy Read like a superhero: Turn PDFs into Bionic Reading format https://github.com/...

how does it work, can you do a brief explanation?

fervent canopy Feb 10, 2025, 6:14 PM

#

weary timber how does it work, can you do a brief explanation?

So, it basically converts the pdf to doc and then it extracts all of the tables and images and renders that as a html with the font embedded inside the html file. I am using html because of consistency and almost every thing supports html.

#

You can also use those fonts locally by getting the ttf files from the repo and installing the font files https://github.com/SanshruthR/Bionic_Reading_Hub/blob/main/Fast_Serif.ttf

GitHub

Bionic_Reading_Hub/Fast_Serif.ttf at main · SanshruthR/Bionic_Readi...

Read like a superhero: Turn PDFs into Bionic Reading format https://x.com/Benavent/status/1853508523638116689 - SanshruthR/Bionic_Reading_Hub

#

there are some other font files in there too

#

I made it cuz, I hate corporate greed lol

weary timber Feb 10, 2025, 6:28 PM

#

fervent canopy So, it basically converts the pdf to doc and then it extracts all of the tables ...

sorry for not being clear, i actually wanna know how the bionic reading part happens and its assocation between ml

fervent canopy Feb 10, 2025, 6:39 PM

#

weary timber sorry for not being clear, i actually wanna know how the bionic reading part hap...

So, the font creates artificial points, the brain doesn't reads the full thing it just focuses on those points and fills in the gaps. So, instead of reading the full word the brain only focuses on the main points and knows what the word is. ML is being used to identify which letters should be selected for better reading.

weary timber Feb 10, 2025, 6:39 PM

#

oh okay

#

thanks

dapper dune Feb 10, 2025, 7:45 PM

#

Hi there. I'm new to ai. I am currently researching this topic for a few personal projects. Can someone guide me on the topic of fine tune language models? I have a general idea of this process from the hugging face manuals, but if I can ask someone about it in more detail (preferably in private messages, as I don't want to reveal details of projects to an audience), please let me know. I'd be very grateful

serene scaffold Feb 10, 2025, 7:48 PM

#

dapper dune Hi there. I'm new to ai. I am currently researching this topic for a few persona...

I will only help you in the server.
Can you describe more about what you want the fine-tuned language model to do? There are many kinds of language models, including ones that aren't what you consider to be language models.

dapper dune Feb 10, 2025, 7:53 PM

#

serene scaffold I will only help you in the server. Can you describe more about what you want th...

The main idea is that I write some literary texts and as an experiment and study of the topic I would like to train a model on them (for example gpt2-large or any other, if you can suggest better options). in particular, I am interested in how to properly compose a dataset based on my texts, how to correctly mark them up, so that the whole text is used for training and not just a part of it or 2 texts are glued together, how to add any third-party labels that will be important to the tokenizer, as well as how to train the model so that the model is approximately as good as it should be.

dapper dune Feb 10, 2025, 7:56 PM

#

serene scaffold I will only help you in the server. Can you describe more about what you want th...

I'm also wondering if it is possible to train the model so that it generates text with similar structure

serene scaffold Feb 10, 2025, 8:12 PM

#

dapper dune The main idea is that I write some literary texts and as an experiment and study...

I write some literary texts and as an experiment and study of the topic I would like to train a model on them so that the model can do ...
can you finish this sentence?

#

and as an experiment and study of the topic
also I don't understand this part.

dapper dune Feb 10, 2025, 8:14 PM

#

serene scaffold > and as an experiment and study of the topic also I don't understand this part.

that part's not important in general, I meant I'm doing it for me

serene scaffold Feb 10, 2025, 8:14 PM

#

dapper dune that part's not important in general, I meant I'm doing it for me

I mean I literally don't understand what you wrote. I don't know what it means.

#

The main idea is that I write some literary texts and I would like to train a model on them
is this the salient part?

dapper dune Feb 10, 2025, 8:16 PM

#

serene scaffold I mean I literally don't understand what you wrote. I don't know what it means.

I'm sorry for my English
In continuation of the phrase you asked: That the model would generate similar text structure

serene scaffold Feb 10, 2025, 8:18 PM

#

dapper dune I'm sorry for my English In continuation of the phrase you asked: That the model...

by "similar text structure", do you mean the syntax of individual sentences, or a more conceptual ordering of information?

dapper dune Feb 10, 2025, 8:21 PM

#

By structure I mean ordering of information, for example each text has a label [TITLE] and [MAIN PART]. The goal is to generate a text with a structure where there will be a title and a main part

serene scaffold Feb 10, 2025, 8:24 PM

#

dapper dune By structure I mean ordering of information, for example each text has a label [...

when it comes time to generate a new text, what do you want the workflow to be?

dapper dune Feb 10, 2025, 8:27 PM

#

serene scaffold when it comes time to generate a new text, what do you want the workflow to be?

If I understood the question correctly, I envision it like this: promt: “[TITLE] Once upon a time...” or promt: “[MAIN PART] A long time ago...”

serene scaffold Feb 10, 2025, 8:50 PM

#

dapper dune If I understood the question correctly, I envision it like this: promt: “[TITLE]...

so you give it the title, and then it generates the main part?

tidal pebble Feb 10, 2025, 8:53 PM

#

Any1 know where I can learn to build a collaborative filtering model? Im trying to make one for my project

dapper dune Feb 10, 2025, 8:58 PM

#

serene scaffold so you give it the title, and then it generates the main part?

that's not quite right, I'm writing a label and the beginning of the text content inside this part. As an promt, there can be both the beginning of the text in the TITLE and the beginning of the text in the [MAIN PART]. the result of the generation is a text with the structure [TITLE] and [MAIN PART]. that is, in fact, the structure of the final text is always the same, but the starting point of generation can be any of its parts. I will also be satisfied with the option if just a phrase is used as an promt, without specifying which part it refers to

dire loom Feb 10, 2025, 10:36 PM

#

hello peeps, can anyone point me in the direction of stock market focused trading communities? im a novice python programmer & funded day trader 🙂

turbid viper Feb 11, 2025, 1:22 AM

#

anyone who's worked with layoutLM or floorplans or both?

serene scaffold Feb 11, 2025, 3:36 AM

#

dapper dune that's not quite right, I'm writing a label and the beginning of the text conten...

you'd need to train it to generate the whole things, including the [TITLE] and [MAIN PART] tags. And then when you go to use it, you prompt it with [TITLE] This is the first part of the title and have it keep generating until it generates [MAIN PART], and then you add the human-written beginning of the main part, and then you have it keep generating from there.

barren veldt Feb 11, 2025, 4:23 AM

#

when doing backpropagation do I use the output that is after Softmax or the raw output? (neural networks)

serene scaffold Feb 11, 2025, 4:40 AM

#

barren veldt when doing backpropagation do I use the output that is after Softmax or the raw ...

the very last output.

#

you also seem to have forgotten a very important step

stable isle Feb 11, 2025, 4:54 AM

#

@dapper dune are you generating your own training data?

#

i see you're trying to train/fine-tune on literary texts you will write...

#

@dapper dune are you interested in creating data-sets of an application's usage?

dapper dune Feb 11, 2025, 4:57 AM

#

stable isle i see you're trying to train/fine-tune on literary texts you will write...

not really, I already have about 100 texts written by me.

#

I'm more interested in how to properly prepare these texts for fine-tune.

stable isle Feb 11, 2025, 4:59 AM

#

dapper dune I'm more interested in how to properly prepare these texts for fine-tune.

oh so you haven't gotten to the point of fine-tuning yet....

dapper dune Feb 11, 2025, 5:00 AM

#

stable isle oh so you haven't gotten to the point of fine-tuning yet....

quite possibly. As I said, I only have a rough understanding of AI

barren veldt Feb 11, 2025, 5:02 AM

#

serene scaffold you also seem to have forgotten a very important step

What very important step?

#

I'm abit new to neural networks

snow moat Feb 11, 2025, 7:34 AM

#

barren veldt I'm abit new to neural networks

Watch Andrew Ng

gleaming plinth Feb 11, 2025, 8:31 AM

#

good morning, i have a quick question. before i study videos regarding data science with python, should i first familiarize myself with ML or can i learn data science first and then ML? thank you for your advice

#

A.I suggested that I start with Data Science first for the following reason. If you were to start directly with ML, you would constantly run into problems because, for example, you don't know how to prepare or analyze data.

dense needle Feb 11, 2025, 11:40 AM

#

You’ll want a solid math/stats foundation to do ML yes

#

How much experience do you have with math, stats, and programming

gleaming plinth Feb 11, 2025, 11:51 AM

#

dense needle How much experience do you have with math, stats, and programming

im new to python and programming. actually i study python basics etc. After that i want to going forward with Data Science

dense needle Feb 11, 2025, 11:55 AM

#

How about math/stats

hybrid zodiac Feb 11, 2025, 11:55 AM

#

Is there anyone experimenting with a CLIP model ?

dense needle Feb 11, 2025, 11:55 AM

#

gleaming plinth im new to python and programming. actually i study python basics etc. After that...

Maybe check out the pins for some resources

gleaming plinth Feb 11, 2025, 11:55 AM

#

dense needle How about math/stats

low experience close 0

toxic mortar Feb 11, 2025, 11:56 AM

#

Are there any good papers that cover embedding code repository into vector database?

gleaming plinth Feb 11, 2025, 11:56 AM

#

dense needle Maybe check out the pins for some resources

thx for helping dude

dense needle Feb 11, 2025, 11:58 AM

#

gleaming plinth low experience close 0

Are you in school

gleaming plinth Feb 11, 2025, 12:00 PM

#

dense needle Are you in school

nope im adult. i have just basic knowledge in math/stats....like median,algebra, probability etc

#

if you are talking about advance math/stats skills i would say im a noob tbh

dense needle Feb 11, 2025, 12:02 PM

#

Def get on the math as well on top of the programming. Maybe prioritize it even more

gleaming plinth Feb 11, 2025, 12:03 PM

#

dense needle Def get on the math as well on top of the programming. Maybe prioritize it even ...

ok thx mate

untold fable Feb 11, 2025, 1:37 PM

#

https://youtube.com/shorts/w_Gqn61h0QI?si=iTOEEHLOAxwtppa1 how to achive this thing

YouTube

Michel Gomez

Using PYTHON and SPOTIFY to auto-select a song based on EMOTION

Using python and AI to auto select a song based on emotion #shorts #python #computerscience #artificialintelligence

▶ Play video

hybrid zodiac Feb 11, 2025, 1:43 PM

#

untold fable https://youtube.com/shorts/w_Gqn61h0QI?si=iTOEEHLOAxwtppa1 how to achive this ...

just find the code

untold fable Feb 11, 2025, 1:44 PM

#

didn't get

sterile heath Feb 11, 2025, 1:50 PM

#

https://www.quantamagazine.org/undergraduate-upends-a-40-year-old-data-science-conjecture-20250210/ Hm.

Quanta Magazine

Steve Nadis

Undergraduate Upends a 40-Year-Old Data Science Conjecture | Quanta...

A young computer scientist and two colleagues show that searches within data structures called hash tables can be much faster than previously deemed possible.

pastel vessel Feb 11, 2025, 6:03 PM

#

dense needle Def get on the math as well on top of the programming. Maybe prioritize it even ...

I have basic knowledege of stats and prob and intermediate knowledge of python, can I build a career in data science or even data analysis.... I have 6-9 months of time. And if yes please tell me what can I do for it?🙏

dusty sentinel Feb 11, 2025, 7:13 PM

#

Hi guys, has anyone ever had issues with the DBSCAN algorithm? I'm using it in a research project with simple code on images, but it's crashing my machine. I've been coding for four years, and this is the first time I've encountered a real bottleneck in it.

#

I am a statistician, so I tested it in R too. While searching for 'why does this work here but not in Python?', I discovered that the implementation in R is more efficient (AKA C++ imp), running smoothly. However, for real-world applications, Python would be a better choice. So if anyone has experienced these issues, a faster solution would be great! 🙂

serene scaffold Feb 11, 2025, 7:17 PM

#

dusty sentinel Hi guys, has anyone ever had issues with the DBSCAN algorithm? I'm using it in a...

are you using this? https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html

scikit-learn

DBSCAN

Gallery examples: Comparing different clustering algorithms on toy datasets Demo of DBSCAN clustering algorithm Demo of HDBSCAN clustering algorithm

dusty sentinel Feb 11, 2025, 7:20 PM

#

Yes, I also tried HDBSCAN, but with the exponential increase in parameters for images, the same error appears. Tonight, I will run it on Colab to test the actual limits of the bottleneck.

serene scaffold Feb 11, 2025, 7:21 PM

#

This implementation has a worst case memory complexity of O(n^2), which can occur when the eps param is large and min_samples is low, while the original DBSCAN only uses linear memory. For further details, see the Notes below.

dusty sentinel Feb 11, 2025, 7:21 PM

#

like 200 min samples or 2.5 eps, (i know is too much but i dont work with the alg too much), it crashs.

dusty sentinel Feb 11, 2025, 7:22 PM

#

serene scaffold > This implementation has a worst case memory complexity of O(n^2), which can oc...

Yes, it's heavy (like other unsupervised algorithms) 😭

past meteor Feb 11, 2025, 7:24 PM

#

dusty sentinel Yes, it's heavy (like other unsupervised algorithms) 😭

You can try CURE, it’s a single pass cluster algorithm

#

That being said, I rarely cluster and when I do, I just use k-means

#

I assume your issue is that you don’t want to specify the number of clusters ahead of time

dusty sentinel Feb 11, 2025, 7:27 PM

#

past meteor I assume your issue is that you don’t want to specify the number of clusters ahe...

Yes, that's it. I want to specify whether the algorithm fails to recognize the clusters or merges them incorrectly. I'm conducting research on solar panel recognition, and using DBSCAN is mandatory for my thesis (TCC).

past meteor Feb 11, 2025, 7:28 PM

#

How many samples do you have

dusty sentinel Feb 11, 2025, 7:29 PM

#

A lot, like 250+ images, and possibly more. I just split some satellite images provided by my professor from my country for the beginning of the research. But I’m processing them one by one. The full image is around 500MB, but each split is about 2.5MB

#

250 images cover my entire state

#

In the test case I'm working on right now, only with partitioned images, with a specific area.

past meteor Feb 11, 2025, 7:50 PM

#

dusty sentinel A lot, like 250+ images, and possibly more. I just split some satellite images p...

I assume these are very high dimensional images

#

Even if your program didn’t crash I’d worry about the quality of clustering in this high dimensional space

odd meteor Feb 11, 2025, 10:44 PM

#

dusty sentinel A lot, like 250+ images, and possibly more. I just split some satellite images p...

One common issue with DBSCAN is its runtime--which grows quadratically with the dataset’s size (and working with a satellite image simply means even more increased runtime)

If you'd wanna optimize for speed without sacrificing utility, then you might wanna explore DBSCAN++. This is a much faster and more scalable alternative to DBSCAN. It's actually 20x faster than DBSCAN.

fallow coyote Feb 12, 2025, 12:56 AM

#

what are some good linear and logistic regression projects I could do thatd be good to increase my programming skills and as a good portfolio project? nothing too complicated but challenging enough. I want to get into ML, but its too fucking complicated so Ill stick with the simpler aspects

crude karma Feb 12, 2025, 3:52 AM

#

guys does anyone have Zillow dataset

fervent canopy Feb 12, 2025, 5:01 AM

#

crude karma guys does anyone have Zillow dataset

https://www.zillow.com/research/data/

Zillow

Housing Data

Note: We make occasional changes to CSV download paths and data is updated on the 12th of each month.

glacial root Feb 12, 2025, 5:47 AM

#

do you guys think i could learn pytorch while learning calc 3 or is it better to wait until after i finish calc 3 and linear algebra

small wedge Feb 12, 2025, 6:30 AM

#

glacial root do you guys think i could learn pytorch while learning calc 3 or is it better to...

you can learn pytorch with basically 0 math knowledge, learn it whenever you like.

rain path Feb 12, 2025, 10:18 AM

#

I want to process a dataset but it's too large for my disk space, so I'm using streaming mode to iterate over it. Is there a way to free up memory after each batch of iteration since the data in memory builds up: (and this is a really large dataset)

from datasets import load_dataset

dataset = load_dataset("calabi-yau-data/ws-5d", name="reflexive", split="full", streaming=True)

# Convert dataset to an iterator
dataset_iter = iter(dataset)

# Iterate through first 1000 rows, in chunks of 100
batch_size = 100
total_rows = 1000

for i in range(0, total_rows, batch_size):
    batch = list(itertools.islice(dataset_iter, batch_size))
    if not batch:  # Stop if there are no more rows
        break
    print(f"Batch {i // batch_size + 1}:")
    print(batch)  # Process the batch as needed

It's a shame datasets aren't indexable otherwise I would've run the code to return the specified range of rows .

marble shoal Feb 12, 2025, 12:22 PM

#

How to install torchdirect-ml because i use pip but it cannot find the package even though i already installed the packages from pypi

dark tangle Feb 12, 2025, 12:39 PM

#

Guys do you know how can i convert this data into a table? I think it should be printed in a table right?

agile cobalt Feb 12, 2025, 12:42 PM

#

dark tangle Guys do you know how can i convert this data into a table? I think it should be ...

you must specify the delimiter/separator as ; when reading
it assumes , by default

dark tangle Feb 12, 2025, 12:43 PM

#

agile cobalt you must specify the delimiter/separator as `;` when reading it assumes `,` by d...

Okayy, now it works, thanks bro

brave adder Feb 12, 2025, 2:50 PM

#

hey guys, i have heard about andrew ng's ml course. I wanna know whether it is good? and also if it is free (lol)?
for context im kinda new to ml and would like to go in depth into how different models work

odd meteor Feb 12, 2025, 3:40 PM

#

brave adder hey guys, i have heard about andrew ng's ml course. I wanna know whether it is g...

Yes it's good (not a guarantee you'll love it though)
Yes it's free

glacial root Feb 12, 2025, 3:54 PM

#

small wedge you can learn pytorch with basically 0 math knowledge, learn it whenever you lik...

but math knowledge is necessary to actually do stuff with it right

#

not bs stuff, actual good projects

#

i'm learning multivariable calculus and linear algebra anyways, just wanted to know if i could go on ahead and learn ml frameworks/libraries as well

brave adder Feb 12, 2025, 5:21 PM

#

odd meteor Yes it's good (not a guarantee you'll love it though) Yes it's free

ohh
can you please share a link, i was only able to find one on coursera and i think it was paid

agile cobalt Feb 12, 2025, 5:36 PM

#

brave adder ohh can you please share a link, i was only able to find one on coursera and i t...

you can audit it for free on Coursera (select the individual sub-courses, not the main one)
the full course including excercises is not free though

and yes it is good

#

just keep in mind it's about the foundations you need to understand how models work, it will not go into specifics about different models, it'll just give you the knowledge to understand what is going on for the general case

glacial root Feb 12, 2025, 5:40 PM

#

so it's all theory and not much application?

agile cobalt Feb 12, 2025, 5:44 PM

#

glacial root so it's all theory and not much application?

the exercises include training simple models, no transformers or other huge architectures iirc

if by "application" you were thinking about training your own ChatGPT, you ain't finding that there

-# (if you were actually thinking about that, take a look at Andrej Karpathy resources like https://github.com/karpathy/nanoGPT and his youtube series though)

keen perch Feb 12, 2025, 8:19 PM

#

Anyone knows how to download tensorflow for GPU and what other things I should download?

small wedge Feb 12, 2025, 8:22 PM

#

glacial root but math knowledge is necessary to actually do stuff with it right

No, one of the the main points of libraries like pytorch and tensorflow are that they enable you to use machine learning without needing the intense math knowledge. Certainly having a foundation of knowledge will help you understand what's going on under the hood or be required to actually preform novel research, but as far as required math knowledge to build and train a model with one of these frameworks, it's about as minimal as you can get.

tidal bough Feb 12, 2025, 8:23 PM

#

keen perch Anyone knows how to download tensorflow for GPU and what other things I should d...

https://www.tensorflow.org/install/pip, note the system-specific info

brave adder Feb 13, 2025, 8:35 AM

#

agile cobalt the exercises include training simple models, no transformers or other huge arch...

i have watched his videos and i found them super helpful

#

and i want something similar for other types of models
like svm, random forest etc

lofty thorn Feb 13, 2025, 11:01 AM

#

Data science role has a different meaning for different companies. some says go for data analyst and some say other. I also don't know any data scientist.
Can anyone help me deciding which job role should I go for?

timid veldt Feb 13, 2025, 3:52 PM

#

what's the best approach i can do to make myself learn ai faster using machine learning using python

serene scaffold Feb 13, 2025, 4:28 PM

#

timid veldt what's the best approach i can do to make myself learn ai faster using machine l...

what's the rush?

timid veldt Feb 13, 2025, 4:28 PM

#

serene scaffold what's the rush?

I have to get a job

serene scaffold Feb 13, 2025, 4:28 PM

#

timid veldt I have to get a job

are you pursuing an AI or ML related degree?

timid veldt Feb 13, 2025, 4:30 PM

#

serene scaffold are you pursuing an AI or ML related degree?

Nope. An IT degree. I'm just curious about what path I should take to learn AIML fast but not too fast that I wouldn't know the concepts and my time would be wasted.

serene scaffold Feb 13, 2025, 4:30 PM

#

timid veldt Nope. An IT degree. I'm just curious about what path I should take to learn AIML...

are you in the US?

timid veldt Feb 13, 2025, 4:30 PM

#

serene scaffold are you in the US?

No I'm from India

serene scaffold Feb 13, 2025, 4:31 PM

#

timid veldt No I'm from India

I don't know how it works in India, but you will probably need to go back to uni to get an AIML job

timid veldt Feb 13, 2025, 4:31 PM

#

serene scaffold I don't know how it works in India, but you will probably need to go back to uni...

I still am in uni

#

I'm a 3rd year student

serene scaffold Feb 13, 2025, 4:31 PM

#

can you switch to an AIML related major?

timid veldt Feb 13, 2025, 4:31 PM

#

serene scaffold can you switch to an AIML related major?

Nope not now

serene scaffold Feb 13, 2025, 4:32 PM

#

timid veldt Nope not now

can you stay in uni and get a masters?

timid veldt Feb 13, 2025, 4:33 PM

#

serene scaffold can you stay in uni and get a masters?

Bachelors is fine for me. I'm thinking of getting a job after I finish doing my bachelors and gain some work experience. Then I can think of getting higher education.

serene scaffold Feb 13, 2025, 4:33 PM

#

timid veldt Bachelors is fine for me. I'm thinking of getting a job after I finish doing my ...

you almost certainly won't be able to get an AI/ML job as your first job

timid veldt Feb 13, 2025, 4:33 PM

#

serene scaffold you almost certainly won't be able to get an AI/ML job as your first job

Yes I won't. But I would like to learn more about it.

serene scaffold Feb 13, 2025, 4:34 PM

#

!resources data science

arctic wedgeBOT Feb 13, 2025, 4:34 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

timid veldt Feb 13, 2025, 4:35 PM

#

Thank you.

sullen herald Feb 13, 2025, 5:04 PM

#

Any reviews on Kerashub?

sullen herald Feb 13, 2025, 5:07 PM

#

lofty thorn Data science role has a different meaning for different companies. some says go ...

It totally depends on what you’re more inclined towards and what you expect from the role. If you enjoy working with data viz, reporting, and business insights, a Data Analyst role might be a good fit. If you're interested in building predictive models or genai and working with machine learning, then a Data Scientist/AI engineer/researcher role would be more suitable.

empty furnace Feb 13, 2025, 5:08 PM

#

Can someone familiar with rerankers explain the differences between AnswerDotAI's rerankers and FlagRerankers?

sullen herald Feb 13, 2025, 5:19 PM

#

empty furnace Can someone familiar with rerankers explain the differences between AnswerDotAI'...

I have only used answer ai's rerankers library for most tasks, but as per my understanding these Flagrerankers are basically using different embedding method called Flag embedding, which claims to produce better embedding features.

If I could recall correctly, rerankers library already consists of BAAI BGE models so you dont really need to use any other library for flagrerankers.

round rapids Feb 13, 2025, 6:43 PM

#

how

#

do I import pandas

serene scaffold Feb 13, 2025, 6:48 PM

#

round rapids do I import pandas

import pandas as pd

#

@round rapids you're voice banned for voice gate spam btw

round rapids Feb 13, 2025, 6:50 PM

#

whats voice gate spam?

serene scaffold Feb 13, 2025, 6:50 PM

#

see #voice-verification

round rapids Feb 13, 2025, 6:50 PM

#

oh

round rapids Feb 13, 2025, 6:50 PM

#

serene scaffold `import pandas as pd`

sad

serene scaffold Feb 13, 2025, 6:51 PM

#

round rapids <:sad:1117130539456020581>

did you pip install pandas?

round rapids Feb 13, 2025, 6:51 PM

#

in the cmd?

serene scaffold Feb 13, 2025, 6:51 PM

#

round rapids in the cmd?

yes

round rapids Feb 13, 2025, 6:51 PM

#

#

I tried

round rapids Feb 13, 2025, 6:51 PM

#

serene scaffold <@822082929776787477> you're voice banned for voice gate spam btw

ah anyways, can you unban me?

serene scaffold Feb 13, 2025, 6:51 PM

#

round rapids <:sad:1117130539456020581>

all the text in the terminal here. please copy and paste it into the chat as text.

round rapids Feb 13, 2025, 6:55 PM

#

serene scaffold all the text in the terminal here. please copy and paste it into the chat as tex...

what?

serene scaffold Feb 13, 2025, 6:55 PM

#

round rapids what?

there's text in the terminal in the screenshot that you posted. I need to copy and paste it, so please put it in this chat as text.

round rapids Feb 13, 2025, 6:56 PM

#

C:\Users\bilal>pip install pandas
'pip' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\bilal>

serene scaffold Feb 13, 2025, 6:56 PM

#

round rapids <:sad:1117130539456020581>

this screenshot.

round rapids Feb 13, 2025, 6:58 PM

#

serene scaffold this screenshot.

yes, I copied it and sent it.

#

or am I missing something

serene scaffold Feb 13, 2025, 6:58 PM

#

round rapids yes, I copied it and sent it.

you did not.

round rapids Feb 13, 2025, 6:59 PM

#

round rapids C:\Users\bilal>pip install pandas 'pip' is not recognized as an internal or exte...

that's literally what it says in the screenshot

#

or do you want another text, I don't get it

serene scaffold Feb 13, 2025, 6:59 PM

#

round rapids <:sad:1117130539456020581>

this one

#

round rapids Feb 13, 2025, 6:59 PM

#

PS C:\Users\bilal\Desktop\coding> & C:/Users/bilal/AppData/Local/Programs/Python/Python313/python.exe "c:/Users/bilal/Desktop/coding/python with pandas/python.py"
Traceback (most recent call last):
File "c:\Users\bilal\Desktop\coding\python with pandas\python.py", line 1, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
PS C:\Users\bilal\Desktop\coding>

serene scaffold Feb 13, 2025, 6:59 PM

#

@round rapids do this

C:/Users/bilal/AppData/Local/Programs/Python/Python313/python.exe -m pip install pandas

round rapids Feb 13, 2025, 7:00 PM

#

thanks

round rapids Feb 13, 2025, 7:00 PM

#

serene scaffold <@822082929776787477> do this ``` C:/Users/bilal/AppData/Local/Programs/Python/P...

and, could you please unban me from voice chat?

serene scaffold Feb 13, 2025, 7:01 PM

#

round rapids and, could you please unban me from voice chat?

no

round rapids Feb 13, 2025, 7:01 PM

#

aight

main fox Feb 13, 2025, 7:04 PM

#

@serene scaffold you're a saint of patience, helpfulness,and restraint

serene scaffold Feb 13, 2025, 7:05 PM

#

main fox <@253696366952316929> you're a saint of patience, helpfulness,and restraint

actually I'm a bastard

#

@round rapids did that command work?

#

also what are you trying to do with pandas?

round rapids Feb 13, 2025, 7:28 PM

#

serene scaffold <@822082929776787477> did that command work?

yup

round rapids Feb 13, 2025, 7:29 PM

#

serene scaffold also what are you trying to do with pandas?

I'm trying to build along some projects on yt with pandas

serene scaffold Feb 13, 2025, 7:29 PM

#

round rapids I'm trying to build along some projects on yt with pandas

I recommend doing the kaggle pandas tutorial

round rapids Feb 13, 2025, 7:30 PM

#

serene scaffold I recommend doing the kaggle pandas tutorial

https://www.kaggle.com/learn this one?

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

serene scaffold Feb 13, 2025, 7:30 PM

#

round rapids https://www.kaggle.com/learn this one?

https://www.kaggle.com/learn/pandas

Learn Pandas Tutorials

Solve short hands-on challenges to perfect your data manipulation skills.

empty furnace Feb 13, 2025, 11:46 PM

#

sullen herald I have only used answer ai's rerankers library for most tasks, but as per my und...

Cool, thanks. I really like AnswerDotAI's approach on the output of the reranker's score, so sticking to that is great.

opal frigate Feb 14, 2025, 11:51 AM

#

Hello

#

Can someone help me with my data science homework?

#

We're task with making the same output as this picture, but i cant get the position for the individual inset just right

lapis sequoia Feb 14, 2025, 1:23 PM

#

opal frigate We're task with making the same output as this picture, but i cant get the posit...

That graph alone looks vers hard(though i am not a maths student, i admire those math geeks)

opal frigate Feb 14, 2025, 1:38 PM

#

lapis sequoia That graph alone looks vers hard(though i am not a maths student, i admire those...

For me im just having trouble with finding how all of it fit together on they're own place

lapis sequoia Feb 14, 2025, 1:40 PM

#

Keep trying, thats what i to say to those around me in problem, currently you a server-mate(idk if that term is correct or not)

silver hill Feb 14, 2025, 1:41 PM

#

I personally have been doing a f tokenizer over and over again for almost 3 months now...hope this is the last version I will ever make because its a true headache... its still interesting thought but well, also tiring

lapis sequoia Feb 14, 2025, 1:41 PM

#

opal frigate For me im just having trouble with finding how all of it fit together on they're...

You mean you are trying to put all those graphs in one singularity.

opal frigate Feb 14, 2025, 1:45 PM

#

lapis sequoia You mean you are trying to put all those graphs in one singularity.

Im trying to MAKE those graphs

opal frigate Feb 14, 2025, 1:45 PM

#

opal frigate Im trying to MAKE those graphs

And that's what im having trouble with

lapis sequoia Feb 14, 2025, 1:46 PM

#

Oh dear devil..

#

I remembered one incident where we had to take anhydrous for experiment and it was just 1± mg differencing the hell out of me, either 0.5 or 1.5 mg went more..

silver hill Feb 14, 2025, 5:08 PM

#

Could anyone explain to me the mathematical theory behind a tokenizer?

#

I got a system that counts chars, an arbitrary threshold so far that takes out too frequent characters, then I got a compiler that looks up all possible character combinations with the sorounding characters , then it looks up the frequency of the combinations too, then I tried normalizing their frequency by dividing it by the overall frequency of all elements with that lenght

#

Maybe Im doing it wrong from the start the think I am aiming for is a dinamic threshold and tokenizer (in the long run)

mighty radish Feb 14, 2025, 5:17 PM

#

Hi everyone,

I’m working on developing a web scraping API using Django to collect financial data and save it to a database that can be automatically downloaded. I’m looking for any guidance or step-by-step tutorials on:
• Setting up Django for web scraping
• Creating API endpoints to expose the scraped data
• Automating database downloads

If anyone has experience with this or knows of a comprehensive tutorial, I’d greatly appreciate your help!

Thanks in advance!

past meteor Feb 14, 2025, 5:22 PM

#

mighty radish Hi everyone, I’m working on developing a web scraping API using Django to colle...

Seems like a good question for #web-development 🙂

open slate Feb 14, 2025, 5:42 PM

#

past meteor Seems like a good question for <#366673702533988363> 🙂

absoulutley

#

what asked ?

unkempt apex Feb 14, 2025, 5:53 PM

#

ohh sorry wrong pin

#

my bad

tawdry plover Feb 14, 2025, 6:45 PM

#

kaprekar numbers

glacial root Feb 14, 2025, 8:10 PM

#

opal frigate Im trying to MAKE those graphs

so you just have to mess around with equations until you make that graph?

lapis sequoia Feb 14, 2025, 10:19 PM

#

Question: How do you learn about and get into AI?

silver hill Feb 14, 2025, 10:52 PM

#

lapis sequoia Question: How do you learn about and get into AI?

hi there

#

AI is a really broad term

#

sorry to say this but it engloves tons of processes and sub processes

#

it goes from tokenization to neural network, its a really vast field, what do you need to know about in particular?

hollow pagoda Feb 15, 2025, 2:08 AM

#

i think he wants to know how people learned and got into the subject programming wise

lapis sequoia Feb 15, 2025, 6:06 AM

#

hollow pagoda i think he wants to know how people learned and got into the subject programming...

Yes

glacial root Feb 15, 2025, 7:14 AM

#

hey guys, what are some good beginner projects to do with just numpy

silver hill Feb 15, 2025, 2:16 PM

#

lapis sequoia Yes

do you want to finetune an already existing model or instead want to do it all by yourself

lapis sequoia Feb 15, 2025, 4:33 PM

#

silver hill do you want to finetune an already existing model or instead want to do it all b...

All myself

silver hill Feb 15, 2025, 4:38 PM

#

lapis sequoia All myself

Sure

#

Start by a tokenizer

serene scaffold Feb 15, 2025, 4:40 PM

#

start what with a tokenizer?

silver hill Feb 15, 2025, 4:40 PM

#

An AI

#

He wants to make an AI from scrach

serene scaffold Feb 15, 2025, 4:40 PM

#

if you're interested to learn about AI, tokenizers aren't a good place to start.

silver hill Feb 15, 2025, 4:40 PM

#

Dang... nvm then

serene scaffold Feb 15, 2025, 4:40 PM

#

silver hill He wants to make an AI from scrach

what is "an AI" according to your understanding?

silver hill Feb 15, 2025, 4:41 PM

#

Text analisis and answer processing

serene scaffold Feb 15, 2025, 4:41 PM

#

that's an incredibly narrow subset of AI.

silver hill Feb 15, 2025, 4:41 PM

#

Whats yours then?

serene scaffold Feb 15, 2025, 4:41 PM

#

Programs that emulate the application of knowledge.

silver hill Feb 15, 2025, 4:42 PM

#

Thats also not correct

serene scaffold Feb 15, 2025, 4:42 PM

#

It's correct.

silver hill Feb 15, 2025, 4:42 PM

#

Because it sees the rules of text, not reality

serene scaffold Feb 15, 2025, 4:42 PM

#

Do you not consider self-driving cars to be AI?

silver hill Feb 15, 2025, 4:43 PM

#

It formulates rules that are derived from tons of text not from visual or other perceptions

silver hill Feb 15, 2025, 4:43 PM

#

serene scaffold Do you not consider self-driving cars to be AI?

Got a point...

serene scaffold Feb 15, 2025, 4:43 PM

#

formulates rules that are derived from tons of data
this is approaching a correct definition of machine learning

silver hill Feb 15, 2025, 4:44 PM

#

Where does machine learning start?

#

Whats the first phase

serene scaffold Feb 15, 2025, 4:45 PM

#

machine learning is where you have a computation graph whose state is determined by data

silver hill Feb 15, 2025, 4:46 PM

#

First you look up too frequent characters right?

#

As they have a meaning by themselves

serene scaffold Feb 15, 2025, 4:46 PM

#

you're still only thinking about NLP

silver hill Feb 15, 2025, 4:46 PM

#

Im too new to this to understand it as a whole

serene scaffold Feb 15, 2025, 4:46 PM

#

it's fine. NLP is the best one 😄

silver hill Feb 15, 2025, 4:46 PM

#

serene scaffold machine learning is where you have a computation graph whose state is determined...

What do you mean by that?

serene scaffold Feb 15, 2025, 4:48 PM

#

silver hill What do you mean by that?

the steps of the algorithm (the computation graph) are decided upon by humans, but the algorithm depends on values that are not set manually--they're "learned" from data.

silver hill Feb 15, 2025, 4:48 PM

#

So far I made a code that read a text, got ride of too frequent characters by storing them, here I already get into a problem because I either look up neighbor characters to the target character or do combinations and look up their frequency

#

I also have trouble deciding wether how to make the threshold that decides wether something is too frequent dinamic

gray slate Feb 15, 2025, 4:49 PM

#

silver hill I also have trouble deciding wether how to make the threshold that decides wethe...

The idea with machine learning is that the machine decides all that

silver hill Feb 15, 2025, 4:49 PM

#

So far I got the overall frequency and normalized group and individual frequencies

gray slate Feb 15, 2025, 4:50 PM

#

https://youtu.be/TkwXa7Cvfr8

YouTube

Emergent Garden

Watching Neural Networks Learn

A video about neural networks, function approximation, machine learning, and mathematical building blocks. Dennis Nedry did nothing wrong. This is a submission for #SoME3

Original vid: https://www.youtube.com/watch?v=0QczhVg5HaI

My Links
Patreon: https://www.patreon.com/emergentgarden
Discord: https://discord.gg/ZsrAAByEnr

Links and Content:
...

▶ Play video

silver hill Feb 15, 2025, 4:50 PM

#

gray slate The idea with machine learning is that the machine decides all that

I get it but there has to be a code that enables it to do so

gray slate Feb 15, 2025, 4:50 PM

#

silver hill I get it but there has to be a code that enables it to do so

You're writing your own tokenizer for a network to learn language?

silver hill Feb 15, 2025, 4:51 PM

#

Ye Im attemting to make a tokenizer from scrach

serene scaffold Feb 15, 2025, 4:51 PM

#

tokenizers often do actually require that you set the rules manually

silver hill Feb 15, 2025, 4:51 PM

#

Im already on version 3.2 of my proyect I have been doing this for almost 4 months and Im getting nowhere lol

silver hill Feb 15, 2025, 4:52 PM

#

serene scaffold tokenizers often do actually require that you set the rules manually

What formula should I usse to make a dynamic threshold?

serene scaffold Feb 15, 2025, 4:53 PM

#

silver hill What formula should I usse to make a dynamic threshold?

for what?

lapis sequoia Feb 15, 2025, 4:53 PM

#

serene scaffold it's fine. NLP is the best one 😄

How do I start off?

#

If I want a job in the industry

serene scaffold Feb 15, 2025, 4:53 PM

#

lapis sequoia How do I start off?

!resources data science

arctic wedgeBOT Feb 15, 2025, 4:53 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

silver hill Feb 15, 2025, 4:53 PM

#

serene scaffold for what?

To choose wether a token is too frequent

serene scaffold Feb 15, 2025, 4:53 PM

#

lapis sequoia If I want a job in the industry

a bachelors degree that's related to AI is a hard requirement for almost all AI jobs, just so you know.

gray slate Feb 15, 2025, 4:53 PM

#

What's the purpose of it?

silver hill Feb 15, 2025, 4:54 PM

#

gray slate What's the purpose of it?

You asking me?

gray slate Feb 15, 2025, 4:54 PM

#

Yeah I mean what's your goal

#

experiment, produce something novel, just to learn?

lapis sequoia Feb 15, 2025, 4:54 PM

#

serene scaffold a bachelors degree that's related to AI is a hard requirement for almost all AI ...

Is there a self learning path I can take?

gray slate Feb 15, 2025, 4:54 PM

#

or a specific use case

serene scaffold Feb 15, 2025, 4:54 PM

#

lapis sequoia Is there a self learning path I can take?

to get a job? no

lapis sequoia Feb 15, 2025, 4:55 PM

#

serene scaffold to get a job? no

What’s the self learning path I can take to different industries within tech?

silver hill Feb 15, 2025, 4:55 PM

#

I want it to identify tokens, and well, get to somewhere at the very list, as long as I can get to I will working on it

lapis sequoia Feb 15, 2025, 4:55 PM

#

I could get into for example software engineering or cyber security by self learning or no

gray slate Feb 15, 2025, 4:55 PM

#

Just make things that are useful and interesting, build a profile on GitHub

lapis sequoia Feb 15, 2025, 4:55 PM

#

Or like what is something that’s really growing that’s not niche that I can self learn and make lots of money off of

serene scaffold Feb 15, 2025, 4:56 PM

#

lapis sequoia I could get into for example software engineering or cyber security by self lear...

try asking in #career-advice. but I can tell you that it's nearly impossible to get into AI without a related degree.

gray slate Feb 15, 2025, 4:57 PM

#

Agentic systems probably.
I don't have a related degree but know devops, done gigs in data science that way. 'cause someone has to actually deploy the stuff if it's for industry

silver hill Feb 15, 2025, 5:11 PM

#

What formula should I usse to make a dynamic threshold to choose wether a token is too frequent?

gray slate Feb 15, 2025, 5:12 PM

#

Too frequent for what?

silver hill Feb 15, 2025, 5:15 PM

#

gray slate Too frequent for what?

too frequent to be combined into a bigger token, meaningfull by itself

gray slate Feb 15, 2025, 5:17 PM

#

Are you looking to make something novel, something that is better than what we have?

gritty rover Feb 15, 2025, 5:22 PM

#

Hello everyone!
Does anyone have this book in PDF or ePub format?
Thank you in advance, have a good day.

https://www.amazon.com.br/Spark-Definitive-Processing-Simple-English-ebook/dp/B079P71JHY/ref=sr_1_1?crid=TXKT97ZZA2FW&dib=eyJ2IjoiMSJ9.vLTrSZ78ZGRBQm3kyZKlIQ.VRs4BiLpxw9V0SInTKmKAWpyICHxODQNYiSAP48R5i0&dib_tag=se&keywords=spark+o+guia+definitivo&qid=1739639863&sprefix=spark+guia+%2Caps%2C217&sr=8-1

Spark: The Definitive Guide: Big Data Processing Made Simple (Engli...

silver hill Feb 15, 2025, 5:23 PM

#

gray slate Are you looking to make something novel, something that is better than what we h...

i want to understand the current to improve it if possible

serene scaffold Feb 15, 2025, 5:23 PM

#

gritty rover Hello everyone! Does anyone have this book in PDF or ePub format? Thank you in a...

we won't help you get a pirated copy of a book that someone wrote. maybe check with your library

silver hill Feb 15, 2025, 5:25 PM

#

silver hill i want to understand the current to improve it if possible

if you help me understand how it works right now I may get a better idea

#

if its too much text dm me

#

dont get auto banned for text wall

gray slate Feb 15, 2025, 5:30 PM

#

Well consider that you can take a word and break it up into ["w", "o", "r", "d", "wo", "or", "rd", "wor", "ord", "word"]. Do that for all the words you see in some text that you care about understanding, count how many times each one shows up. Sort by the score. Choose a maximum list length and cut off at that point.

serene scaffold Feb 15, 2025, 5:30 PM

#

!paste

arctic wedgeBOT Feb 15, 2025, 5:30 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

lapis sequoia Feb 15, 2025, 5:30 PM

#

serene scaffold try asking in <#470889390588035082>. but I can tell you that it's nearly impossi...

Hmm.

gray slate Feb 15, 2025, 5:33 PM

#

gray slate Well consider that you can take a word and break it up into ["w", "o", "r", "d",...

I think that's essentially what OpenAI did, with some tricks to keep the size down. But the score doesn't just have to be based on frequency. Ideally, IMO, you'd also use input from dictionaries to score the list. And you might even benefit from making some character sequences have the same token ID

#

like "ize" and "ise" for example

silver hill Feb 15, 2025, 5:38 PM

#

how do you diffrence structures of consecutive tokens from subparts that have to be fussed into a token

gray slate Feb 15, 2025, 5:38 PM

#

also, you could look at the average position where sentences containing a token end up in an LLMs latent space, take the distances between them, and use that to help order the list - maybe this would improve training times if you're making a model from scratch, because the token ids actually contain semantic data

gray slate Feb 15, 2025, 5:38 PM

#

silver hill how do you diffrence structures of consecutive tokens from subparts that have to...

you check length-first when you tokenize

#

or I guess "tokenizing priority" first, whatever that turns out to be. Which is also an interesting problem

silver hill Feb 15, 2025, 6:11 PM

#

guess that works

#

althought I still believe that differentiating structures from tokens is a huge challengue

silver hill Feb 15, 2025, 6:13 PM

#

gray slate you check length-first when you tokenize

as you have to know when to check surrounding tokens to make structures and when to unify tokens based on frequency patterns...Im not even sure if I should just get the general frequency of consecutive characters to fuse tokens that likelly appear together or go for a more roundabout approach dividing the text in a percentage basis

#

I find it a fascinating topic...anyway, too many hard questions that have been making my head hurt for quite a while, Its already horribly complicated althogether, at list for me, I have been trying to apply some already in usse normalizing and threshold finding formulas and their results wherent that great...maybe because it lacks input text, maybe thats way Im trying to read up as many studies as I can of ppl who actually got somewhere

uneven pawn Feb 15, 2025, 7:19 PM

#

Anybody have any experience in fine tuning LLMs? Wondering how good a model would be if I finetuned it on a codebase to fully understand it/I could ask questions about it

silver hill Feb 15, 2025, 7:21 PM

#

uneven pawn Anybody have any experience in fine tuning LLMs? Wondering how good a model woul...

wanna try to make a new one from scrach instead?

uneven pawn Feb 15, 2025, 7:22 PM

#

Na, it needs that higher level of reasoning and writing that you get with some of the top models

#

I don't really want it to just be repeating existing code

#

But to actually understand it

silver hill Feb 15, 2025, 7:22 PM

#

ye guess you got a point there

silver hill Feb 15, 2025, 7:26 PM

#

uneven pawn Na, it needs that higher level of reasoning and writing that you get with some o...

which model are you going to finetune?

#

Llama?

uneven pawn Feb 15, 2025, 7:26 PM

#

I'm not sure, I'm thinking llama or deepseek

#

But I'm not even sure how feasible finetuning deepseek is

silver hill Feb 15, 2025, 7:26 PM

#

deepseek is too new if you want more feedback you should get llama as its a clasic

#

there is a ton more information on the net about Llama I mean

uneven pawn Feb 15, 2025, 7:27 PM

#

Right but isn't r1 also opensource?

#

So.. the reasoning capabilities would be a lot higher

#

Might be worth the effort

silver hill Feb 15, 2025, 7:27 PM

#

guess that settels it then

#

btw, have you ever made a tokenizer?

uneven pawn Feb 15, 2025, 7:28 PM

#

Yes in some simple programs

silver hill Feb 15, 2025, 7:28 PM

#

how did you handle too frequent characters or character combinations? how did you separate combination of characters(fusing) from structures of does?

uneven pawn Feb 15, 2025, 7:29 PM

#

I know there is entire github projects that make it easy to finetune

#

And that tokenises your input

#

What do you mean by how did I handle too frequent characters? The data set, if its good, won't have too frequent characters

silver hill Feb 15, 2025, 7:30 PM

#

I mean, have you ever built a tokenizer from nothing?

uneven pawn Feb 15, 2025, 7:31 PM

#

Yes, character level based tho, very simple

silver hill Feb 15, 2025, 7:31 PM

#

you know the datasets that LLMA programs read are real life books right? they read inmense ammount of data

uneven pawn Feb 15, 2025, 7:31 PM

#

Basically string to int lol

silver hill Feb 15, 2025, 7:31 PM

#

yes you get the frequencies

#

but in tokenization you have to choose

#

wether a token is meaningfull by itself or if it has to be combined

#

and that requires a threshold

#

wanted to ask how you did that because I cant figure it out

uneven pawn Feb 15, 2025, 7:31 PM

#

Llama trains on huge amounts of data Yes but finetuning requires minimal data

silver hill Feb 15, 2025, 7:32 PM

#

silver hill I mean, have you ever built a tokenizer from nothing?

.

#

from nothing

uneven pawn Feb 15, 2025, 7:33 PM

#

You can start with your wanted token count, say 10million

#

Then for each part/character only take the top N amount that fits into your required count

#

Obviously 10million is insanely high

silver hill Feb 15, 2025, 7:34 PM

#

how do you choose what to take and what to not

#

there are many combinations that may be usseless

uneven pawn Feb 15, 2025, 7:34 PM

#

Frequency analysis

#

If the combination appears many times, that is what your data is showing

silver hill Feb 15, 2025, 7:35 PM

#

and on what basis do you calculate the threshold that the analizer usses

#

is there a formula?

uneven pawn Feb 15, 2025, 7:36 PM

#

The amount of time you're willing to spend on compute and required time to train N amount of tokens

#

Obviously you can do it at character level and it will take 100x longer or large strings

#

It purely depends on how granular you want it

#

There isn't really a set margin I don't think, maybe somebody else can help ya there

silver hill Feb 15, 2025, 7:37 PM

#

got it

#

thanks for the advice

#

Im trying to make it as granual as possible jaja...

gray slate Feb 15, 2025, 7:38 PM

#

silver hill I find it a fascinating topic...anyway, too many hard questions that have been m...

I think from a practical standpoint, you want your tokenizer to use a simple algorithm so it can be fed data easily. There's probably good reasons why OpenAI made theirs they way they did, rather than work on a word level first then have letters as supplimentary chars

silver hill Feb 15, 2025, 7:40 PM

#

Im trying to get subword level tokenization so that afterwards I can get sentence boundaries in pleace, e.g. Dr. isnt a sentence end most of the time

silver hill Feb 15, 2025, 7:43 PM

#

gray slate I think from a practical standpoint, you want your tokenizer to use a simple alg...

btw, srry for the ping, what does nunpy do?

gray slate Feb 15, 2025, 7:44 PM

#

numpy is python for working with numbers, it's data scientists making Python ugly but it's fast

silver hill Feb 15, 2025, 7:47 PM

#

thats a concise answer

gray slate Feb 15, 2025, 7:49 PM

#

What are you gonna use this tokenizer for btw? Like, break text into parts... and then what?

#

Because the design decisions all have trade-offs, and it's really common for people to aim for something they don't need at the expense of something they need later

harsh lagoon Feb 15, 2025, 11:50 PM

#

hey guys is it possible to get the coordinates of detected objects in yolov5?

median nexus Feb 16, 2025, 7:53 AM

#

Anyone familiar with Streamlit?

unkempt apex Feb 16, 2025, 8:00 AM

#

median nexus Anyone familiar with Streamlit?

Yeah go ahead ask the real question

median nexus Feb 16, 2025, 8:04 AM

#

Trying to deploy my ML model but I keep on getting the 'ModuleNotFoundError', I have installed and provided all modules in requirements.txt, any idea how to debug it?

abstract basin Feb 16, 2025, 8:12 AM

#

I want help with my Regression Project !

#

Can Anyone help ?

odd meteor Feb 16, 2025, 9:47 AM

#

median nexus Trying to deploy my ML model but I keep on getting the 'ModuleNotFoundError', I ...

Go to dev side (you should find the icon somewhere on the Streamlit app) to see the exact module that's missing.

odd meteor Feb 16, 2025, 9:54 AM

#

abstract basin Can Anyone help ?

Don't ask question to ask question. Ask questions and provide complete details with the idea that someone will answer your questions

ashen blaze Feb 16, 2025, 11:42 AM

#

Hey so I want to learn data analytics.
And from YouTube I have noticed about this field and it's majorly about making attractive dashboards from power Bi or tabluea.

#

So any expert would guide as to why is python or sql used?
Is it necessary to learn these 2?

normal grove Feb 16, 2025, 11:44 AM

#

ashen blaze So any expert would guide as to why is python or sql used? Is it necessary to le...

with SQL you can put it into PowerBI and do things such as join datasets and filter/select out data which would normally take a lot longer and be a lot harder to do without it

#

for example you could join a dataset in a way that part of the data is kept and only certain parts of another set are kept while the rest of the data is deleted. This is helpful in scenarios such as if maybe you are filtering out all data that has a duplicate, etc., or maybe you want to get rid of all blank accounts with nothing in them. im not sure if you would use that in a real scenario since i am just a college student, but this is what was relayed to me basically

ashen blaze Feb 16, 2025, 11:48 AM

#

I use tabluea... So is it possible to work on it too?

normal grove Feb 16, 2025, 11:48 AM

#

python can also automate some processes and is just a good general language overall because its compatible with a lot of things i believe. tl;dr: they make doing stuff faster

#

not sure on tabluea, since i honestly havent learned it yet 😅

ashen blaze Feb 16, 2025, 11:48 AM

#

I see

#

Are you a data analyst?

normal grove Feb 16, 2025, 11:50 AM

#

nope im a college accounting student with an MIS minor looking to pursue a Data Science master's, i currently just work in tax

ashen blaze Feb 16, 2025, 11:52 AM

#

Hmm would you like to give any opinion or advice since going into data related field?

Like how much SQL should I know or is python really necessary to learn? Since I can start early and learn it later on

normal grove Feb 16, 2025, 11:56 AM

#

so it really depends on what type of "data" you want to do. there are a lot of different things that are labelled under those positions that are all a little different. data analyst is commonly used interchangeably with some terms. for example you might see it mixed in with business and occasionally financial analyst positions, basically if it lists using Python etc. in the job description it is most likely what you are looking for. im not sure too much on the big differences between data scientists and data engineers. it could be a good idea to do research online on youtube most likely of people in those professions documenting what its like and requirements to break into the field. data analysts though, i dont believe you usually need a degree but they might ask for one in a field related to it like statistics, mathematics, etc.

#

if you just want to be an analyst of some sort, so, a lot of those positions are basically working with datasets and communicating the results to teams or managers. so depending on the field what they ask you to know could be different. for example a financial analyst would be asked to make budget forecasts etc., i think these positions usually operate in Excel/PowerBI. like i said probably a good idea to look at some youtubers who are in the field most likely!

#

and yeah python/sql would probably be honestly the most basic things to learn ^^; i believe R and some other things are good later on that might be a bit more advanced too

ashen blaze Feb 16, 2025, 12:02 PM

#

normal grove if you *just* want to be an analyst of some sort, so, a lot of those positions a...

Well the most I have seen is either data cleaning or making fancy data dashboards.( I think this is where they might use all of the required knowledge into this part only)

And I have also visited freelance sites and noticed that making these "dashboards" is the easy part but there are more advanced ones as well and I have no idea 💀

ashen blaze Feb 16, 2025, 12:03 PM

#

normal grove and yeah python/sql would probably be honestly the most basic things to learn ^^...

True both should be learned!

normal grove Feb 16, 2025, 12:05 PM

#

ashen blaze True both should be learned!

If you want to self-learn, I really like the Geeks4Geeks website projects, you could try taking notes over the basics, and then try recreating projects without looking at them. PyCharm Community Edition is great for practicing. I also watch a lot of Python Programmer's youtube channel and he has a lot of project/video basics as well. I honestly haven't been practicing SQL too hard but I probably should because I have an exam this week over some of it 😭 but yeah hope this helps!!

young granite Feb 16, 2025, 12:06 PM

#

maybe more remarks to ur initial question.
Data Analytics can differ as Redd stated already, it gives many different job roles where those Analytics Methods comes to work.
Business -> BI-Reports (mainly PowerBI)
R&D -> ML/NN (Python, Azure, SQL)
Production -> SCADA

and so on

#

so id say bare minimum is SQL

ashen blaze Feb 16, 2025, 12:08 PM

#

normal grove If you want to self-learn, I really like the Geeks4Geeks website projects, you c...

Yea good resources actually since you have reminded me of them.
But anyways thanks for the advice!

young granite Feb 16, 2025, 12:08 PM

#

as you get more expirienced you will be confronted with cloud, kubernetes, new frameworks etc.

ashen blaze Feb 16, 2025, 12:08 PM

#

young granite so id say bare minimum is SQL

Rest with chatgpt? 💀

young granite Feb 16, 2025, 12:08 PM

#

normal grove If you want to self-learn, I really like the Geeks4Geeks website projects, you c...

disagree on that one i dislike geeks4geeks and wouldnt suggest that site to a newbie

#

!resources

arctic wedgeBOT Feb 16, 2025, 12:08 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

young granite Feb 16, 2025, 12:09 PM

#

ashen blaze Rest with chatgpt? 💀

so you become a 6 figure prompt engineer?

normal grove Feb 16, 2025, 12:09 PM

#

young granite disagree on that one i dislike geeks4geeks and wouldnt suggest that site to a ne...

oh i didnt know my bad! ill check the resources out, i honestly only just started learning python a few weeks ago for one of my courses

#

and sql honestly, but im familiar with powerbi. i mean its not hard to learn powerbi tbh

young granite Feb 16, 2025, 12:10 PM

#

normal grove oh i didnt know my bad! ill check the resources out, i honestly only just starte...

if u are a under-/graduate check if you can sneak o'reilly python books

ashen blaze Feb 16, 2025, 12:10 PM

#

ashen blaze Rest with chatgpt? 💀

Cuz I have no idea about it and SQL sounds too advance..
Seen too many SQL courses specific to data analytics and it's like 4 hours long on average

young granite Feb 16, 2025, 12:10 PM

#

normal grove and sql honestly, but im familiar with powerbi. i mean its not hard to learn pow...

yeh its the new excel

#

SQL is an easy language, even easier than python

normal grove Feb 16, 2025, 12:11 PM

#

oh i think one of my graduate courses i might take at one of my target schools actually uses one of those books since the course name is similar to the book name

young granite Feb 16, 2025, 12:11 PM

#

u can get SQL basics in like 1 day and have 60% of the whole lang. already learned

ashen blaze Feb 16, 2025, 12:14 PM

#

young granite so you become a 6 figure prompt engineer?

I can't say prompt engineering since I'm not sure what to do next after DA.
(I'm a freshman in computer engineering and I'm looking into something that would align data and hardware)

young granite Feb 16, 2025, 12:15 PM

#

what u mean by align data and hardware?

ashen blaze Feb 16, 2025, 12:15 PM

#

young granite u can get SQL basics in like 1 day and have 60% of the whole lang. already learn...

But there's too many versions of SQL.
Each with different style

normal grove Feb 16, 2025, 12:15 PM

#

yeah we had 1 day in my AIS class where we discussed different SQL commands you can use that i think ill be tested on D: and then make a project later i believe. just 1 day lmao. i wanted to take a full course on it later on anyways though. my professor was previously a software engineer for tax software services in SoCal so thats probably why, he likely thinks we wont need to know too much more than what he showed us

ashen blaze Feb 16, 2025, 12:16 PM

#

But either ways how much SQL should I know?
Like in terms of context

young granite Feb 16, 2025, 12:16 PM

#

the basics would be how does SQl work, e.g. what is a table, how to aggregate data, applying filters

#

understanding the concepts of relational db vs non-relational

ashen blaze Feb 16, 2025, 12:20 PM

#

young granite what u mean by align data and hardware?

Hmm to do something special yk.
Like making a server and clouds.
I see Data analytics as a way to understand SQL , python and machine in a beginner way.

I still haven't figured it out but I have major interest in this field

ashen blaze Feb 16, 2025, 12:20 PM

#

young granite the basics would be how does SQl work, e.g. what is a table, how to aggregate da...

So what are some resources that you would suggest to learn SQL?

young granite Feb 16, 2025, 12:20 PM

#

so u wanna be a sys admin?

ashen blaze Feb 16, 2025, 12:20 PM

#

System admin?

young granite Feb 16, 2025, 12:21 PM

#

ashen blaze So what are some resources that you would suggest to learn SQL?

just setup a local SQL instance and play with it honestly

#

o'reilly Learning SQL, 3rd Edition

ashen blaze Feb 16, 2025, 12:21 PM

#

Idk that but like server related or database engineer

lapis sequoia Feb 16, 2025, 4:15 PM

#

a book i accidently stole from the airport and read the entire thing on flight good read 10/10

rn_image_picker_lib_temp_9c7e5557-79e8-4f3b-a08a-76bf8820ecc7.jpg

ashen blaze Feb 16, 2025, 4:22 PM

#

lapis sequoia a book i accidently stole from the airport and read the entire thing on flight g...

Wdym that you "stole accidentally" 💀🙏

lapis sequoia Feb 16, 2025, 4:23 PM

#

ashen blaze Wdym that you "stole accidentally" 💀🙏

the reader does not steal

#

and the thief does not read

#

but i definitely put it in my bag and forgot to return it so

ashen blaze Feb 16, 2025, 4:29 PM

#

💀💀💀 bruv

#

THATS CLEARLY STEALING

ashen blaze Feb 16, 2025, 4:30 PM

#

lapis sequoia and the thief does not read

If the reader does not steal then you stole it from the guy thus your self contradicting your statement

lapis sequoia Feb 16, 2025, 4:38 PM

#

but ay the book taught me sum good shit

#

tested it out at home

ashen blaze Feb 16, 2025, 4:45 PM

#

lapis sequoia but ay the book taught me sum good shit

I mean sure ig 💀

#

What's your house address so that I can steal it?

lapis sequoia Feb 16, 2025, 4:46 PM

#

55 bucks i just searched up the price 😭

lime grove Feb 16, 2025, 5:23 PM

#

clustering benchmarks for those of you that do unsupervised
https://cs.joensuu.fi/sipu/datasets/

#

up to 1000 dimensions for some of the sets.

#

These are nice, clean sets with ground truths that somewhat remove the dataset hassle from algorithm development

granite agate Feb 16, 2025, 6:20 PM

#

What is the lastest version of python and CUDA does tensorflow and pytorch both support. I have a rtx 3060 and I need it to run on the GPU

serene scaffold Feb 16, 2025, 6:23 PM

#

granite agate What is the lastest version of python and CUDA does tensorflow and pytorch both ...

don't worry about tensorflow support. just don't use tensorflow.
you usually want to stay at least one python version behind. looks like pytorch doesn't work on 3.13 yet.

granite agate Feb 16, 2025, 6:26 PM

#

serene scaffold - don't worry about tensorflow support. just don't use tensorflow. - you usually...

Okay so I will work with 3.12 but I do need the tensorflow support tho since I gotta implement an existing code which is written on tensorflow. I could switch it to torch later but now I kinda need tensorflow.

agile cobalt Feb 16, 2025, 6:29 PM

#

granite agate Okay so I will work with 3.12 but I do need the tensorflow support tho since I g...

Check the Install instructions on the official website then

If you are using Windows, you will need to use WSL to run Tensorflow on a GPU

granite agate Feb 16, 2025, 6:31 PM

#

It says u need WSL only if u use 2.11 so I can use 2.10 for native windows support. So I just need to get the highest CUDA possible which is compatabile with both Tensoflow 2.10 and pytorch

lime grove Feb 16, 2025, 6:36 PM

#

Have you encountered any WSL-specific bugs within CUDA?

#

I recall running into some strange issues a few years back, some of the examples they provided wouldn't compile. But I haven't really looked into things since then

granite agate Feb 16, 2025, 6:38 PM

#

lime grove Have you encountered any WSL-specific bugs within CUDA?

I have never used WSL

#

The only time I used linux was in college when I had to use Ubuntu. I thought I could get away without touching linux but here we are

lime grove Feb 16, 2025, 6:39 PM

#

Linux is superior.

#

this is sort of tangential to the channel topic, but I really dislike Windows and the complete mess it represents. Sorry - won't talk about this again 😄

agile cobalt Feb 16, 2025, 6:42 PM

#

lime grove Have you encountered any WSL-specific bugs within CUDA?

I have been using it for a while, and it works fairly well

I did run into some weirdness and had to purge all nvidia dependencies once though, but it has been working well since too so TPF_02_Shrug

granite agate Feb 16, 2025, 6:45 PM

#

lime grove this is sort of tangential to the channel topic, but I really dislike Windows an...

I do agree windows is shit I ain't debating lol. Linux is the best but there is a learning curve that I yet to take on and I game and stuff so it is kinda easier to do so on windows though it being shit.

#

Well I guess it is a good time to start learning then

lime grove Feb 16, 2025, 6:50 PM

#

btw, my gaming laptop has an RTX 3070 laptop GPU. I think it is enough for practice, but not production

agile cobalt Feb 16, 2025, 6:52 PM

#

no mods I think, but I am not even sure about what mods would be in this case? overclocking?

#

the first time I installed it, it worked without any problems

later on after I installed more things was when things got weird, not sure if I broke apt at some point

lime grove Feb 16, 2025, 6:57 PM

#

I am not sure either. I added that question out of my own personal ignorance. Maybe mods are possible, but not sure

opaque condor Feb 17, 2025, 1:14 AM

#

lapis sequoia but ay the book taught me sum good shit

What book is it?

lapis sequoia Feb 17, 2025, 1:14 AM

#

lapis sequoia a book i accidently stole from the airport and read the entire thing on flight g...

@opaque condor

opaque condor Feb 17, 2025, 1:16 AM

#

lapis sequoia <@1318626588560199692>

Thank you very much

karmic void Feb 17, 2025, 5:16 AM

#

Hello guys, I am trying to get into data science by making projects. Can anyone tell me what projects I can create, I know the usual matplotlib, pandas, numpy and plotly. I am bored of creating graphs and all, is there any other thing I can do using these?

echo yacht Feb 17, 2025, 5:34 AM

#

hellooo pandas question D:

#

i cant figure out how to merge two datasets on the index if they have different row counts (patients), one of them has significantly more than the other but im trying to keep the rows to match the smaller set to be more conservative

undone ridge Feb 17, 2025, 5:53 AM

#

I am looking for an experienced developer in Python openCV and .NET programming.
If you have the ability, you should work on a project related to image processing.
If you have the ability, please contact me.

jaunty helm Feb 17, 2025, 5:59 AM

#

echo yacht i cant figure out how to merge two datasets on the index if they have different ...

df_small.join(df_big, how="left")

serene trellis Feb 17, 2025, 7:56 AM

#

yo anyone familiar with RL?i think i fumbled but im not sure what could be the reasons

#

#

kinda feels like its trying random stuff instead of exploiting a possible strat

noble arch Feb 17, 2025, 8:13 AM

#

undone ridge I am looking for an experienced developer in Python openCV and .NET programming....

Hi, @undone ridge
I am a full stack developer with rich experience of developing in python, opencv and asp.net.
Looking forward to work on exciting and challenging projects.
By combining exquisite design with the latest tech, my paramount goal is to deliver the best of the best that the world has seen.
Please feel free to contact me.

keen perch Feb 17, 2025, 12:53 PM

#

I need help installing tensorflow GPU for windows I installed wsl, cuda toolkit and Nvidia drivers, what else should i do, I need detailed explanation please help

valid swift Feb 17, 2025, 1:33 PM

#

ok

fleet marlin Feb 17, 2025, 1:57 PM

#

bro can someone help me navigate im currently learning python as my first language my goal is to get into AI/Ml what should i do after learning python can someone explain me?

unkempt apex Feb 17, 2025, 5:33 PM

#

serene trellis

you have to specify first what are you trying to do!

serene trellis Feb 17, 2025, 6:57 PM

#

unkempt apex you have to specify first what are you trying to do!

its a recommender system using reinforcement learning with a gridworld environment

#

i used bibit algo for reducing state space by making states biclusters of user -item matrix instead of items

#

but the performance varies wildly by user

#

and im not exactly why is that since the qlearning and gridworld is the same(same as in the same for every user)

#

it should start low and plateau at 60 but it starts going crazy for some users

granite agate Feb 17, 2025, 7:40 PM

#

Is cudnn 9.7.1 only available for windows 10? is there one for windows 11

agile cobalt Feb 17, 2025, 8:17 PM

#

granite agate Is cudnn 9.7.1 only available for windows 10? is there one for windows 11

try using the windows 10 download, in general windows 11 is fairly compatiable with 10
or just use WSL (I'd recommend this tbh)

lucid hornet Feb 17, 2025, 8:19 PM

#

Are pretty much all of the common models out there based on data that only goes up to Oct '23 at the latest? It feels weird to me

#

The only one I found was "Meta Llama 3.3 70B" which goes up to December '23

agile cobalt Feb 17, 2025, 8:23 PM

#

lucid hornet Are pretty much all of the common models out there based on data that only goes ...

a lot of the platforms big companies sourced their data from started closing up after they realized how much money they could get from selling their data, e.g. Reddit and Twitter killing their free APIs, newsletters suing big AI companies, whatever was going on with LAION datasets

lucid hornet Feb 17, 2025, 8:23 PM

#

Ah true, I didn't think about that

agile cobalt Feb 17, 2025, 8:24 PM

#

most models from companies with $$$ to spend include some premium training data, like the partnerships OpenAI has been doing, or otherwise include content scrapped from sources they probably had better not admit they are using

lucid hornet Feb 17, 2025, 8:24 PM

#

With the current administration, who knows if it'll matter, though

#

inb4 OpenAI is given access to NSA data

agile cobalt Feb 17, 2025, 8:25 PM

#

well... at least as far as US goes I wouldn't be surprised if they damaged themselves just to try and fail to harm Deepseek

lucid hornet Feb 17, 2025, 8:25 PM

#

That still cracks me up

#

The hypocrisy is so strong

fading wigeon Feb 17, 2025, 8:51 PM

#

Weell, you see, Chat GPT is good. But Deepseek is bad.

#

Because uh....

#

Well, let me ask Chat GPT to answer for you, one sec

craggy agate Feb 18, 2025, 1:00 AM

#

fading wigeon Weell, you see, Chat GPT is good. But Deepseek is bad.

Cause deepseek can't talk about CCP!!! (only on the website)

unkempt apex Feb 18, 2025, 5:02 AM

#

serene trellis it should start low and plateau at 60 but it starts going crazy for some users

You should ask this question in dedicated RL server ( search for it)
Because this is unique project I have heard
So in that server you can get different people dedicated to RL

gritty vessel Feb 18, 2025, 6:52 AM

#

Hey

gritty vessel Feb 18, 2025, 7:35 AM

#

Is there a guide to install ollama with different backend?

#

In llama.cpp docs they have mentioned metallium backend I want to install the same backend on ollama

#

Are there any guide for that?

scarlet anchor Feb 18, 2025, 7:45 AM

#

Hi, is there any workaround for this?
https://stackoverflow.com/questions/77131746/how-to-download-punkt-tokenizer-in-nltk

the solutions in this didnt work

Stack Overflow

How to download punkt tokenizer in nltk?

I installed the NLTK library using
pip install nltk

and while using the lib
from nltk.tokenize import sent_tokenize
sent_tokenize(text)

I am getting this error
LookupError:
********************...

tropic sphinx Feb 18, 2025, 9:39 AM

#

Hey everyone! 👋

I’m working on a Python package that needs to automatically track transformations applied to pandas, NumPy, and scikit-learn. The goal is to detect when a dataset is modified without requiring the user to write extra code or manually call tracking functions.

The main challenge is finding a method that works seamlessly while ensuring all meaningful changes are detected.

🔹 What I Want to Achieve

Automatically track modifications when a user applies transformations like df.fillna(), df.drop_duplicates(), or sklearn_pipeline.fit(X).
Ensure minimal code changes for the user—ideally, they just import the package and work with pandas/NumPy as usual.
Detect in-memory modifications, including df.iloc[0, 1] = 5 or array[2] = 100, without requiring the user to explicitly log them.
Avoid major performance overhead—the tracking system should be lightweight and not slow down computations.

🔹 Approaches I’ve Considered

Proxy Wrapping (Overriding pandas, NumPy, and scikit-learn Methods)

Override common transformation functions (fillna(), drop_duplicates(), apply(), fit(), transform()).
Pros: Works transparently, no user interaction needed.
Cons: Override all the functionalities!

🔹 What I Need Help With

What other approaches would you suggest for tracking pandas/NumPy transformations **(almost) without user interaction **?
How would you track inline modifications (df.iloc[...] = 5) without modifying user code too much?
What’s the most efficient way to track changes while avoiding performance overhead?

Would love to hear your thoughts on how you’d approach this! 🚀 Thanks in advance for any insights! 🙌

keen perch Feb 18, 2025, 10:21 AM

#

I need help installing tensorflow GPU for windows I installed wsl, cuda toolkit and Nvidia drivers, what else should i do, I need detailed explanation please help

harsh lagoon Feb 18, 2025, 1:16 PM

#

hey guys how do I use this?

#

where should i put it?

#

I want to know the coordinates using detect.py

#

I'm new to python so I really don't know what most of these do

unkempt wigeon Feb 18, 2025, 3:25 PM

#

import torch
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms

import matplotlib.pyplot as plt

#device configuration
device = torch.device('cuda' if  torch.cuda.is_available() else "cpu")


#hyper paramiters
input_size = 784 # 28X28=784
hidden_size = 100
num_classes = 10
num_epochs = 2
Batch_size = 100

learn_rate = 0.0001

trainging_datasets = torchvision.datasets.MNIST(root='./data',train=True,
                                               transform=transforms.ToTensor(), download=True)


tests_datasets = torchvision.datasets.MNIST(root='./data',train=False,
                                               transform=transforms.ToTensor())

train_loader = torch.utils.data.DataLoader(dataset=trainging_datasets, batch_size=Batch_size,
                                           shuffle=True)


test_loader = torch.utils.data.DataLoader(dataset=tests_datasets, batch_size=Batch_size,
                                           shuffle=False)


exampels = iter(train_loader)
samples, labels = exampels.next(exampels)
print(samples.shape,labels.shape)

#

how come there is an attribute error

true flicker Feb 18, 2025, 3:35 PM

#

Hi

unkempt wigeon Feb 18, 2025, 3:43 PM

#

hello

serene scaffold Feb 18, 2025, 3:57 PM

#

unkempt wigeon ```py import torch import torchvision import torch.nn as nn import torchvision.t...

as we've discussed a few times: always always show the whole entire error message no matter what. never say that you got an error message without showing the whole entire error message.

unkempt wigeon Feb 18, 2025, 4:13 PM

#

https://paste.pythondiscord.com/5F4Q

unkempt wigeon Feb 18, 2025, 4:18 PM

#

serene scaffold as we've discussed a few times: always always show the whole entire error messag...

sorry i was trying to fix a typed line an unfinished line now Ive run int a another error

#

its in the paste bin

serene scaffold Feb 18, 2025, 4:21 PM

#

unkempt wigeon sorry i was trying to fix a typed line an unfinished line now Ive run int a anot...

you wrote -1.28*28, which is -35.84, which is a float. you probably meant to write -1, 28*28, which is tuple[int, int]

unkempt wigeon Feb 18, 2025, 4:21 PM

#

thank you

#

n_correct = (predictions == labels).sum().item()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'bool' object has no attribute 'sum'

serene scaffold Feb 18, 2025, 4:30 PM

#

unkempt wigeon n_correct = (predictions == labels).sum().item() ^^^^^^^^^^^^^^^...

can you explain what this error message is telling you?

unkempt wigeon Feb 18, 2025, 4:32 PM

#

it is not finding a sum even thou i pit .sum as following the tutorial

serene scaffold Feb 18, 2025, 4:34 PM

#

unkempt wigeon it is not finding a sum even thou i pit `.sum` as following the tutorial

do print(type(predictions), type(labels))

#

and tell me what it says

unkempt wigeon Feb 18, 2025, 4:36 PM

#

printed output:
<class 'torch.Tensor'> <class 'torch.Tensor'>

unkempt wigeon Feb 18, 2025, 4:38 PM

#

serene scaffold and tell me what it says

odd meteor Feb 18, 2025, 4:39 PM

#

unkempt wigeon n_correct = (predictions == labels).sum().item() ^^^^^^^^^^^^^^^...

(predictions == labels) is going to return a Boolean tensor where True (a.k.a 1) indicates a correct prediction and False (a.k.a 0) indicates an incorrect one.

use torch.sum((predictions == labels)) instead

unkempt wigeon Feb 18, 2025, 4:43 PM

#

epoch -1 / 2, step 200/600, loss = 1.3214
epoch -1 / 2, step 300/600, loss = 0.8638
epoch -1 / 2, step 400/600, loss = 0.8123
epoch -1 / 2, step 500/600, loss = 0.6779
epoch -1 / 2, step 600/600, loss = 0.4998
epoch 0 / 2, step 100/600, loss = 0.5865
epoch 0 / 2, step 200/600, loss = 0.4572
epoch 0 / 2, step 300/600, loss = 0.3157
epoch 0 / 2, step 400/600, loss = 0.5043
epoch 0 / 2, step 500/600, loss = 0.4080
epoch 0 / 2, step 600/600, loss = 0.3515

it dose not update the epoch and it starts with -1 instead of 1

serene scaffold Feb 18, 2025, 4:43 PM

#

unkempt wigeon ```epoch -1 / 2, step 100/600, loss = 1.7525 epoch -1 / 2, step 200/600, loss = ...

did what Emyrs said work?

unkempt wigeon Feb 18, 2025, 4:44 PM

#

yes but it is not updating the epoch

serene scaffold Feb 18, 2025, 4:44 PM

#

odd meteor (predictions == labels) is going to return a Boolean tensor where True (a.k.a 1)...

AttributeError: 'bool' object has no attribute 'sum' indicates that (predictions == labels) itself is a single bool, which I find surprising.

unkempt wigeon Feb 18, 2025, 4:46 PM

#

what do i need to do to update the epoch?

#

it dose update in the positive

odd meteor Feb 18, 2025, 4:53 PM

#

unkempt wigeon what do i need to do to update the epoch?

I just checked your code and there are a few issues I spotted. The logging should be epoch + 1 not epoch -1

print(f'Epoch {epoch+1}/{num_epochs}, Step {i+1}/{n_total_steps}, Loss = {loss.item():.4f}')

unkempt wigeon Feb 18, 2025, 4:59 PM

#

print(f'epoch {epooch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss.item():.4f}')```

odd meteor Feb 18, 2025, 5:06 PM

#

serene scaffold `AttributeError: 'bool' object has no attribute 'sum'` indicates that ` (predict...

Yeah, I initially suspected it's probably a case where Wendigo might have mistakenly made predictions and labels a rank 0 tensor instead of a rank 1 tensor. But then i also noticed in his code that s/he used torch.max(outputs, 1) instead of torch.argmax(outputs, dim=1)

torch.max(outputs, 1) returns a tuple instead of predicted label. I usually prefer using torch.argmax() in this case because it's a more safer option.

unkempt wigeon Feb 18, 2025, 5:07 PM

#

Im just following a tutorial

odd meteor Feb 18, 2025, 5:08 PM

#

unkempt wigeon Im just following a tutorial

Is everything working fine now?

unkempt wigeon Feb 18, 2025, 5:10 PM

#

epoch wont update

odd meteor Feb 18, 2025, 5:12 PM

#

unkempt wigeon epoch wont update

Even after fixing the logging print out? Can I see an example of what it's returning

unkempt wigeon Feb 18, 2025, 5:12 PM

#

epoch 1 / 2, step 100/600, loss = 1.7161
epoch 1 / 2, step 200/600, loss = 1.2840
epoch 1 / 2, step 300/600, loss = 0.8888
epoch 1 / 2, step 400/600, loss = 0.7217
epoch 1 / 2, step 500/600, loss = 0.6352
epoch 1 / 2, step 600/600, loss = 0.4856
epoch 2 / 2, step 100/600, loss = 0.5465
epoch 2 / 2, step 200/600, loss = 0.3712
epoch 2 / 2, step 300/600, loss = 0.4181
epoch 2 / 2, step 400/600, loss = 0.3767
epoch 2 / 2, step 500/600, loss = 0.4154
epoch 2 / 2, step 600/600, loss = 0.4518
accurecy = 0.8600000143051147

#

video im following

YouTube

Patrick Loeber

PyTorch Tutorial 13 - Feed-Forward Neural Network

New Tutorial series about Deep Learning with PyTorch!
⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer *

In this part we will implement our first multilayer neural network that can do digit classification based on the famous MN...

▶ Play video

odd meteor Feb 18, 2025, 5:14 PM

#

unkempt wigeon ```epoch -1 / 2, step 100/600, loss = 1.7525 epoch -1 / 2, step 200/600, loss = ...

But here the epoch is updating, if we go by this log. It moved from -1 to 0. Fix the logging part and share the new update

odd meteor Feb 18, 2025, 5:16 PM

#

unkempt wigeon epoch 1 / 2, step 100/600, loss = 1.7161 epoch 1 / 2, step 200/600, loss = 1.284...

The epoch is updating. You trained for 2 epochs; it trained for 1st and 2nd epoch as shown here.

unkempt wigeon Feb 18, 2025, 5:17 PM

#

its updating now corectly???

odd meteor Feb 18, 2025, 5:19 PM

#

unkempt wigeon its updating now corectly???

Yes.

unkempt wigeon Feb 18, 2025, 5:20 PM

#

how can i test my forward network?

unkempt wigeon Feb 18, 2025, 5:21 PM

#

odd meteor Yes.

thank you

odd meteor Feb 18, 2025, 5:50 PM

#

unkempt wigeon how can i test my forward network?

The last part in your code that's using a context manager has already handled the evaluation of your MLP (feed forward NN) on your test-set; hence, the reason you got accuracy= 0.86

with torch.no_grad():
    n_correct = 0
    n_samples = 0

    for images, labels in test_loader:
        images = torch.flatten(images, start_dim=1).to(device) #<--flatten image to a 1D vector
        labels = labels.to(device)

        logits = model(images) #<---Your feed-forward / forward propagation 

        predicted_labels = torch.argmax(logits, dim=1) #<--- I used 'argmax' here instead of 'max' to get predicted labels

        labels = labels.view_as(predicted_labels)  #<--- ensure labels match shape of predicted_labels

        # Compute accuray per batch
        n_samples += len(labels)
        n_correct += torch.sum(predicted_labels == labels).item()
    # Compute the avergae accuracy over all batches

    final_acc = 100.0 * n_correct / n_samples
    print(f'Accuracy = {final_acc:.2f}%')

unkempt apex Feb 18, 2025, 5:51 PM

#

harsh lagoon I'm new to python so I really don't know what most of these do

new to python? , then learn first basics of it, because it will really help you to understand this code

and regarding the question, ultralytics is a library to load and fine-tune Yolo and other models

unkempt wigeon Feb 18, 2025, 5:52 PM

#

odd meteor The last part in your code that's using a context manager has already handled th...

What I mean is there a way of graphing each epoch to know how much learning it's done and if I were to have a TK intermodule that allows me to draw and written digits will allow me to see does it recognize what has been written

odd meteor Feb 18, 2025, 6:01 PM

#

unkempt wigeon What I mean is there a way of graphing each epoch to know how much learning it's...

Aside from seeing the loss go down via the training log, yes.

You can modify the code to accommodate for storing the loss per epoch, then once the training is over, create and invoke a function that plots the learning curve in each epoch.

#

About using Tkinter to draw a digit and have your trained model have a go at it, yeah, I believe it's possible as well. I haven't done sometime like that myself but yeah it's possible.

unkempt wigeon Feb 18, 2025, 6:15 PM

#

odd meteor Aside from seeing the loss go down via the training log, yes. You can modify t...

It's getting better and better that's what I'm saying from all this data

unkempt wigeon Feb 18, 2025, 6:19 PM

#

odd meteor The last part in your code that's using a context manager has already handled th...

My model is 97% correct yay

odd meteor Feb 18, 2025, 6:21 PM

#

unkempt wigeon My model is 97% correct yay

Next step would be, implementing EarlyStopping and applying it to your code to help prevent overfitting.

unkempt wigeon Feb 18, 2025, 6:22 PM

#

odd meteor Next step would be, implementing `EarlyStopping` and applying it to your code to...

What does earlystopping do?

odd meteor Feb 18, 2025, 6:24 PM

#

unkempt wigeon What does ``earlystopping`` do?

Prevents your model from overfitting. After a certain tolerance level has been reached, EarlyStopping will force your model to stop training.

unkempt wigeon Feb 18, 2025, 6:25 PM

#

Where would I implement it at the end inside of the loop or in the main body?

odd meteor Feb 18, 2025, 6:25 PM

#

unkempt wigeon Where would I implement it at the end inside of the loop or in the main body?

inside the training loop

unkempt wigeon Feb 18, 2025, 6:26 PM

#

What earlystop should I go for in training on average
To avoid overfitting?

cerulean rain Feb 18, 2025, 6:30 PM

#

hello, I am trying to create a neural network from scratch using numpy, but i am kinda lost in building a optimiser, I don't know how i should implement that... if anyone can give me any info, that would be super helpful! thanks!

unkempt wigeon Feb 18, 2025, 6:34 PM

#

odd meteor inside the training loop

What is the limit that I should set to avoid overfitting?

odd meteor Feb 18, 2025, 6:43 PM

#

unkempt wigeon What is the limit that I should set to avoid overfitting?

There is no one-size-fits-all answer. It depends on multiple factors such as dataset size, model complexity, and the problem you're solving. I could have a tolerance level of 3 while yours could be 9. In my case, if validation loss doesn't improve after 3 epochs, EarlyStopping will be triggered which ultimately will terminate the model from training further.

unkempt wigeon Feb 18, 2025, 6:45 PM

#

To be honest I forgot they hit retrain and now I'm up to 98% accuracy which I'm surprised actually worked so that's why I was wondering because I didn't realize it and instead of disrupting the network I chance clinic keep going do you think it's a little overfitting now

tough ingot Feb 18, 2025, 6:50 PM

#

Which tool can make an node-link-graph for big data? I tried Pyvis and Sigma.js but there was too many connections that built white blobs

#

lime grove Feb 18, 2025, 6:53 PM

#

Maybe you could coarse grain the graph, and represent that instead?

#

as things stand, visualizing the whole thing in one go is just not a good idea

#

You can perform a type of renormalization wherein you could, for instance, only represent nodes with a certain connectivity greater than N

#

I actually thought that these two images were some sort of a burlap cloth.

#

Or maybe represent only neighborhoods? there are a number of interesting things you could do that would convey information more efficiently than this data dump

unkempt wigeon Feb 18, 2025, 6:57 PM

#

odd meteor There is no one-size-fits-all answer. It depends on multiple factors such as dat...

If I got 50% be ok?

odd meteor Feb 18, 2025, 7:04 PM

#

tough ingot Which tool can make an node-link-graph for big data? I tried Pyvis and Sigma.js ...

Once it has gone past what I can easily do with networkx library, I gently abort mission 👀 . However, I've seen someone use Gephi and Graphistry for this sort of thing.

tough ingot Feb 18, 2025, 7:06 PM

#

The white are the Links / connections between nodes

odd meteor Feb 18, 2025, 7:07 PM

#

unkempt wigeon If I got 50% be ok?

Can you elucidate? I'm not sure I understand your question very well

tough ingot Feb 18, 2025, 7:07 PM

#

odd meteor Once it has gone past what I can easily do with networkx library, I gently abort...

Thanks, I try Graphistry

unkempt wigeon Feb 18, 2025, 7:08 PM

#

odd meteor Can you elucidate? I'm not sure I understand your question very well

To stop the training when it gets to 50% so that avoids overfitting

odd meteor Feb 18, 2025, 7:11 PM

#

unkempt wigeon To stop the training when it gets to 50% so that avoids overfitting

Instead of stopping at 50% accuracy, use early stopping with validation loss to decide when to stop training dynamically.

main fox Feb 18, 2025, 7:14 PM

#

Could also use Patience, e.g. if no improvement in test loss after x amount of epochs

unkempt wigeon Feb 18, 2025, 7:23 PM

#

odd meteor Instead of stopping at 50% accuracy, use early stopping with validation loss to ...

what should i make next?

tough ingot Feb 18, 2025, 7:48 PM

#

lime grove as things stand, visualizing the whole thing in one go is just not a good idea

This is also an idea.
Maybe I can make something like NASA, that I make many small things and combine it the small things. Nasa does this for images from the space, can I do this with pyvis?

#

So instead of generate an graph.html, I create many graph.html (cluster) and then program an graph.html that combines all cluster into a graph.

odd meteor Feb 18, 2025, 7:57 PM

#

unkempt wigeon what should i make next?

Since you've implemented EarlyStopping in your MLP, you might wanna move to CNN next.

Well, before moving to CNN, pick two different datasets (one tabular dataset and one image data) and practise what you just learned by training a NN with MLP.

#

Once you've trained a MLP on a new image dataset - - preferably an image with 3 color channels, then try to train a CNN model on same dataset. Hopefully, this will enable you see and understand why CNN tend to outperform MLP.

unkempt wigeon Feb 18, 2025, 8:35 PM

#

MLP?

unkempt wigeon Feb 18, 2025, 8:39 PM

#

odd meteor Since you've implemented EarlyStopping in your MLP, you might wanna move to CNN ...

MLP?

unkempt wigeon Feb 18, 2025, 8:42 PM

#

odd meteor Prevents your model from overfitting. After a certain tolerance level has been r...

what would it be in an if statement?

odd meteor Feb 18, 2025, 9:41 PM

#

unkempt wigeon MLP?

Multi-Layer Perceptron

unkempt wigeon Feb 18, 2025, 10:32 PM

#

odd meteor Multi-Layer Perceptron

What happens if the accuracy is a full 1.0 instead of 0.900000

opaque condor Feb 18, 2025, 11:58 PM

#

odd meteor Once you've trained a MLP on a new image dataset - - preferably an image with 3 ...

Would it be possible to combine make a multi- model by training the data on the network then change the data afterwards?

orchid light Feb 19, 2025, 12:26 AM

#

Bruh i made an Gpt like transformer but it sucks it learns so slowly......

#

Could i get any tips? Like i tried every thing changeing learning rate, optimizers, model paramiters, datasets, vocab... and my model its just stuck

orchid light Feb 19, 2025, 12:46 AM

#

I just swiched from character level vocab to subword and it just learns so slowly

#

opaque condor Feb 19, 2025, 2:38 AM

#

orchid light

Remember learning takes time especially in this case the computer has to translate what humans mean as in language and then convert that into a vectors that it can draw lines between to find the words that are appropriate for what is on the graph

cerulean kayak Feb 19, 2025, 3:56 AM

#

So I just found out that xscale and yscale exist in matplotlib.pyplot.

Basically, is this a function that you guys use often? Because this seems like a function that could be a big game changer, yet I've never seen it used before today.

please at me if you have anythinh.

odd meteor Feb 19, 2025, 4:28 AM

#

opaque condor Would it be possible to combine make a multi- model by training the data on the...

Not sure I understand your question properly. Can you add more clarity to it?

odd meteor Feb 19, 2025, 4:37 AM

#

unkempt wigeon What happens if the accuracy is a full 1.0 instead of 0.900000

The model is 'standing on business.' More like it's saying "there's no gimmicking around here" in a Commando voice.

But ideally, while the model gets all prediction right with 100% accuracy, it's good to confirm the model isn't overfitting.

calm thicket Feb 19, 2025, 6:57 AM

#

it's impossible in practice because there is noise

limpid bronze Feb 19, 2025, 7:29 AM

#

i want to build a Mobile app for real time detection, should i use yolo8n.pt or yolo8s.pt??? or any other yolo8 model?

fervent canopy Feb 19, 2025, 8:08 AM

#

limpid bronze i want to build a Mobile app for real time detection, should i use yolo8n.pt or ...

why not use yolo11?

jaunty helm Feb 19, 2025, 8:57 AM

#

cerulean kayak So I just found out that `xscale` and `yscale` exist in `matplotlib.pyplot`. B...

game changer
it does the thing it's supposed to do, which is fine, but it's not like its existence changes everything
mpl code is still a pain with/without it

tidal bough Feb 19, 2025, 9:39 AM

#

Basically, is this a function that you guys use often?
I use logarithmic scales often, so yes

sour kelp Feb 19, 2025, 9:42 AM

#

hello, I am looking for a partner to learn DSA with me using python and on intermediate level

opaque condor Feb 19, 2025, 12:06 PM

#

odd meteor Not sure I understand your question properly. Can you add more clarity to it?

Why mean is since the neural network should remember that Ms data set that I gave it without downloading it again if I change the data set that is being placed inside of it would it change what it understands or I can't learn two things at once

orchid light Feb 19, 2025, 1:15 PM

#

opaque condor Remember learning takes time especially in this case the computer has to transla...

Yeah but my problem is that i tried to change my model complexity and just let it run but it didnt help my llm cant just cross that 4.6 loss line

#

I train on a good dataset called Fineweb 10bt (10 bilion tokens) and my character level model (with less paramiters) did better (had loss 0.8) than my new model that cant cross loss 4.6.

#

My model graph isnt even close to this

#

I just dont know what to do anymore

jaunty helm Feb 19, 2025, 1:32 PM

#

orchid light

from my limited knowledge, I think llms usually only train for a few (< 5) epochs, on a very big (trillion tokens) dataset

#

also, maybe dynamic learning rates as training goes on?

orchid light Feb 19, 2025, 1:38 PM

#

jaunty helm also, maybe dynamic learning rates as training goes on?

I have dynamic learning rate

#

I also tried turning it off but it didnt help

orchid light Feb 19, 2025, 1:40 PM

#

jaunty helm from my limited knowledge, I think llms usually only train for a few (< 5) epoch...

When i do more than one epoch my model learns the training data not how to generalize (i tried it on smaller datasets)

#

I could give u my code if u want too look for some errors

jaunty helm Feb 19, 2025, 1:51 PM

#

orchid light I could give u my code if u want too look for some errors

well, I'm not exactly an llm training expert...
if I were you, I'd look for people that do something similar, say asking those who release finetunes on what they do

orchid light Feb 19, 2025, 1:52 PM

#

jaunty helm well, I'm not exactly an llm training expert... if I were you, I'd look for peop...

Ok and thx for the help..

thorn flame Feb 19, 2025, 6:29 PM

#

What's an alternative library to sentence-transformers for creating embeddings

#

That supports python 3.8

rich moth Feb 19, 2025, 6:44 PM

#

!paste

arctic wedgeBOT Feb 19, 2025, 6:44 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

cerulean kayak Feb 19, 2025, 7:04 PM

#

tidal bough > Basically, is this a function that you guys use often? I use logarithmic scale...

So in general, how do you know that scaling the graph will give you a better idea of what you want to illistrate?
Because my problem is I heard of it while doing a tutorial where they were like "Oh geez. You should really scale it to see what I mean", and I'm wondering "How the heck would I know to do that by myself?"

Sorry this is such an open ended question: I typed the function in on YouTube and expected to find several videos on when and when not to use it and found very little.

calm thicket Feb 19, 2025, 7:07 PM

#

if you know your data has a certain distribution, you could scale it to make it more obvious or show up better. a log tailed distribution is hard to visualize without a log scale because everything will be clumped on the left. but with scaling, it will look normal

#

or if the data follows a power law, you could use a log log scale to show that

cerulean kayak Feb 19, 2025, 7:50 PM

#

calm thicket if you know your data has a certain distribution, you could scale it to make it ...

okay, so that's just a matter of doing some statics on the data to find out if it has a distrubution: I can re-teach myself how to do that.

tidal bough Feb 19, 2025, 8:47 PM

#

cerulean kayak So in general, how do you know that scaling the graph will give you a better ide...

It's just an experience thing, I'd say. Like, if you look at the plot and you see that most of the points merge together near 0 and are hard to see, probably that plot will be more enlightening as a log-y one. That sort of thing.

#

interactive plots are also very useful for exploring data

cerulean kayak Feb 19, 2025, 10:44 PM

#

tidal bough interactive plots are also very useful for exploring data

Interactive?
Hold the phone: doesn't matplotlib's pyplot produce a static output? Are you saying to use something else to get a feel for the data then using pyplot?

mild dirge Feb 19, 2025, 10:57 PM

#

cerulean kayak Interactive? Hold the phone: doesn't matplotlib's pyplot produce a static outpu...

matplotlib can have interactive plots. Something I have been getting into lately is PowerBI, which I may want to use for some initial data inspection when we get data from clients.

unkempt wigeon Feb 19, 2025, 11:15 PM

#

for i (images, labels) in enumerate(train_loader):
^^^^^^^^^^^^^^^^^^
SyntaxError: cannot assign to function call'

#

https://paste.pythondiscord.com/EQYQ

serene scaffold Feb 19, 2025, 11:37 PM

#

unkempt wigeon for i (images, labels) in enumerate(train_loader): ^^^^^^^^^^^^^^^^^^ Sy...

can you see how that's a syntactically invalid for loop?

gray shard Feb 19, 2025, 11:39 PM

#

lads what is your opinion on data camp as a learning resource?

#

I feel some of concepts are slightly rused and not explained as well

opaque condor Feb 19, 2025, 11:45 PM

#

serene scaffold can you see how that's a syntactically invalid for loop?

I'm just really following the tutorial I'm following and it used in numerate

serene scaffold Feb 19, 2025, 11:45 PM

#

opaque condor I'm just really following the tutorial I'm following and it used in numerate

are you Wendigo on a different account?

opaque condor Feb 19, 2025, 11:46 PM

#

No my brain is just confused sometimes I am feeling what people might be thinking very tired

serene scaffold Feb 19, 2025, 11:47 PM

#

opaque condor No my brain is just confused sometimes I am feeling what people might be thinkin...

so you are not the same person as Wendigo?

opaque condor Feb 19, 2025, 11:47 PM

#

Yep

lime grove Feb 19, 2025, 11:47 PM

#

chatroom paranoia has remained the same for around 4 decades. Are you, or are you not a sockpuppet, sir?

serene scaffold Feb 19, 2025, 11:48 PM

#

lime grove chatroom paranoia has remained the same for around 4 decades. Are you, or are yo...

Mechanical Fox is replying to a message that very clearly was a response to Wendigo, but Mechnical Fox spoke as though they were Wendigo.

lime grove Feb 19, 2025, 11:48 PM

#

sure, but regardless of those details, the point remains. Autoconfusion is a giveaway, but also writing style.

opaque condor Feb 19, 2025, 11:51 PM

#

lime grove chatroom paranoia has remained the same for around 4 decades. Are you, or are yo...

?

gray shard Feb 20, 2025, 12:00 AM

#

for the pd.read_csv() function there is an argument called na_values? anyone know what this is?

serene scaffold Feb 20, 2025, 12:11 AM

#

gray shard for the pd.read_csv() function there is an argument called na_values? anyone kno...

pandas is very well documented

#

!docs pandas.read_csv

arctic wedgeBOT Feb 20, 2025, 12:11 AM

#

pandas.read\_csv

pandas.read_csv(filepath_or_buffer, *, sep=<no_default>, delimiter=None, header='infer', names=<no_default>, index_col=None, usecols=None, dtype=None, engine=None, ...)```
Read a comma-separated values (csv) file into DataFrame.

Also supports optionally iterating or breaking of the file into chunks.

Additional help can be found in the online docs for [IO Tools](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html).

serene scaffold Feb 20, 2025, 12:11 AM

#

take a look and you'll see what that argument is for.

gray shard Feb 20, 2025, 12:13 AM

#

ahhhh ok thank you! where can I learn more about the bot conmands here in this server

serene scaffold Feb 20, 2025, 12:13 AM

#

gray shard ahhhh ok thank you! where can I learn more about the bot conmands here in this s...

go to #bot-commands and do !help

gray shard Feb 20, 2025, 12:14 AM

#

serene scaffold go to <#267659945086812160> and do `!help`

thanks lad appreciate it

serene scaffold Feb 20, 2025, 12:14 AM

#

I'm a lad now?

gray shard Feb 20, 2025, 12:14 AM

#

if ur girl thanks lass

serene scaffold Feb 20, 2025, 12:15 AM

#

I don't care if people think I'm a girl as long as they think I'm a pretty girl.

gray shard Feb 20, 2025, 12:15 AM

#

only pretty girls out of the female category know data science and ai

jaunty helm Feb 20, 2025, 12:25 AM

#

cerulean kayak Interactive? Hold the phone: doesn't matplotlib's pyplot produce a static outpu...

other plotting libs exist, like plotly
personally I've been liking hvplot + bokeh

neat bloom Feb 20, 2025, 2:05 AM

#

do y'all have any reccomended books for newbies?

#

pls ping me if y'all know, i will genuinely forget

odd meteor Feb 20, 2025, 2:33 AM

#

neat bloom do y'all have any reccomended books for newbies?

You can check the pinned messages

hearty rampart Feb 20, 2025, 7:47 AM

#

I need help with this question https://imgur.com/a/ysbFJcz

Imgur

Untitled Album

odd meteor Feb 20, 2025, 9:56 AM

#

hearty rampart I need help with this question https://imgur.com/a/ysbFJcz

We have a channel for DSA. You might wanna check #algos-and-data-structs

unkempt wigeon Feb 20, 2025, 11:53 AM

#

serene scaffold can you see how that's a syntactically invalid for loop?

nope

#

https://youtu.be/pDdP0TFzsoQ?si=8E6dglBBT-axZ9VJ

YouTube

Patrick Loeber

PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

New Tutorial series about Deep Learning with PyTorch!
⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer *

In this part we will implement our first convolutional neural network (CNN) that can do image classification based on the ...

▶ Play video

serene scaffold Feb 20, 2025, 1:55 PM

#

unkempt wigeon for i (images, labels) in enumerate(train_loader): ^^^^^^^^^^^^^^^^^^ Sy...

You should be able to see what about this is wrong. If you can't, what you're trying to do is probably above your skill level

lapis sequoia Feb 20, 2025, 2:38 PM

#

so basically for our project we will be having to use some models and we decided on LSTM/ RNN as its time series

tmrw we have a meeting with our mentor and we r supposed to have done some research or have a demo model or anything

so basically main PS is
like aiml powered smart energy management system
be it for home or office anything
so like we r supposed to collect data through IoT devices
and there will be 3 models:
consumption prediction
anamoly detection
generation prediction

original idea was to make it for gated communities then thermostat, AC use case for offices came later we just combined and made it general

idk what exactly to prepare for this. Like my teammate is going through a github project that has like LSTM for some finland electricity consumption thingy

im trying to go through research papers but any inputs ideas etc? what models can be used and metrics to be kept in mind. we talked to our professor and she suggested the RNNs / LSTMs

tidal bough Feb 20, 2025, 4:29 PM

#

cerulean kayak Interactive? Hold the phone: doesn't matplotlib's pyplot produce a static outpu...

Matplotlib supports interactive plots (switch backend to an interactive one, e.g. matplotlib.use("TkAgg"), or if you're in a jupyter notebook use %matplotlib widget). I also sometimes use other libraries than mpl, for example plotly can do very cool interactive plots which support, for instance, showing all the info about a point in a tooltip when you hover over it.

woeful escarp Feb 20, 2025, 4:38 PM

#

Hello, I am starting in ML, I would like to work in a project to improve, send me DM

worldly wagon Feb 20, 2025, 7:37 PM

#

hi does polars have tuple support? can't find anything on it

main fox Feb 20, 2025, 8:12 PM

#

unkempt wigeon nope

Your class definition is all messed up

class ConvNet(nn.Module):
    def __init__(self,):
        self.conv1 = nn.Conv2d(3, 6,5)#r channels
        self.pool = nn.MaxPool2d(2,2)
        self.conv2= nn.Conv2d(6,16,5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2= nn.Linear(120, 84)
        self.fc3= nn.Linear(84, 10)
        x = conv1

    def forward(self,x):
        pass

Revise this part first

unkempt wigeon Feb 20, 2025, 8:47 PM

#

main fox Your class definition is all messed up ```py class ConvNet(nn.Module): def ...

what part specifically?

weary timber Feb 20, 2025, 8:54 PM

#

is there a decent free website i can train my llm's

#

?

serene scaffold Feb 20, 2025, 8:56 PM

#

weary timber is there a decent free website i can train my llm's

train what LLM to do what?

weary timber Feb 20, 2025, 8:56 PM

#

serene scaffold train what LLM to do what?

gpt

#

to be exact

serene scaffold Feb 20, 2025, 8:56 PM

#

weary timber gpt

there are many GPT models

weary timber Feb 20, 2025, 8:56 PM

#

from scratch

main fox Feb 20, 2025, 8:57 PM

#

You didn't specify super() , so your class isn't inheriting from nn.Module

You don't seem to have a layer for flattening, so your Linear layer won't be able to take the outputs of the conv layer

x = conv1 should not be in your def init()

You don't have any activation functions so your Linear layers don't learn any non linear patterns

Your forward pass doesn't do anything, so your model won't process inputs

serene scaffold Feb 20, 2025, 8:57 PM

#

weary timber from scratch

It's astronomically expensive to train an LLM from scratch. Give up on this immediately.

weary timber Feb 20, 2025, 8:58 PM

#

serene scaffold It's astronomically expensive to train an LLM from scratch. Give up on this imme...

so i cant test my own model :(?

serene scaffold Feb 20, 2025, 8:58 PM

#

weary timber so i cant test my own model :(?

you can create some models that aren't LLMs from scratch, and you can fine-tune existing LLMs. Only a handful of very wealthy organizations with tons of training data can create LLMs from scratch.

weary timber Feb 20, 2025, 8:59 PM

#

i want to make a small one couldnt i ?

serene scaffold Feb 20, 2025, 8:59 PM

#

I suppose, but it probably wouldn't be able to respond coherently to any prompt.

weary timber Feb 20, 2025, 9:00 PM

#

okay then ty

#

i was ready to work on this for a full week

#

to waste all my time

serene scaffold Feb 20, 2025, 9:00 PM

#

Sorry I don't have better news

#

There are still beginner ML projects that you can do. But they probably won't involve LLMs.

weary timber Feb 20, 2025, 9:01 PM

#

can you tell me one in nlp?

#

i made countless project on image classification and stuff

serene scaffold Feb 20, 2025, 9:02 PM

#

weary timber can you tell me one in nlp?

you can fine-tune a BERT model (which was considered an LLM when it was created) to do some kind of classification task.

main fox Feb 20, 2025, 9:02 PM

#

Image captioning
Stack a CNN on top of a RNN

weary timber Feb 20, 2025, 9:02 PM

#

okay thakns

unkempt wigeon Feb 20, 2025, 9:03 PM

#

serene scaffold Feb 20, 2025, 9:03 PM

#

unkempt wigeon

do you see why the line with your cursor is wrong?

weary timber Feb 20, 2025, 9:03 PM

#

unkempt wigeon

insert mode...

main fox Feb 20, 2025, 9:04 PM

#

unkempt wigeon

Why did you write "NeralNet" inside super() ?

unkempt wigeon Feb 20, 2025, 9:06 PM

#

how can i disable that cursor?

main fox Feb 20, 2025, 9:06 PM

#

By indenting properly

unkempt wigeon Feb 20, 2025, 9:10 PM

#

in the white block

main fox Feb 20, 2025, 9:10 PM

#

Press tab there, yeah

weary timber Feb 20, 2025, 9:14 PM

#

unkempt wigeon in the white block

press ins

#

on your keyboard

unkempt wigeon Feb 20, 2025, 9:15 PM

#

thank you

#

thank you

unkempt wigeon Feb 20, 2025, 9:49 PM

#

https://paste.pythondiscord.com/NJIA

dense needle Feb 20, 2025, 9:58 PM

#

weary timber can you tell me one in nlp?

I made a—really low accuracy— implementation of an ngram model to predict words. I didn’t make it from scratch

#

Used an existing package that already had a tokenizer and tools for making document feature matrices

#

And then I implemented the prediction method just to get a feel for how ngrams work

#

Not LLM level stuff but it was doable and I learned a lot

#

Point being I think you could do something small like that

unkempt wigeon Feb 20, 2025, 10:02 PM

#

unkempt wigeon https://paste.pythondiscord.com/NJIA

why is the loss function showing it has an error

unkempt wigeon Feb 20, 2025, 11:11 PM

#

weary timber press ins

thank you so much

main fox Feb 21, 2025, 12:51 AM

#

unkempt wigeon thank you so much

What do you believe this part of your code does?

    def forward(self,x):
        pass

#

Matter of fact, I mentioned changes in my previous reply when you asked

unkempt wigeon Feb 21, 2025, 1:03 AM

#

main fox Why did you write "NeralNet" inside super() ?

Because of insert cursor I have been copy and paste it from another bit of code and do I just copy and paste the same forward method that I've been working with cuz it seems like it when mostly work out

unkempt wigeon Feb 21, 2025, 1:43 AM

#

https://paste.pythondiscord.com/A2ZA

agile cobalt Feb 21, 2025, 3:36 AM

#

serene scaffold Sorry I don't have better news

have you seen https://github.com/karpathy/nanoGPT and https://github.com/jingyaogong/minimind/blob/master/README_en.md ?

they're not really able to have coherent conversations, but working at all at that scale is a bit impressive, and kind of does what they're looking for

that said... tbh not sure if recommending those would be of any use given considering they do not really understand what they're getting into

sour kelp Feb 21, 2025, 4:34 AM

#

hello, I am looking for a partner to learn DSA with me using python and on intermediate level

serene scaffold Feb 21, 2025, 4:41 AM

#

sour kelp hello, I am looking for a partner to learn DSA with me using python and on inter...

You're looking for #algos-and-data-structs . Data science is something else

orchid light Feb 21, 2025, 4:43 AM

#

weary timber so i cant test my own model :(?

U can with some good graphic card or with cloud.

serene scaffold Feb 21, 2025, 4:44 AM

#

orchid light U can with some good graphic card or with cloud.

They were asking where to train a foundation model for free.

orchid light Feb 21, 2025, 4:44 AM

#

U cant....

#

U mean pre train?

orchid light Feb 21, 2025, 4:44 AM

#

serene scaffold They were asking where to train a foundation model for free.

.

serene scaffold Feb 21, 2025, 4:44 AM

#

When you make an LLM from scratch

orchid light Feb 21, 2025, 4:45 AM

#

I made one but its shitty

#

Like i tried to swich from character level vocab to subword and its shit now

serene scaffold Feb 21, 2025, 4:45 AM

#

I don't think even gpt-2 could be taught to respond correctly to prompts

orchid light Feb 21, 2025, 4:47 AM

#

serene scaffold I don't think even gpt-2 could be taught to respond correctly to prompts

No but my model is shit even at training data maybe i need to do more epochs

#

training it rn

weary timber Feb 21, 2025, 6:44 AM

#

dense needle I made a—really low accuracy— implementation of an ngram model to predict words....

i already did that, youre right its pretty low accuracy when you do a broad one

weary timber Feb 21, 2025, 6:51 AM

#

unkempt wigeon thank you so much

np

weary timber Feb 21, 2025, 6:54 AM

#

serene scaffold I don't think even gpt-2 could be taught to respond correctly to prompts

so even gpt2 says some nonsense ?

#data-science-and-ml

Create a utility matrix (user-item matrix)

Convert the utility matrix to a sparse matrix

Define a batched KNN function for incremental computation

Example usage

🔹 What I Want to Achieve

🔹 Approaches I’ve Considered

🔹 What I Need Help With