#data-science-and-ml
1 messages · Page 160 of 1
college uni thesis
sorry i dont have enough knowledge yet
so im trying to grasp what to except
did you propose it yet?
I don’t think it’s novel enough to be accepted
at least, based on my experience ha
where can I talk bout web scraping?
research ko back in high school was using a face recognition model in a video compression algorithm to keep faces high quality
Can be used in video conferencing
You’d need to go much deeper than just “classifying types of vehicles”, since there’s hundreds of models that already do that online
pinoy?
yeah
nope hindi pa
English only in the server pls, it’s a rule
ayon kasi recommendation samin ng civil prof
But we can take it to dms
^
Because it’s a very basic problem and very solved
But if you propose and they accept it, go lang
But if I were you I would try finding another problem or make it more specific
even with raspberry pi implementation to count cars and classify them? also pala we will log it in web-based datebase e
yaaa we will ask it pa
our drowsiness detection for car implementation got rejected tho
will be use for traffic survery for civil eng
If you’re gonna use it as part of another research, then that’s fine
As long as it’s not the main part of your research
What was the premise for the rejection
Also which uni is this?
too hard to get datasets daw
Okay, valid tbh
may before researchers daw that coducted almost same concept but they gave up daw eh
AY WEH ASHUDHUASHUASHASDHASHDAH
What will it be about
Secret
^
If you do decide to go forward with the vehicle classification, there’s lots of resources online for that
civil eng stud in our uni manually count cars for research survey
idk anong tawag sa survey
but they were counting from 9am to 8pm?
so we did consult the research adviser, and told us to automate it
Yepyep
yeah im pretty sure
im using libraries im using cv2 and mediapipe
i managed to get a real time handtracker that counts hwo many fingeres you have up
and then using that i wanna develop it to do sign language as wel
for the model im using pytorch
and thats it
wbu
Hello everyone
(Note: I am a newbie in this field)
K-Fold Cross-Validation, f1 score, and train-Test Split, statistical tests such as McNemar's.
Despite all this, I don't think it's exactly the right way to understand it. It comes out after you try it in a real-life problem. The model can perform very well. What about outliers? Values that change over time? So I think the best way is to test it with different data. A test data that is statistically different.
Things like train test splits only increase the model's reliability. They can be misleading. I've experienced this.
In fact, increasing the number of data seems like the only real solution. 😅 👀
right now for the number classifier i'm doing it from scratch (just numpy), but when i do the sign language one i'm thinking of doing a real time one, not sure if i'll do it from scratch or with cv libraries though
typically how do mle interviews run, as in what's the structure
A friend and I are attempting to create an open source project that automatically identifies unattractive clauses in end user license agreements/privacy policies. I mostly come from a background of reverse engineering and security research and not data science, but I am very passionate about this.
I figure NLP is likely our best bet?
My understanding is that transformer models are somewhat unwieldy and kind of like a black box.
NLP is "the subset of AI that deals with natural language", so saying "NLP is the best way to automatically extract certain clauses from a legal document" isn't really saying anything.
most of AI these days is a "black box".
I think what I mean to say is that techniques such as NER or text classification have more definite output shapes...?
any program you can possibly think of to solve this problem will, by definition, be NLP.
I guess that's not your point of confusion.
Right.
can you tell me what you mean by "definite output shapes"?
i.e. the model has a finite set of outputs, as opposed to the output being a stream of tokens with indeterminate structure and length
The ideal output might be spans of text with "low-level" details such as: [BINDING, ARBITRATION]
For classification tasks, yes. The model classifies data into categories it was trained on (zero shot is an exception).
NER is a type of classification where each word is classified into one of the entities specified in training
My guess is that for this purpose, an NER model might be ideal. But instead of entities like "PERSON" or "ORG", we use low-level legal primitives, like "LIMITATION", "BINDING", "WAIVER", (more research is needed on my part for the ideal collection of these)
Sounds like it, yes. You could go the NER route or test specific prompting of an LLM.
Is it possible to get an LLM to determinately emit structured output?
if the interactive LLM is instruction-tuned, it will almost always adhere to instructions like "return the output as a JSON with this format"
Interesting, I'll need to investigate how performant something like that is
would a real time sign language identifier be a good resume project
Yes
that would be incredibly ambitious.
Hi!
Stuck on not being able to get llama.cpp-python to use the gpu.
- Nvidia-SMI shows that cuda 12.5 is installed
- I installed doing :
pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu125 - Here is the script im using:
from llama_cpp import Llama
llm = Llama(
model_path="../../models/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf",
n_gpu_layers=-1, # Uncomment to use GPU acceleration
# seed=1337, # Uncomment to set a specific seed
# n_ctx=2048, # Uncomment to increase the context window
)
output = llm(
"Q: Name the planets in the solar system? A: ", # Prompt
max_tokens=32, # Generate up to 32 tokens, set to None to generate up to the end of the context window
stop=["Q:", "\n"], # Stop generating just before the model would generate a new question
echo=True # Echo the prompt back in the output
) # Generate a completion, can also call create_completion
print(output)_
After running this keeps pooping up:
print_info: EOG token = 128008 '<|eom_id|>'
print_info: EOG token = 128009 '<|eot_id|>'
print_info: max token length = 256
load_tensors: layer 0 assigned to device CPU
load_tensors: layer 1 assigned to device CPU
load_tensors: layer 2 assigned to device CPU
The logs has literraly zero mentions of GPU (Crtl+F) does not show any results for GPU
Any help appreciated
it would?
i thought the complexity of it is typical of resume projects
uh, yes? are you talking about just drawing bounding boxes around signs, or translating whole communications into a written language?
just signs
not whole communications
what will the input be? images that are cropped around exactly one sign?
no i mean if i had a camera in the application and it could classify what i'm "saying"
just single letters though
you know that sign language isn't just letters, right?
though maybe i could have a system where it records the letters to be able to "type" messages
yeah but a lot of the full words are dynamic signs right
those i would have to figure out how to do
and that would be incredibly ambitious.
yeah that's why i thought of just letters
or maybe i should just try an autonomous drone (probably would be tougher but it's what i want to work with in the future)
how would you distinguish between signs that are letters and signs that are not?
well it would only pick up the sign if it's able to classify it into one of the letters right
Ok got it working but what gives ?
{'id': 'cmpl-b1422a3e-f391-4f18-813b-7d7466d574bc', 'object': 'text_completion', 'created': 1741033924, 'model': '../../models/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf', 'choices': [{'text': "Q: Name the planets in the solar system? A: 1. Mercury 2. Venus . . . Hmm, wait, that's it? Only two? No, that can't be right. I must", 'index': 0, 'logprobs': None, 'finish_reason': 'length'}], 'usage': {'prompt_tokens': 14, 'completion_tokens': 32, 'total_tokens': 46}}
32 max_tokens and the question prompt were also default
any tips to reduce state space for RL QLearn? currently doing a biclustering algo but since its for a recommender system i need to guarentee that there's at least an instance of each row in at least one bicluster
since if an user isnt in a bicluster i cant generate recommendations
yo this might be a dumb question but i genuinely don't know, if i'm implementing a neural network from scratch and use np.gradient, does that still count as done from scratch?
I don't think from scratch has a strong definition, but you probably are cutting out the hardest part of doing it from scratch by doing that
In order to truly do something from scratch, you must create the universe
so for the sake of learning it's better not to use it
Depends on how far you want to go then. Implementing the autograd yourself is common in school exercises
can somebody help me with a virtual enviornment issue via anaconda prompt. I am basically trying to import keras and tensorflow into my jupyter notebook but I keep getting errors that tensorflow keras cannot be found.
did u activate it?
see #1346312369277636650 if you want to help this person.
yes I activated my virtual env, I conda installed tensorflow and had to do it with keras too but still doesnt work
im using a mix of chatgpt and stackoverflow and im going insane lmao
r u using vscode or like the jupyter itself?
jupyter
can u try running conda list
yo guys do i need a virtual environment if i have anaconda
cause anaconda already comes with all the needed packages, so what else would i need a virtual environment for
my recommendation is to fully remove anaconda from your system and forget it exists.
@bright thorn I recommend closing your thread if you're not going to post the whole error message and you're receiving help here instead.
🤓
my internet is being weird with jupyter so im waiting for the error message to load up so i can ss it
you should always post text as text, not as screenshots. people often need to google parts of the error message, and it's unreasonable to ask them to retype it from a screenshot.
ok
wait when u say ur using jupyter r u using the desktop app or the browser one
that's odd tho
browser as in when I load it up, it appears on my google chrome
i'll say tho if u cant figure it out soon just make normal venv or use something like uv/poetry
wym north venv and also its for my class so my teacher is kinda insisting that i do it in a virtual env
anyway, anaconda was created when intalling dependencies like numpy was nontrivial. but that's no longer the case. so in the vast majority of circumstances, anaconda just creates a barrier between data scientists and the rest of the python community, for absolutely no benefit.
I work for an AI company with over 10 thousand employees, and anaconda is banned at my company.
type shi man
like, they literally deleted anaconda from everyone's laptop one day. (except it wasn't on my laptop)
nice
akshually conda isnt just limited to python 🤓
@serene scaffold did you go to a prestigeious school before getting your current job
im just curious bc im deciding on colleges rn
I went to an average state school.
gotcha
I wanna go to a school that has a mix of good student life but also good education obviously but I just dont know how I should view it honestly
if you're a hard-working and skillful person, and you go to an "average" university, it will be easier for you to get the special opportunities that are only available to the hardest-working and most skillful students.
fair
but I guess I just worry that even with hard work that i won't be able to benefit any special opportunies if i dont go to the right school
come to my alma mater, we good and in city environment
whats the alma mater
georgia tech
ok so, im chopped and i could not get in 💔
this year for college apps was so fucked
i could have prolly had a chance if i was born a diff year but like i barely got into schools and places where I got rejected/waitlisted I most likely would have had better chances
that's a legitimate concern. but it's not a secret what the special opportunities are. if you know what you want to specialize in, you should just email the admissions counselors and ask them what opportunities there are for your preferred specialty.
u can always transfer
ngl tho if u still in hs and u lock in rn and grind coding
u will become mbappe when u graduate regardless of where u at
well I mean my concern roots from the fact i wanna do computer science and or cyber security
im a senior gang
ot
oh wait*
i didnt read the entire message mb
u good bro
one of mu biggest regrets was not grinding harder in covid cuz i had all the free time in the world to get skills uo
yea I understand experience / practice is what helps but rn Im kinda just full with my hands on my current class bc im doing learning development for ai models
if u learning developiment for ai models in hs bro u ahead of the curve fr
like i shit u not there were ppl in my graduating class that couldnt do i/o in any language u doing fine
idk I mean my teacher says shi like that but atp in this current time of computer science where imposter syndrome is common i just feel behind
i appiricaite that I guess I just doubted myself a lot in my capability due to the fact i didnt really get into my nearend reach schools
like umd was my dream school but i got rejected, which was crazy from the start since they only accepted 100 comp sci applicants but I was just too cocky id get in
u'll be fine bro, i spent the first two years of college just drinking and partying and ended up fine w multiple offers paying 200k+
ah u from md?
i appriciate that too, im trying to lay back a lot more and realize I have the grind mentality but i dont wanna drive myself into the ground
nah sadly i live in nj
u dont need to even grind hard bro, just be consistent and u will be fine
like main reason why ppl cant code or get job coming out of college is that theydont code on consistent basis
good point, I just like being ahead so I guess thats why i thought it was necessary
i mean u already r ahead, like im assuming u going to some place like rutgers but at least at georgia tech a ton of kids entered w/o ever coding a single line of code
true, its like scary hearing all the statistics but i never hear how the same mfs arent utilizing their degree so i just feel mislead with how the market current is
market isnt great but most ppl that didnt get job didnt really grind or apply themselves outside of classroom
i got into a diff branch of rutgers not main campus but Im considering west virginia u, penn state, loyola md, u delaware or rochester institute
u can just write rutgers university if u go there, if cost isnt an issue id prob go to penn state or rochester
but yea like i said this year every school practically had a crazy surplus of applicants which even lead to rutgers lowering their acceptence rate to like 30
u'll be fine bro, the school does matter to an extent im not gonna lie to u esp for stuff like getting vc funding or very early stage startups but tech recruiting is mostly meritocratic so if u r qualified u will get a decent job
well preference wise the branch of rutgers i got into isnt hella safe + not much aid, and rochester i just cant seem to udnerstand why i dont like it but why do u say that school
better engineering/cs program than the others except maybe penn state
like i wouldnt really consider places like west virginia and loyola md as like places that r super strong at cs or where resarch is priority
penn state altho its lowk a party school it gets a lot of funding for research and also a lot of kids from my hs that went there didnt really have trouble finding a decent job givrn they were dilligent
UMD's CS department is very selective--I live in DC, and I only know two people who have graduated from it.
(Maryland
DC
)
but I also now have the same job as both of them 
damn u from the dmv too?
why
yeah im doing the real time sign language. from scratch would be pretty tough i cant lie so im gonna use cv libraries
my friend recently bought a pi and a camera module so we're gonna try make the sign language in real time work with the pi and stuff
Good day, I would be grateful for your help and advice. I tried to use YOLO models found on the internet to detect a dog in real time on the camera image. I failed because the dog and the surroundings have very similar colors. Is it a good idea to create my model and train it on photos of my dog (Siberian husky)? I only care about the script detecting my dog in the selected square. Is training the dog model on a set of 300+ photos of my dog a better idea that might work? 🙂 I plan to use YOLO8l. Thank you for the advice! best regards
umm any recommendation for which gemini-openai proxy to use
That is very general statement. It takes over 200 applications even when applying to the job to even get a job or interview. I know this since even after getting a 3.89 GPA for my Master's degree and research to back me up I had a lot of issues, and it does not help with current layoffs from government grants drying up for public and private.
😅 That is equally a very general statement
Mahal naman niyan pre
pretty sure you can run yolo type models in real time on an rpi
ay really ba?
rpi with hailo kit?
yeah I did it for my high school research
Well not exactly, but I found out you could while doing my research
rpi 4?
no, just an rpi
I couldn’t tell you, I didn’t do it, I just discovered you could while writing the RRL, because it was important for our research that you could run our program without expensive accelerator devices like the hailo kit and jetson nano etc
There’s an article out there from tensorflow (or maybe PyTorch) on this topic, im sure you can find it
ohhhh
we might change it kasi
ocr for license plate, vehicle classfication, and car counter
dont have idea gaano ka heavy sa computation
3 in 1
Hi guys,
I'm doing this pie chart but as you can see all names titles are not visible, how can i make to show all names using matplotlib?
Might need to change the orientation of the labels, or add a legend
Another option is to lump all of the very small ones into a single category
For that many categories, consider using a bar chart instead
use a bar chart
dang, banned is wild
well then i think i'll uninstall it and set up a venv
wait how does it create a barrier though
because it creates a different workflow for environment management for, in almost all circumstances, no added benefit.
nice
oh i see
yeah i think i'm gonna use a framework as well after doing this one implementation from scratch just for learning purposes
i don't know if anaconda was the problem, but after getting rid of it my issue with git lfs was fixed
i had git lfs installed and initialized but it just wouldn't work and continued to give the large file error
now it's fixed though
nevermind i got my hopes up too soon
i really thought
that push was taking an outrageously long time compared to normal so i thought it was working, but after waiting literally almost 10 minutes it finished loading and gave me the same large file detected error
unfortunate
Hey everyone this is my first big project in AI:
https://github.com/sashaheadshot/12-lead-ECG-classification
just wanted to share it here. 🥲
yeah fairs
we managed to get it running on the pi
but my neural network did not work lmaoo
it was so inaccurate icl i think it was j randomly guessing
hi guys
i hope this is the right channel
when i create a .txt doc, enter code and rename .py, the doc open, black, then shut down. here is th code: from openai import OpenAI
Step 1: Initialize the OpenAI client
client = OpenAI(api_key="your-openai-api-key")
Step 2: Create a chat completion
response = client.chat.completions.create(
model="gpt-4", # Use "gpt-4" or "gpt-3.5-turbo"
messages=[
{"role": "user", "content": "write a haiku about AI"}
]
)
Step 3: Print the response
print(response.choices[0].message.content)
i've already import open ai with cmd
if someone can help...
does it give any error messages or just the doc is empty and shuts down?
black screen of cmd pop and deasapear
and it does that a lots of time also with other pi doc
I think the issue is with the api_key declaration
That's not the command i dont think
i follow the steps on openai website and they give me the k, i just replace it with "your-openai-api-key"
idk then
do you think their can be a link with azure cloud shell?? cuz i've seen some malicious stuff on my laptop since a bit of time idk
this situation doesnt not look ideal for pie chart imo
use diff formm of visualization cuz its not apt for it
ahh good one, I think you can also write a blog on this so that it will more informative
Thank you! I actually published a paper. Got accepted by iccae, will be online at the end of this month i think.
ohh nice, share that as soon as its published will love to read that
Sure thing!
lol
literally just a perceptron at that point, no backpropagation
does tqdm work in jupyter notebook
Yes
Congrats
import tqdm.notebook.auto instead of just tqdm
not that just tqdm doesn't work, but it might give you a warning
getting inference fast enough for real time seems hard af from scratch or more effort than needed
yeah that's why i'm only doing it for my first one which is just classifying with the mnist dataset
nothing in real time
oh ye thats p doable
i get an error message saying no module named tqdm.notebook.auto
and it doesn't work when i import tqdm.notebook
it's probably fine though cause when i just import from tqdm everything works fine
nevermind the tqdm bar that is working is kinda janky, first time it seemed fine but now it prints each number seperately instead of replacing
haha yeah
currently retraining it tho!
on like 6000/220 000 but hopefully by morning it will be done
holy
damn that's crazy
it takes that long to train?
well im definitley not an expert im pretty new to nn's
so icl my code is probably pretty inefficient
my code is definitely inefficient asl lol
it does around 100 images/second
it takes like 20-30 seconds to set up a perceptron with the mnist dataset
for a macbook i cant complain that mcuh
and it's just 2 hidden layers
but yeah my friend overclocked his cpu with a 6700xt or something like that so we're gonna brute force it tmrw anyway
30 seconds!?
how?
i honestly don't know
can i see ur code lol
wait let me check again, i might be exaggerating a bit
hah okay
ok yeah it was 20
damn mines like 5 im pretty sure
here's my code by the way
what libs are you using
numpy and pandas
pandas is only to parse the data though
nums = data.iloc[:50000, 0].to_numpy()
input_data = data.iloc[:50000, 1:].to_numpy()
output = np.zeros((50000, 10))
expected_output = np.zeros((50000, 10))
for i in range(50000):
hidden_layer_one = hl_one_feed_forward(input_data[i])
hidden_layer_two = hl_two_feed_forward(hidden_layer_one)
single_output = output_feed_forward(hidden_layer_two)
output[i] = single_output
expected_output[i][nums[i]] = 1```
def hl_one_feed_forward(activation):
weight = np.random.randint(low = -5, high = 5, size = (15, 784))
bias = np.random.randint(low = -50, high = 50, size = (15,))
neuron = (np.matmul(weight, activation) + bias).astype(np.float128)
sigmoid_neuron = 1 / (1 + np.exp(-1 * neuron))
return sigmoid_neuron```
oh wow thats more low-level than me to be fair
and then i have a method for hidden layer two and the output layer as well, pretty much the same as this one
yeah fair enough
yeah this is my first time working with neural networks so
starting with just basics
to be honest same lol
ive only done the number recognising one before and thats basically a tutorial
how did you chooose the number 50,000 btw?
there's 59,999 training examples
and the book i'm going through says i should use 50k and then save the rest 10k for hyperparameter tuning later
i'm trying to do it with as little guidance as possible and only learning theory from this book
not gonna even peek at the tutorial part lol
the one i'm doing is the numbers one
not the sign language one
yeah fair enough
haha i love it
ohhh sorry my bad
nah all good
i did that one
i'm not sure what's with the number though
and i made a little twist which you could try
59,999
all it took was one more training example lol
haha
oh what's the twist
basically i made a little gui and a really basic pen tool using pygame (dont judge) and then you can draw a number in a square it will resize it to 28 by 28 (for the 784 parameters) and then guess the number
it makes it much more fun to test
oh dang that's super cool
i was deciding to make it into a webpage where i could input an image which would be resized to 28x28 and then it would guess
i honestly have never even heard of what pygame is lol
there's a lot of things i should know but don't
kinda scary how far behind i am
its just a library for making games lol
haha you're not dw
python has so many different libraries
i struggle with git lol
for example ive been coding for probably over half my life and ive never used pandas before
and setting up venvs
tbf who doesnt initially
when it comes to doing stuff in ubuntu i always have claude open telling me what to do lol
haha
maybe the best way to create ai is to use it to create itself!?
hopefully i'm not too cooked when i go to uni
how old are you if you dont mind me asking
or you can reply in dms if thats easier
17
ah cool, UK?
oh no im from uK
lol america seems so much cooler
nah lol
actually i don't know what it's like after high school yet
but high schools here suck
suck how, are they like easier?
that's part of it, but the bigger thing is that it wastes a lot of time
yeah but an education is an education, no?
do you have any other options as an american
every system has its flaws
true
not just that though, just kinda hard to find people who are serious
seems like everyone at school is always on some form of tomfoolery
I am trying to make a chart similar to this but as an added bonus, I would like to modify the bars to modulate their width kind of like a violin plot. the data itself is the first and last time someone talked in a server and the modulating width would be probably a rolling average of chat activity month over month. I am trying to do it in matplotlib but I am reallly having a hard time making a violin chart with the kind of modulation that I want to begin with, let alone with multiple horizontal bars. @fleet latch for the cross channel ping
Try from tqdm.autonotebook import tqdm
same
thats true everywhere you go
although tbf in uk highschool you narrow down subjects a lot more earlier which is good i guess
it still breaks up the progress bar
though no errors though
is this just how it is with jupyter notebook?
oh that's cool
Hello chat. I'm planning to learn Generative AI. I am a freshman undergraduate at a university.
I don't know if I should learn machine learning and Deep learning before starting Generative AI
The problem I have with machine learning and Deep learning, I don't know where to start. If i start at basics I always have "learn X before Y" and "learn Y before Z"
Generative AI is an application of deep learning.
You should start with basics and work your way up
Of course first you need to learn the basics deep learning is basic before learning GAI
learn stats and linear b4 u even think about going deep into gen ai otherwise u gonna struggle
why'd you if it doesn't align with your work. i had to learn it cuz of school lol didn't thought it would be useful in near future
anyone installed conda in WSL2 for tensorflow setup?
you can pretty much learn what you need to along the way though right
like if you just get started on a basic neural network
that's what i'm doing, outside of just basic matrix and vector operations and partial derivates i knew nothing when i started
Hi everyone,
Has anyone managed to set up OAuth2 with Casdoor in Dify? I'd really appreciate it if you could share your experience or any guidance. Thanks in advance!
Is it not possible to run threads in streamlit?
Sure but I’d still dedicate time to learn it well
Send an inbox for assistance @compact merlin
please don't try to divert help to DMs. if you want to help this person with aythentication, please go to #cybersecurity
Hi. I do not know anything about llm and ai development. Right now i am building a chess game to learn ai or llm in the future. My main goal to learn ai by building chess ai. Can you share some topics possibly i need to learn in this journey?
yo guys don't tell me the actual best way as i need to figure it out on my own, but if i'm setting up a neural network from scratch, once i have set up the perceptron and am setting up the gradient vector, should i get a np array of every single weight matrix and bias vector
50k samples and 2 hidden layers before the output layer so both of these np arrays would be of size 150k which is kinda wild
are u not using autograd, also u shouldnt have that much arrays
they're intentionally not using autograd
i mean its troll af to not implement backprop using autograd imo
why not saying purpose of search?
is it really needed? as i know conda can install linux packages for windows without problems
yeah tbf if there's no need for it then there's no need for it lol
Wise words 💀🙏🏻
isn't it better to implement my own gradient method though
to better understand how it works
i mean autograd is teh algo that is used for most modern deep learning and will be always relevabt
i mean just for now though
after this implementation i'm probably gonna be using frameworks for everything
well autograd isnt a framework tho its just an extension of chain rule and viewing the gradiens in respect to the computation graph of loss
yeah but that's all of backpropagation in one method right
which would defeat the purpose of implementing it from "scratch"
i'd disagree, and in terms of like theory it allows for a greater degree of complexity and understanding modern deep learning
!d numpy.gradient
numpy.gradient(f, *varargs, axis=None, edge_order=1)```
Return the gradient of an N-dimensional array.
The gradient is computed using second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.
a wise individual here says to not use conda
better to set up a venv and install packages using pip
but from tensorflow 2.11 only wsl2 supporting gpu version
does venv can have two cuda versions
I think I wrote it right?
U could always just build from source if that becomes an issue
i tried to code a image captioner but it seems to overfit, like it outputs the same sentence for all inputs, the output sentence is a gramatically correct but compeletely out of context sentence, i tried everything but i couldnt get to fix it someone help me pls
btw when i train it it first starts with gibberish, then transforms to the output sentence that i get all the time
wait yo how would this even work
cause how would autograd know what variables to take the partial derivatives with respect to
cause there's chain rule
u do it base off the loss node, basically the idea is given the loss function u take the reversed topological sorting and use that to determine the order to find gradients
this repo covers it p well in terms of code imo:
https://github.com/karpathy/micrograd
topological sorting?
damn i haven't gotten that far in dsa
that's a part of dsa right
also the approach i mentioned earlier, would it work?
while it would use some pretty large arrays, the method itself probably isn't too hard since everything is gonna be plugged into one of two derivatives (chain rule of derivative with respect to either weight or bias)
this is my training function: ```# Create function to fit and score models
def fit_and_score(models, x, x_val, y, y_val):
np.random.seed(69)
model_scores = {}
for name, model in models.items():
model.fit(x, x_val)
y_predict_proba = model.predict_proba(y)
return roc_auc_score(y_val, y_predict_proba)```
I'm getting a value error when both x & y, and x_h1n1, y_h1n1 have the same shape
nvm i'm dumb
need to pass y_predict_proba[:,1] into roc_auc_score
pls help me with this im desperate
Can you explain exactly what you imagine what will happen when you "connect AI to visual studio"?
If you're asking how to use a code completion assistant--which is very incredibly more specific than "AI"--that would be a question for #editors-ides
yo i'm sorry but i sometimes find it funny when people say stuff like that
like "connect an ai with visual studio"
i know they probably are just unaware and have yet to learn more but
my bad for this 💀
i really shouldn't laugh, there's a ton of stuff i don't know either
The way I said all of that probably came off meaner than I intended. But I legitimately didn't know what they were asking at first, and I'm irritated by how I'm increasingly expected to know people are talking about generative AI when they just say "AI"
I haven't heard of the kind of thing that you're talking about.
nah i get that, it's definitely irritating, but also that's why i find it funny
unless someone's asking something like that to me directly, then yeah it irritates me too
i think the dude genuinely just didn't know how vast "AI" is, just like the majority of the population
the term has become more of a marketing buzzword than a field of technology
but with the way they were speaking, i presume he was just referring to copilot lol
I gave a guest lecture in an MBA recently and I noticed all of the students believed all of AI was GenAI. I think that’s not a battle we can win
Did you ask them what they thought self driving cars were?
I did actually
Also gratz on being a guest lecturer
But MBA 🤮
I asked them what they believed AI was, all of them said gen ai related stuff and then I showed all interactions I had with ml/ai that morning, most of them obvs not gen ai
My colleague has a Tesla so I brought it up haha
And does he regret buying it given present circumstances?
I haven’t asked but all my management has anarcho-capitalist vibes so probably not 🫠
Welcome to the club where some of us are still on the classic definition of "rational agents," which at this point is long forgotten.
That’s a fight hardly worth fighting haha
At some point a bunch of people will be on "only generative AI is AI," and will be left behind too.
Maybe it will cycle (like fashion).
so when will we be able to create companies using ai?
what did they say
for real
i mean mba i guess is chill to an extent, depending on where you're going into
but bba 🤮
mba is just an extension of that but the degree can the useful for some cases, i would rather just get a masters in my field of interest though
most business majors don't know anything but yapping
i need personal help on a project if interested email me sstalion67@gmail.com
Click here to see this code in our pastebin.
you know what battle we can win though? making companies believe they can replace business "professions" with "AI"
yo chat, Can I like scrap data with Twitter API (free-tier), for like reading the tweets under specific "target words", and use the data to fine-tune a llm? and then if a user inputs a similar "target word" the model should fetch like last 1 week's tweets with the keywords and provide insights based on that, is it possible?
what do you want the LLM to be able to do more effectively after the fine-tuning? Be as specific as you ever possibly can, with examples.
if you just want the LLM to be able to give you a summary of a set of tweets, fine-tuning it on that tweets won't help you.
Let's say new movie releases, I want the llm to go read like past 1 or 2 days tweet on "movie title", and say it is worth watching in theatre or not
you should not do any fine-tuning for this.
you just need to have all the tweets that the LLM can consider in a database that you query when the user asks a question, and then pass those tweets to the LLM along with the user's prompt.
so we like give a live-feel to it? like past 1 or 2 days with APIs? I mean if we storing in database, it'll not be constantly updating weekly with new tweets right.
you periodically update the database. you do not give the LLM a live feed. you only give anything to the LLM when the user prompts it.
periodically update in the sense like manually update on database?
whenever there's more tweets that you want the LLM to be able to use, you put them in the database.
lemme give you another examples (just trying to understand if its possible to do), Let's say a new product releases, and I make a model, which tells what to improve based on tweets from users on a 1 week timeline, like if the tweets say like the user is experiencing lags, the model should suggest like "you need to fix server lag to boost user retention" for this I need to fine-tune a llm right?
and it'll be cool, if the model can retrive tweets, in a given time frame, after the user enters the product name or company name, like "Zoom" or something like that, just curious if it's possible to make such models
the thing that an LLM does is generate text that comes after initial text. For interactive LLMs, the thing that the user says is the initial text.
If you want it to start generating answers to questions differently than it does currently, you need to train it on examples of user interactions where it answers those kinds of questions. you need to have tons and tons of training data to do this.
YES, that's what I mean
to be clear: you need so many tons of training data to do this, that it's almost always better to embed more information in the prompt.
usually how many viewmodels dose an app have?
Let's say I'm able to train such model. can I do the Twitter posts info retrieval in real time?
can you be more specific? viewmodels? app?
viewmodels ie the thing after repositories
if the shadow president hasn't shut off the twitter API, you can add new tweets to your database in real time.
and app application
I still don't know what you're talking about. is there a library you're using?
hi so i am a beginner to all this and i thought of making a simple chapter wise question paper classifier it takes in a pdf made out of images of pyq of a particular subject and classifies it int chapter wise on providing the list of chapters in that chapter easy right! NO it isnt.
While trying to improve it, I encountered several challenges:
- My initial prototype worked because the physics paper was text-based without diagrams, but...
- The past year question (PYQ) papers are only available as scanned images.
- Converting these images to text is complex because they contain diagrams, chemical structures, and scientific notation that regular OCR cannot process.
- I considered using MathPix, but it wouldn't be effective for chemistry papers.
- Another challenge is handling bilingual content—questions appear in both Hindi and English. Even if I separate the English text, processing mathematical notation in the options remains problematic.
This is just the data preparation stage. After that:
- The text needs to be processed through an LLM for classification.
- Scientific notation needs conversion from MathPix's LaTeX format to readable text.
- Finally, the text needs formatting and PDF conversion.
So at this point i feel like it isnt worth it as the outcome isnt that much valuable if we compare the effort its taking so should i quit this projects and how to build something useful that actually is worth it?
also im new to all this just learned python last year im ok with pandas and numpy too dont know dbms and flask will work on it after exam.
wait so does it automatically stoes tweets at set intervals, or does it only update when user requests for the "target word", grabbing new tweets with that term? (and thanks for being patient with my endless questions about this tweet-y business! 😅 you're a legend⚡ )
up to you.
thankusomuchhh
pls
can someone pls help me
nobody can help you until they know what the question is
@serene scaffold here it is
I'm not sure how to help with that, but hopefully someone else does
@serene scaffold yea thanks btw are you the person who made the pokemon card game neural network?
no
@serene scaffold oh ok
this is what chatgpt made 😭
4 knobs, let's start with 2 values each:
A: 1, 2
B: 64, 256
C: 0, 1
D: 8, 1024
So, 16 input combinations in total, each correspond to a scalar output value. I want to see what affect changing D from 8 to 1024 has. How to visualize? My intention was to pair up each input combination that has identical ABC inputs, compare the output of those two, and plot the resulting 8 values. Is there any plot that can help me do this grouping? I have the data in a pandas frame currently and just started using seaborne today. Catplot seems interesting, but can't figure how to utilize it for this. Any input?
That's not bad for a model tha fundamentally does not understand geometry at all
It's going to take most jobs because most jobs are bullshit jobs. And that means it'll drive down wages - it'll hollow out the middle class completely
hold on lemme check the version as well
Train it on the outputs of everyone and it'll be better than 50% of the people
GPT-4o
Ah.This place looks fun
which AI is best for math and physics
these kinds of messages need to be restricted
as in they gotta be against the rules of this channel
Can you explain what you think "AI" is?
when doing backpropagation for a simple feed forward neural network, i should set up a vector such that each term is the partial derivative of cost with respect to either a weight or bias, then once i have the vector, i should add the terms to their corresponding weights and biases and then check if it increases the cost or decreases it, then if it increases i subtract it instead and if it decreases then i go with adding it right (in either case i would multiply the vector by a learning rate that i choose depending on how far the vector is from 0)
honestly i'm kinda lost with backpropagation, not even the goat 3blue1brown could save me
and if i'm having trouble understanding 3blue1brown then i'm definitely fried
yo chat, where to find metal scrap image data, as in like the image should have minimal rust in it, which should be recyclable, i want to know how or to where to find source (paid or free)
what you calculate is the positive gradient, it will always tell you the direct to go to increase the cost. We then multiply that by -1 to get the negative gradient which tells us which way to go to decrease the cost. No need for conditionally increasing or decreasing.
so then what's the learning rate
also yeah i realized how messed up my approach was earlier lol
now i realized though, i need to calculate the gradient by backpropagating to get the partial derivative with respect to each weight and bias
how much you multiply the gradient by when you subtract it from the weights/biases
and that learning rate is like the step we're taking right
yes, it's one of the factors included in how big a step you take
whatever it starts out as, from there we decrease it as we get closer to our minimum
i'm still failing to understand how to actually set up backpropagation
if you're just learning about implementing backprop don't worry about decaying the learning rate, just consider it a constant step size
yeah the learning rate part is pretty straight forward anyways, but my goal is to implement a full neural network
a pretty basic one though
it's for the mnist dataset
so essentially when we calculate the partial derivatives (find derivative and plug in the relevant values from what we have), that gives us the gradient right
but our gradient is what has the steepest slope
i don't see how we get the steepest slope from this everytime
a single partial derivative will tell you how the function changes when you tune a single parameter, if your partial is positive you know you need to go down to decrease the cost and vice versa. The gradient is just what we call all of those partials together in a tensor (or set of tensors)
i see, so it's not necessarily the direction that has the steepest slope
it the slope of the cost function for a single point over every trainable parameter in the model
so pretty much what we do is we compute that gradient, then check to see if it increases or decreases our cost when we add it right
the positive gradient (what you calculate using the chain rule) tells you the direction you need to go to increase the cost
so you just subtract those parameters instead and decrease the cost
no need to consider what the gradient actually says
oh so that gradient will add to the cost
no matter what
and so we always just multiply it by the learning rate and then subtract it from our vector of weights and biases
Whats a good book to master python from knowing basics
i heard about Fluent Python by luciano is that a good one or is there better
Like to write efficient good code + basically know inside out of python
i'm thinking i should store all the weights and biases into one vector, and then have the gradient be in the same format so that they correspond to one another and i can just directly subtract, that's a good idea right?
i think the best way to master python after you learn the basics is to just practice
no book will replace doing practice problems and projects
should I fine tune CLIP on dataset or instead use cross attention to improve accuracy on image captioning task ?
optimizationally yes, as far as making it easier to work with that might add a bit of extra struggle in the implementation
Hello guys I am new to ai can anyone recommend any sources from where I can start learning
The gradient is largest when applied to a vector in the same direction as the gradient
Or well the gradient is what it is but you don’t have to move in that direction
But the function changes by the most if you do
(So instead you move the opposite direction)
first make sure if yore good in python learn pandas and numpy and then take courses on langchain on deeplearning.ai that what i did
Also just to clarify because don’t think I explicitly said it— the gradient is the direction with the steepest slope
Greetings everyone. I have a very simple question yet so tricky. In the attached figure I created an end to end workflow of a model selection and evaluation. In principal, there's nothing wrong with it but the first split (to train/test sets) introduces bias although I bootstrap the test set in the final evaluation module. My question is: How can I do it robust ? If I repeat the exact same workflow 1000 times, I'll get a new "best configuration" and that's not what I want. I work with tabular data of 835 samples for a binary classification task with 80 features and imbalanced of 65-35 %.
Someone could argue that the 1st split (train/test) basically corresponds to reality and whatever the result is this should be reported.
Others could argue that the 1st split (train/test) might lead to poor scoring for the best configuration hence split was done once and sets might not be ideal.
Hi, I'm trying to understand the use of numpy a bit better (in relation to ai/ml) and I often see something like: img = np.random.randint(<someting>); img.transpose((2, 0, 1)); and I'm trying to understand what the transpose() call really does.
From my understanding, in this case, it converts the pixels which are stored in memory like [r0,c0, r0,c1 ....]` (e.g. <r>=row, <c>=column, pixels are stored packed per row), into a planar format where the R, G and B values are stored as separate planes. Is that correct?
simply put, it's reordering the axes of your data. using your example, if the data is originally stored as [r,g,b], then img.transpose((2,0,1)) makes it so that it is now [b,r,g]
whenever possible, this is done only by changing the stride and not by rearranging the entries in memory
Thanks! So the data is already planar? Instead of packed?
idk what you mean with "as separate planes"
since numpy falls back on c to handle memory and what not, what is done is that the full required memory (rows * cols * size of each element) is allocated as a contiguous chunk. you then have two options: store the data in row major, or column major order. numpy uses row major by default
maybe this helps
Ok I understand, I'll try to explain better.
Normally when you load/decode an image like a png or jpg, the pixels are stored in memory as r0,g0,b0,r1,g1,b1,r2,etc. However from what I've learned so far, the AI/ML worls uses pixels where they are stored/extracted, like: r0,r1,r2,r...for red, and b0,b1,b2 etc.
how you index them and how they are stored in memory are two separate things, btw
you usually don't need to care about how they are stored in memory
In my case it is important how they are stored in memory because I'm loading them in C++ and feeding them into TensorRT.
ah in c++ you can choose this yourself
the first order you showed would be column major
When I feed packed R0,G0,B0,R1 instead of planar R0,R1,R2, G0,G1, etc. the inference results are completely wrong
why do you call it packed
if you look at the image i shared, it's exactly what i showed there
just reading the data in one order or the other
I call them packed/planar and not column/row major because I talk about how the bytes are stored in memory
i mean, row and column major refers exactly to how the bytes are stored in memory
that's precisely what the words are used for
ok
I also tried to focus on how the red, green and blue values are stored in memory. But ... from your first answer I think that they are already separated/planar.
and the .transpose() only flips the order like you said
transpose does not change the memory layout
it changes the stride only
the two things are completely separate. you can keep your memory layout as is and just change how you map the memory to a matrix
Ah ofc. The reason why I was asking, is because I'm curious why they use transpose here:
def convert_onnx(net, path_module, output, opset=11, simplify=False):
assert isinstance(net, torch.nn.Module)
img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.int32)
img = img.astype(np.float)
img = (img / 255. - 0.5) / 0.5 # torch style norm
img = img.transpose((2, 0, 1))
img = torch.from_numpy(img).unsqueeze(0).float()
because of what the math is doing
My guess was, that they want to show how to reprocess the input data.
not because of the memory layout
from the exact code you shared, it's not even needed. you could just directly create the array with the shape 3, 112, 112. but later on when the data is not randomly generated, it can make a difference
yeah that's why I thought they were using transpose() to clarify something ... like how the preprocessed input for inference should look like
sure
so that you can make sure that the memory layout in e.g. C/C++ is correct
still has nothing to do with memory layout
not in python maybe, but in C you have to make sure the input data is preprocessed in such a way that the pixels in memory are as expected.
also no
you're mixing up mathematical objects with programming objects
you can store the entries of a matrix however you like in memory as long as you apply the math to them in the correct way
If "no" please tell me why remapping a loaded image into separate R,G,B planes gives me correct results when I feed it into a face detection model and that it doesn't work when I keep the pixels as is?
yes of course 🙂
but it's about what the model expects
because you are treating the two things as if they were the same
it's because of how you wrote your code
yes ofc 🙂
you made the two things the same, when they don't have to be
you could keep the memory layout the same and change the stride instead
sorry, maybe I should have shared a bit more context
the thing is that I'm working with a .onnx model which I converted into a tensorrt engine. I use this model to learn about ai/ml and as a test I'm creating a face detection app.
so I load this model and want to feed images for inference into it.
what I found difficult is to figure out how to correctly preprocess the image data before feeding them into (any) model that expects images.
most repositories seem to share a python script which converts a pytorch model to onnx and I thought maybe those scripts share/explain how the data is supposed to look as they all look similar.
those scripts focus just on the "shape" of the data, e.g. that it's an m x n matrix
ok thanks for explaining
Ill give you a code that is basically a tester for your model when i get home
but if that's the case wouldn't we always be going in the direction with the steepest slope
or is the vector we find not always exactly the gradient
The loss landscape is a lumpy mess, filled with hills and valleys. Part of shaping the architecture and training data is to smooth this out so we can "roll the ball down the hill" - descend the gradient - without getting stuck in a pool that isn't at the bottom (a local minima, rather than a global one)
Part of this is "feature engineering" and part of it is model design, the architecture
with open('smallbatch.jsonl','r') as f:
data = [json.loads(line) for line in f]
df = pd.json_normalize(data)
df.head()
Hi, pandas / json question. Does anyone know why the json_normalize() function is failing to flatten after it gets a few levels down? (response.body.choices)
any idea on rocm support on wsl vs nvidia setup to train a deep learning model ?
has 2 setup where one is old laptop with rtx 1060, another one with amd 7900 xtx. but from what i have been reading online, wsl setup on window with amd gpu is annoying ??
hi guys. I am deciding about ml hardware, but pytorch and tensorflow requirements are not obvious. i am seeking for cheap variants and so cannot be sure about specific card being capable of running frameworks due to card being obsolete. how to check compatability properly?
i see
so model design is the setup of our multilayer perceptron right
as in how many layers we have and how many nodes in each layer
Yeah but also the fact that you use the perceptron pattern, which parts are plumbed into each other, the flow of the information as gradient descent adjusts the loss landscape to shape to fit the overall function to the training data
right now i'm just worried about getting the concepts down and trying a basic implementation from scratch, once i get backpropagation/gradient descent set up and have the algorithm which gives me a set of weights and biases that will give me a relative minimum cost, then i will look into all the ways i can improve the accuracy through hyperparameter tuning and improved model design
I highly recommend "The Ancient Secrets of Computer Vision" for a deep primer on feature engineering and the sort of cool hacks that are used to build vision models. It's long, but it's a full course by the guy who wrote YOLO:
https://www.youtube.com/watch?v=8jXIAWg_yHU&list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p&index=2
The Ancient Secrets of Computer Vision
https://pjreddie.com/courses/computer-vision/
An introductory course on computer vision originally held Spring 2018 at the University of Washington.
thank you, i'll look into it
Something a bit shorter and a bit more direct:
https://www.youtube.com/watch?v=TkwXa7Cvfr8
A video about neural networks, function approximation, machine learning, and mathematical building blocks. Dennis Nedry did nothing wrong. This is a submission for #SoME3
Original vid: https://www.youtube.com/watch?v=0QczhVg5HaI
My Links
Patreon: https://www.patreon.com/emergentgarden
Discord: https://discord.gg/ZsrAAByEnr
Links and Content:
...
Does locally training any LLMs or any ml models (like say diffusion models) makes any sense?
Or is cloud the way to go?
Asking cause I wanna buy a new gpu. I am not a pro but much better than a total beginner. Would keep this in mind.
how much would i be able to do with say rtx 4070 (thats the max i can go). Or I should just stick to colab/kaggle?
I don't really train that many models so I'm not a very good source of an opinion here: I own a 3080 and I think it has the same amount of VRAM and more cores? It felt like a waste of time to me.
The amount of GPU power you might need varies wildly depending on what you're trying to do. Your decision about which GPU you get for personal use should only consider how you'd use it for gaming
If you can also use it for ML, yay
I see. and i dont think and gpus are any good for that right?
Good for what?
oh i meant AMD gpus.
it autocorrected 😅
Yeah, you'll want to stick to Nvidia.
anyone here been working with this mcp library? https://github.com/modelcontextprotocol/python-sdk?tab=readme-ov-file
I legit cannot get even the quickstart to work and I want to make sure its not just me
AMD works, but not all libraries support it, or support it well. It's the best option for price, but at the tradeoff of ecosystem / support.
Considering how difficult it is to get any Nvidia GPU at this point, it may be your only option.
(Other than using the cloud)
The key thing you would be looking for in a GPU is tensor core count, and VRAM (at least 16 GB, but you always want more).
Also not all generations are equal, they may have the same count, but tensor cores have improved a lot recently.
at least 16?
I was more in the range of a used 3080 or maybe new 4070tbh.
would they not be okay even for learning and like a proof of concept thing?
ig it won't be worth it considering the model training part for gpu at all then haha?
You can use them for learning machine learning on more simple tasks. Just don't expect too much, they will be toy projects.
I see. and if I go with cloud for getting similar experience, is that expensive?
That also depends on how much you are expecting. If your expectation is anything in the ballpark of ChatGPT or self driving cars, then I regret to inform you that this has gone outside the range of what an individual (usually (are you a billionaire?)) can afford.
But it will probably be cheaper and get you further. However, this depends a bit, since there is a curve here. If you have enough funding to buy enough of them and basically become your own mini cloud it can be worth it again.
This also depends on if the cloud provider you pick is currently willing to lose (go negative) to attract users, in which case it would be a good deal for you and better than doing it yourself (and many cloud providers are currently doing this, waiting until everyone is locked into their services to increase the price).
Overall, probably go with cloud. It will probably be cheaper and they have the biggest, most modern GPUs (that are not even consumer GPUs, they are better than those).
na na lol. it's just personal level.
for doing to understand on my own.
and perhaps building something by fine tuning existing LLMs specially open source ones
If it's just for learning on your own, then a used 3080 is fine.
Main issue would be VRAM, since it's only 10GB.
(DDR6X)
I see.. how much is good?
cause it only gets 12gb with 4070
For reference, the 5090 is 32GB (DDR7). And the 3080 has about 272 tensor cores (older generation cores) while the 5090 has 680 modern ones.
The 4070 is bluntly a scam.
For the price.
Nvidia likes to mess with their consumers a lot, you have to navigate their tricks.
If you are a university student, one option is to just ask your professors if they have compute power laying around and maybe turn it into a summer research project
Also the 4070 is not what you want, it's targeted at gamers.
Less tensor cores. It just has more VRAM for textures and models.
My department has a fair bit of resources that are mostly a polite request from a good student away from being made available to them
If your university has like a Nvidia workstation that is great (mutliple GPUs all interlinked).
(4 V100s interlinked, each 32GB VRAM, and 640 tensor cores)
nah. I just graduated lol. been thinking to dive into ML again.
Hey, anybody online? I am self taught and don't know much about Python. I had to use a chromebook and webbrowser programming things like CodeHS to learn 😭 I just got an actual computer and I want to make JARVIS.
What do you mean by "JARVIS"? Because the Jarvis in the marvel films is more advanced than the most advanced AI that currently exists. By a lot.
Yeah, I just mean I want something with a similar concept
I know it's not gonna be nearly as good
I want something that I can talk to and get a verbal response. I would (preferably) like to have it voice activated but that's a big what if
Basically Glorified ChatGPT
I think it's called machine learning? Idk. I want it to be able to remember some stuff if that makes sense
Glorified ChatGPT. You won't be able to create an interactive language model that's anywhere near as good as ChatGPT. Those models train for months with terabytes of training data on enterprise systems.
Again, just using these as examples lol
You're currently trying to operate at level 100. You need to bring it way down to level 1.
fair enough
I had something that was working but it had limited commands
and I had to push a button before it would start listening
A beginner project you could do is make a spam detector
Or something that decides if a picture is of a dog or a cat
Also, what can I use for these things? I have VSCode installed and I think I have the python extensions for it installed too, but idk how to tell if they work or not
Vscode is just a code editor. You can use whatever code editor you want.
It has no bearing on how your code will run.
Just don't install anaconda
hahaha, why?
They've been gaslighting ml people into thinking it's still relevant
oh
I'll probably be back in like nine hours
alright
you seem to be in the same boat as me
trust me, we are no where close to recreating jarvis lol
not just us, mankind as a whole
what i'm doing right now and what i think you should do as well, is implementing a basic feed forward neural network using just numpy
and pandas but only to parse the data and get it into a numpy array, nothing else
before that probably learn some basics of linear algebra and multivar calc though
and then the rest you could probably learn as you go
If anyone is looking to get some open-source experience with AI Engineering and LangChain, I have a repo which has some issues to tackle, probably more intermediate than beginner-level stuff.
https://github.com/GGyll/condo_gpt
In the readme there is a video that explains the project in case you like visual explanations 🙂
does anyone know a fix for PIL UnidentifiedImageError?
basically a lots of images in my dataset are corrupted and I can find a way to ignore corrupted images in dataset during testing loop
Did you join this server just to advertise your project? 😅
Isn't this what exception handling can take care of? Or perhaps, it might be better to find all corrupted images and remove them from your data.
Guys anyone interested in doing some deep learning projects with me?
I am learning so yeah if anyone would like to join in guiding the workings of DL architectures & other stuff.
I just joined and I posted it, but not really tbh.
How much data science should you learn before learning machine learning?
Hello, I am starting in ML, I would like to work in a project to improve, send me DM
i think the question should be how much math should you learn before ml
Hi, I'm working on a small project to approximate polynomial function using tensorflow.
I use 30000 datapoints from x = [-1, 1] and use the polynomial to create the corresponding y dataset
I then create a model using this general format
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(1,)))
model.add(Dense(12, activation= 'relu'))
model.add(Dense(1, activation='relu'))
This is just as an example
I train it at 20 epochs with batch size 12 with validation data (data is split in 30% training, 20% validation and 50% testing)
It takes about a minute to train the model, is this normal?
https://paste.pythondiscord.com/AQRA
Is this convelution net ok I do see an issue but I don't know how to solve it
this isn't a convnet
?
i dont see any convolution layers, this looks like a standard mlp
I was trying to get it to work
yo do you think it's worth it to work on an implementation from scratch for each of the most important neural networks
like convolutional and recurrent
also by the way i got an implementation of the gradient function working, i'm still not sure how the numpy method for gradient works
cause it doesn't give us the individual changes for each weight and bias that we need to make
Don’t use numpy gradient method u got write ur own for ur operators in tensor class
tensor class?
also yeah i wrote my own
UnboundLocalError: cannot access local variable 'hl2_to_output_weight' where it is not associated with a value
does this mean out of bounds
remember to always always show the code also.
are you getting this in a lambda or what?
I need help
whenever i use import cv2 or whatever i am importing it says module not found
you can have more than one python on your computer. if you're sure you installed something, and you're getting an import error, you probably installed it to a different python.
how am fix
import sys
print(sys.executable)
add that to your code to see which python your code is using
please always copy and paste text directly--don't ask people to read screenshots of text
oh sorry
what did you do to install cv2?
you get an import error when you try to import cv2, right?
what did you do to install cv2?
you have to install cv2. it doesn't come with python.
please give the text in this screenshot. I need it for something.
no i'm not using a lambda, is it fine if i sent the notebook file
No, it needs to be flat text (not a json, which is what a notebook is).
you can do python -m jupyter nbconvert --to script --stdout your_notebook.ipynb
without them knowing that it's running??
😊
they can know whats running idc
but i need to test my webcam and part of my script is screwed
please do this if you want to continue.
with your keyboard
this
ok
@glacial root still there?
yeah the command wasn't working so i just exported it as a python file
Click here to see this code in our pastebin.
should i click the check
It said please
or is this fine
which line causes the error?
are u using
venv?
visual studio
one sec i'll send a screenshot of the entire error
PS C:\Users\pcname\Downloads> & 'c:\Program Files\Python313\python.exe' 'c:\Users\pcname.vscode\extensions\ms-python.debugpy-2025.4.0-win32-x64\bundled\libs\debugpy\launcher' '59458' '--' 'C:\Users\pcname\Downloads\import sys.py'
Please do not post screenshots of text.
u can still use venv in visual studio
can u do pip list?
in terminal?
yh
oh then i'll copy and paste it
UnboundLocalError Traceback (most recent call last)
Cell In[11], line 8
6 if array_max > max:
7 max = array_max
----> 8 gradient_descent(cost_gradient, max)
Cell In[9], line 3, in gradient_descent(cost_gradient, max)
1 def gradient_descent(cost_gradient, max):
2 if max > 1:
----> 3 gradient_descent_step(cost_gradient, 0.01)
4 elif max > 0.0001:
5 gradient_descent_step(cost_gradient, 0.001)
Cell In[8], line 2, in gradient_descent_step(gradient_vector, learning_rate)
1 def gradient_descent_step(gradient_vector, learning_rate):
----> 2 hl2_to_output_weight = hl2_to_output_weight - (learning_rate * gradient_vector[0])
3 output_bias = output_bias - (learning_rate * gradient_vector[1])
4 hl1_to_hl2_weight = hl1_to_hl2_weight - (learning_rate * gradient_vector[2])
UnboundLocalError: cannot access local variable 'hl2_to_output_weight' where it is not associated with a value
@viral wadi okay, do this command.
'c:\Program Files\Python313\python.exe' -m pip install opencv-python
also @viral wadi
i recommedn using wsl in the future it will make life easier
you run it in the powershell terminal. not as python code.
ok good
also can you tell me just enough to give me a slight hint
I think that's too advanced for them for the moment
I'm working on it.
The filename, directory name, or volume label syntax is incorrect.
sorry this is what i meant
the way i said it earlier may have come off as a bit rude
WTF
@glacial root
def gradient_descent_step(gradient_vector, learning_rate):
hl2_to_output_weight = hl2_to_output_weight - (learning_rate * gradient_vector[0])
output_bias = output_bias - (learning_rate * gradient_vector[1])
hl1_to_hl2_weight = hl1_to_hl2_weight - (learning_rate * gradient_vector[2])
hl2_bias = hl2_bias - (learning_rate * gradient_vector[3])
input_to_hl1_weight = input_to_hl1_weight - (learning_rate * gradient_vector[4])
hl1_bias = hl1_bias - (learning_rate * gradient_vector[5])
It looks like you're trying to change global variables in a function, but that's not how it works.
I need to know what you did and the whole error message.
oh so even if they're global variables i have to put them as arguments
the error is "The filename, directory name, or volume label syntax is incorrect."
dude
do you know the difference between variables and objects?
you should probably send him the whole thing
I followed the tutorial what can I say
i closed it so ill redo it
yeah
but there are no objects here
@viral wadi if you run code and it causes an error, I need to see the whole code and the whole error.
if you run a powershell command and it causes an error, I need to see the command and the whole error.
always send both parts in the same message.
C:\Windows\System32>-m pip install opencv-python
'-m' is not recognized as an internal or external command,
operable program or batch file.
C:\Windows\System32>
i'm not gonna lie i kind of forgot about objects, it's been a while since i've worked with java and in python i haven't really worked with classes
but i see the issue now
you're missing the first half.
'c:\Program Files\Python313\python.exe' -m pip install opencv-python
i'd prob learn the math behind ml first then blindly following tutorial
it'll help u more in the long run too
I tried that too
C:\Windows\System32>'c:\Program Files\Python313\python.exe' -m pip install opencv-python
The filename, directory name, or volume label syntax is incorrect.
try making the C upper-case
ok
C:\Windows\System32>'C:\Program Files\Python313\python.exe' -m pip install opencv-python
The filename, directory name, or volume label syntax is incorrect.
C:\Windows\System32>
@serene scaffold
download and install this. https://cmder.app/
cmder is software package that provides great console experience even on Windows
ok
I don't know enough about powershell to keep helping you.
@viral wadi if u just need to script and dont want to deal w windows environemnt other option is to juse use colab
everything in python is an object. there is no escape.
RAT malware - remote access trojan
🤓 blud about to spill some generational yap about pyobjects
# creates a numpy array object and assigns it to that variable
hl2_to_output_weight = np.random.randint(-5, 5, (10, 15))
# creates a new object and overwrites the original variable, but only for the current functions
hl2_to_output_weight = hl2_to_output_weight - (learning_rate * gradient_vector[0])
idk what could have caused that. you could try using google colab instead.
this is for you, @glacial root. sorry for not making that clear.
great.
open cmder and type which python
is it supposed to be transparent
doesn't matter
oh wait yeah i forgot about that
λ which python
/c/Program Files/Python313/python
i see
stelercus what happen
so should i just not use a function
NO
YES
ok
even functions are objects right
okay, do python -m pip install opencv-python
i find that wild
I know somebody already I commented but I still understand the equations it's just my brain is it comprehending what's the error
i had a stroke reading this
please be quiet mechanical fox
you're not more entitled to help than anyone else
oh
...
bro...
this
that's for human languages. but it's not even good for that.
:C
@viral wadi in cmder, do python -m pip install opencv-python
i did
okay, what were the last few lines of the output?
Successfully installed numpy-2.2.3 opencv-python-4.11.0.86
[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip
okay. where in your computer is the python program you want to run?
oh sorry
but the reason i was asking
was that i just want to let you know
you gotta be more patient
and you gotta understand that others are people as well
yes
whatever folder it's in, you need to use the cd command to go to it
so it might look like cd C:\Program Files\my_stuff\example
and then once you're there, you do python the_code.py
where the_code.py is your python file.
I'm going to have dinner, so I'll be back later.
condition = True
while(condition):
cost_gradient = gradient_cost()
max = 0
for parameter_array in cost_gradient:
array_max = np.max(np.absolute(parameter_array))
if array_max > max:
max = array_max
if max > 1:
hl2_to_output_weight = hl2_to_output_weight - (0.01 * cost_gradient[0])
output_bias = output_bias - (0.01 * cost_gradient[1])
hl1_to_hl2_weight = hl1_to_hl2_weight - (0.01 * cost_gradient[2])
hl2_bias = hl2_bias - (0.01 * cost_gradient[3])
input_to_hl1_weight = input_to_hl1_weight - (0.01 * cost_gradient[4])
hl1_bias = hl1_bias - (0.01 * cost_gradient[5])
elif max > 0.0001:
hl2_to_output_weight = hl2_to_output_weight - (0.001 * cost_gradient[0])
output_bias = output_bias - (0.001 * cost_gradient[1])
hl1_to_hl2_weight = hl1_to_hl2_weight - (0.001 * cost_gradient[2])
hl2_bias = hl2_bias - (0.001 * cost_gradient[3])
input_to_hl1_weight = input_to_hl1_weight - (0.001 * cost_gradient[4])
hl1_bias = hl1_bias - (0.001 * cost_gradient[5])
elif max > 0.00000001:
hl2_to_output_weight = hl2_to_output_weight - (0.0001 * cost_gradient[0])
output_bias = output_bias - (0.0001 * cost_gradient[1])
hl1_to_hl2_weight = hl1_to_hl2_weight - (0.0001 * cost_gradient[2])
hl2_bias = hl2_bias - (0.0001 * cost_gradient[3])
input_to_hl1_weight = input_to_hl1_weight - (0.0001 * cost_gradient[4])
hl1_bias = hl1_bias - (0.0001 * cost_gradient[5])
else:
condition = False```
@serene scaffold this is my function now
how long can i usually expect this to take to run
so far it's been like at least 2 minutes and my computer sounds like a fan lol
it really depends. you usually want to write it in such a way that it periodically tells you the current loss.
wouldn't that be even more computationally heavy though
cause it would have to recalculate the loss every time right
though that takes like 5 seconds
you're always calculating the loss. you just need to print it sometimes
oh i didn't have it calculating the cost every time
but now i changed that and am having it print every time
printing it every time will be annoying for the human
why not just using logging module so u can have that when u r debugging
it's just a for loop though
one sec i'll send the code
oh wait it goes over the limit
i'll use that github thing to send code, i forgot what it's called
what's logging module
also i'm doing this on jupyter notebook
I'm winding down for the day, so I probably won't be answering questions in the immediate future.
https://docs.python.org/3/library/logging.html
it's kinda like print() but u can have it trigger based on certain conditions like whether u r debugging or for some some other reason
i see
oh wait i remember what it's called now
github gist
this is the code for it
only the first element in the cost vector has been going down
and very slowly, except for one time when it just shot down and then went back up to relatively where it was before
is there a reason why u arent summing up mse?
i am
i'm taking an average though
average across all training examples
also perhaps the reason why it's so slow is cause i'm using a cpu
u arent summing up the components tho, u should get a scalar
do you guys usually pay for a gpu
oh that's what you mean
it doesn't really make too much of a difference though right
it does tho
ur supposed to have the loss for mse as a scalar and not doing his impacts
backprop
so it should be done before i take partial derivatives?
yes, u should get the partial after u get mse
in fact a lot of modern ml liubraries dont let u call backwards() unless the tensor is a a scalar
i see
is this also why it's so slow
Run it under cProfile to benchmark bottlenecks
i'm sorry but i have no idea what that means
it's a a profiler, a way to figure out what the bottleneck in your code is
You are using Python (without numpy), also use a smaller batch size or just SGD.
i have no idea what a profiler or a bottleneck are
i'm using numpy
setting up a neural network without numpy would be diabolical
Also you don't really compute these things in separate passes always. It's like if you add 2 to every value in an array and then multiply it by 3. You could do this in a single loop. But what you are doing is like having a loop that adds 2 to each value, and then another after that multiplies them by 3.
and yeah i cut down the batch size from 50k to 5k training examples
i see
See if you can combine loops into one.
Do things all at once when you have the info right there.
Note that this separate loops stuff still happens when using numpy, just it's hidden.
yeah that's what i did
And it's why it's slower than doing it in plain C still.
like with setting up the layers and the cost function
i did all of that in one loop
same for backpropagation wherever i could
i see
makes sense
The real issue is going to be large loops in Python.
Even though numpy is not optimal it's so much faster than pure Python that it does not really matter.
