#data-science-and-ml
1 messages · Page 159 of 1
no i mean like unrelated to prompt
If its trained bad then yeah or in early stages of training
and btw u said u made an llm?
@weary timber
what gpu?
gtx 3050
gtx?
yes
gtx 3050 dosent exist only rtx 3050
yeah u should be able to train it
i dont know shit about hardware
what error does appear
ohhhh did u try with smaller dataset?
becouse i think u are trying to load the whole dataset into ram
and thats 40gb
i tried with a chatgpt generated 50k sentence dataset
german
and?
it was generating meaningful sentences but often unrelated to the topic
i think because of how small it is
(the dataset)
U know there are two stages of training pre training and fine tuning
yes i think
so smaller dataset works?
ye it works
im loading the dataset from the "datasets" lib and setting streaming True
i think it loads as i request
Your pc just starts freezing?
I had that problem too
and that was that i was trying to load the whole dataset into ram
so now i load it in chunks
and it works
it happens when i start to train
youre using the openwebtext too?
the problem could be with my model architecture too
it uses same as gpt2
Was earlier but now fine web
U mean transformer?
yep
gpt2 pretrained
ohhh
since i was using the same architecture might aswell use its toknizer
so u are loading the gpt and trying to fine tune it?
no
or from scratch?
If u want
i know my has 301 lines of code
oh wait it is a notebook how do i send it
copy and paste?
i have the generate response function only
no need to put there since i havent been able to use it 😭
where are model paramiters defined?
like n_embd = 768
n_head = 8
n_layer = 8?
Do u have cuda installed and torch for cuda?
yes
yeah u will have better access
there are 2 n_head's . the first one is n_block
Your model is quite small
the embedding_dim is pretty high tho
256?
yes isnt it?
its small
okay then
n_embd = 256
n_head = 8
n_layer = 5 so these are paramiters of your model?
yep
your model has 30,470,784 paramiters
30milion
the smallest gpt2 (small) has 140milion
sorry 117milion
okay
so its like 4 times larger
and its the smallest gpt 2 model
lr = 0.0004
your learning rate is low change it to 0.003
or 0.001
@weary timber u want my old code?
i havent run it
btw whats up with the self.proj in multiheadattention, what is it for?:
and can you check and tell if my multiheadattention is proper
yeah ok and btw in my code there isnt implemented flash attention so if u want to use it u need to add it
self.proj in multiheadattention is a linear layer that comes after combining the outputs of all attention heads.
so it just normalizes
so it isnt for reshaping the rsult
so its just combining and refining the information from all the individual attention heads into a single
so it just takes the result from multiple heads
and then it mixes it
ive learnt it as splitting the multihead attentions input and then running it through indiviual attentionunits
class MultiheadAttention(nn.Module):
def init(self,embedding_dim,n_head,mask=None,batch=False):
super().init()
self.chunk_size = embedding_dim//n_head
self.n_chunk = embedding_dim//self.chunk_size
self.remain = embedding_dim%n_head
self.heads = nn.ModuleList([AttentionUnit(self.chunk_size,mask,batch) for i in range(self.n_chunk)])
if self.remain:
self.heads.append(AttentionUnit(self.remain,mask,batch))
def forward(self,x):
x = [head(chunk,chunk,chunk) for chunk,head in zip(torch.split(x,self.chunk_size,dim=-1),self.heads)]
x = torch.cat(x,-1)
return x
like this
Hmmm idk i just was learning from youtube so idk the specifics like i can give u some tutorials
https://www.youtube.com/watch?v=l8pRSuU81PU&t=6585s here is an video i used to optimize my model
We reproduce the GPT-2 (124M) from scratch. This video covers the whole process: First we build the GPT-2 network, then we optimize its training to be really fast, then we set up the training run following the GPT-2 and GPT-3 paper and their hyperparameters, then we hit run, and come back the next morning to see our results, and enjoy some amusi...
This guy has pretty much everything u need to know on his youtube
@weary timber and does my code work?
im downloading some dependencies for it
What exactly?
bitsandbytes
oh for adamW8bit its an faster and more efficient adamW
danke
i cant run it :/
im running this code on VSC but when I run it, it says ModuleNotFoundError: No module named 'torch'
py -m pip install torch
why aren't you just using ultralytics?
why the heck did that work?
ultralytics?
why does py works?
idk vscode requeires to put py in front
atleast for me
lol
is it possible to reference my cam here?
or http streams?
does it only support images?
I need to get coordinates of bounding box with the use of live feed how do I do that with yolov5?
made it work wth
it was as simple as this
from ultralytics import YOLO
# Load the model
model = YOLO("yolov8n.pt")
# Run inference with show=True to display the video feed
for result in model(source=0, stream=True, show=True): # show parameter moved here
boxes = result.boxes
# Process each box
for box in boxes:
coords = box.xyxy[0]
x1, y1, x2, y2 = coords
# Calculate center
center_x = (x1 + x2) / 2
center_y = (y1 + y2) / 2
print(f"Center: ({center_x:.0f}, {center_y:.0f})", flush=True)
I want to make a legit site about check template, it basically makes real checks based on certain templates. ppl can cash at the bank like i said its a real check ( i wanna make this for small business or big business or even bank workers can use this ai tool) but if the ai misses up even once that could cause a fraud check and the user bank acc could get closed and for me I could get sued. I'm thinking of using APIS cuz I don't know anything about Ai I only know web development ( still learning web development) but should I learn to make my own ai or use apis for now
lmao yep
that's what i was talking about
you don't really need pytorch while working with yolo
and use venv or docker
why not conda?
what error
unless you're sure that what you're trying to do can't be done without conda, I recommend not using it.
Yeah if u dont want to install like a couple gb of files then dont use conda it takes so much space
Just use pip
I see I'll uninstall it then
im using vscode do so when I open it I have to enable environemnt manually?
also where can I read about this environments and their purposes?
Vscode studio or vscode studio code
is what your code runs on
Visual Studio Code the blue one.
Also I tried running my YOLOv8 without enabling environment and it still works
does it auto enables?
is should say what python/other things do your program uses rn
I think this is a good overview for venvs: https://realpython.com/python-virtual-environments-a-primer/
yep conda automaatically enables
?
opened a workspace and cit says python 3.11.11 conda
yeah with pip it would o the same
also this shows up everyime I run my code
when I run my code it automatically sets the environment to venv
that's cool
U run your code out of onedrive?
@bold pumice hello, your message was removed for violating rule 6 regarding unapproved advertisements
People are more likely to respond if you pose a question or discussion topic.
Okay, what does it mean for an ai to be decentralized?
it means that its computation, decision-making, and data processing are distributed across multiple nodes instead of being controlled by a single centralized entity.
does anyone know any good resources for getting better at python specifically for ml
guys why do i get this Blank screen on opening a file in jupyter notebook someome please please please help me
What would the benefit of this be?
security, privacy, scalability, and democratization of AI technology.
I see the democratization part, but how does it help with security privacy and scalability?
if you haven't restarted the PC, try that
also see https://stackoverflow.com/questions/55152948/juypter-notebook-shows-blank-page fro some other useful notes
https://github.com/jupyter/notebook/issues/1627 also some notes'
i have been scrolling both these links for a few hours
what would restarting the PC do?
have you tried it?
No i am going to rn
Does jupyter have no log file
How can I see what is happening every time i perform an action
So I can see what is happening when I open a notebook
crazy thing is that jupyter lab works
but jupyter notebook keep giving me blank page
restart your PC.
oof, ok
tbh, I do not use Anaconda - are you using it because it's part of a requirement?
i have been working on it
but
i tried to upgrade it because it kept giving me warnings about the version
now its like this this
its something regaridng menuist idek something regarding that
idk what to do man
in general, Anaconda is only something I find useful if you are working with other people who use it, and you all depend on its behaviors to get things done. you don't need Anaconda to work with Jupyter, and I honestly find it to be more trouble than it's worth
if you're really stuck you could try uninstalling it entirely and reinstalling it fresh
should i delete it? can u help me configure
but if you don't really need to be using Anaconda as such, I'd move away from it
i dont want my code files to be deleted
your code is not going to be deleted, it's separate from Anaconda
im scared
then back up your files somewhere
and maybe don't do this if you're not able to think clearly
i will back them up
please help me delete annaconda and configure it again
it would also delete all the environments and its files right?
whatever stuff you've created in your own directories is not going to be deleted. if you have venvs, they may need to be recreated since the Python installation they point to may not exist anymore, but if you have requirements.txt or pyproject.toml files to go with the projects, you can recreate the venvs easily
to remove anaconda - https://docs.anaconda.com/anaconda/uninstall/
i need everything gone so i can start from scratch
wont deleting anaconda delete its venvs too?
i want no more conflicts delete everything and install it from scratch again
it will delete what the venvs point to. you can remove the venv directories themselves manually for each project, that's generally not hard
they're typically in a subdir of the project with a name like .venv
you don't have to use a particular venv
if i a create a new venv then its isolated right
just because the directory is there doesn't mean it's going to get used; you can ignore it (or delete it) and create a new venv
as long as you don't have the venv and your code intermingled it's fine
e.g., don't write code in the venv directory itself
and restart after uninstalling conda
i hate it so much so many conflicts i dont deserve it
what would that do
you don't have to use conda to work with any of these things.
it just ensures all the changes to your environment overall will take and be consistent
should i uninstall jupyter too?
if it was installed with conda it'll be removed with it
should i just delete all of these?
oh i just wanted to uninstall anaconda so i searched anaconda and
opened its file location
did you follow the directions I linked for how to remove Anaconda?
oh my bad i didnt see the link
nope not yet i havent
don't.
i just uninstalled it from the url u gave me
i did that its uninstalling
once done, restart, and then if there are any Anaconda directories left over, don't delete them but rename them - put a leading underscore in the name or something like that - so if anything useful remains in them they can be extracted. then reinstall Anaconda anew
this renaming trick is a useful way to "delete without deleting"
once you get Anaconda up and running again you can probably delete the renamed directories safelt
its taking so much just to delete again
it takes a while
its gonna take so much to install it again
what does anaonda even do
cant i run my jupyter notebook without it
yes, you can
you do not need Anaconda to run Jupyter, it's just a way it's often packaged and delivered to people
its too bloated
you don't need to use it if you don't want it, it's just for the sake of having a lot of common data science stuff for Python in one place
there's nothing in Anaconda you can't install manually
what do u work on
it's just, again, a way to package and deliver that stuff in a way that some people find easier to use because Anaconda simplifies it
okok
Learn how to run Jupyter Notebook without Anaconda for your AI project notebooks, streamlining your development process. | Restackio
it has a lot of work to do . be patient
it makes sense for installing but
ok sorry
It's done
I clicked on finish and
It restarted the system on its own
These applications are getting a bit cocky with the administration on my system
I didn't even give the permission to restart
Ok it's stated again
@bitter kayak can I make a virtual environment with jupyter alone?
Jupyter isn't used to make venvs
you just use the Python runtime, so you'd install a copy of Python from python.org and use that
Maybe I need to resintall anaconda
if you're not familiar with working with Python outside of Anaconda, it would be worth getting to know it directly
I would but I don't have time I need this working immediately
then I guess you can use Anaconda and cross your fingers
hey i just uninstalled anaconda but why are these files still here
remember what I said earlier about renaming the anaconda directory?
which one do i rename here
c:\users\aryan\anaconda3
I've got to go, I need to make dinner and take care of other things
oh
goodnight
one last question renamed .anaconda and anaconda3 but what about these probably related files from the c:\users\aryan\ @bitter kayak
anyone here got projects for data science on their github?
@bitter kayak i reinstalled it and everything and still get blank page?
You reinstalled anaconda?
I did
Hmm you tried via pip?
What with pip?
I don't know anaconda too well to be honest, always found it caused me more problems than it solved.
This one
Maybe I should delete this anaconda and download a specific version of jupyter notebook and use it on its own?
I pretty much do what is described in the link
Is ur interface like this? I didn't have this interface before it's a new interface
Haven't used it in a couple of weeks, but looks familiar
I upgraded anaconda and ever since it's giving me this interface and causing that problem
So, it goes blank when you go to a file and then try to open it?
Yep
It shows nothing
And it's only for the notebook not the jupyter lab
But the jupyter lab is shitty to work with
Is it? I actually prefer it
It's shit I hate it
What do you hate about it?
Let's not talk about this that's not important
I need to fix this
I am uninstalling anaconda again
I will download only jupyter
I mean it's pretty much the same interface...yeah, I am curious whether the pip approach will work for ya
does anyone know of any libraries for easily creating and managing flags/scores. My use case is for checking staticstics of users and determening a final score based off of various things e.g. when their most recent login was.
Image somewhat explains it
@bitter kayak hey, i uninstalled anaconda and installed simple python and then installed notebook
i still get the same problem
clicking on any python file even redirects me to the home tree
redirects me please help
somebody please help i feel like i am messing this up more
guys i finally did it
the mad lad did it
i installed an older stabler version
i hope it doesnt cause any conflicts with the libraries please god help me
If you haven’t tried Spyder IDE I recommend Spyder. The GUI is so crisp imo. It comes with IPython console that functions similar to terminal eg Bash shell. I haven’t tried the terminal feature but I think it has terminal functionality for Bash (Linux). And it’s fast
Spyder 6 is the latest release iirc
i will try afterwards someday
Hey mates I am 19 , i just bought m3 16/256 recently I know basics of programming languages like python , GOLANG.
I want to go in the field of aiml but I'm not going to any college , instead of this I am dedicated to learn by my self so my question is are there any prerequisites like is it necessary to know web dev to learn ml and last question is is my m3 16/256 enough to learn and get an internship or job in ml ai.
Sorry for my English 😅
what is a basic requirement of pc, for machine learning.
like not that high end one
Build with Visual Studio Code, anywhere, anytime, entirely in your browser.
😱
you dont really need a powerful pc and yours is good enough.
ML can range from stuff that can be done on a low-end laptop to things that require a multi-million dollar super computer.
?
Ohh gotcha but most people are saying that without degree i can't get into aiml jobs
U can't they r correct
yep, why do you ask
true
im shocked of online vscode
anyone ever used catboost?
i am having trouble trying to install it
it says i need a rust package? wtf
That's strange. Can you show the code you used and the error message?
Agree. Usually you can run catboost in Python no problems
I had python version 3:13:2
It's still under construction for that version
So I downgraded
It's working now, yeah?
for anyone that has worked on a poker bot or solver work does anyone know if there are many more modern techniques other than MCCFR?
and if you do know any can u link the study to me
you need to show it with matplotlib.pyplot.show
!docs pandas.Series.plot.line
Series.plot.line(x=None, y=None, **kwargs)```
Plot Series or DataFrame as lines.
This function is useful to plot lines using DataFrame’s values as coordinates.
you need to be using that if you want a line graph
the issue might also be that the index is discrete values
it doesn't make sense to have a line graph where the x axis is countries, because countries aren't sequential values.
@bold rapids are you sure you don't want a bar chart?
i mean to learn ml and ai on it
in that case, literally any computer that can run a web browser will work.
hey guys, does this backpropagation formula derivation look correct?
here a is activation, w is weight, b is bias, z is just a function of the relation between weights, activations, and biases, and y is the expected output
also L is the current level, with j being indexes through the components of the vector of neurons at the current level and k being the indexes through the one at the previous level
any mf familiar with kaggle
is it possible to download part of a dataset?
please ping me on reply
So I'd like to get started working with machine learning, but first I'd like to have a good math background before even stepping into it. I'm about to finish my schools Calc 3 course and I'm gonna take statistics soon. Do I need anything else in order to understand how these algorithms actually work?
Should I wait until I take linear algebra?
if you want to start with a good foundation then yes, linalg is integral to NNs
but it wouldn't hurt to start learning now and just accumulate more knowledge as you take the class and work on your own stuff
Alright ty for the advice
i keep getting this error in kaggle even when i send all stuff to gpu
Click here to see this code in our pastebin.
okay fixed it turns out i didnt send all devices to gpu
where can i learn how to finetune models?
It's a rule of thumb kind of because each neural network is different so it requires different types of data etc
ok but i know nothing about finetuning almost
so i should start somewhere and where do i start?
What is your model supposed to be doing?
i didnt think of what it should be doing but my goal is to finetune gpt2 to do something
it could give title recommendation
for video idea
What if you're trying to make it into a prediction wizard?
wdym
You give it data on what videos you watch for names and it depends on what year you're making for a video or you watch then you can ask it can you take all these names and try to figure out one name for a video that encompasses all this topic and it would predict what people may like for a name
my english is bad can you explain simpler pls
Why mean is you take titles that you like and whenever you're trying to make on YouTube and you can give it data to also predict outskirt events in the world for recommendation you may ask it could you make a video using data to tell when the next great calamity may happen?.
I'm sorry for being out there
what
Being crazy on the prediction idea
so youre crazy?
No I'm sorry if I sounded like I was crazy but if you can predict using current events outside it can generate a title so if there's something to be concerned of the network will generate that into a title when that can be used in a more complex prediction later down online
yay! what does it do?
it belongs to the category called supervised learning. in it i found there is this concept called decision trees.
using the decision trees i made this - a telecom company who has been getting customers dropping out on a yearly basis. using the ml model with decision trees. i was able to pinpoint the top 5 reasons the telecom company has been losing customers.
accuracy level i got was 98%
felt good doing it . so thought i'll come brag about my hard work in here 🙂
what was the reason?
reason for?
the customers dropping out
the telecom company had been losing customers
here: https://www.kaggle.com/code/ruforavishnu/project3-supervised-learning-decision-trees-churn
the project in jupyter notebook format in my kaggle
its there in this notebook.
thanks for the like. feels good to brag after all the hard work 🙂
Good work, however be careful when interpreting feature importance. With decision trees, you only get back the magnitude of how important a feature is for distinguishing between classes, not how important a feature is "for predicting label 1".
I want to extract names of technologies from array how can I do it?
["like", "amazon", "adobe"]
to
["amazon", "adobe"]
stopwords not enough
its more of words in list
maybe detect which are nouns?
oh they are companies not technologies
["like", "git", "asana"]
to
["git", "asana"]
list contains 15k of words
some having noise
sample
How do labels work for machine learning?
For supervised ML, they are the outcome of a set of features. A model makes a prediction, error is measured based on the difference between the prediction and the label, and the model is adjusted to minimize error.
What about deep learning?
Same thing
How do I label the data image, video, sound?
Depends on what you're trying to train the model to do
If it's classification, a common set up is to have a folder for each class
e.g: An "apple" folder that contains images of apples and an "orange" folder that contains images of oranges
Is there anything specific that needs to be done to the image or data?
With images, you typically resize them all to the same size
How do I do that sorry I try to learn how much about computers and I still don't understand most of the things that happen with them sorry
The libraries cv2 and pillow both have functionality for resizing images
yo i've been trying to understand this one part of back propagation, so for the cost/error function how do we set up the gradient
here C_0 is the cost of one training sample right?
so then would the gradient be a vector with the partial derivative of cost for many training examples, each as a component of the gradient vector?
essentially here what i have noted as cost is the error/sum of squares of differences from the expected outputs (i have it as cost cause 3blue1brown has it denoted that way lol)
Which one is better?
Cv2 is better than pillow?
!d pandas.io.formats.style.Styler.map
Styler.map(func, subset=None, **kwargs)```
Apply a CSS-styling function elementwise.
Updates the HTML representation with the result.
that documentation includes the following example (without imports):
import numpy as np
import pandas as pd
def color_negative(v, color):
return f"color: {color};" if v < 0 else None
df = pd.DataFrame(np.random.randn(5, 2), columns=["A", "B"])
df.style.map(color_negative, color="red")
Pylance, however, does not like this example, reporting that style.map does not exist. (i.e. ⏬ ) Is there a way to fix this?
Cannot access attribute "map" for class "Styler"
Attribute "map" is unknown
I’m making an AI and to speed up the process I’m going to speed up the game, I take 100+ every second so OpenCV needs to output 100+ FPS but it can’t handle it. I mainly need to use edge detection and calculate the distance to the edge. I have 4060TI and a I9-12900K. OpenCV also doesn’t use CUDA.
Is CV2 better than pil for machine learning?
is that so? I didnt know that. When re-reading your message I noticed the fact that you mentioned 'with decision trees you only get back the magnitude...'
When you said that? did you mean I should have used another concept in supervised learning rather than using Decision Trees?
You could have changed the wording of your conclusion, e.g. "these features are important". Or, if you wanted to explain which features are important for a particular class, you could have used logistic regression.
i am asking, isnt supervised learning just a method of deep learning?
Supervised learning doesn't have to be deep learning
oo okay i understand it now
OutOfMemoryError Traceback (most recent call last)
<ipython-input-32-a6bf242be06b> in <cell line: 0>()
14 for questions,attn_mask,answers in dataloader:
15 questions,attn_mask,answers = questions.to(device),attn_mask.to(device),answers.to(device)
---> 16 loss = gpt2(questions,attention_mask=attn_mask,labels=answers).loss
17
18 loss.backward()
14 frames
/usr/local/lib/python3.11/dist-packages/transformers/activations.py in forward(self, input)
54
55 def forward(self, input: Tensor) -> Tensor:
---> 56 return 0.5 * input * (1.0 + torch.tanh(math.sqrt(2.0 / math.pi) * (input + 0.044715 * torch.pow(input, 3.0))))
57
58
OutOfMemoryError: CUDA out of memory. Tried to allocate 96.00 MiB. GPU 0 has a total capacity of 14.74 GiB of which 78.12 MiB is free. Process 2698 has 14.66 GiB memory in use. Of the allocated memory 14.25 GiB is allocated by PyTorch, and 287.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
i tried everything but didnt fix
A guide to torch.cuda, a PyTorch module to run CUDA operations
how do i fix this?
historical words that are similar to each other according to the google news data
How can I make it so that I have an agent I need the game that I'm trying to make?
Anyone here used GCP (for databricks) I have some doubts regarding billing.
I was practicing Databricks (on GCP) last Saturday. After finishing, I terminated the cluster and deleted the workspace, along with the associated storage buckets. When I checked the billing on Sunday morning, the expected credits were consumed. However, today I noticed that the credits consumed have unexpectedly increased, despite no further usage. Can anyone help me understand why this might be happening?
what classifies deep learning is that there are multiple layers of processing rather than just one layer right
ayep
well, I wouldn't say 2 hidden layers is that deep
tho I don't think there's an agreed upon number such that an nn is deep if its layers exceed it
so it's really just if there are enough layers for decently advanced processing
something like that
I’m making an AI and to speed up the process I’m going to speed up the game, I take 100+ every second so OpenCV needs to output 100+ FPS but it can’t handle it. I mainly need to use edge detection and calculate the distance to the edge. I have 4060TI and a I9-12900K. OpenCV also doesn’t use CUDA.
I get around 140 FPS when I only take screenshots using d3dshot which stores it in ram.
So if I want to have an image seen by my network I have to put it in a folder which it would be my label?
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[1], line 48
44 test_set = DataLoader(test_set,shuffle=True)
46 cats_dogs_model = ConvNet().to(device)
---> 48 cats_dogs_model.load_state_dict(torch.load(r"C:\Users\mehme\cats_and_dogs_model.pth"))
50 cats_dogs_model.eval()
File c:\Users\mehme\AppData\Local\Programs\Python\Python312\Lib\site-packages\torch\serialization.py:1360, in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args)
1358 except pickle.UnpicklingError as e:
1359 raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
-> 1360 return _load(
1361 opened_zipfile,
1362 map_location,
1363 pickle_module,
1364 overall_storage=overall_storage,
1365 **pickle_load_args,
1366 )
1367 if mmap:
1368 f_name = "" if not isinstance(f, str) else f"{f}, "
File c:\Users\mehme\AppData\Local\Programs\Python\Python312\Lib\site-packages\torch\serialization.py:1848, in _load(zip_file, map_location, pickle_module, pickle_file, overall_storage, **pickle_load_args)
1846 global _serialization_tls
1847 _serialization_tls.map_location = map_location
-> 1848 result = unpickler.load()
...
514 )
515 if hasattr(device_module, "device_count"):
516 device_count = device_module.device_count()
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
im about to crash out
i cant run torch on my pc
started to happen randomly
Hello, I am starting in ML, I would like to work in a project to improve, send me DM
imo you dont have to learn and memorize activation functions at the start, since it doesnt really matter when you are using a lib like pytorch and have functions ready
can someone explain to me how in image captioners the rnn's can be used to turn a vector to a sequence?
im geniunely lost
which one is best to go with if i'm implementing a neural network from scratch
when you say "from scratch", how "scratch" are we talking?
pytorch?
numpy / jax?
C?
CUDA?
assembly?
creating a computer from scratch?
creating the universe from scratch?
if you just mean training your own as opposed to using one trained by someone else, just use pytorch
prob numpy
ngl assembly neural network would go crazy
good portfolio project
hey guys im using pandas and my dataframe is opening funny, idk what to do to fix it, its opening like shifted to the left one so the col1 returns col2 values and the col1 values are getting used as ig the index...
df = pd.read_csv('data/src/games.csv')
print(df is not None)
df.head(1)
Returning
AppID Name Release date Estimated owners Peak CCU Required age Price
20200 Galactic Bowling Oct 21, 2008 0 - 20000 0 0 19.99 0
pretty much only numpy, i'm only using pandas to parse data
what's an assembly neural network
You might have hanging delimiters, try index_col=False
df = pd.read_csv('data/src/games.csv', index_col=False)
Hi, so yeah this worked i forgot to mention that i found the solution, i had been misunderstanding and putting index_col=0 instead of False... lol
the result of pd.read_csv is never going to be None. print(df.empty) will tell you if the dataframe is empty.
How do I set a data label?
If the data are images, and you have them all in a folder, you can also have a CSV that has the name of each file and its label.
The label is not an inherent part of the image.
So all I need is a folder sorry because I'm still working on my convolutional neural network I just want to know so I can add pictures of my own idea
If you add your own images, and each image is a file, you'll need to make a text file to say which image has which label.
Do I have to use TKinter?
where did you get the idea that tkinter would be involved in any way?
Mnst diget data set
the mnist digit dataset has nothing to do with tkinter.
If you open the data file it shows Tkinter label
idk where you're getting that idea.
this is the copy of the mnist dataset from kaggle
I saw the data file when I was using the code Studio
you can see that there are images and labels in different files.
Is there any tool for filtering and parsing before data creation? I was trying to parse the texts.
what is the data? like literally. are you saying that it's currently plain text?
What description do I have to type and is it for the network or just for the human?
for the network, so anything that needs to have the same label must have exactly the same label.
So if I gave it an image of a fox or multiple photos do I have to make a separate label for the same type of image of a fox?
each image that you want to label as a fox needs to be labeled as such.
not sure what you mean by "type of image of a fox"
And the training is the same?
first we should impact what you mean by "type of image of a fox".
there should be no "type" that an image can have other than what label it has.
cat images. dog images. fox images. that's it.
if there is in your case, you're doing something too advanced.
What i mean is there's different types of foxes there's arctic fox red fox fennekin Fox etc
All are scientific articles in pdf. I wonder should I change to xml to proceed these steps.
And are you trying to extract tables from them, or what?
Not tables. A specific set of words like “special” ‘feed’ ‘insect’ ‘1928’ ‘1972’ ‘virus’ usage in the pdf files
Please give the most specific example you possibly can of an example input and what you want to extract from it
if a one or more PDF contains the sentence: ‘The special feed given to insects in 1928 led to a virus outbreak in 1972,’ I want to extract instances where the words ‘special,’ ‘feed,’ ‘insect,’ ‘1928,’ ‘1972,’ and ‘virus’ appear together in the same context. The goal is to identify patterns of co-occurrence within sentences or paragraphs. Ideally, I’d like the output to highlight such sentences along with their source in all of my PDF files.
Do you want to consider "virus" and "viral" as the same word?
What about "insect" and "insects"? Or "feed" and "fed"?
I have a list variations, exclusions of my words that to be searched in the files.
When a sentence or piece of information is referenced, rephrasing it may result in different wording.
So i keep adding them in my list. depending on the meaning and context it will change like virus- viral
@deft oriole sorry, I was making dinner.
I recommend using spacy to split the document into whatever chunks you have in mind. Then converting each word to its lemma, and counting the co occurrences of those.
There are different tools for converting pdf to plain text. Mileage will vary.
Hope you cooked the delicious dinner. I am planning to convert it to XML. Will try those spaCy and try variations.
Ohh thankyou
I made a chicken bowl like the kind you can get at chipotle.
you just have to get the marinade
^
guys when we ssh into a server, how do we run an ml model in gpus?
how do we know how many gpus is required for our LLM model training or inferencing?
how many times does a gpu speedup the process as opposed to using a collab or cpu?
how do we run an ml training on a server on gpu overnight, connected to ssh but close out local system?
ping me on reply
is there a place where i can take paid help ?
m sure this isnt right place to ask.
I am trying to detect bottlenecks or critical points in an entailment graph that correspond to independence results. One approach could be using graph-theoretic measures like betweenness centrality to find nodes that many inference paths pass through. Would that effectively highlight points where logical dependencies concentrate? Another possibility is looking at articulation points and bridges—would their removal reveal independent statements? Low-degree nodes might also be worth investigating, especially those with few incoming edges, since they could indicate statements not easily derivable. Topologically, would persistent homology or cycles in the graph provide insight into independence? Finally, are there any other logical or computational techniques that could make this more precise? I would appreciate any thoughts on better ways to approach this.
a neural network coded in the assembly laungage
assembly is the lowest possible (i think) coding laungage before machine code
machine code is the 1's and 0's
I would treat it as a red flag. it shows that the applicant wasted time learning the wrong thing in a misguided effort to look skillful.
there's so much to learn about AI that it's a waste of time to go that deep into assembly.
bro sounds like me back when i was spending all my time just implementing data structures from scratch
then some dude came in and send i don't need data structures much for ml
i thought damn i really wasted all this time for nothing
it's worth implementing data structures from scratch so that the DSA course that you'll have to take for a CS degree is easy for you.
yeah i guess
i did it in java, the one i'm taking next year will be in c++
lame
i haven't learned c++ yet though so i need to learn it
only language i know decently is java
python too but i haven't used it much, i need to start using it more
C++ is marginally more useful for ML than Java, which is effectively useless.
yeah that's what i've heard
cause of c++ extensions right
i just did it in java cause it's what i knew best
i'm probably never touching java again though lol
I think models for like self-driving cars are trained in a python simulation, and then the model is "compiled" to C++ for deployment in the vehicle.
you can write extensions in C++, but you probably won't need to given platforms like numba.
it's becoming increasingly popular to write python extensions in rust.
polars polars polars
I want to make an application where I can picture and it extract the text from photo how can I achieve this thing
the application youre saying is named ocr (optical character recognition)
and if you know nothing about aiml and start with implementing your ocr, dont do that.
but if you know some stuff and wanna do it, try and do it on your own by researching or/and if that doesnt work out, follow a tutorial
im pretty sure theres one out there.
im wanting to setup vscode with a coding assistant but i see a ton of options. like copilot, codeium, aider, cursor, and others. im wondering what you all like to use and why. mostly as a coding assistant, autocomplete and helping spot errors. or write small blocks of code. occasionally asking to write larger blocks to see what i get.
ive been using python for almost 10 years so im already really familiar with the language as it is.
can someone help with this please:
i tried to learn python
i ended up with severe brain damage
please someone help me unsee this awful language
@tame scarab were you trying to learn python for data science?
in either case, if you decide you want to learn python, there are plenty of people here who would be happy to help you. but if you're just here to rage bait, kindly stay away until you change your mind.
Hi can I ask a question if ur available?
idk what the question is or if it's something I know anything about.
Thank you. I am trying to make an image recognition model for Go board scanning and I don't know how to train a model for that task.
me either.
it will probably have something to do with a convolutional neural network
I don't. I think I know how it works, sort of.
the board is always a grid, is it not?
yeah it is
then you probably don't need AI for this. you can impose a grid structure on the image and see which tiles are mostly black or mostly white.
What library can I use for that?
PIL might help
Lemme study on it. Thx for help
It's not a good portfolio project to write a neural network in some language just because it's perceived to be "low level" and "difficult" (neither of which are true). If you could show that this somehow had an advantage over writing them in Python or C or something, then I would be interested (but there is no advantage). It's important to remember the goal, which is to make a machine learn, and do so better than before, not write it in X language. It's like the goal is to get to the moon and you are showing off how you did all the math using a slide ruler. Sure you can do that, and it will work, but will that convince me that you are somehow better than other candidates?
so implementing a neural network with just numpy wouldn't work as a portfolio project
it would just be to learn
why do some people genuinely hate python just because it's an easier language
i never understood that
“There are only two kinds of languages: the ones people complain about and the ones nobody uses.” - Stroustrup.
When working with langchain, and the create_react_agent() function from lang graph, does the short-term memory automatically update itself after each message session?
E.g.
memory = MemorySaver()
agent = create_react_agent(llm, tools=tools, checkpointer=memory)
Hi, can you recommend me a book or free course on machine learning that has a strong emphasis on mathematics and explains things like regression classification or svm in detail (these are topics that I know exist, but apart from basic linear and logistic regression I know nothing about)
Check the pins
"ai will break" write my words somewhere
What do you mean?
?
i think so much ai shit gonna come up that we will start feeding ais with ai data and that will break it
mark it somewhere
That won't "break" it, but it will cause new problems
ai do not break bro
not working properly = break
That's a very expansive definition of "broken". Models, by their nature, can't be guaranteed to do a certain thing perfectly every time
So in that sense, they're all broken.
But if you mean "model feedback loops will cause LLMs to start generating grammatically incorrect or incoherent text", I think you're wrong
yep that
i think that
That's a much more specific claim than "AI will break"
well i said feeding ai with ai, would give that what u said
Generative language models are an astronomically small subset of "AI"
FOlks
Please I need a response
Is the memory automatically managed?
I think people will select and clean data for LLMs better in the future, instead of just feeding everything they can to them
So I think this is unlikely to happen
There isn't a surefire or even a good way to tell if text is LLM generated or not, and training LLMs requires more text than can be manually curated
True, but I don't think the goal here should be "filter out ALL LLM text", the goal should be "filter out the most garbage LLM text" so it doesn't affect the final model as much (at least not enough to "break" it)
And I don't think that'd be nearly as hard of a task (not to say it would be easy, but it wouldn't be impossible)
p sure most dataset makers all at the very least filter for the crap sentences (unless there's a reason not to, e.g. specifically trying to mimic internet talk)
for a while now people are going one step beyond, also removing correct but overused phrases llms tend to generate to varying success; but at worst, including this type of ai slop won't make it incomprehensible, just bland
So I have to make a separate folder for each image with a label or do I make an image with in an area with a bunch of labels
If you make a folder for each label and put all the images with that label in the folder, that will work.
Hiii i have a doubt while doing a prediction project in kaggle I should do feature selection always. Is this approach correct?
Guys, i have a problem. I use Spearman to calculate the correlation. And i think the data set is fine, but why did i got p values: 0.0? Like, is that even possible?
no. sometimes all features are helpful, for example
it means that p is so small it rounded down to 0
so basically you can pretty confidently reject the null hypothesis
Does that go with any type of data sound data video data sorry
it would work as a way for you to understand how they work
i did that when i first started learning aiml
it made me so frusturated but made me learn so much about them
so unless it starts taking too much time
try to do it
I'm trying to use tensorflow for a little project im doing. for some reason its never recognisig them. im using mac and vs code and its definitley installed via pip3
Does the code run? I don’t always match my IDE kernel to the one I run the program with
It runs and then spits out like a million errors
hey guys how you learn ML? like any sources?
just to get more information and resources
if you're interested in neural networks, i found this course really fascinating:
https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks
Additional funding for this project was provided by Amplify Partners
Typo correction: At 14 minutes 45 seconds...
I watched it I know
I meant like how you guys bulid this stuff? where you got theory?
did you do the number image recognising program?
well with guide yes...
but I don't think I understood it fully
try make another neural network with somehthing else if you want
just try find a data set and then see if u can do it
hm alright, perhaps I should try
and you should probably watch more videos.
I know, that's why I'm asking this question
is this jupyter notebook?
if yes that happens to me too, it looks like its not recognizing but it imports normally
I have had verry little success with Neural networks, but when I do I usually only understand stuff after using multiple sources, because they're all so complicated.
for real
StatQuest is the best resource
he simplfies them while making everything so clear
if you want you can try it with the quick,draw! dataset
and make a clone of it even
mhm, thanks for advice
So I'm doing a Data Science cource online that discusses the topic of Random walks vs a sequence of random events.
Is a "Random walk" considered a formal concept in Data Science?
~~I only ask because I want to ask a question in #1035199133436354600 .~~edit: I think I got it.
Please at me if you have anything.
yes, "walk" has a specific meaning in graph theory.
Survey things ML/AI can solve
Find a topic you're interested in
Gather resources (books, videos, papers) that cover theory and implementation
Build something using resources
papers!!!
yow how do i make it so that when i change the name of another w classs on 1 file iit changes the name on all otherr files?
I don't even know where I can find his channel
StatQuest on YouTube
if youre on vscode, select the name then right click, you should see a "rename" option
no it's not
oh then idk
have you restarted your vscode, im pretty sure youre smart enough to do that but incase you havent
i have yeah lol maybe ill try again and just hope for some luck
Yeah, I tried searching for him on YouTube, but it doesn't look like his account.. Do you mind pasting his link?
https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw
it was the first google result for "statquest youtube"
Statistics, Machine Learning and Data Science can sometimes seem like very scary topics, but since each technique is really just a combination of small and simple steps, they are actually quite simple. My goal with StatQuest is to break down the major methodologies into easy to understand pieces. That said, I don't dumb down the material. Instea...
Thank you 🙂
hell yeah bro
i used this too
3blue1brown is insane
the 3blue1brown playlist should've put you in the direction you need to be for learning theory
from there you know what to learn and dive deeper into
he's so good i love him
did u select
teh right environment? also recommend creating a venv if u havent already
or if you get anaconda i believe all of those packages are installed with it
but i'm not 100% sure
does anyone have a good resources that teach you enough theory on neural networks to self implement them from scratch but one that doesn't show any part of coding it
Check the pins
like this?
?
I wasn't talking to you when I said that.
I know I just was wondering about graph theory
How on earth am I supposed to know that when all you wrote was "?"
I forgot to add things and just it's been a long year
You can read this if you want to know what graph theory is. https://en.m.wikipedia.org/wiki/Graph_theory
In mathematics and computer science, graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices (also called nodes or points) which are connected by edges (also called arcs, links or lines). A distinction is made between undirected graphs...
thank you
also is this correct
I can't really tell what I'm looking at in that picture.
You've been trying to train a neural network to classify images for several months. It looks to me like the help people have tried to give you here hasn't gotten you any closer to that goal. I don't really know what to do.
Is English your native language?
yes
Are you in high school or what?
No I am not in any type of schooling as of currently
is there somewhere that you could audit an in-person ML course?
An image of a fox and it's label in text format
I believe so a local academic maybe
hi
which model is more suitable for finding out most similar sentences? the dataset is a software - bugs dataset
basic embedding model maybe
theres a ton of embedding models you can run locally on ollama
Thanks for the recommendation, he really does explain everything nicely
And the videos are organized into playlists too, which I love
how about sentence transformers?
are there any simpler models?
was thinking of KNN approach but did not yield good result
what do you guys think of kaggle?
it's good.
just that? do you use it yourself?
I recommend their pandas tutorial.
have already done pandas just wanna become better with predicting things and maybe building powerbi dashboard since a lot of companies seem to love those
glad i could help you and get that man a well deserved new follower :)
Is it possible overfitting can still occur if our validation set is exactly what we optimize for? Do I need ( and why do I need ) to plot train val loss and train val acc during training
why can't it? I'm not sure if I get you
the model of your choosing trains on the training dataset and tests if it's good on the validation set; very good scores on train with very bad scores on validation is a pretty clear indicator that it's overfitting
im pretty sure i selected the right environment. ive never created a venv before, does it change that much?
ill try that, thanks
no, however i found a massive set of training and testing data for images of sign language and im currently trying to make a program that will detect signlanguage in real time using a pi with my friend. i can link the resources if you'd like
also if u manage to finish it before me please help because im stuck rn 😭
tensorflow as tf 😂 tf like the fuck
no way.....
Which project they are working on ?
Check datacamp, they have powerbi courses that are prett decent
oh that sounds cool, yeah can you send it here
currently i'm working on mnist classification, this is gonna be my first neural network and i'm working on implementing it from scratch
also for anyone looking for a good understanding of neural networks, this is a really good resource that 3blue1brown recommended, so far i've only read the first part which is just about perceptrons, and it's laid it very well and makes it super easy and intuitive to understand
@serene scaffold i think this would be a good pin
I'm at my wits end, and I'm hoping someone here can give me some insight. I am attempting to implement an "anonymization" software that uses InsightFace, ONNX, PyTorch, and OpenCV2. I did not write this software, it was provided to us by the client. (I'm primarily a .NET developer, and only know the basics of Python). I am able to run it (albeit slowly) on my local machine using WSL. We have a GPU-oriented VM in Azure that I'm attempting to get the software running on. However, when it tries to call the InsightFace scrfd_10g_bnkps detect() function, it spits out "Invalid handle. Cannot load symbol cublasLtCreate"
I have verified that my PyTorch, ONNX, and CUDA versions all match (12.6)
Hi, can I ask you about datacamp courses? I wanna try their ML Engineer course, what do you think about that?
I haven't done any of their ML courses, I have done some on PowerBI, SQL, Microsoft Azure and they're all pretty decent, consisting of some videos, multiple choice questions, and code exercises. The quality kinda depends on the person who made it.
But the courses I've done are more about learning the syntax, and some basics of a language or framework. ML can be very in-depth, so I'm not sure how their format is fit for that.
Maybe try codecademy, see if you like that format, and then maybe pay for datacamp if you prefer to use datacamp @plucky crane
Free?
I have a subscription from work, I think datacamp is payed, but not 100% sure
Please who has experience with langchain here?
My agent gets stuck in an infinite loop where it keeps calling tools repeatedly. Wasn't like this before and just started haappening all of a sudden.
Welll if the train accuracy is bad there’s prob little/no chance the model is good. But it could be good and the model still not be good
not sure, but if you open your notebook in vs code it might work
Make a folder called "fox" and put your fox images in there
Wendigo, before 2025 is over, we will have helped you build that CNN
I believe in you
i'm not sure if bro has done this yet or not, but a good textbook that makes the theory intuitive can be a game changer
neural network theory is so damn interesting
thank god i forced myself into this field, cause now i can't see myself doing anything but this
Do you have book recommendations for Wendigo?
i'm not too sure about cnns cause i too am just getting started, but if he hasn't fully learned the theory behind neural networks in general then the 3blue1brown nn playlist is good, and an online textbook he recommends and one that i too find really nice is this
haven't gotten too far into it, but the perceptrons section is very beautifully written
take derivatives do matrix mult
take more derivatives and do another
repeat till its over
xD
lol
i mean there's more to it than just that
but yeah in terms of math that's pretty much it lol
but i've started to like that too, ml has made me like math
Hello
Can anyone teach me how to apply regression in python....?
I know the mathematical concepts of it
you want the package version or by hand?
Just some bassics
Okay I will try it
Well I am new to python
oh
Yeah package version is a few lines only
can say that
Didn't know u needed raw. if u need raw then it could be good amount of work maybe
Start with package version.
I just want to know that how to apply regression on any data I have
scikit learn is the package name if u need to look it up'
Okay
Yes then use scikit learn
import numpy as np
x = data_x
y = data_y
# Add a column of ones to x for the intercept term (if not already included)
if not np.all(x[:, 0] == 1): # Check if first column is all ones
x = np.column_stack((np.ones(x.shape[0]), x))
# Calculate beta coefficients using the OLS formula
x_transpose = x.T # Transpose of x
x_tx = x_transpose @ x # X^T X
x_tx_inv = np.linalg.inv(x_tx) # (X^T X)^(-1)
x_ty = x_transpose @ y # X^T y
beta_hat = x_tx_inv @ x_ty # Final coefficients
# Predictions
y_hat = x @ beta_hat
# Print results
print("Estimated coefficients (beta_hat):", beta_hat)
print("Predicted values (y_hat):", y_hat)
is basic implementation of ordinary least squares
@ is matrix multiplication
if you do normal * then its elementwise
Thankyou soo much...!
Got it
Hello, I’m running evolutionary algorithms to find optimal features in a dataset for a DNN. I swapped to parallel processing to speed things up (otherwise it would take days to execute) and the results are slightly different to sequential operation. I’ve try to make it as deterministic as possible, but exact reproducibility is hard to find. I’m not that experienced with parallel programming - is it silly to hunt for exact reproducibility ? Is there differences I simply have to accept ?
Where to start and learn ai/ml from?
import numpy as np
arr_f = np.array([1.0, 2.0, 3.0], dtype=np.float32)
result = arr_f * 1.5 # 1.5 is a Python float
print(result.dtype) # Output: float32
Is this expected in numpy? I though it would treat python floats as np.float64 and would promote the array.
Depends strongly on your background
You'll definitely want to have a pretty solid foundation of the relevant math
FOUND IT FINALLY
Im in the beginning stages of utilizing python for ML/DS - does anyone have a code that I can use (API free) to have a chatbot in my linux (fedora 41) terminal?
Hello. I don't recommend doing a chat bot as a beginner project.
I don't think he wants to code a chatbot
think he just wants a chatbot in the ide with access to the files and codebase
who are we to say what he wants? we don't know his life.
Linear algebra, stats, probability and calculus it is right?
Was looking for some books that cover these from scratch to a good lvl
you're right
Yeah, these are the most important
There are more to it?
ML is a big field, it uses math very extensively
Depending on your current math level, I'd probably jump into methods like linear regression using gradient descent, k means clustering and others right away
And you can learn math as you go
Where can I from?
Idk, have you ever taken a lin alg or calc course? What's your background
I personally just learn through Wikipedia most of the time
I am engineering student, but havent really studied well for maths. Just only to pass the tests with good marks and so I dont recall anything
Although I might recall once I getinto it again
Wikipedia for maths…
Yeah, a lot of their pages are very high quality
not all though, especially for more obscure stuff books can't be replaced :p
Hm, okay
Well, I'd say try your hands on making a linear regression solver from scratch with numpy or something and see how it goes
Sure but thought could learn some maths
Btw taking a udemy course would be a good idea for ml? Or the andrew ng? Or something else
I wouldn't buy any course
Andrew NG I've heard good things about, you could also look for university lectures
or just get a book
3blue1brown on YouTube has good videos on calculus and linear algebra, he explains them intuitively and with visual examples
statquest on YouTube explains statistics well, if you just search "statquest playlist" on YouTube you'll get results both for his Statistics Fundamentals and Machine Learning playlists, I like his explanations a lot so far
Whot book
Have used them a bit for some clg courses, will check the playlists
I can't recommend anything specific unfortunately
i second this
3blue1brown's craft is absolutely impeccable
Best educational channel on YouTube in my opinion
agreed
when using sentence transformers what text pre-processing steps are not required ? like lower-casing, removing stop words, stemming/lemmatization -- correct ?
Yes
is there a way to label 70k images automatically, manually it will take forever recommend what can i do to automate it its ok if its paid
^^ even i wanna know
Train deep learning model on a sample
If performance is good, use it to label the rest
You would have to manually label this sample
ohhhhh
may i ask, i want to know how feasible is our thesis. it is a implementation of computer vision to detect 13 vehicle classification and count them going inside the premise and will be log to database. also will raspberry pi with hailo kit enough or do we need jetson orin nano for it?
This is a well worked problem that could be run on a simple laptop
for datalabeling you can use aws groundtruth, it is not cheap though
U don't need special hardware
Ideally there are zero shot models that can already do this
I'm speaking from a language model point of view. You will have to see if such a thing exists for deep learning
Using OpenAI's GPT-4 , which has Vision capability is also a good and cheap option
we will be implementing it so it requires microprocessor, but i dont have an idea if it will work in rpi with hailo kit (easy to find in our country) or we need jetson orin nano (harder to find/quite rare)
yo guys, currently i'm working on implementing a basic feed forward neural network from scratch (just using numpy) to classify digits from the mnist dataset
i'm getting this error message when taking the exponential of a vector, should i just disregard the error message since the array outputs just fine
(as you can probably tell, this is just the set up of the multilayer perceptron, haven't started backpropagation yet)
You can literally run detectron locally and this is detectron....so no u wouldn't need hardware that is specialized
That's a warning, not an error per se.
If you're having an overflow, that might mean you've exploded the gradient.
oh i see
exploded the gradient?
wait what does that mean
also is changing the datatype to float128 in order to increase capacity necessary
Exp is blowing out. Consider using this activation instead to account for exceedingly large/small inputs:
- sigmoid(z) for z > 0
- exp(z) / (1+exp(z)) for z <= 0
Essentially the second function is the same as sigmoid. Now instead of blowing to infinity activations are approaching assymptotes
yes
just like bert
oh i see
yeah i've seen the second form
it's just what you get when you multiply e^z to the top and bottom
but how would this work though, cause it's an array
would i have to index through
What exactly do they do in artificial intelligence? I coded with sklearn and keras. But it seemed very basic to me. 🤔
I can understand that these are based on algorithms like gradient descent and that they are difficult to code.
Am I missing the point?
I can solve problems with high accuracy on Kaggle.
Actually, I had a hard time understanding the transformer model. Also, it seems quite difficult to translate it into code. Maybe this is the real problem. 👀
AI professionals are solving harder problems than the ones that can easily be solved with a few lines of sklearn or keras
And sklearn and keras are hiding lots of complexity
Like autonomous vehicles?
Sure
Solved
oh lmao i didnt even realise
tf as in tf am i doing 😭 😭
yeah i got it from mnsit too
Can anyone suggest me some really awesome projects on data science as I am new to this domain , want some impressing projects on my resume
live stock market value scrapper and analyzer
it contains:
sql
web scraping
frontend
backend
stats
It's already being done by my classmate,so I can't take the same project as per the college rules
But thanks for the suggestion
How do I learn Python for data science? I am new
Is there any way to stream live data to streamlit from a database and update metrics without having to relaod the entire streamlit application?
yup its possible
use sleep function while you are fetching data and then use st.rerun()
ah i see, thank you
Hello and welcome to our wonderful python server. This is the data science channel. maybe you were looking for #python-discussion
Hey everyone
I have a question. Many of the developers make Jarvis and other chatbots but why we can't call them AI whereas we called chat gpt and deepseek as a AI
oh i'll probably do this right after i finish the number classifier
are you doing it from scratch or with any frameworks?
also were any of you able to get git lfs working? i'm having some issues with it because of the mnist numbers csv file being higher than the limit but git lfs for some reason isn't working for me
i'm periodically pushing all my programming files to a private github repo for when i switch devices
my bad if this is the wrong channel to ask this question, i just thought that here there would be others who have done this as well since csv files containing training data are typically huge
The Jarvis in the Marvel films is an automated general intelligence. Nothing that currently exists--including ChatGPT and deepseek--is anywhere near that.
AI has imo become more of a marketing term now that doesn't really have meaning
all mainstream AI you see right now, including ChatGPT, Deepseek, Claude, Gemini, Mistral, Llama, and way more, are LLMs (Large Language Models) which excel at generating convincing text
currently companies are trying to make them good at other tasks beyond "simply" writing human-like text, like reasoning, coding, etc. to... let's say less than ideal results
the results are on par with human ability for some tasks.
that is true; though there's still a pretty far distance between current llms and the likes of Jarvis as seen in the movies
i think he was talking about this jarvis lol https://jarvis.cx/
Boost productivity with Jarvis-Best AI Assistant Powered by ChatGPT: Instantly translate message, improve readability, shorten it and ask anything directly on the input text area
cause he said jarvis and other chatbots
iron man's jarvis in real life would be insanely cool though
my interpretation is that he's referring to the bjillionth "I made jarvis" videos on youtube
You can't try to hire people in this server. Be sure not to ask again.
okay
@serene scaffold
!ban 582700648152432640 soliciting employment in DMs after being told not to use this server for that.
:incoming_envelope: :ok_hand: applied ban to @frosty finch permanently.
Im not sure if theis is exactly the right place to ask but ill try to articulate this carefully
so, long story short im helping my friends girlfriend with her masters research capstone thingy and she's like bad at tech
her masters is like in gender studies and shes gonna cite my tech contribution so its not like a academic dishonesty type thing
but basically im making a python web scraper for the chinese site douyin that will search videos under some hashtag, take a sample then record every comment under each video.
ive already created a tree like data structure for the comments that is nice and searchable
so that shouldnt be a problem
however, when i began to try to figure out user data collection (just public profile data) i came into a problem, theres captcha pretty frequently. while i did figure out a way to solve their captcha using image interpretation i though that would be a pain in the ass so i told her to see if she could access douyin's open platform sdk
we got approved for it, however, its only accesible via java, go, etc, not python or c++ that ik how to call in python
so yeah i guess my question is should i just code a spearate java application that takes a list of users gets their data outputs as a csv and then have python parse that or figure out a way to handle java call within python
thanks in advance
also in case anyone has used the sdk and has any tips please let me know as all the documentation is in chinese and i cant read it lol. i doubt it though as it seems pretty unknown outside of china
Java first, python after. That's meta
ok sounds good, btw what do you mean by "that's meta" in this context
As in how it's done in enterprise
Hello.
Has anyone here used CrewAI to build agents? I need some help regarding using tools
hello guys do iam a demon of trying to code a tiny ia who have to get a low "final loss" with only chat gpt
if your intested of what it gived to me after 1 hour :
Click here to see this code in our pastebin.
Here are the main steps of the algorithm:
Initialization: The population of neurons is initialized with random subsets of features from the input data.
Training and elite selection: The neurons are trained on the data, and the best-performing neurons are selected to form the elite group.
Mutation and crossover: The elite neurons undergo slight modifications (mutation) and exchange information with other neurons (crossover) to generate new neurons.
Extinction of the worst neurons: The worst-performing neurons are eliminated, and the population is renewed.
Parameter adjustment: Based on the elite’s performance, mutation rates and other parameters are adjusted to guide the evolution of the population.
Convergence: The algorithm continues until the population of neurons converges to a solution with a low prediction error.```
basically i asked him to mimic the evolution of a group , keep the more intelligent and kill the dumbass the intelligents neurons create new fresh neurons , then i take the answer of all the intelligent one and i ponderate the result
but i dont know how i evaluate if its good or not
the minimum it reach was 7 percent error rate and max 20 percent sooo not stable but i dont save the training in a json / db sooo maybe i will try to do that
hey, does anyone know any good resources/websites/videos/books to learn content from https://ioai-official.org/wp-content/uploads/2025/01/Syllabus-2025-Final.pdf
its about Classical Machine Learning, Neural Networks & Deep Learning, Computer Vision and Natural Language Processing
theres a lot more detail of the subtopics in the pdf
just to make sure, for the first column in the training data file, the numbers correspond to letters with 0 being A and 25 being Z right
and how come there's only 9 pixels rather than 784 like in the mnist numbers dataset
Is there any good tutorials on how to train ai, specifically a chatbot?
chatbot to do what?
respond to small talk and make conversation.
"chatbots" that are interesting enough to count as "AI" require enterprise hardware to train and absurd amounts of training data.
Oh okay well is there a way to just make a super dynamic small ai?
like maybe 50mb of training data?
You can't make anything dynamic with 50mb. let alone super dynamic.
if you're interested to learn about AI in general, and you're willing to start with projects that are unremarkable, I will help you.
well the issue is im coding on a not very good laptop.
I woud love to learn more about ai right now mine are all rule based
But I would love to if 4gb of ram on a windows 10 is good enough lol
you can do all your learning code on google colab
Ive been very confused on how to use that in the past
do you understand how notebooks work?
like real life ones...???
code notebooks
colab is a notebook.
at least, it's a notebook platform.
where you can run segments of code individually as you're writing it
oh okay
how many images will i need for each classes? it will be our research/thesis to detect and classify vehicles going to premises and also count them for traffic survey
How can I fix this? I have torch cu126 installed and cuda 12.6 update 3 installed as well as the system variables set to the appropriate cuda version
installing triton for windows rn
There's a website called Kaggle where you can learn this stuff. And the first thing their course teaches you is how to work with notebooks, so might be worth checking out
