#data-science-and-ml
1 messages Β· Page 128 of 1
Oof why did my copy have it commented out
since when does pygame throw seg faults
π Copy paste did me dirty
Well I dont think it chooses to
It is likely it trying to either talk to a DLL which isn't there
or some UB internally
since always, they're a real treat too, but it might not be pygame that's throwing it, it might just be catching it from elsewhere
how do i fix it
Hope's and prayers
and a lot of commening lines in and out
π Welcome to UB related errors
should this be a bug report
as I said, they're a real treat 
UB?
Undefined Behaviour
can you share the segfault?
I.e. Null pointer, out of bounds memory access, invalid DLL read, thje list goes on
(also pygame-ce π)
Segmentation fault (core dumped)
The main issue with them is just because one line runs into the segfault, it does not mean that line caused it
!trace
Please provide the full traceback for your exception in order to help us identify your issue.
While the last line of the error message tells us what kind of error you got,
the full traceback will tell us which line, and other critical information to solve your problem.
Please avoid screenshots so we can copy and paste parts of the message.
A full traceback could look like:
Traceback (most recent call last):
File "my_file.py", line 5, in <module>
add_three("6")
File "my_file.py", line 2, in add_three
a = num + 3
~~~~^~~
TypeError: can only concatenate str (not "int") to str
If the traceback is long, use our pastebin.
there is no stack trace
it can be the result of a earlier action that then created the undefined behaviour
what's the output then?
this
that can't be all of the output
pygame 2.5.2 (SDL 2.28.2, Python 3.11.9)
Hello from the pygame community. https://www.pygame.org/contribute.html
/home/sarati/Programming/rust/ven311v/lib64/python3.11/site-packages/torchrl/data/replay_buffers/samplers.py:37: UserWarning: Failed to import torchrl C++ binaries. Some modules (eg, prioritized replay buffers) may not work with your installation. If you installed TorchRL from PyPI, please report the bug on TorchRL github. If you installed TorchRL locally and/or in development mode, check that you have all the required compiling packages.
warnings.warn(EXTENSION_WARNING)
device: cpu
Segmentation fault (core dumped)```
that warning comes with and without the seg fault
can you switch to pygame-ce real quick, because the other pygame removed parachute
(and also because it's better anyway π)
pip uninstall pygame
pip install pygame-ce
Unrelated but what is the backstory behind pygame and pygame-ce π Last time I used either, pygame was the only one around
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
Tbh if I am debugging seg faults, a lib saying it failed to load C/C++ DLLs would be a good place to start
Well that is a much better error
my environement is duct taped together im afraid to touch it
π That is a dream of a seg fault
wat
never seen one that long
That full dump gives way more info than just "Segfault xD good luck"
Are you on linux?
unfortunately
My guess
is torch is doing some threading related stuff
and pygame is not happy about making various calls from different threads
Does anything change if you set:
torch.set_num_threads(51
torch.set_num_interop_threads(1)
so i found integration bug?
Well, you have (maybe) found that not everything in Python is threadsafe
no
I swear i got this one time and it was releated to the autograd in the c++ python files.
wait, how do you find the the size of each word embedding?
well, the TL;DR is that the maintainer of that repo decided to ban all the core contributors due to disagreements regarding 3.11 release... so yeah, we just moved on to create a fork and continue development as pygame-ce pretty much, you can read more into detail here: https://www.reddit.com/r/pygame/comments/18xy7nf/what_was_the_disagreement_that_led_to_pygamece/
interesting
normally the embeddings are all the same size
Sorry I typo'd I meant to set each to 1 idk if you did that or not
either see what the relevant papers use or experiment
still error
only choice and only solution in all honesty
what?
Rip, Err you could possibly try launch a debugger that can show the executing threads
I know pycharm supports it
idk of standalone ones or ones that aren't premium/paid for
@spring field mr pygame dev do you know how to fix it?
it's not an issue with pygame(-ce) as far as I can see
Other thing you can try is turn the pygame logic into a single threaded actor
and send updates via a queue
so all the pygame logic sticks to the same thread
I need to go to sleep though so I can't help much more than that until tomorrow π
i appreciate you trying
im on a bit of streak of getting random failures
even though its caused by a pygame line?
where i gotta put bug report?
Unless you get the error on a smaller file using on pygame and no pytorch
I suspect it is not going to be labelled as a fixable bug regardless
where do you see it being caused by pygame code?
It will be put under "Don't use this library across multiple threads"
Just to be sure, make another file that does the basic pygame logic without all the AI and pytorch stuff
#screen = pygame.display.set_mode((screen_width, screen_height)) this line commented out fixes it
then do the same for Pytorch
then that will probably tell you if it the specific combo of PyTorch threading + Pygame causing the issue
Pytorch does a lot of magic that can cause thread safety issues and various UB, so I would probably put my money on the two libs not playing nicely together
but without doing weird thread duct taping im am screwed?
That is gives them more significance?
You tried? torch.set_num_threads(2)
torch.set_num_interop_threads(2)
i tried 1 and 1
can you put a bunch of prints around to see how far the code gets before segfaulting and send that code over as well
i can do you better
hold omn
import pygame
import torch
from torch import nn
screen = pygame.display.set_mode((9, 9))
class BearBrain(nn.Module):
def __init__(self):
super().__init__()
brain = BearBrain()
optimizer = torch.optim.Adam([1], lr=1)``` @spring field 9 lines of code
causes a seg fault
bad news, I can't reproduce a segfault, it just exits with 0 π¬
can you tell how far in that code it gets?
fails on the last line
this new code fails with a different error for me btw
what error
TypeError: optimizer can only optimize Tensors, but one of the params is int
yes thats expected
so I assume it segfault before getting that far for you
yes
Fatal Python error: pygame_parachute: (pygame parachute) Segmentation Fault
Python runtime state: initialized
Thread 0x00007f32c56006c0 (most recent call first):
File "/usr/lib64/python3.11/threading.py", line 331 in
...```
and it doesn't segfault w/o that set_mode line?
TypeError: optimizer can only optimize Tensors, but one of the params is int
correct
threading?
interesting
why it error for me but not you
well, for one, I'm on Windows, you're on Linux
I'll try it on a linux machine though rq
also I'm on Python 3.12
Well, at least we figured out its probably your env and not your code π
as usual
oh i got it to work in a new env
with just pygame-ce and torch
awesome!
on 3.12
What a pain in the ass, lol.
I guess I missed something. You needed 3.11 for this sto work?
yes most of the torch libraries arent updated
I see, damn
do you think i should submit a bug report. I need this fixed
you're facing segfaults when using something more than just torch and pygame?
yeah do you want my pip freeze
Maybe inside your 3.12 instance, you can install the torch and pygame-ce for that compatible 3.11 version? Im just brainstorming here.
Are you using python virtual env or conda? I cant tell.
venv
mmm, not really no, well, I'm fairly certain it's not an issue (at least not directly) with pygame-ce
am honestly not sure what you can do at this point sigh
Maybe for shits and giggles build it in conda nad try it?
but if you do decide to report it as a pygame-ce issue, you can do it here: https://github.com/pygame-community/pygame-ce/issues
I'm just fairly certain that it will be closed because we can't directly really do anything about it, I don't think
you can also ig report it to whatever else you suspect of being at fault, it seems that triton (from openai) figured a lot in that segfault trace, it's still in development it seems, but, perhaps, onnx is using it?
so ig onnx would be another project to report this to?
but even then it's probably gonna take a while to get it fixed, perhaps, try hacking something together with threads, lol
Hey just curious what cuda verison are you using?
Maybe different drivers can help too
mmm, you could also try moving development to a docker container
I have no idea what this is related to
Im just curious your version and his. Since you got it to work.
people always laugh at me for not wanting to use libraries and this is a prime example why
well be sure to document this for them then π
i submitted a issue to pytorch but like all other issues i face. I am the only one who can ever recreate it
Did you try it with conda?
idek what that is
heh
I've been working on an ML lib in rust for a while
but I haven't touched it in a long time
bash Miniconda3-latest-Linux-x86_64.sh
conda activate test```
```conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install -c conda-forge pygame-ce
pip install torchrl onnxruntime
Run the code.
I made it easy for you π
Maybe experiment with different cuda versions. It could really impact this
Two more hours till we get to see the results from the 1st epoch, I finally got XLM Roerta large working for the caption creation. Its taking 6 hours just for the eval stage on my 4090.
I went with Roberta mainly for its multi lingual abilities. Now it can read and generate captions in many languages.
The caption creation size has a max length of 77 tokens, which is limited by the clip model.
what you training?
image captioning model. It has a vq-vae manifold with dynamic learning and attention mechanism. The idea is to create multilingual captions for images.
It a model that understand the semantic realtionship between words and images.
I am 15 Yr rn I want to learn ml I have been learning python from few weeks I want to know if I should learn dsa or not I mean if yes than should I cover all topic
if you're a beginner, it's just as well that you practice graph traversal algorithms, since graphs are used to conceptualize things in ML.
Note that by graph, I am not talking about data visualizations. I'm talking about nodes and edges that model entities and their relationships.
Check out Arrays & Strings , Linked lists , stack quesus , tree and graphs, hash tables and sorting and searching algos. Agents are a hot topic right now as well. But different graph algos like dfs and bfs , etc. you might want to check out too.
Ok
except if you're doing to do ML, it's important to remember that lists and arrays are different, and to never use those words interchangeably
what do you mean by agent?
What do you mean?
you said that agents are a hot topic right now. I'm interested to know what examples you have in mind.
can someone help me im getting this error
Hello, be sure to always post the code and the error message as text. Can you post the whole error message, starting from Traceback, and the code that caused this?
!code
basically an ai that has a specific role. the ai impersonates
ya
I asked you not to post text as screenshots.
oh my bad
this is the error
File "/Users/cyrusvakil/Visual Studios/Python/Experimental/.venv/lib/python3.12/site-packages/autogen/agentchat/conversable_agent.py", line 159, in init
self._validate_llm_config(llm_config)
File "/Users/cyrusvakil/Visual Studios/Python/Experimental/.venv/lib/python3.12/site-packages/autogen/agentchat/conversable_agent.py", line 263, in _validate_llm_config
self.client = None if self.llm_config is False else OpenAIWrapper(**self.llm_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/cyrusvakil/Visual Studios/Python/Experimental/.venv/lib/python3.12/site-packages/autogen/oai/client.py", line 392, in init
self._register_default_client(extra_kwargs, openai_config)
File "/Users/cyrusvakil/Visual Studios/Python/Experimental/.venv/lib/python3.12/site-packages/autogen/oai/client.py", line 453, in _register_default_client
client = OpenAI(**openai_config)
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/cyrusvakil/Visual Studios/Python/Experimental/.venv/lib/python3.12/site-packages/openai/_client.py", line 104, in init
raise OpenAIError(
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
(.venv) cyrusvakil@Cyruss-MBP Experimental %
the code is kinda long
import os
llm_config={"model": "gpt-3.5-turbo"}
from autogen import agentchat
from autogen.agentchat import ConversableAgent
onboarding_personal_information_agent = ConversableAgent(
name="Onboarding Personal Information Agent",
system_message='''You are a helpful customer onboarding agent,
you are here to help new customers get started with our product.
Your job is to gather customer's name and location.
Do not ask for other information. Return 'TERMINATE'
when you have gathered all the information.''',
llm_config=llm_config,
code_execution_config=False,
human_input_mode="NEVER",
)
onboarding_topic_preference_agent = ConversableAgent(
name="Onboarding Topic preference Agent",
system_message='''You are a helpful customer onboarding agent,
you are here to help new customers get started with our product.
Your job is to gather customer's preferences on news topics.
Do not ask for other information.
Return 'TERMINATE' when you have gathered all the information.''',
llm_config=llm_config,
code_execution_config=False,
human_input_mode="NEVER",
)
Please read this
@austere perch read this
Did you install openai via pip?
You should be able to solve a couple problems on those programming puzzle sites. You don't need to be able to do the most difficult ones. The fundamental problem solving techniques are the important part. For example, if the problem is too hard, try to solve a simplified version first, then use that to come up with a solution to the harder problem. These methods don't even apply to programming specifically. The other important part is just knowing a bunch of the common data structures and algorithms which you can then use to get a quick estimation for how fast or slow something will probably be. One of the main differences between a good ML algorithm and a bad one is just how fast it is on modern hardware / what DSA can it make use of.
i dont think so
sorry ill do this now
wait so i quote the line that has the error
put the whole code in the paste bin
just did that
same way you install the other packages install the openai package so you can import your API key
# Set the API key directly in your code
openai.api_key = 'your-api-key-here'
you might need to setup something else I forget, but check out openai importing api keys python in google
You need to get an API also via openAI website, I found the playground easier to access it.
i got a key
it wont let me cuz its too long
roger, trying doing that for now then
how many lines is it?
98
Were you trying to paste it into this chat, or into the paste bin?
the pastebin is a website that is separate from Discord.
type !paste
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
@austere perch please read this whole message #data-science-and-ml message
it tells you how to get to the paste bin. I posted it a while ago.
but I guess you can skip that for now
this should be the solution.
this would probably also work
import os
os.environ['OPENAI_API_KEY'] = 'your_api_key'
@austere perch yes, but you've posted your API key. please go to the OpenAI website and change it as soon as possible.
Double ya. change it with a new one, you're comprimised.
okay i changed it
delete old one
i deleted it
good man
and m,ade a new one
is trhis how to show the error as well
yes
be sure to always post code and error messages as text (not screenshots) because they're easier to read and can be copied/pasted.
either in the chat or in the paste bin--whichever is fastest
ohhh okayt
well, I guess it's not about speed, but that you have to use the paste bin for text that's too long for discord.
yeah
Awesome!
yeah i realized
Try pip install shuttleai itβs free
what do i use
shuttleai.app
okay
Need any help dm me itβs free has OpenAI models and more
And lib is identical to OpenAI but faster more optimized
sure
@lapis sequoia what is your experience with shuttleai?
Results are in for epoch. Not sure what to make of it .
looks like shuttleai is another generative AI platform that has a free tier. But it's not going to be identical to what you're getting from OpenAI.
So far it seems like the best result ive gotten so far, you can make out different hues of color
The aligment of the vector quantization loss and validation loss is interesting
The size of the torch model exploded though from the last runs, 5 epochs was like 2.5 gigs
Actually it was around 10, but its a lot bigger than its ever been, im pretty sure because it incluedes the newly created captions
I dunno..
just me messing around with basic neuron networks
i guess a single neuron would have an array of inputs (neurons from last layer inputting an integer after ReLU so just 0-int limit), a bias and a weight as properties?
maybe a function to compute output etc
but thats mostly it..?
also i definitely need an explanation on how backpropogation works
qt.qpa.xcb: could not connect to display :8
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/opt/conda/lib/python3.10/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: xcb.```
what's this now
go for 3blue1brown, he had explained like god level!
In practice, neurons are just the values that we pass through the network, your first layer of neurons is your input layer, you multiply that input layer with your weights and add your bias (and sometimes add an activation function) to calculate your next layer of neurons and so on as it propagates forward through the network.
What is your understanding of backprop right now, do you know what gradient descent is?
gradient descent is basically finding the minimum of a function.
idk much ab backprop, i only know it is related to the cost, and adjusts neuron weights and biases based on their importance (weights as the "meaning" of each input for the output)
i am pretty interested at how it actually works in code, like how does it know whcih value to change?
Right so in gradient descent we calculate the gradient, which tells us which direction to move each variable in order to lower the cost (output of the function)
i watched like 5 videos from his introduction, i have a hard time remembering and understanding stuff if i can not see their base. his explanation is some sort of higher level and it all happens "magically"
Have you taken any calculus classes or have any knowledge of that?
how would an output look like in a gradient descent function
not really lol
well basic knowledge
nothing too crazy
a gradient descent basically returns the slope id need to go to minimize a value
iirc
Pretty much, so imagine your model as a function
the minimum value represents the minimum cost i suppose?
Say you have 3 layers, 2 sets of weights/biases. You can write it as second_layer(first_layer(input))
(Input is technically your first layer here)
yeah i figured
You can see that the function for second layer relies on the output of the first
i understood the structure
interested ab all the background processes and how they work
So we calculate the partial derivative of the weight and bias used in the first layer w.r.t cost, which tells us which direction to move
what does wrt mean again
With respect to
and partial derivative is just for one variable iirc
all others being treated as constants ig?
Yes and no, each variable has its own partial derivative but we can calculate them all at once for a given matrix (wieght or bias) using the chain rule
how does the rule go?
i forgot lol
If h(x) = f(g(x)) then h'(x) = f'(g(x)) * g'(x)
weights are just numbers from 0-1 after all, wouldnt making a derivative of them just make them equal to 0..? sorry ab these questions the math is not mathing in my brain rn
oh that lol
So back to my example, we know what we need to adjust the first layers weights and biases but we need to calculate the second layers weights and biases, so we will use a partial derivative from the last layer and the chain rule to calculate the partial derivative of the next weight and bias
They can be any number not just 0 to 1, and derivatives have to do with functions and their variables, we don't treat anything here as a constant (which would have a 0 derivative)
This process of calculating all this (your gradient) is called back propagation
so all weights and biases are variables, and bec we have w*a + B i guess we consider WA as the inner function so when f(x) = b
g(x) = wa we can actually solve it
so for example say i got an answer, how do i forward that information to modify my current weights and biases
We would treat it as a multivariate function which is why we are using partials
using my estimation?
like how would the function look like
so make a derivative for each one of them, so the derivative for the entire function would be WX + 1
?
We aren't concerned with the derivative of the whole function, all we care about are the partial derivatives of the variables that we need to update, w and b
oh alright
Hold on I've written all the math out before as an example let me see if I can find one
oh wow that could be helpful, lmk
x = original input
y = labeled output
z1 = x.w1+b1
a1 = activation1(z1)
z2 = a1.w2+b2
a2 = activation2(z2)
c = cost(a2, y)
βc/βw2 = βz2/βw2 * βa2/βz2 * βc/βa2
Note: βz2/βw2 = transposed(a1)
Note: βa2/βz2 = derivative_activation(z2)
Note: βc/βa2 = derivative_cost(a2, y)
For the sake of simplicity let d1 = βa2/βz2 * βc/βa2
βc/βw1 = βz1/βw1 * βa1/βz1 * βz2/βa1 * d1
Note: βz1/βw1 = transpose(x)
Note: βz2/βa1 = transpose(w2)
Note: βa1/βz1 = derivative_activation(z1)
Breaking all these functions apart into their partial derivatives lets us apply the chain rule to any number of layers, each time we just remove the first partial βzi/βwi and multiply that by the new partials. For example the next d for a 3 layer network would be d2 = βa1/βz1 * βz2/βa1 * d1 and so on
what does that greek character mean again
Partial derivative
i have no knowledge when it comes to notation lmao
oh alr
i suppose labeled is the corre t answer
In supervised learning our labels are the correct answers, yes
also i guess z1 is the input neuron and z2 is the current one were handling? since we use z1's activation for z2's value
X is the input layer
Z1 is the output of that after multiplying by weights and adding bias, then a1 would be our next layer after we apply the activation function
Almost, we apply activation to that too so a2 would be the output layer
But if we didn't use activation functions there then that'd be right
If I have a 2x3 matrix and I transpose it I will have a 3x2 matrix
oh i see alr
i suppose X and W1 are all matrixes with values according to each neuron where B is just the next neuron's bias?
same for a1 and a2
Weights and biases exist between layers of neurons, so we wouldn't say that the weight or bias belongs to a layer
lets take the 28x28 number approach.
x would be 748 elements
so would w1
and b1?
How many neurons do you want in the next layer?
The dim of the weights will determine this
1x748 * 748xN + 1xN
Where N is the number of neurons in the next layer
i see
Multiplying 1x748 * 748xN results in a 1xN layer of neurons ( or at least the values of the neurons before activation is applied)
so 748 + 748N + N is equal to the size of the matrix?
1xN + 1xN results in 1xN so that's the size of the matrix representing the next layer of neurons
isnt that 2xN?
No, it is element wise addition
ah i see
https://arxiv.org/abs/1802.01528 this is my holy grail for this topic
ill check it out very soon, thank you so much for ur assistance
i suppose this is activation with the ReLU function?
ReLU(x) = max(0, x)
so thats just activation with relu
@umbral bison look into adversarial attacks on LLMs. I think you can get away with doing it on smaller models like gpt-2
For some adversarial attacks you need access to the gradients
You kind of do, for llama and co
But unless his uni has a cluster it'll be hard to do with truly large models
cv2.error: OpenCV(4.10.0) /io/opencv/modules/highgui/src/window.cpp:1301: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvShowImage'
now what is this thing?
why to install libgtk?
this error only occurs when I am running on docker container!~
Can you install those dependencies in your Dockerfile
I think docker containers don't get to open windows on the host machine at all by default; might need some configuration
yeah I guess
yup I installed libgtk2.0
but dunno about config?
yeah!!
I just said yeah!!π
installed! now how to config that thing, I dunno about cmake
just reinstall cv2, building from source if necessary.
I did that actually and reinstalled without headers
--headless
(did you configure the container to allow access to the X server, though? without it, I don't think the container will ever be able to open windows on the host)
@final kiln ?
he configured my container actually
wdym mean by windows?
that pygame window? opencv window
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
yeah, I mean like the window that imshow opens. I don't think docker containers can normally open those.
yeah that's the issue
oh, vscode's dev container support can allow that? huh, that's pretty cool
and why I am running ubuntu commands on fedora?
is that my docker container running on ubuntu?
this is already installed
we already did this ?
I already have dockerfile
yeah
I created that
FROM python:3.12.3-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python"]
pytorch:
image: pytorch/pytorch
command: sleep infinity
working_dir: /app
volumes:
- ./:/app```
change th is?
we build for pytorch
how to ?
and it will get that opencv too?
did that already now running
??
where to put this?
docker compose exec bash!
command?
docker compose exec pytorch?
but why we are already running that docker compose up --build
yeah it's running now?
what to do?
I am in cmd now
yeah working
and when running .py file again same error
hi there datasatanists, how does one write a simplified alternative for PoS diffusers?
I need to cmake that file which I downloaded
I gotta say that UNet++ has got to be the most annoying architecture I have encountered so far
just so many connections everywhere
also, w/o context (but it is a rather small dataset), an IoU score of 0.04, does that like make sense? given that it is increasing
this is how I calculated it
iou_numerator = torch.sum(y * y_prim, dim=(-1, -2)) + 1e-8
iou_denominator = torch.sum(y * y_prim + (1 - y) * y_prim + y * (1 - y_prim)) + 1e-8
iou = torch.mean(iou_numerator / iou_denominator)
y and y_prim have the same shape of (batch, channels, height, width) (or maybe width and height are the other way around, but they're the same value anyway)
and also, the test samples seem to quite closely resemble the expected images
(weeee, more context (I was not expecting to provide this much as one might infer based on the beginning of this message, but in the end, I have provided at least some context... anyway))
bruh
I forgot dim=(-1, -2) for the denominator, cool
I knew it was something silly like that cuz 0.04 just seemed like wayyyy too low
hmm, seems like nobody knows at the moment
does tokenization come before lemming/stemming?
Your question is a bit broad, because the whole process involves several steps. What is your current level of knowledge on such topics? Why do you want to write a simple alternative? Is it just for practice? Are you trying to learn more about neural nets in general?
Now, for (simple) audio generation specifically you could look into Recurrent Neural Networks. But for a more general approach to generative AI, you can look into Generative Adversial Networks (I'm not entirely sure on how diffusion works, so I can't tell you whether diffusion techniques also employ parts of GANs, but GANs is sort of the more general approach to generating stuff (though, perhaps, it's best suited for images...))
well, i'm not a datasatanist, and i want to enable not datasatanists to write scripts/apps for SD
of course, in an abstract, easy, way
without needing that ai/ml jargon
I found the solution
but where I can find now Opencv folder in docker container?
hey @final kiln
I think we need to make another image of open cv now
or else we can make one image which have both pytorch and opencv
In the dynamic landscape of modern software development, harnessing the potential of containerization has become indispensable for deploying applications efficiently and consistently. Docker, one ofβ¦
where did you find this?
and yeah another is that
which opencv should install?
headless or normal?
yeah done downloading now what?
Authorization required, but no authorization protocol specified
qt.qpa.xcb: could not connect to display :11
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/opt/conda/lib/python3.10/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: xcb.```
new error that's nice
Reinstalling the application may fix this problem.
what does this mean now?
reinstalling what
ahh !!
what question?
wait
I am not able to understand this now!
custom environemnt for pygame game, where my agent will be trained
that's why I need a window
to create a env
then it's not good!!
I need window badly
nah, we need to search
for what?
we are creating with opencv then
should I switch to pygame window now?
there is another error for pygame also lemme share that tooπ
Authorization required, but no authorization protocol specified
error: XDG_RUNTIME_DIR not set in the environment.
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default```
yeah!
not possible for now!
yeah
vega 8 which takes 2gb ram
if anyone is suddenly interested in my implementation of UNet++, here's the code: https://paste.pythondiscord.com/R27Q
(based on this paper: https://arxiv.org/pdf/1807.10165)
some plot from training, lol
what about this?
so we will download qt externally
I got new error wait
improvement is happening
so I have downloaded that file
libgtk something
now the question is how can I re-run that?
but how can I re-run?
that cmake
Do you guys think the YOLO object detection model will run smoothly enough on a rpi3 for a self centering camera?
I need to install open-cv along with all dependencies and then again build docker container
random question, opencv lets you do image analysis?
like with yo camera and shit
you have to do that!!
yeah
currently im trying to build my own neural network, is that a possibility to do from scratch?
what i would note is that if you are already comfortable with numpy, you might be interested in trying jax out
whats the contribution of numpy towards building a neural network? i suppose just the big ass arrays?
letting you do the math at a reasonable level of abstract
im not looking for an extremely useful network, going with the basic 28x28 hand drawn numbers and ill just plug a front end where you can draw shit on there
yeah ig
cuz otherwise you have to start by writing your own matrix operations like addition and multiplication
oh hell nawh
right. so from numpy up
yup i guess so
if you will do the derivatives yourself by hand, numpy will be enough
by scratch i mean building the classes for neurons and the network itself
if you want the derivatives to be computed automatically, you have to use pytorch, tf, jax, or something of the sort
i dont think doing derivatives would be neither interesting nor fun
then the starting point is pytorch, jax, or tf. pytorch is probably the most recommended. i like jax because it writes very much like numpy. tf... is kinda hard to recommend at the moment
doesnt pytorch make u create a neural network using it though? or can i still utilize some other functions
wdym?
i still don't get what you're asking
atp idk either im tired asf
what im asking
is if all functions are accessible
honestly i shd check out the package first
Guys why do we need keras? Why couldnt we just stick to tensorflow
When I use SGDRegressor it gives a rmse of over 53753742.24651016
It is mostly good for beginners getting started with things
I could say the same with TF tbh
Outside of google, I haven't work on any AI related projects that use TF over PyTorch for anything that matters
While when I switch to linearregression gives an acceptable 10711.00334810241
I have heard that Keras is like an API for TF
yes
Thats so much
it is a more beginner friendly API wrapper over TF
that makes some operations simpler and lowers the barrier to entry a bit for new people
API in sense , it collects data from the TF framework and presents it to the user?
no]
API in the sense it wraps the tensorflow api
API != webserver or what not here
Keras just basically creates some functions and classes for people to use and the internally makes the calls to tensorflow and configurations
In the same way requests is an API wrapper around urllib3 for example, you could call urllib3 directly, but requests exists to make your life easier
So the Dense() function is like a part of the keras?
Oh makes sense
Is there any case where multicollinearity in regression models can somewhat help it's accuracy?
I believe no
it can only make it worse
what is scaling your input refers to?
Alr
Changing the range of a features values to be closer to every other features values
Applying a linear transformation to change the range of values (because some kinds of models are sensitive to that, and won't perform well if one of your features is from 0 to 1 while another is from 0 to a billion). See https://scikit-learn.org/stable/modules/preprocessing.html for an overview.
oh ok
will check it out
ohhh
apparently that is the problem
I have to scale the inputs for gradient descent
Do I need to know the math behind it?
Looks drawn to me.
yeah unfortunatly
What color represents what?
Who have an idea about. ML ?
The blue line is the training loss and orange line is the validation loss
in RL how does an agent train on dynamic environments? Currently i train it on a static envirnment but if i wanted to randomize the environment in the testing stage wouldnt i need to train it on thousands of random environments? that would multiply the training time by thousnads
Why do you think you'd need thousands of environments? What do you imagine happening if you had only 2?
an image classifier cant learn off 2 images, i assumed this was similar
You can recreate something like this with tools like draw.io and lucid.app
Well same idea, I think what you're getting at is having a model that has a proper generalized function for any task you give it in the environment? You would encounter overfitting with 2 images in a classifier and the same would probably happen here. But your model is already overfit on one environment so as you increase the samples, the number of specific functions that it can overfit on decreases and it is forced to generalize more and more.
so i do need a bunch of environments just less than thousands
And that's really only the ideal, since RL environments can be incredibly finicky depending on how you give reward/punishment. If the model finds one environment where it can get 10x score it will steer it towards overfitting even if you have lots of other environments
It's more about having enough environments that the model can't overfit. You could generate random environments even as long as the reward/punishment is bounded the same for all of them
This could be done with 4, 5, 6 environments if your model is simple enough that it can't overfit all those
so 6 predefined environemts or a random environment each time for training?
Just depends on your model architecture, policy, and environments
also is it expected that the training global average is less than the max score due to it having a minimum random action chance?
Global average as in the average of all agents over the sim?
like right now i can get the average score to the max score but if i set a minimum random it will sometimes kill itself causing not every iteration to be perfect
Yeah ofc, if even one agent gets a score below the max then it will be skewed, you wouldn't expect that number to match the max score
in my dataframe I have the column smokers with (yes and no) and I want to replace that to 1 and 0 to be able to deal with it how can I do that?
I have a dataset that looks like this, this is a βbadβ dataset because of the periodicity occurring in the trough. I want to find a way to detect and quantify the periodicity thatβs occurring in this dataset
Iβve looked into Fourier transforms and dug around scipy but havenβt had much success, Fourier transforms appear to be for filtering out digital signals and scipy I just havenβt quite found what Iβm looking for yet
Any recommendations?
now I will try for last time, and if that doesn't happened , I will increse that storage of /tmpfs
hi
hello
post your error real quick i think i can fix it
now I am cloning whole opencv through github
what? of my docker?
werent you having an issue of pip saying it doenst have enough storage
@unkempt apex can I help you w something
yeah that error
are you running in visual studio code in linux?
no I didn't get I was totally bored with that
yup fedora 39
uninstall visual studio code and install it from dkp thingy. The flatpak version is bugged
import random
messages = [
"Get lost, you useless moron.",
"Your existence is a joke.",
"Why do I have to deal with idiots like you?",
"Go play in traffic, you imbecile.",
"You bring new meaning to the word stupidity."
]
def generate_zapbott_message():
return random.choice(messages)
print(generate_zapbot_message())
yeah but pip was not able install from venv
no one force you to help me!!
yeah make sense
#ot0-psvmβs-eternal-disapproval message
this fixes it
okay okay can I share you exact error msg of that pip?
yeah I am also thinking this
linux legends are online , he has something for me
my pc is getting hot now!!
my all cores are running at 100 percent
because of that cloning process
wait I am muted because of that bot
I was directly pasting code
yeah freak!
Official Linux.Chat Discord community. Come chat about virtualization, security, networking, gaming, programming & more!
first of all why typed this?
are you going with bad day or what?
yeah they are very nice!
I need to install another fan now!!
also how can I delete all this docker containers and images now?
import random
def get_random_message():
"""
Returns a random message from a predefined list.
"""
messages = [
"Get lost, you moron.",
"Why are you wasting my time?",
"You really are an idiot, aren't you?",
"Stop bothering me.",
"You're absolutely clueless.",
"You couldn't do this if your life depended on it.",
"Give up already.",
"This is pointless and so are you."
]
return random.choice(messages)
def main():
while True:
user_input = input("Do you want a random message? (yes/no): ").strip().lower()
if user_input == 'yes':
print(get_random_message())
elif user_input == 'no':
print("Good. Now go away.")
break
else:
print("I didn't understand that. Try again, you moron.")
if __name__ == "__main__":
main()
I agree.
hey that gpt docker file works
literralllyy
hwo can I run different image? on docker tell fast
def assess_day():
mood = "absolutely terrible"
message = f"He's definitely having a {mood} day. Just like every day he has to deal with oxygen-stealing idiots like you."
return message
def extend_insult():
additional = "You were a waste of resources from the moment you were born, and I sincerely pity anyone who has to tolerate your existence."
return additional
def main():
day_message = assess_day()
insult_message = extend_insult()
final_message = day_message + " " + insult_message
print(final_message)
if __name__ == "__main__":
main()β```
I ran this with sudo docker compose up
but it is runnig that pytorch
not pong-game ( current )
I think I should destroy all the containers now , I am getting bored with those
@final kiln the solution was very simple
and I spend 2 days onto this , but anywas docker is the thing which I learned
but hey now how can I write something in docker which I will upload ongithub and others can also use that?
yeah
now need to clean storage
because I have run that build docker multiple times
so only 85 gb remaining
although I have removed all the images
and containers
Total reclaimed space: 21.69GB
wait what is leaking memory into disk?
that dataset is taking 23 gb
yeah the question was what to write on docker file now?
now we are not using docker file for that pytorch
but I still want to make use of dockerfile ( it looks nice on github though)
so what to write on that?
basic stuff for installing packages with docker?
why the hell you types all thisπ
but the thing is I have already push code in which docker was there!!
it will disturb my code flowπ
anyways need to again push without docker then
shit bruhh!!π
self.icon = cv2.imread("paddle.png") / 255.0
~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'
this was the same error but was on docker
so I solved that with absolute path of .png
but now we don't have docker
so what which absolute path?
yeah it's working with that
This is taking forever now. Im going to restart the training after this epoch, I just wanna see the resuls. But it was strange when I woke up, it was going, but really slow. It might be the dynamic learning and optimizations causing the slow down. I also though it might be combined wit the roberta models max length of 256 so I matched that with the clip model of 77. I moved the terminal screen and it started to get faster, dunno just seem abnormality long. Also lowered my batch size from 16 to 5 would be more efficent, just from watching it seems process the first 5 pretty quick. Evaluation: 61%|ββββββββββββββββββββββββββββββββββββββββββ | 355/582 [6:30:55<2:52:02, 45.47s/it] Oh I setup some metrics to evaulate the captions too.
Like when it slows down, it seems stuck but then it gets back around to that screen above and it will show like 140s/it
I also think all the context its creating for the captions is blowing up the size of the model. One run was like 2.2 gigs
Wait thas not right, I think it was more than one.
Evaluation: 63%|ββββββββββββββββββββββββββββββββββββββββββ | 368/582 [6:51:21<11:43:55, 197.36s/it]
Thats what I mean.
can someone help me im making a neural net but idk how to visualize it? like should the wheights be the size of the line, or should it be the color? and is the bias line width or color or should he be the nodes? idk can someone help me please
watch 3b1b video on it
or you want from scratch
then go with sentdex!
sentdex has nailed that!
no i just want to know wheither the line width should be whieght or bias
ya
they are weights
the connected lines between the nodes
and each node has bias which gets add into weighted sum
@small wedge if the model has found a better option why isnt it doing it in testing. In testing I just run the model in pygame to visualize it with the randomness and learning removed.
Score: 73
Score: 73
Score: 73
Score: 73
Score: 73
Score: 73``` this is what its doing in testing
```All Time Average Score: 50.563
Average of last 100 bears: 53.33
Highest Score: 78
Total Bears: 7000
Time between epoch: 44.77771258354187 s``` but it has found a better option of 76 in training
so at the end of the day here's my PoC, likely to be broken as of right now https://github.com/aolko/StableDiffusionEvents/tree/main
Event Sheets/Macros for Stable Diffusion (based on diffusers) - aolko/StableDiffusionEvents
not the highest quality out there, but hey, at least i'm trying to work with π© d i f f u s e r s
(and the yaml is the intermediate format for the future webui)
Holy mother of mary Evaluation: 90%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 521/582 [8:46:03<37:49, 37.20s/it] just 37 more mins π
Poor gpu is tired.
9210 54
9211 73
9212 54
9213 39
9214 70
9215 41
9216 41
9217 39
9218 71
9219 57
9220 37
9221 58
9222 44
9223 72``` i feel liks this shouldnt be jumping around so much
iteration vs score
are you using ReLU or sigmoid as your activation function thing
sorry i do not know much i just know relu has a better learning curve
also its probably your rate of learning being inaccurate
yeah that dont look normal
what is this
im training a model that combines image reconstruction and multilingual caption generation. its suppose to create a deep understanding, semantic relationship between them, thats the idea anyways. So i built this VQVAE with manifold learning and combined multiheaded attention mechanisms, dynamic weight adjustments and integrated caption generation for the images .
out of the few words I understood, that sounds rlly cool
hopefully ill be able to do stuff like that eventually
what was the command to put ur code in the broser thing
i have an update issue
the !pip install one?
the one where u paste ur code in the browser
oh !paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
You might wanna change your API key again you exxposed it π
snatches
Well you never know who lurks in the shadows
trying running the openai migrate command
how
try typing it in the terminal
is this the command
pip install --upgrade openai
honestly, not sure, but from the warning yesterday it said to use % instead of !
look like its updating your code base
it seems it had quite a few errors ima just look into it
paste them lets see.
sometimes pip uninstalling packages and reinstalling and work things out
remember what version works before you go dabbling with that though.
Please remember what we discussed about screenshots
try changing engine to model.
oh sorry, screenshots are only for errors right
Only for information that isn't text (so not code and not error messages)
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
Data visualizations obviously have to be screenshots
unless you convert them to ascii art 
!paste
in fact, you can bookmark that page
okay
this includes hte error @rich moth
i replaced engine with model
it means they dont use that model anymore, pick an updated one. I believe.
chatgpt3.5 turbo or something
ohh
Reinforcement learning is so hard π
Im really interested in learning what your working on a bit RL is something im also very interested in but havent dabbled it in really.
I mean I saw the code and looked over it, the reward system to interesting.
I just never know what the issue is
Itβs just randomly changing variable values until it works
Dude something I feel like thats park of this ML , AI stuff. Sometimes when you get a solid foundation of something, its more expermenting like an alchemist.
And every run takes 7 hours
ya that too
Hey what was the issue with your enviorment last night? You fixed it I imagine?
I fixed it by giving up
REAL
lol
I went back to windows and if I ever get to the point where I need a gpu Iβll learn aws
My model was working perfectly
Then I upped the complexity just a tiny lil bit and now it doesnβt work
And Iβve over compensated on the model size so it must just be a fundamental flaw
it do be like that
I can help you experiment with it later if you want. Im waiting for this damn training to finish though but im curious cause I'd like to learn as well.
not sure I've linked this, but here's a good book on RL http://incompleteideas.net/book/RLbook2020.pdf
I wish I had only 7 hours lol. Evaluation: 98%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 571/582 [11:11:58<4:04:14, 1332.21s/it]
how does yours have a set finish time?
i thought in ML you train it until its good enough not after a certain time period
what does it mean to add a key to an .env?
"While you can provide an api_key keyword argument, we recommend using python-dotenv to add OPENAI_API_KEY="My API Key" to your .env file so that your API Key is not stored in source control."
Thats my understanding also, this is only the 2nd epoch though
but how does it have a time limit
that's for a single epoch
well, you're doing RL
that is rather questionable
why does his not have variable epoch time
why would it? you're always processing the same number of inputs
dang 11 hours for one epoch that thing is gonna take all year
I made some changes hopefully reducing the batch size and caption creation size. It was 256 tokens now its 100.
u considered upping your hardware?
even in the case of RL though, you're pretty much always processing the same batch size, what could be variable is the length of a single episode, the agent could take variable number of steps every episode before the episode terminating
Im on a 11700k, 128gigs, 4090. I need something more commerical lol
ill check it out thanks, hopefully some of these changes i can avoid that though
goddamn
I had to add more metrics for the caption generation though
can anyone interpret this?
nope, I have literally no idea what I'm looking at
can you make it a line plot
ill need to retrain but yeah
and high fluctuations can be expected in RL, did you try using fixed Q targets
good one
the heck is that
a line plot
πππ
It tells me theres 4, 5 types of players. Some being better than others lol
thats a crazy line plot
why are there 160k whatever x is? (please label your plots)
episodes
that is a lot of episodes, have you considered making the whole plot window wider?
doesnt help much
IIRC
matplotlib.rcParams["figure_size"] = (width, height) # in some weird unit, in your case you can try something like (100, 10) maybe, lmao
the graph isnt going to show anything different
I mean, what I'm seeing is great fluctuations which as I said, can be expected in RL
now, did you try using fixed Q targets?
well, you have, you haven't used fixed Q targets
is this statement wrong
ehhhh
ehhhhhhh
honestly, no clue, I mean, that is the expectation... but there are a ton of factors
somehow the test value is always constant though
I feel like you need a bigger range or less bears to measure it accurately.
the results are so condensed how do you measure the results?
wym measure
guess im confused.
im logging some metrics as well as testing the model in pygame to visualize if the model works or not
No I guess what I mean, is like exxtending the range and maybe lessen the time? I just look at the chart and its hard to tell whats going on in the middle.
this one shows it better
I mean obviously their in there!~ but how much?
if it's of any consolation, some research papers have trained RL models for hundreds of millions of episodes
yeah but those are big complicated enviornments
and mine has stopped learning
not necessarily big but reasonably complicated, perhaps, yeah
the scores plummeted
Oh I see and understand now π
are you plotting loss (TD error but inverse pretty much) and number of steps per episode as well?
number of steps directly affects score
I think, in RL having a complex model might not just mean it'll take longer to converge, it might mean it'll never converge because the agent simply does't get to explore the things that would move the network closer to convergence
mine has reached the max score though
at least that's my intuition
but only once, why cant it do it more than once
Couldn't you intice it via the reward system ?
that's what RL already does
yeah, you could adjust rewards, ofc, that's another hyperparameter you can tune
right in here it hit the maximum score but never again
Can we see your working code again?
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
so, they way it's learning, right, it's randomly sampling past actions from memory, that will, inevitably include a bunch of not so exciting ones, where there isn't much happening, so overall it'll not converge to the maximum
this is why something like Priority Experience Replay is another optimization which simply makes it sample past experiences with greater priority assigned to those that have a greater deviation from that expected score
also why Fixed Q Targets are used, you have an anchor point that you explore around and then move the anchor to the best found new location instead of carrying the anchor constantly around to the first best position you find pretty much, this helps to reduce the fluctuations (I think it's also called variance in statistical terms, not sure)
then also stuff like deterministic and non-deterministic policies plays a role, if you have a deterministic policy, you can pretty much stop once you've reached the max score and that's it, it'll always take the same actions, but it will reach the max score every time (unless there is some randomness in the environment itself)
those are just my sort of intuitions on this topic and from stuff I've read
hmmmm i will try to program those optimizations, but do you know why its failing to converge currently?
Fixed Q Targets, Priority Experience Replay
Have you tried getting any advice from AI about it or are you apposed to such ideas?
Sometimes they point out the obvious, we often overlook. Or get different insights and angles.
chat gpt speaks nonsense for anything as complicated as this
might as well be asking the dude on the street corner
Well check out these suggestions ```def get_score(action, bear):
reward = 1 # Default reward for taking an action
match action:
case 0 | 1 | 4 | 5: # Left, right, up, down
bear.hp -= 2 # Reduced penalty for movement
# Calculate proximity reward based on distance to the nearest berry
min_distance = min(bear.distance(berry) for berry in berries if berry.color == green)
proximity_reward = 10 / (1 + min_distance) # Reward decreases with distance
reward += proximity_reward
case 2: # Wait
reward = -1 # Small penalty for waiting
case 3: # Eat
for berry in berries:
if bear.borders(berry) and berry.color == green:
berry.color = red
reward = 20 * (100 - bear.hp) / 100 # Reward scales with hunger
if bear.hp < 100:
bear.hp = min(bear.hp + 50, 100)
return reward # Exit early if a berry was eaten
elif bear.distance(berry) < 3: # Intermediate reward based on proximity
reward += 5
case 6: # Explore (new action)
reward = 2 # Small reward for exploring new areas
case 7: # Kill self (adjusted action)
bear.hp -= 20 # Reduced penalty for "kill self" action
reward = -10
return reward # Return the calculated reward
Update the BearBrain class to include the new actions
class BearBrain(nn.Module):
def init(self, input_shape, actions):
super().init()
self.network = nn.Sequential(
nn.Linear(3, 128),
nn.ReLU(),
nn.Linear(128, len(actions))
)
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.device = "cpu"
print("device:", self.device)
self.to(self.device)
def forward(self, x):
return self.network(x)
Update the Bear class to include the new actions
class Bear(Square):
hp = 100
actions = [0, 1, 2, 3, 4, 5, 6, 7] # Updated actions
states = []
learning_rate = 0.0025
discount_applied_to_future = 0.9
random_behavior_chance = 1
input_shape = (1, 3)
batch_size = 128
# Rest of the Bear class code remains the same
Update the main loop to pass the bear object to the get_score function
while True:
# ...
if bear.hp > 0:
total_action_count += 1
action = bear.choose_action(state)
score_per_action = get_score(action, bear) # Pass the bear object
total_score += score_per_action
next_state = get_state(bear, berries)
bear.remember(state, action, score_per_action, next_state, bear.hp <= 0)
bear.learn()
state = next_state
# ...```
what changed
i see it decreased the hp loss from movement which i dont like because it just inflates the score
not really no, as I have mentioned a couple times, tons of factors and RL is not easy, lol
blindly tuning parameters is not for me π
I felt it was too long to paste in here.
how did u even paste that one its massive
What do you mean?
usually python bot stops messages that big
Ah, maybe I was just a token few short. lol
well ima read that book matiss sent earlier
probably a good idea.
see if allows me to figure out why this doesnt work
@rich moth GL on your model, lmk how it turns out
How does one keep the Val accuracy consistent with the training accuracy for a Tokenizer RNN LSTM thing. I donβt know, this is hard. Should have started with cnn. Is a validation split better ? Or is using the test data after NLP stuff better or does the defeat the purpose of a testing set by using it to evaluate the training data?
wdym consistent with the training accuracy? validation accuracy is seldom as high as that of training
and wdym started with a CNN, that's a completely different network altogether
Im a bit confused. can you try to rephrase it again.
Man lowering the batch size from 16 to 8 was about a 20% increase in performance.
I'm having a hell of a time getting the Roberta model to generate captions. The left side padding error is driving me bonkers.
I dunno, I feel like its processing the captions I just cant correctly get it to show
!paste
If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.
here is the section related to it if anyone has any suggestions . https://paste.pythondiscord.com/352A
going to bed
is that code AI-generated? π€¨
It is also basically impossible to gain any useful info out of that code because we know none of the types
Why is the Roberta model supposed to be generating text? It is not particularly designed to do that...
Also idk what is going on with your tokenizer and attention mask but if this is a huggingface transformers type then that is not how you create the input IDs and attention mask
any RL people?
now the issues are solved mostly but some bugs in env!!
so if you have done using gym environment it will be easy for you to identify!
(small note, if you use dropout it' s common that for certain settings your val loss is lower than your train loss)
Guys I have a question
In perceptron why do we need a peceptron update rule when we can just analyze the perceptron loss and adjust weights accordingly?
Thatβs really nice question
Any answers?
I guess a simpler way to say what I was asking is whether or not the training set could ever be ran through the nn with the test set acting as the val set, but that would defat the purpose of a testing set.
perceptron as in step activation function?
how do i start learning ML guys i have no idea where to start from any recommendations like videos and sites etc?
see pinned messages in this channel
ok ty
Can someone suggest some courses for NLP?
For more information about Stanfordβs Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai
Thank you so much
is there any logical error here? please check someone
because the current environment code works fine but it lags/freeze in train.py
Yes
I made a program so that I can download all of the transcripts from a YouTube channel. What should I do with that? 
Did you use the YouTube API?
yep
Guys
In huber loss , if the point isnt an outlier we use the formula similiar to that of the MSE right?
But in my sources , it shows 1/2 of the MSE
Why is that
scaling of the loss doesn't have any effect
the MSE is often written with a 1/2 in front simply because differentiating gets rid of the 1/2 factor
same in the huber loss
nicer to write, but otherwise it has no effect
But if we divide the loss by half , isnt it a problem?
no, it literally makes no difference
how?
the value of the loss is never important. what matters is the value that achieves the minimum (i.e. the minimizer)
for the one dimensional case, you can easily convince yourself that x^2/2 and x^2 have the same minimizer: x = 0
the same is true of adding a scalar btw
Oh so we arent considered about the number , we just want that term to be as minimum as possible?
from the convex optimization perspective, you only need to find a point where the gradient equals 0. well, scalar multiplication factors out of differentiation. that is, if we have f(x) and its derivative f'(x), it turns out that the derivative of 2f(x) is 2f'(x). if you equate f'(x) = 0 and 2f'(x) = 0, you can just divide the latter by 2 and you get exactly the same thing again: f'(x) = 0
unless you have an explicit interpretation for the loss, the number usually doesn't matter at all
you more care about the minimizer and about the inference results you obtain, not the value the loss takes
Oh makes sense . Thanks just one more thing , why do we make the small errors to be penalized quadratically and not linearly?
that's a difficult question to answer. the choice of cost function always depends on the properties of your problem
sometimes quadratic terms make sense, other times they don't. there's no general rule
the common motivation behind MSE is that, under AWGN, least squares minimization is the maximum likelihood estimator
under different statistical criteria, the absolute value is preferred
you could take the huber loss as a robust MSE minimizer that, for small errors, behaves like the usual least squares, and for large errors becomes insensitive
Ohhh so it depends on the problem statement but overall we would want to use huber loss
So after the loss is calculated, we use gradient descent and then adjust the weights right
yeah
technically you never need to evaluate the loss, but this is often done as a sanity check to verify that it's decreasing
Oh it clears my whole doubt . Thanks mate!!
People say
Tensorflow is easy to deploy
Pytorch is not
Pytorch's syntax is better
Tensorflow has prebuilt libraries in it
Pytorch is used more in research field
What should i do?
I planned to start with tensorflow, but now I'm confused
What if i go with pytorch first, and then if needed i can shift to tensorflow
Thanks for pointing out how I was using the input id attention mask, I got it fixed regarding Roberta I was experimenting with different models. I switched to a ber base uncased for masked LM. Its working now. . Feature shape: torch.Size([512]), Input IDs: tensor([[ 101, 101, 1996, 3746, 3065, 103, 102, 102]], device='cuda:0') Generated Caption: the image shows
its not much, but its start.
the image does indeed shows 
has anyone made a custom environment with pettingzoo?
how hard is it?
is the hardest part the gui?
and what have you done so far?
yeah created environment, now training part
but yeah for begineers like it's quite hard because new naming convetions are being introduced to me
no I mean you can do that with opencv, but how you are representing that is you play
btw which game you are making?
You should see others code first wait a sec I will send that link
read this
did you have to encode state space and action and stuff into the env?
or did you make a gui that represented pong
wdym by encode?
how does the training send data to ur custom env?
yeah so in simple terms I created whole Pong like game in env
it's long part first read that blog
then you have to read about Q-Learning
i have the training done
??
which game?
DMs?
ratio: 0.1665920527192292
Epoch [1/5], Train Loss: 0.9839, Val Loss: 0.5658, Train PSNR: 14.2182, Val PSNR: 17.2875, Train SSIM: 0.1736, Val SSIM: 0.2983, BLEU Score: 0.0010, CIDEr Score: 0.0008, ROUGE Scores: {'rouge1': 0.0430025915208296, 'rouge2': 8.739937283566147e-05, 'rougeL': 0.04283252114215165}```
what do you guys think? I changed my batch size from 16 to 8, but without adjusting the learning rate, which wasnt smart, So Ill think Ill go back to 16 for now or try or a lowering learning rate for 8.
why are you training for different types of image ?
I mean the previous one was on another filtered images
now it seems another
Im shuffling the data before the split now, which created a bit of more randomness.
are you having issues with the amount of time the model is taking to train?
The dataset is a big confusing, everything is saved all 31k rows under test instead of train, but all the strings for train, test eval exist. I had to split up the data manually.
setting batch as low as 8 could be problems even if you adjust the learning rate since it might not be able to get a good enough gradient estimation to learn properly
yeah make sense!
after experimenting yours words ring true
if you are struggling with the time it takes for your model to converge this is always a fun paper https://arxiv.org/abs/1711.00489
You guys are seeing the results of batch size 8, ill go back to 16 and test this out again.
I should experiment with different learning rates at 16 too. Im using 1e-4 , maybe increasing that instead might be a good idea.
but what is this green filtered image called as, or you have given them a name?
The visual resconstructions? Why is it green?
yeah
Updating DQN
current loss -> tensor(3.7179, grad_fn=<SmoothL1LossBackward0>)
Episode 91 : Total Reward = -10
Updating DQN
current loss -> tensor(5.3552, grad_fn=<SmoothL1LossBackward0>)
Episode 92 : Total Reward = -10
Updating DQN
current loss -> tensor(4.8030, grad_fn=<SmoothL1LossBackward0>)
Episode 93 : Total Reward = -10
Updating DQN
current loss -> tensor(3.1958, grad_fn=<SmoothL1LossBackward0>)
Episode 94 : Total Reward = -10
Updating DQN
current loss -> tensor(2.1072, grad_fn=<SmoothL1LossBackward0>)
Episode 95 : Total Reward = -10
Updating DQN
current loss -> tensor(2.2879, grad_fn=<SmoothL1LossBackward0>)
Episode 96 : Total Reward = -10
Updating DQN
current loss -> tensor(1.9174, grad_fn=<SmoothL1LossBackward0>)
Episode 97 : Total Reward = -10
Updating DQN
current loss -> tensor(2.5300, grad_fn=<SmoothL1LossBackward0>)
Episode 98 : Total Reward = -10
Updating DQN
current loss -> tensor(4.0256, grad_fn=<SmoothL1LossBackward0>)
Episode 99 : Total Reward = -10
Updating DQN
current loss -> tensor(4.2301, grad_fn=<SmoothL1LossBackward0>)
Episode 100 : Total Reward = -10
what are this loss values trying to tell, I mean I know the less the loss, the great modle performs
but hey how can I optimize my model now?
ohh, send the docs
then I have to run all 100 episodes again?
I have .pth file created after training
and yeah after lot of bug solving my RL model is ready with custom env!!
what is this image?
can the reward function change every timestep?
NICE!!
and what you are trying to achive with this
very much advanced level project , I mean GOD level!
what I can do with .pth?
I guess it contains parameters
is this like a hypothetical question or are you asking about someone's project?
will it be weird to ask HR to keep joining date as 2nd week of some month, because you want to resign after taking the annual bonus, so you can resign only in second week
also, is the annual bonus given in 12th salary? if cycle does not follow financial year
or is it 13th?
my project
just fill your work calender with random fake timeslots until then
"I'm sorry I'm very busy"
this might be better to ask in #career-advice but I think it depends on the company, idk the standards
lol i thought this was career discussions
what kinda model are you using?
huh? iβm trying to find a MARL algorithm that can use a changing reward function
what to plot?
loss values?
oh interesting, haven't heard of that
is there anyway to bypass that
to bypass what?