Neural Network and Deep Learning Code Explanation | Learn AI Together | Page 1

fiery parrot Mar 11, 2023, 1:38 AM

#

Hey everyone, new here.

I wanted to get started with AI, and really learn the fundamentals of it rather just watching a video and coding something up. So I decided to go with the http://neuralnetworksanddeeplearning.com/ website as many of you have probably already seen. Once I reached the part of the site where he explain Backpropagation code I got a bit confused.

My first question was:

In the update_mini_batch block of code, when he updates the weights and biases for the network here:

self.weights = [w-(eta/len(mini_batch))*nw 
                        for w, nw in zip(self.weights, nabla_w)]
self.biases = [b-(eta/len(mini_batch))*nb 
                       for b, nb in zip(self.biases, nabla_b)]

Has he given the formula for how he got that in the explanations before it? If he has which formula is it. I don't quite understand where he is getting those 2 lines from.

And also what are these 2 lines for:

nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]

Thank you

inner vine Mar 11, 2023, 8:26 AM

#

Yes @fiery parrot, he provided the explanation earlier.

The first set of code lines you provided are the lines of code that actually do the gradients update.

The second set of lines act as a means of implementing minibatch gradient descent. Here nabla_b and nabla_w act as buffers to store all the updates for a minibatch. After all the updates of the minibatch are accumulated, the first set of code lines now uses the accumulated updates to carry out the real gradient update.

Make sense?

fiery parrot Mar 14, 2023, 2:10 AM

#

inner vine Yes <@403744585970352139>, he provided the explanation earlier. The first set o...

Ah ok that makes sense. Do you know if he ever mentions the mathematical formula for the gradient update that looks same or similar to the code? I tried looking but couldn't find anything similar.

#

Is it these?

#

If it is, can you possible explain how they went from the math equation above to the code?

inner vine Mar 14, 2023, 4:42 PM

#

fiery parrot Is it these?

Yes! These are the mathematical notations for these:

self.weights = [w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights, nabla_w)]\ self.biases = [b-(eta/len(mini_batch))*nb for b, nb in zip(self.biases, nabla_b)]

inner vine Mar 14, 2023, 4:49 PM

#

fiery parrot If it is, can you possible explain how they went from the math equation above to...

n/m is analogous to eta/len(mini_batch). nw is the grad of a specific weight, while nabla_w is a tensor that contains the grad for all the weights. To get and apply the grads for all individual weights, we need to iterate over nabla_w, which is why you see a list comprehension in the code.

fiery parrot Mar 15, 2023, 2:18 AM

#

inner vine `n/m` is analogous to `eta/len(mini_batch)`. `nw` is the grad of a specific wei...

That makes so much more sense thank you!!

fiery parrot Mar 15, 2023, 2:19 AM

#

inner vine Yes <@403744585970352139>, he provided the explanation earlier. The first set o...

Another question, for the second set of lines:

nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]

What is the reasoning for summing the current gradient with the gradient given from the backprop algorithm?

inner vine Mar 16, 2023, 10:19 AM

#

These lines of code allow us to accumulate the grads for a specific example (x, y) in the mini-batch. As we calculate the gradients (delta_nabla_w and delta_nabla_w) for each (x, y) tuple in the loop, we update the buffers (nabla_w and nabla_b).

The reason the buffers are updated via a comprehension is the fact that the gradients are represented by matrices of similar shapes to the weights they are generated from, and these weights are different in shape. Thus, the need to iterate over a zipped collection of the matching gradients.

fiery parrot Mar 21, 2023, 2:12 AM

#

Thanks a lot for the help. I think I understand it now!!

fiery parrot Mar 25, 2023, 9:57 PM

#

I also had another question, but this time with the code. So as part of this learning project, I wanted to implement it onto my personal website. So people could test the AI by drawing digits themselves. I got as far as drawing the image in my NextJS application and sending a base64 string of the image to my python functions using Flask.

This is where I am having an issue. I am not sure how to format the image so that I can send it to my neural network. I currently have:

@app.route('/predict', methods=['POST'])
def predict():
    print("Predicting...")
    data = request.get_json()

    image_data = base64.b64decode(data["image"])

    img = Image.open(io.BytesIO(image_data))

    # Set the background of the image to white
    new_img = Image.new("RGB", img.size, "WHITE")
    new_img.paste(img, (0, 0), img)
    new_img = new_img.convert('L')
    print(new_img)

    pixelated = pixelate(new_img, 0.2)
    # pixelated = new_img
    print("Image pixalated")
    print(pixelated)
    pixelated = pixelated.resize((28, 28))
    print(pixelated.size)
    # pixelated = pixelated.resize((1, 784))
    # print(pixelated.size)
    # pixelated.show()
    img_array = np.array(pixelated)
    # print(img_array)
    img_array = img_array.reshape((784,1))
    print(img_array.shape)
    img_array = img_array/255.0
    img_array = img_array - 1
    img_array = abs(img_array)
    print(img_array)

    #make prediction
    prediction = net.feedforward(img_array)
    print(np.argmax(prediction))
    
    #return prediction
    return jsonify({'prediction': {}})

And why this works in the sense that it sends the correct array/array shape to the network and gets a result, the result is wrong most of the time.

@inner vine

#

And I know the net is trained fairly well beecause if I run the network against the validation data from mnist dataset, it gets most them correct

#

For instance the net thinks:

#

Is a 5

#

I have a feeling it is because of the preproccesing I am doing on the image, But I am not sure how to fix it

#

It can get some numbers right like 4,5,7 and 9, but others it completely fails

fiery parrot Mar 25, 2023, 10:49 PM

#

The array of shape (784,1) seems to be good from looking at it. So I honestly can't figure out what could be wrong

fiery parrot Mar 26, 2023, 10:54 PM

#

From looking online for solutions, it maybe that the neural net is overfitting to the cleaned dataset such as MNIST. I am not how to fix this for it to work on my handwritten digits.

https://youtu.be/hfMk-kjRv4c?t=2708

This person, made the dataset more noisy but I am not sure how I can do that in Python as it seems a bit complicated for my experience level

YouTube

Sebastian Lague

How to Create a Neural Network (and Train it to Identify Doodles)

Exploring how neural networks learn by programming one from scratch in C#, and then attempting to teach it to recognize various doodles and images.

Source code: https://github.com/SebLague/Neural-Network-Experiments
Demo: https://sebastian.itch.io/neural-network-experiment

If you'd like to support me in creating more videos (and get early acce...

▶ Play video

inner vine Mar 29, 2023, 4:44 PM

#

fiery parrot From looking online for solutions, it maybe that the neural net is overfitting t...

Just add Gaussian noise. Something like img + torch.randn().

fiery parrot Mar 31, 2023, 2:56 AM

#

inner vine Just add Gaussian noise. Something like `img + torch.randn()`.

I did try to (in a different way) the results got sort of better but it was still not consistent with the results. The way I added noise was:

tr_d, va_d, te_d = load_data()
    
    noise = np.random.normal(loc=0, scale=0.1, size=(784, 1))
    
    # Variable to store the list of 784-dimensional numpy.ndarrays / Done by reshaping the original 28x28 matrix
    training_inputs = [np.reshape(x, (784, 1)) for x in tr_d[0]]
    
    # Add noise to each image in the training_inputs list
    for i in range(len(training_inputs)):
        # Add the noise to the image
        noisy_image = training_inputs[i] + noise

        # Make sure the pixel values are between 0 and 1
        noisy_image = np.clip(noisy_image, 0, 1)

        # Update the training_inputs list with the noisy image
        training_inputs[i] = noisy_image

inner vine Mar 31, 2023, 5:08 PM

#

fiery parrot I did try to (in a different way) the results got sort of better but it was stil...

There is a issue here. The noise you add is constant. This makes it a constant feature the network can account for. I would generate the noise variable for the loop iteration.

fiery parrot Apr 1, 2023, 1:23 AM

#

inner vine There is a issue here. The noise you add is constant. This makes it a constant f...

woaahhh, you were right, it is wayyy better now. Still gets some stuff wrong, but I think if I use better activation functions and account for overfitting it will get even better

#Neural Network and Deep Learning Code Explanation