martes, 20 de noviembre de 2018

Como hacer DropOut en TensorFlow con Mnist - Convolutional Neural Network for MNIST -Как сделать DropOut в TensorFlow с помощью Mnist

Como hacer DropOut en TEnsor Flow con Mnist - Convolutional Neural Network for MNIST
Как сделать DropOut в TensorFlow с помощью Mnist

https://wpovell.github.io/posts/cnn-mnist.html

Convolutional Neural Network for MNIST
Jun 23 2017
☀️ This post is a part of my Summer of Tensorflow series ☀️
In this post I’m going to walk through implementing a Convolutional Neural Network in TensorFlow to classify MNIST digits.
Code adapted from siddk's tensorflow workshop

The Math

XC1P1C2P2FHOL= Image Matrix =ReLU(conv2d(X,Wc1)+Bc1)=maxPool(C1)=ReLU(conv2d(P1,Wc2)+Bc2)=maxPool(C2)= Flattened form of P2=ReLU(F×Wf1+bf1)=Softmax(H×Wf2+bf2)=Loss(O,Y)

Setup

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf 

# Fetch MNIST Dataset using the supplied Tensorflow Utility Function
mnist = input_data.read_data_sets("data/MNIST_data/", one_hot=True)

# Setup the Model Parameters
INPUT_SIZE, OUTPUT_SIZE = 784, 10  
FILTER_SIZE, FILTER_ONE_DEPTH, FILTER_TWO_DEPTH = 5, 32, 64
FLAT_SIZE, HIDDEN_SIZE = 7 * 7 * 64, 1024

Convolution and Pooling

The two main transformations used in the CNN different from the FFNN are convolution and pooling.

Convolution

From which the CNN gets its name, Convolution is a process of applying filters to the image. In our case, the filter is a 5x5 matrix which is positioned over every cell of the image, multiplying pairwise and summing. For our model we have 32 of these filters in the first convolution, and 64 for the second. The values in these are trained by TensorFlow to minimize the loss.
Source

Pooling

Pooling shrinks the outputted feature maps from the convolution, removing less important information. In max pooling, each 2x2 area is condensed to its largest value, cutting each side length in half (28 to 14 and 14 to 7). Average pooling is also used where the average of each pool is outputted into the new cell.
Source
# Create Convolution/Pooling Helper Functions 
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

### Start Building the Computation Graph ###

# Initializer - initialize our variables from standard normal with stddev 0.1
initializer = tf.random_normal_initializer(stddev=0.1)

# Setup Placeholders => None argument in shape lets us pass in arbitrary sized batches
X = tf.placeholder(tf.float32, shape=[None, INPUT_SIZE])  
Y = tf.placeholder(tf.float32, shape=[None, OUTPUT_SIZE])
keep_prob = tf.placeholder(tf.float32) 

# Reshape input so it resembles an image (height x width x depth)
X_image = tf.reshape(X, [-1, 28, 28, 1])

# Conv Filter 1 Variables
Wconv_1 = tf.get_variable("WConv_1", shape=[FILTER_SIZE, FILTER_SIZE,
                                            1, FILTER_ONE_DEPTH], initializer=initializer)
bconv_1 = tf.get_variable("bConv_1", shape=[FILTER_ONE_DEPTH], initializer=initializer)

# First Convolutional + Pooling Transformation
h_conv1 = tf.nn.relu(conv2d(X_image, Wconv_1) + bconv_1)
h_pool1 = max_pool_2x2(h_conv1)

# Conv Filter 2 Variables
Wconv_2 = tf.get_variable("WConv_2", shape=[FILTER_SIZE, FILTER_SIZE,
                                            FILTER_ONE_DEPTH, FILTER_TWO_DEPTH],
                          initializer=initializer)
bconv_2 = tf.get_variable("bConv_2", shape=[FILTER_TWO_DEPTH], initializer=initializer)

# Second Convolutional + Pooling Transformation
h_conv2 = tf.nn.relu(conv2d(h_pool1, Wconv_2) + bconv_2)
h_pool2 = max_pool_2x2(h_conv2)

Feed-Forward & Dropout

Our model finishes with flattening out last pool into a giant 3136 long vector and running it through our FFNN model from the last notebook. However, there is a small but important addition.
Dropout disables nodes with a probability of keep_prob, removing them as well as their connections from the graph. This also means they aren't trained during that particular iteration. This limits the performance of the model, preventing it from over fitting the train dataset.

# Flatten Convolved Image, into vector for remaining feed-forward transformations
h_pool2_flat = tf.reshape(h_pool2, [-1, FLAT_SIZE])

# Hidden Layer Variables
W_1 = tf.get_variable("W_1", shape=[FLAT_SIZE, HIDDEN_SIZE], initializer=initializer)
b_1 = tf.get_variable("b_1", shape=[HIDDEN_SIZE], initializer=initializer)

# Hidden Layer Transformation
hidden = tf.nn.relu(tf.matmul(h_pool2_flat, W_1) + b_1)

# DROPOUT - For regularization
hidden_drop = tf.nn.dropout(hidden, keep_prob)

# Output Layer Variables
W_2 = tf.get_variable("W_2", shape=[HIDDEN_SIZE, OUTPUT_SIZE], initializer=initializer)
b_2 = tf.get_variable("b_2", shape=[OUTPUT_SIZE], initializer=initializer)

# Output Layer Transformation
output = tf.matmul(hidden_drop, W_2) + b_2

# Compute Loss
loss = tf.losses.softmax_cross_entropy(Y, output)

Training & Results

# Compute Accuracy
correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(output, 1))
accuracy = 100 * tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Setup Optimizer
train_op = tf.train.AdamOptimizer().minimize(loss)

### Launch the Session, to Communicate with Computation Graph ###
BATCH_SIZE, NUM_TRAINING_STEPS = 100, 1000
with tf.Session() as sess:
    # Initialize all variables in the graph
    sess.run(tf.global_variables_initializer())

    # Training Loop
    for i in range(NUM_TRAINING_STEPS):
        batch_x, batch_y = mnist.train.next_batch(BATCH_SIZE)
        curr_acc, _ = sess.run([accuracy, train_op], feed_dict={X: batch_x,
                                                                Y: batch_y,
                                                                keep_prob: 0.5})
        if i % 100 == 0:
            print('Step {} Current Training Accuracy: {:.3f}'.format(i, curr_acc))
    
    # Evaluate on Test Data
    # keep-prop = 1.0 to disable dropout
    print('Test Accuracy: {:.3f}'.format(sess.run(accuracy, feed_dict={X: mnist.test.images, 
                                                                Y: mnist.test.labels,
                                                                keep_prob: 1.0}))) 
Step 0 Current Training Accuracy: 9.000
Step 100 Current Training Accuracy: 91.000
Step 200 Current Training Accuracy: 96.000
Step 300 Current Training Accuracy: 99.000
Step 400 Current Training Accuracy: 96.000
Step 500 Current Training Accuracy: 95.000
Step 600 Current Training Accuracy: 98.000
Step 700 Current Training Accuracy: 97.000
Step 800 Current Training Accuracy: 99.000
Step 900 Current Training Accuracy: 99.000
Test Accuracy: 98.730

No hay comentarios:

Publicar un comentario