[모두의 딥러닝 Chapter10]

Notice

GitHUb

Recent Posts

Recent Comments

Link

« 2025/01 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

ComputerVision Jack

[모두의 딥러닝 Chapter10] 본문

DeepLearning/DL_ZeroToAll

[모두의 딥러닝 Chapter10]

JackYoon 2020. 1. 22. 18:11

[10-1 mnist_softmax]

softmax()함수를 이용한 mnist 데이터 분류

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot = True)

#mnist데이터를 작업 환경으로 load한다.

X = tf.placeholder(tf.float32, [None, 784])

Y = tf.placeholder(tf.float32, [None, 10])

#mnist 이미지를 통채로 넣어 준다. (28 * 28) 자료형이기 때문에 784크기를 받는다.

W = tf.Variable(tf.random_normal([784, 10]))

b = tf.Variable(tf.random_normal([10]))

#가충치 또한 784크기이고 결과 분류가 10클래스 이기 때문에 [784, 10]으로 연결한다.

learning_rate = 0.001

batch_size = 100

num_epochs = 50

num_iterations = int(mnist.train.num_examples / batch_size)

#하이퍼 파라미터를 연결한다.

hypothesis = tf.matmul(X, W) + b

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = hypothesis, labels = tf.stop_gradient(Y)))

train = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)

#모델을 설정하고, softmax 손실함수를 정의하고, adam최적하를 생성한다.

correct_prediction = tf.equal(tf.argmax(hypothesis, axis = 1), tf.argmax(Y, axis = 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

#정확도를 판별할 부분을 구현한다.

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

for epoch in range(num_epochs):

avg_cost = 0

for iteration in range(num_iterations):

batch_xs, batch_ys = mnist.train.next_batch(batch_size)

_,cost_val = sess.run([train, cost], feed_dict = {X: batch_xs, Y: batch_ys})

avg_cost += cost_val / num_iterations

print(f"Epoch : {(epoch + 1):04d}, Cost : {avg_cost:.9f}")

#학습을 시킨다.

r = random.randint(0, mnist.test.num_examples - 1)

print("Label : ", sess.run(tf.argmax(mnist.test.labels[r : r + 1], axis = 1)))

print("Prediction : ", sess.run(tf.argmax(hypothesis, axis = 1), feed_dict = {X: mnist.test.images[r: r + 1]}),)

plt.imshow(

mnist.test.images[r: r+ 1].reshape(28, 28),

cmap = 'Greys',

interpolation = 'nearest',

)

plt.show()

#랜덤한 숫자를 입력받아 plt를 이용하여 예측이 맞는지 확인한다.

[10-2 mnist_nn]

이번엔 모델을 좀더 깊게 만든다.

W1 = tf.Variable(tf.random_normal([784, 256]))

b1 = tf.Variable(tf.random_normal([256]))

L1 = tf.nn.relu(tf.matmul(X, W1) + b1)

W2 = tf.Variable(tf.random_normal([256, 256]))

b2 = tf.Variable(tf.random_normal([256]))

L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)

W3 = tf.Variable(tf.random_normal([256, 10]))

b3 = tf.Variable(tf.random_normal([10]))

hypothesis = tf.matmul(L2, W3) + b3

#레이어를 2개를 추가하여 신경망을 구현한다.

결과가 훨씬더 좋아진 것을 확인할 수 있다. 나머지 코드는 위 코드와 같다.

다만 신경망의 층만 깊게 연결한다.

[10-3 mnist_nn_xavier]

이번엔 W에 대하여 초기값 설정을 해주려고 한다.

초기값을 어떻게 설정하냐에 따라 모델 정확도에 많은 영향을 미친다.

tf.get_variable()

W1 = tf.get_variable("W1", shape = [784, 256],

initializer = tf.contrib.layers.xavier_initializer())

b1 = tf.Variable(tf.random_normal([256]))

L1 = tf.nn.relu(tf.matmul(X, W1) + b1)

W2 = tf.get_variable("W2", shape=[256, 256],

initializer = tf.contrib.layers.xavier_initializer())

b2 = tf.Variable(tf.random_normal([256]))

L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)

W3 = tf.get_variable("W3", shape = [256, 10],

initializer = tf.contrib.layers.xavier_initializer())

b3 = tf.Variable(tf.random_normal([10]))

hypothesis = tf.matmul(L2, W3) + b3

#tf.get_variable()함수를 사용하여 W에 대하여 최적의 초기값 설정을 한다.

정확도가 한층 더 올라간다.

[10-4 mnist_nn_deep]

이번엔 신경망의 깊이를 좀더 추가하여 모델을 구현한다.

W1 = tf.get_variable("W1", shape=[784, 512],

initializer=tf.contrib.layers.xavier_initializer())

b1 = tf.Variable(tf.random_normal([512]))

L1 = tf.nn.relu(tf.matmul(X, W1) + b1)

W2 = tf.get_variable("W2", shape=[512, 512],

initializer=tf.contrib.layers.xavier_initializer())

b2 = tf.Variable(tf.random_normal([512]))

L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)

W3 = tf.get_variable("W3", shape=[512, 512],

initializer=tf.contrib.layers.xavier_initializer())

b3 = tf.Variable(tf.random_normal([512]))

L3 = tf.nn.relu(tf.matmul(L2, W3) + b3)

W4 = tf.get_variable("W4", shape=[512, 512],

initializer=tf.contrib.layers.xavier_initializer())

b4 = tf.Variable(tf.random_normal([512]))

L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)

W5 = tf.get_variable("W5", shape=[512, 10],

initializer=tf.contrib.layers.xavier_initializer())

b5 = tf.Variable(tf.random_normal([10]))

hypothesis = tf.matmul(L4, W5) + b5

#정확도가 미약하게 올라갔으나 모델의 깊이가 많이 깊어진다고 정확도가 높아지진 않는다.

오히려 연산과정만 길게 시간을 잡아먹을 수 있다.

[10-5 mnist_nn_dropout]

Drop Out

드롭 아웃 개념은 과적합을 방지하기 위해 고안된 방법이다.

학습할 때, 노드활성화에 제약을 두고 학습을 시키고, 정확도를 판별할 땐, 모든 노드를 활성화 시켜서 학습 하는 방법.

keep_prob = tf.placeholder(tf.float32)

#dropout 정도를 던져주기 위해 placeholder로 제작한다. (보통 0.0~1.0)사이의 값이다.

W1 = tf.get_variable("W1", shape=[784, 512],

initializer=tf.contrib.layers.xavier_initializer())

b1 = tf.Variable(tf.random_normal([512]))

L1 = tf.nn.relu(tf.matmul(X, W1) + b1)

L1 = tf.nn.dropout(L1, keep_prob=keep_prob)

W2 = tf.get_variable("W2", shape=[512, 512],

initializer=tf.contrib.layers.xavier_initializer())

b2 = tf.Variable(tf.random_normal([512]))

L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)

L2 = tf.nn.dropout(L2, keep_prob=keep_prob)

W3 = tf.get_variable("W3", shape=[512, 512],

initializer=tf.contrib.layers.xavier_initializer())

b3 = tf.Variable(tf.random_normal([512]))

L3 = tf.nn.relu(tf.matmul(L2, W3) + b3)

L3 = tf.nn.dropout(L3, keep_prob=keep_prob)

W4 = tf.get_variable("W4", shape=[512, 512],

initializer=tf.contrib.layers.xavier_initializer())

b4 = tf.Variable(tf.random_normal([512]))

L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)

L4 = tf.nn.dropout(L4, keep_prob=keep_prob)

W5 = tf.get_variable("W5", shape=[512, 10],

initializer=tf.contrib.layers.xavier_initializer())

b5 = tf.Variable(tf.random_normal([10]))

hypothesis = tf.matmul(L4, W5) + b5

#만들어 놓은 층 하단부에 dropout을 적용하여 연결한다.

r = random.randint(0, mnist.test.num_examples - 1)

print("Label: ", sess.run(tf.argmax(mnist.test.labels[r:r + 1], 1)))

print("Prediction: ", sess.run(

tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1], keep_prob: 1}))

#마지막 정확도를 산출할땐, keep_prob 값에 1을 주어 모든 노드를 활성화 시켜서 결과를 도출한다.

dropout을 적용하면 정확도가 더 올라가는 것을 볼 수 있다.

[10-7 mnist_nn_higher_level_API]

이번엔 API를 이용하여 기존 mnist자료에 대한 모델을 구현한다.

learning_rate = 0.001

training_epochs = 15

batch_size = 100

keep_prob = 0.7

hidden_output_size = 512

final_output_size = 10

#하이퍼 파라미터를 설정한다.

xavier_init = tf.contrib.layers.xavier_initializer()

bn_params = {

'is_training': train_mode,

'decay': 0.9,

'updates_collections': None

}

#변수에 대한 초기값 세팅을 해준다.

with arg_scope([fully_connected],

activation_fn=tf.nn.relu,

weights_initializer=xavier_init,

biases_initializer=None,

normalizer_fn=batch_norm,

normalizer_params=bn_params

hidden_layer1 = fully_connected(X, hidden_output_size, scope="h1")

h1_drop = dropout(hidden_layer1, keep_prob, is_training=train_mode)

hidden_layer2 = fully_connected(h1_drop, hidden_output_size, scope="h2")

h2_drop = dropout(hidden_layer2, keep_prob, is_training=train_mode)

hidden_layer3 = fully_connected(h2_drop, hidden_output_size, scope="h3")

h3_drop = dropout(hidden_layer3, keep_prob, is_training=train_mode)

hidden_layer4 = fully_connected(h3_drop, hidden_output_size, scope="h4")

h4_drop = dropout(hidden_layer4, keep_prob, is_training=train_mode)

hypothesis = fully_connected(h4_drop, final_output_size, activation_fn=None, scope="hypothesis")

#기존과는 다르게 API를 사용하여 층을 구현한다.

[10-8 mnist_nn_selu(wip)]

def selu(x):

with ops.name_scope('elu') as scope:

alpha = 1.6732632423543772848170429916717

scale = 1.0507009873554804934193349852946

return scale*tf.where(x>=0.0, x, alpha*tf.nn.elu(x))

def dropout_selu(x, keep_prob, alpha= -1.7580993408473766, fixedPointMean=0.0, fixedPointVar=1.0,

noise_shape=None, seed=None, name=None, training=False):

"""Dropout to a value with rescaling."""

def dropout_selu_impl(x, rate, alpha, noise_shape, seed, name):

keep_prob = 1.0 - rate

x = ops.convert_to_tensor(x, name="x")

if isinstance(keep_prob, numbers.Real) and not 0 < keep_prob <= 1:

raise ValueError("keep_prob must be a scalar tensor or a float in the "

"range (0, 1], got %g" % keep_prob)

keep_prob = ops.convert_to_tensor(keep_prob, dtype=x.dtype, name="keep_prob")

keep_prob.get_shape().assert_is_compatible_with(tensor_shape.scalar())

alpha = ops.convert_to_tensor(alpha, dtype=x.dtype, name="alpha")

keep_prob.get_shape().assert_is_compatible_with(tensor_shape.scalar())

if tensor_util.constant_value(keep_prob) == 1:

return x

noise_shape = noise_shape if noise_shape is not None else array_ops.shape(x)

random_tensor = keep_prob

random_tensor += random_ops.random_uniform(noise_shape, seed=seed, dtype=x.dtype)

binary_tensor = math_ops.floor(random_tensor)

ret = x * binary_tensor + alpha * (1-binary_tensor)

a = tf.sqrt(fixedPointVar / (keep_prob *((1-keep_prob) * tf.pow(alpha-fixedPointMean,2) + fixedPointVar)))

b = fixedPointMean - a * (keep_prob * fixedPointMean + (1 - keep_prob) * alpha)

ret = a * ret + b

ret.set_shape(x.get_shape())

return ret

with ops.name_scope(name, "dropout", [x]) as name:

return utils.smart_cond(training,

lambda: dropout_selu_impl(x, keep_prob, alpha, noise_shape, seed, name),

lambda: array_ops.identity(x))

#dropout에 대한 함수정의를 통해서 좀더 세밀하게 dropout을 조절하고 작업한다.

[10-X1 mnist_back_prop]

마지막 예제는 mnist에 대해 역전파 알고리즘을 적용하는 예제이다.

10-1-mnist_softmax.ipynb

0.01MB

10-2-mnist_nn.ipynb

0.01MB

10-3-mnist_nn_xavier.ipynb

0.01MB

10-4-mnist_nn_deep.ipynb

0.01MB

10-5-mnist_nn_dropout.ipynb

0.01MB

10-7-mnist_nn_higher_level_API.ipynb

0.01MB

10-8-mnist_nn_selu(wip).ipynb

0.01MB

10-X1-mnist_back_prop.ipynb

0.01MB

'DeepLearning > DL_ZeroToAll' 카테고리의 다른 글

[모두의 딥러닝 Chapter12] (0)	2020.01.28
[모두의 딥러닝 Chapter11] (0)	2020.01.23
[모두의 딥러닝 Chapter09] (0)	2020.01.21
[모두의 딥러닝 Chapter08] (0)	2020.01.20
[모두의 딥러닝 Chapter07] (0)	2020.01.19

'DeepLearning/DL_ZeroToAll' Related Articles

Comments

ComputerVision Jack

[모두의 딥러닝 Chapter10] 본문

[모두의 딥러닝 Chapter10]

[10-1 mnist_softmax]

[10-2 mnist_nn]

[10-3 mnist_nn_xavier]

[10-4 mnist_nn_deep]

[10-5 mnist_nn_dropout]

[10-7 mnist_nn_higher_level_API]

[10-8 mnist_nn_selu(wip)]

[10-X1 mnist_back_prop]

'DeepLearning > DL_ZeroToAll' 카테고리의 다른 글

티스토리툴바