[모두의 딥러닝 Chapter12]

Notice

GitHUb

Recent Posts

Recent Comments

Link

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Tags more

Archives

Today

Total

관리 메뉴

ComputerVision Jack

[모두의 딥러닝 Chapter12] 본문

DeepLearning/DL_ZeroToAll

[모두의 딥러닝 Chapter12]

JackYoon 2020. 1. 28. 15:37

[12-0 rnn-basics]

RNN엔 Cell에 대한 기본적인 접근

h = [1, 0, 0, 0]

e = [0, 1, 0, 0]

l = [0, 0, 1, 0]

o = [0, 0, 0, 1]

#실습에 사용될 hello에 대한 one-hot 인코딩 적용

with tf.variable_scope('one_cell') as scope:

# One cell Rnn input_dim(4) -> output_dm(2)

hidden_size = 2

cell = tf.keras.layers.SimpleRNNCell(units = hidden_size)

print(cell.output_size, cell.state_size)

x_data = np.array([[h]], dtype = np.float32) #x_data = [[[1, 0, 0, 0]]]

pp.pprint(x_data)

outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype = tf.float32)

sess.run(tf.global_variables_initializer())

pp.pprint(outputs.eval())

입력 shape은 입력 데이터의 차원을 따르지만 출력은 hidden_size를 통하여 내가 스스로 정할 수 있다.

RNN의 Cell을 만든 후에 dynamic_rnn을 통하여 cell을 구동한다.

with tf.variable_scope('two_sequances') as scope:

# One cell RNN input_dim(4) -> output_dim(2), sequence = 5

hidden_size = 2

cell = tf.keras.layers.SimpleRNNCell(units = hidden_size)

x_data = np.array([[h, e, l, l, o]], dtype = np.float32)

print(x_data.shape)

pp.pprint(x_data)

outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype = tf.float32)

sess.run(tf.global_variables_initializer())

pp.pprint(outputs.eval())

#시퀀스 데이터가 추가된 상태 (1, 5, 4)의 shape을 따른다.

(5 = hello이므로 5개의 시퀀스이며, 4는 h e l o의 shape이다)

with tf.variable_scope('3_batches') as scope:

# One cell RNN input_dim(4) -> output_dim(2), sequence : 5, batch : 3

x_data = np.array([[h, e, l, l, o],

[e, o, l, l, l],

[l, l, e, e, l]], dtype = np.float32)

pp.pprint(x_data)

hidden_size = 2

cell = tf.nn.rnn_cell.LSTMCell(num_units = hidden_size, state_is_tuple = True)

outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype = tf.float32)

sess.run(tf.global_variables_initializer())

pp.pprint(outputs.eval())

#입력에 따른 시퀀스 데이터 확인

with tf.variable_scope('initial_state') as scope:

batch_size = 3

x_data = np.array([[h, e, l, l, o],

[e, o, l, l, l],

[l, l, e, e, l]], dtype=np.float32)

pp.pprint(x_data)

# One cell RNN input_dim (4) -> output_dim (5). sequence: 5, batch: 3

hidden_size=2

cell = tf.nn.rnn_cell.LSTMCell(num_units=hidden_size, state_is_tuple=True)

initial_state = cell.zero_state(batch_size, tf.float32)

outputs, _states = tf.nn.dynamic_rnn(cell, x_data,

initial_state=initial_state, dtype=tf.float32)

sess.run(tf.global_variables_initializer())

pp.pprint(outputs.eval())

#배치 사이즈를 적용하여 생성한다.

따라서 Cell의 shape에 대하여 ( num1, num2, num3) 경우

num1 : 배치 사이즈를 설정

num2 : 시퀀스 데이터

num3 : 입력 데이터 shape

으로 알 수 있다.

[12-1 hello-rnn]

hihello rnn 학습

idx2char = ['h', 'i', 'e', 'l', 'o']

#필요한 문자열을 셋팅한다.

x_data = [[0, 1, 0, 2, 3, 3]]

x_one_hot = [[[1, 0, 0, 0, 0],

[0, 1, 0, 0, 0],

[1, 0, 0, 0, 0],

[0, 0, 1, 0, 0],

[0, 0, 0, 1, 0],

[0, 0, 0, 1, 0]]]

y_data = [[1, 0, 2, 3, 3, 4]]

#데이터를 준비한다. x_data는 one_hot을 적용하고 y_data는 index로 설정해서 준비한다.

num_classes = 5

input_dim = 5

batch_size = 1

hidden_size = 5

sequence_length = 6

learning_rate = 0.1

#하이퍼 파라미터 설정

X = tf.placeholder(tf.float32, [None, sequence_length, input_dim])

Y = tf.placeholder(tf.int32, [None, sequence_length])

#배치 사이즈와 입력 dimension, sequence_length를 설정한다. (배치, 시퀀스, 입력) - (None, 6, 5)

cell = tf.contrib.rnn.BasicLSTMCell(num_units = hidden_size, state_is_tuple = True)

initial_state = cell.zero_state(batch_size, tf.float32)

outputs, _states = tf.nn.dynamic_rnn(cell, X, initial_state = initial_state, dtype = tf.float32)

#셀을 제작 한다. 출력 사이즈 (출력값 = 5)

x_for_fc = tf.reshape(outputs, [-1, hidden_size])

outputs = tf.contrib.layers.fully_connected(inputs = x_for_fc, num_outputs = num_classes, activation_fn = None)

outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])

#출력 결과의 shape을 바꿔준다.

weights = tf.ones([batch_size, sequence_length])

sequence_loss = tf.contrib.seq2seq.sequence_loss(logits = outputs, targets = Y, weights= weights)

loss = tf.reduce_mean(sequence_loss)

train = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(loss)

#마찬가지로 cost함수를 설정하고 optimizer를 설정한다.

cost 경우 sequence별로 적용되는 것을 알 수 있다.

prediction = tf.argmax(outputs, axis =2)

#정확도를 알수 있게 설정한다.

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

for i in range(50):

l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data})

result = sess.run(prediction, feed_dict={X: x_one_hot})

print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data)

result_str = [idx2char[c] for c in np.squeeze(result)]

print("\tPrediction str: ", ''.join(result_str))

#실질적으로 학습을 적용하고, 마지막 출력에 대해서 인덱스에 맞는 알파벳을 가져온다.

[12-2 char-seq-rnn]

if you want 예제. 공백까지 인덱스에 추가해야 한다.

sample = "if you want"

idx2char = list(set(sample)) # incdex -> char

char2idx = {c : i for i, c in enumerate(idx2char)}

#알파벳 단위로 중복을 배제하고 찢은 후, 인덱스를 적용한다.

dic_size = len(char2idx)

hidden_size = len(char2idx)

num_classes = len(char2idx)

batch_size = 1

sequence_length = len(sample) - 1

learning_rate = 0.1

#하이퍼 파라미터를 설정한다.

sample_idx = [char2idx[c] for c in sample]

x_data = [sample_idx[: -1]]

y_data = [sample_idx[1: ]]

#x_data와 y_data를 설정한다.

if you want에 대해 다음에 매칭 될 글자는 f you want이다. 따라서 그에 맞게 슬라이싱을 적용하여 데이터에 넣는다.

x_one_hot = tf.one_hot(X, num_classes)

cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size, state_is_tuple=True)

initial_state = cell.zero_state(batch_size, tf.float32)

outputs, _states = tf.nn.dynamic_rnn(cell, x_one_hot, initial_state=initial_state, dtype=tf.float32)

#cell을 생성하고 dynamic_rnn()으로 구동한다.

outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])

#생성된 output에 대하여 shape을 변경한다.

one_hot으로 분류한 것을 다시 모아 인덱스로 포장한다.

weights = tf.ones([batch_size, sequence_length])

sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y, weights=weights)

loss = tf.reduce_mean(sequence_loss)

train = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)

#cost와 optimizer를 구현한다.

prediction = tf.argmax(outputs, axis=2)

#정확도 판별을 위한 준비

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

for i in range(50):

l, _ = sess.run([loss, train], feed_dict={X: x_data, Y: y_data})

result = sess.run(prediction, feed_dict={X: x_data})

result_str = [idx2char[c] for c in np.squeeze(result)]

print(i, "loss:", l, "Prediction:", ''.join(result_str))

#실질적으로 학습을 시키고 학습 과정중 출력의 인덱스에 맞게 알파벳을 매칭한다.

수행이 진행 될 수록 if you want다음에 올 문구가 확실해 진다.

[12-3 char-seq-softmax-only]

알파벳으로 분류시키기 때문에 flatten을 적용하여 softmax를 시킨다.

sample = "if you want you"

idx2char = list(set(sample))

char2idx = {c: i for i, c in enumerate(idx2char)}

#기존 처럼 데이터 준비

dic_size = len(char2idx)

rnn_hidden_size = len(char2idx)

num_classes = len(char2idx)

batch_size = 1

sequence_length = len(sample) - 1

learning_rate = 0.1

#하이퍼 파라미터 설정

X_one_hot = tf.one_hot(X, num_classes)

X_for_softmax = tf.reshape(X_one_hot, [-1, rnn_hidden_size])

#softmax에 들어갈 수 있게 data를 flatten하여 준비한다.

softmax_w = tf.get_variable("softmax_w", [rnn_hidden_size, num_classes])

softmax_b = tf.get_variable("softmax_b", [num_classes])

outputs = tf.matmul(X_for_softmax, softmax_w) + softmax_b

#softmax로 구현한다.

outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])

weights = tf.ones([batch_size, sequence_length])

#outputs을 rnn식으로 reshape한다.

sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y, weights=weights)

loss = tf.reduce_mean(sequence_loss) # mean all sequence loss

train = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)

#cost함수와 optimizer을 적용한다.

[12-4 rnn_long_char]

long sentence에 대하여 rnn 적용하기

sentence = ("if you want to build a ship, don't drum up people together to "

"collect wood and don't assign them tasks and work, but rather "

"teach them to long for the endless immensity of the sea.")

char_set = list(set(sentence))

char_dic = {w: i for i, w, in enumerate(char_set)}

#데이터 셋 준비하기

data_dim = len(char_set)

hidden_size = len(char_set)

num_classes = len(char_set)

sequence_length = 10

learning_rate = 0.1

#하이퍼 파라미터 준비

윈도우 사이즈

dataX = []

dataY = []

for i in range(0, len(sentence) - sequence_length):

x_str = sentence[i : i + sequence_length]

y_str = sentence[i + 1 : i + sequence_length + 1]

print(i, x_str, ' -> ', y_str)

x = [char_dic[c] for c in x_str]

y = [char_dic[c] for c in y_str]

dataX.append(x)

dataY.append(y)

batch_size = len(dataX)

#윈도우 사이즈 지정하여 윈도우를 움직이면서 패턴을 추출한다.

def lstm_cell():

cell = rnn.BasicLSTMCell(hidden_size, state_is_tuple = True)

return cell

multi_cells = rnn.MultiRNNCell([lstm_cell() for _ in range(2)], state_is_tuple= True)

outputs, _states = tf.nn.dynamic_rnn(multi_cells, X_one_hot, dtype = tf.float32)

#cell 준비하고 dynamic_rnn()으로 구동한다.

sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y, weights=weights)

mean_loss = tf.reduce_mean(sequence_loss)

train_op = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(mean_loss)

#cost함수와 optimizer을 준비한다.

for i in range(500):

_, l, results = sess.run(

[train_op, mean_loss, outputs], feed_dict={X: dataX, Y: dataY})

for j, result in enumerate(results):

index = np.argmax(result, axis=1)

print(i, j, ''.join([char_set[t] for t in index]), l)

#패턴을 학습시킨다.

results = sess.run(outputs, feed_dict={X: dataX})

for j, result in enumerate(results):

index = np.argmax(result, axis=1)

if j is 0:

print(''.join([char_set[t] for t in index]), end='')

else:

print(char_set[index[-1]], end='')

#학습 결과 패턴에 맞게 데이터를 join하여 문장을 완성해 나아간다.

[12-5 rnn_stock_prediction]

시계열 데이터(주식)에 rnn 적용 회귀 문제

def MinMaxScalar(data):

numerator = data - np.min(data, 0)

denominator = np.max(data, 0) - np.min(data, 0)

return numerator / (denominator + 1e-7)

#값의 고른 분포를 위한 minMax설정

xy = np.loadtxt('data-02-stock_daily.csv', delimiter=',')

xy = xy[::-1]

#데이터 읽어와서 준비

train_size = int(len(xy) * 0.7)

train_set = xy[0: train_size]

test_set = xy[train_size - seq_length : ]

train_set = MinMaxScalar(train_set)

test_set = MinMaxScalar(test_set)

#train set과 test set을 분리하고 MinMaxScale()함수에서 데이터 전처리

def build_dataset(time_series, seq_length):

dataX = []

dataY = []

for i in range(0, len(time_series) - seq_length):

_x = time_series[i:i + seq_length, :]

_y = time_series[i + seq_length, [-1]] # Next close price

print(_x, "->", _y)

dataX.append(_x)

dataY.append(_y)

return np.array(dataX), np.array(dataY)

#윈도우 사이즈 만큼 데이터를 패턴화 시켜 학습하기 위한 준비

cell = tf.contrib.rnn.BasicLSTMCell(

num_units=hidden_dim, state_is_tuple=True, activation=tf.tanh)

outputs, _states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)

Y_pred = tf.contrib.layers.fully_connected(outputs[:, -1], output_dim, activation_fn=None)

#cell과 dynamic_rnn 준비

loss = tf.reduce_sum(tf.square(Y_pred - Y))

optimizer = tf.train.AdamOptimizer(learning_rate)

train = optimizer.minimize(loss)

#최적화와 cost함수 준비

with tf.Session() as sess:

init = tf.global_variables_initializer()

sess.run(init)

# Training step

for i in range(iterations):

_, step_loss = sess.run([train, loss], feed_dict={

X: trainX, Y: trainY})

print("[step: {}] loss: {}".format(i, step_loss))

# Test step

test_predict = sess.run(Y_pred, feed_dict={X: testX})

rmse_val = sess.run(rmse, feed_dict={

targets: testY, predictions: test_predict})

print("RMSE: {}".format(rmse_val))

#학습하여 결과 도출

12-0-rnn-basics.ipynb

0.08MB

12-1-hello-rnn.ipynb

0.01MB

12-2-char-seq-rnn.ipynb

0.01MB

12-3-char-seq-softmax-only.ipynb

0.16MB

12-4-rnn_long_char.ipynb

0.02MB

12-5-rnn_stock_prediction.ipynb

0.35MB

'DeepLearning > DL_ZeroToAll' 카테고리의 다른 글

[모두의 딥러닝 Chapter11] (0)	2020.01.23
[모두의 딥러닝 Chapter10] (0)	2020.01.22
[모두의 딥러닝 Chapter09] (0)	2020.01.21
[모두의 딥러닝 Chapter08] (0)	2020.01.20
[모두의 딥러닝 Chapter07] (0)	2020.01.19

'DeepLearning/DL_ZeroToAll' Related Articles

Comments

ComputerVision Jack

[모두의 딥러닝 Chapter12] 본문

[모두의 딥러닝 Chapter12]

[12-0 rnn-basics]

[12-1 hello-rnn]

[12-2 char-seq-rnn]

[12-3 char-seq-softmax-only]

[12-4 rnn_long_char]

[12-5 rnn_stock_prediction]

'DeepLearning > DL_ZeroToAll' 카테고리의 다른 글

티스토리툴바