【Pytorch 실습】 Mnist 데이터셋 MLP, CNN으로 학습 및 추론

21 Jul 2020 in Pytorch / Docker / Git

Mnist 데이터 셋을 다운받고 pytorch를 사용해서 학습 및 추론을 합니다. 아래의 코드는 Kaggle 및 Git의 공개된 코드를 적극 활용한, 과거의 공부한 내용을 정리한 내용입니다.

train, test 파일 다운받는 경로 :
https://www.kaggle.com/oddrationale/mnist-in-csv

PS.

jupyter 실행결과도 코드 박스로 출력되어 있습니다.
matplot을 이용해서 이미지를 출력하는 이미지는 첨부하지 않았습니다.

MLP 이용하기

1. 데이터 로드하기

# Load Dataset
import pandas as pd
train_dataset = pd.read_csv("./mnist_train.csv")
test_dataset = pd.read_csv("./mnist_test.csv")

# Check Train Set
train_dataset.head()

# Split to Image & Label
train_images = (train_dataset.iloc[:, 1:].values).astype("float32")
train_labels = train_dataset["label"].values
test_images = (test_dataset.iloc[:, 1:].values).astype("float32")

# Check Train Data's Image
print(train_images)

[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]

# Check Train Data's Label
print(train_labels)

[5 0 4 ... 5 6 8]

# Check Test Data's Image
test_images

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

# Split into Train, Valid Dataset 분리 비율은 8:2
from sklearn.model_selection import train_test_split
train_images, valid_images, train_labels, valid_labels = train_test_split(train_images,
                                                                          train_labels,
                                                                          stratify = train_labels,
                                                                          random_state = 42,
                                                                          test_size = 0.2)

# Check Train, Valid, Test Image's Shape
print("The Shape of Train Images: ", train_images.shape)
print("The Shape of Valid Images: ", valid_images.shape)
print("The Shape of Test Images: ", test_images.shape)

The Shape of Train Images:  (48000, 784)
The Shape of Valid Images:  (12000, 784)
The Shape of Test Images:  (10000, 784)

# Check Train, Valid Label's Shape
print("The Shape of Train Labels: ", train_labels.shape)
print("The Shape of Valid Labels: ", valid_labels.shape)

The Shape of Train Labels:  (38400,)
The Shape of Valid Labels:  (9600,)

# Reshape image's size to check for ours
train_images = train_images.reshape(train_images.shape[0], 28, 28)
valid_images = valid_images.reshape(valid_images.shape[0], 28, 28)
test_images = test_images.reshape(test_images.shape[0], 28, 28)

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-18-f37841568d28> in <module>
      2 train_images = train_images.reshape(train_images.shape[0], 28, 28)
      3 valid_images = valid_images.reshape(valid_images.shape[0], 28, 28)
----> 4 test_images = test_images.reshape(test_images.shape[0], 28, 28)


ValueError: cannot reshape array of size 7850000 into shape (10000,28,28)

# Check Train, Valid, Test Image's Shape after reshape
print("The Shape of Train Images: ", train_images.shape)
print("The Shape of Valid Images: ", valid_images.shape)
print("The Shape of Test Images: ", test_images.shape)

The Shape of Train Images:  (38400, 28, 28)
The Shape of Valid Images:  (9600, 28, 28)
The Shape of Test Images:  (10000, 785)

# Visualize Train, Valid, Test's Images
import matplotlib.pyplot as plt
for idx in range(0, 5):
    plt.imshow(train_images[idx], cmap = plt.get_cmap('gray'))
    plt.title(train_labels[idx])
    plt.show()

for idx in range(0, 5):
    plt.imshow(valid_images[idx], cmap = plt.get_cmap('gray'))
    plt.title(valid_labels[idx])
    plt.show()

for idx in range(0, 5):
    plt.imshow(test_images[idx], cmap = plt.get_cmap('gray'))
    plt.show()

2. Torch를 이용한 데이터 로드, 모델 생성

지금까지 로드한 데이터를 torch.tensor형태로 데이터를 변환해준다.

# Make Dataloader to feed on Multi Layer Perceptron Model
# train과 test 셋을 batch 단위로 데이터를 처리하기 위해서. _loader를 정의 해준다.
import torch
from torch.utils.data import TensorDataset, DataLoader
train_images_tensor = torch.tensor(train_images)
train_labels_tensor = torch.tensor(train_labels)
train_tensor = TensorDataset(train_images_tensor, train_labels_tensor)
train_loader = DataLoader(train_tensor, batch_size = 64, num_workers = 0, shuffle = True)

valid_images_tensor = torch.tensor(valid_images)
valid_labels_tensor = torch.tensor(valid_labels)
valid_tensor = TensorDataset(valid_images_tensor, valid_labels_tensor)
valid_loader = DataLoader(valid_tensor, batch_size = 64, num_workers = 0, shuffle = True)

test_images_tensor = torch.tensor(test_images)

# Create Multi Layer Perceptron Model
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.input_layer = nn.Linear(28 * 28, 128)
        self.hidden_layer = nn.Linear(128, 128)
        self.output_layer = nn.Linear(128, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = F.relu(self.input_layer(x))
        x = F.relu(self.hidden_layer(x))
        x = self.output_layer(x)
        x = F.log_softmax(x, dim = 1)
        return x

USE_CUDA = torch.cuda.is_available()
DEVICE = torch.device("cuda" if USE_CUDA else "cpu")

model = MLP().to(DEVICE)
optimizer = optim.Adam(model.parameters(), lr = 0.001)  # optimization 함수 설정

print("Model: ", model)
print("Device: ", DEVICE)

Model:  MLP(
  (input_layer): Linear(in_features=784, out_features=128, bias=True)
  (hidden_layer): Linear(in_features=128, out_features=128, bias=True)
  (output_layer): Linear(in_features=128, out_features=10, bias=True)
)
Device:  cpu

3. 여기까지 모델 설계. 이제 학습 시작

# Definite Train & Evaluate
# 이제 학습키자!
def train(model, train_loader, optimizer):
    model.train()  # "나 모델을 이용해서 학습시킬게! 라는 의미"
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(DEVICE), target.to(DEVICE)
        optimizer.zero_grad()
        output = model(data)
        loss = F.cross_entropy(output, target)  # criterion
        loss.backward()
        optimizer.step()

        if batch_idx % 100 == 0:
            print("Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}".format(epoch, batch_idx * len(data), len(train_loader.dataset), 100. * batch_idx / len(train_loader), loss.item()))

def evaluate(model, valid_loader):
    model.eval()  # "나 모델을 이용해서 학습시킬게! 라는 의미"
    valid_loss = 0  # just Loss를 확인하기 위해서.
    correct = 0

    with torch.no_grad():
        for data, target in valid_loader:
            data, target = data.to(DEVICE), target.to(DEVICE)
            output = model(data)
            valid_loss += F.cross_entropy(output, target, reduction = "sum").item()
            prediction = output.max(1, keepdim = True)[1]
            correct += prediction.eq(target.view_as(prediction)).sum().item()

    valid_loss /= len(valid_loader.dataset)
    valid_accuracy = 100. * correct / len(valid_loader.dataset)
    return valid_loss, valid_accuracy

''' Training'''
EPOCHS = 10
for epoch in range(1, EPOCHS + 1):
    train(model, train_loader, optimizer)
    valid_loss, valid_accuracy = evaluate(model, valid_loader)
    print("[EPOCH: {}], \tValidation Loss: {:.4f}, \tValidation Accuracy: {:.2f} %\n".format(epoch, valid_loss, valid_accuracy))

Train Epoch: 9 [0/38400 (0%)]	Loss: 0.005890
Train Epoch: 9 [6400/38400 (17%)]	Loss: 0.054453
Train Epoch: 9 [12800/38400 (33%)]	Loss: 0.017136
Train Epoch: 9 [19200/38400 (50%)]	Loss: 0.004437
Train Epoch: 9 [25600/38400 (67%)]	Loss: 0.150551
Train Epoch: 9 [32000/38400 (83%)]	Loss: 0.177758
[EPOCH: 9], 	Validation Loss: 0.2030, 	Validation Accuracy: 95.92 %

Train Epoch: 10 [0/38400 (0%)]	Loss: 0.075817
Train Epoch: 10 [6400/38400 (17%)]	Loss: 0.075776
Train Epoch: 10 [12800/38400 (33%)]	Loss: 0.010031
Train Epoch: 10 [19200/38400 (50%)]	Loss: 0.056201
Train Epoch: 10 [25600/38400 (67%)]	Loss: 0.056001
Train Epoch: 10 [32000/38400 (83%)]	Loss: 0.103519
[EPOCH: 10], 	Validation Loss: 0.2131, 	Validation Accuracy: 95.54 %

4. Test 하고 Test 결과를 출력해보자.

# Predict Test Dataset
# Validation 과 Test 같은 경우에 다음과 같이 torch.no_grad를 꼭 사용하니 참고하자.
def testset_prediction(model, test_images_tensor):
    model.eval()  # test모드 아니고, validation모드를 사용한다.
    result = []
    with torch.no_grad():
        for data in test_images_tensor:
            data = data.to(DEVICE)
            output = model(data)
            prediction = output.max(1, keepdim = True)[1]
            result.append(prediction.tolist())
    return result

test_predict_result = testset_prediction(model, test_images_tensor)
test_predict_result[:5]

[[[7]], [[2]], [[1]], [[0]], [[4]]]

import numpy as np
from collections import Counter
Counter(np.squeeze(test_predict_result)).most_common()

[(1, 1139),
 (3, 1115),
 (9, 1113),
 (7, 991),
 (0, 979),
 (8, 959),
 (6, 949),
 (4, 938),
 (2, 934),
 (5, 883)]

for idx in range(0, 10):
    plt.imshow(test_images[idx], cmap = plt.get_cmap('gray'))
    plt.title("Predict: " + str(test_predict_result[idx]))
    plt.show()

CNN 이용하기

1. Data Load 하기

# Load Dataset
import pandas as pd
train_dataset = pd.read_csv("./Mnist_train.csv")
test_dataset = pd.read_csv("./Mnist_test.csv")

# Check Train Set
train_dataset.head()

# Split to Image & Label
train_images = (train_dataset.iloc[:, 1:].values).astype("float32")
train_labels = train_dataset["label"].values
test_images = (test_dataset.values).astype("float32")

# Check Train Data's Image 
train_images

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

# Check Train Data's Label
train_labels

array([1, 0, 1, ..., 7, 6, 9], dtype=int64)

# Check Test Data's Image
test_images

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

# Split into Train, Valid Dataset
from sklearn.model_selection import train_test_split
train_images, valid_images, train_labels, valid_labels = train_test_split(train_images, 
                                                                          train_labels, 
                                                                          stratify = train_labels, 
                                                                          random_state = 42, 
                                                                          test_size = 0.2)

# Check Train, Valid, Test Image's Shape
print("The Shape of Train Images: ", train_images.shape)
print("The Shape of Valid Images: ", valid_images.shape)
print("The Shape of Test Images: ", test_images.shape)

The Shape of Train Images:  (33600, 784)
The Shape of Valid Images:  (8400, 784)
The Shape of Test Images:  (28000, 784)

# Check Train, Valid Label's Shape
print("The Shape of Train Labels: ", train_labels.shape)
print("The Shape of Valid Labels: ", valid_labels.shape)

The Shape of Train Labels:  (33600,)
The Shape of Valid Labels:  (8400,)

# Reshape image's size to check for ours
train_images = train_images.reshape(train_images.shape[0], 28, 28)
valid_images = valid_images.reshape(valid_images.shape[0], 28, 28)
test_images = test_images.reshape(test_images.shape[0], 28, 28)

# Check Train, Valid, Test Image's Shape after reshape
print("The Shape of Train Images: ", train_images.shape)
print("The Shape of Valid Images: ", valid_images.shape)
print("The Shape of Test Images: ", test_images.shape)

The Shape of Train Images:  (33600, 28, 28)
The Shape of Valid Images:  (8400, 28, 28)
The Shape of Test Images:  (28000, 28, 28)

# Visualize Train, Valid, Test's Images
import matplotlib.pyplot as plt
for idx in range(0, 5):
    plt.imshow(train_images[idx], cmap = plt.get_cmap('gray'))
    plt.title(train_labels[idx])
    plt.show()

for idx in range(0, 5):
    plt.imshow(valid_images[idx], cmap = plt.get_cmap('gray'))
    plt.title(valid_labels[idx])
    plt.show()

for idx in range(0, 5):
    plt.imshow(test_images[idx], cmap = plt.get_cmap('gray'))
    plt.show()

# Make Dataloader to feed on Multi Layer Perceptron Model
import torch
from torch.utils.data import TensorDataset, DataLoader
train_images_tensor = torch.tensor(train_images)
train_labels_tensor = torch.tensor(train_labels)
train_tensor = TensorDataset(train_images_tensor, train_labels_tensor)
train_loader = DataLoader(train_tensor, batch_size = 64, num_workers = 0, shuffle = True)

valid_images_tensor = torch.tensor(valid_images)
valid_labels_tensor = torch.tensor(valid_labels)
valid_tensor = TensorDataset(valid_images_tensor, valid_labels_tensor)
valid_loader = DataLoader(valid_tensor, batch_size = 64, num_workers = 0, shuffle = True)

test_images_tensor = torch.tensor(test_images)

2. 모델 생성

import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size = 5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size = 5)
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = x.view(-1, 1, 28, 28)
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2(x), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        x = F.log_softmax(x)
        return x

USE_CUDA = torch.cuda.is_available()
DEVICE = torch.device("cuda" if USE_CUDA else "cpu")

model = CNN().to(DEVICE)
optimizer = optim.Adam(model.parameters(), lr = 0.001)

print("Model: ", model)
print("Device: ", DEVICE)

Model:  CNN(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=320, out_features=50, bias=True)
  (fc2): Linear(in_features=50, out_features=10, bias=True)
)
Device:  cuda

3. 학습

# Definite Train & Evaluate
def train(model, train_loader, optimizer):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(DEVICE), target.to(DEVICE)
        optimizer.zero_grad()
        output = model(data)
        loss = F.cross_entropy(output, target)
        loss.backward()
        optimizer.step()
        
        if batch_idx % 100 == 0:
            print("Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}".format(epoch, batch_idx * len(data), len(train_loader.dataset), 100. * batch_idx / len(train_loader), loss.item()))

def evaluate(model, valid_loader):
    model.eval()
    valid_loss = 0
    correct = 0

    with torch.no_grad():
        for data, target in valid_loader:
            data, target = data.to(DEVICE), target.to(DEVICE)
            output = model(data)
            valid_loss += F.cross_entropy(output, target, reduction = "sum").item()
            prediction = output.max(1, keepdim = True)[1]
            correct += prediction.eq(target.view_as(prediction)).sum().item()

    valid_loss /= len(valid_loader.dataset)
    valid_accuracy = 100. * correct / len(valid_loader.dataset)
    return valid_loss, valid_accuracy

''' Training'''
EPOCHS = 10
for epoch in range(1, EPOCHS + 1):
    train(model, train_loader, optimizer)
    valid_loss, valid_accuracy = evaluate(model, valid_loader)
    print("[EPOCH: {}], \tValidation Loss: {:.4f}, \tValidation Accuracy: {:.2f} %\n".format(epoch, valid_loss, valid_accuracy))

Train Epoch: 9 [0/33600 (0%)]	Loss: 0.039682
Train Epoch: 9 [6400/33600 (19%)]	Loss: 0.002007
Train Epoch: 9 [12800/33600 (38%)]	Loss: 0.041496
Train Epoch: 9 [19200/33600 (57%)]	Loss: 0.047618
Train Epoch: 9 [25600/33600 (76%)]	Loss: 0.001618
Train Epoch: 9 [32000/33600 (95%)]	Loss: 0.191500
[EPOCH: 9], 	Validation Loss: 0.0745, 	Validation Accuracy: 98.02 %

Train Epoch: 10 [0/33600 (0%)]	Loss: 0.007429
Train Epoch: 10 [6400/33600 (19%)]	Loss: 0.077467
Train Epoch: 10 [12800/33600 (38%)]	Loss: 0.007383
Train Epoch: 10 [19200/33600 (57%)]	Loss: 0.009296
Train Epoch: 10 [25600/33600 (76%)]	Loss: 0.054614
Train Epoch: 10 [32000/33600 (95%)]	Loss: 0.032034
[EPOCH: 10], 	Validation Loss: 0.0807, 	Validation Accuracy: 98.23 %

4. 추론

# Predict Test Dataset
def testset_prediction(model, test_images_tensor):
    model.eval()
    result = []
    with torch.no_grad():
        for data in test_images_tensor:
            data = data.to(DEVICE)
            output = model(data)
            prediction = output.max(1, keepdim = True)[1]
            result.append(prediction.tolist())
    return result

test_predict_result = testset_prediction(model, test_images_tensor)
test_predict_result[:5]

c:\users\justin\venv\lib\site-packages\ipykernel_launcher.py:21: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.

[[[2]], [[0]], [[9]], [[9]], [[3]]]

import numpy as np
from collections import Counter
Counter(np.squeeze(test_predict_result)).most_common()

[(1, 3176),
 (7, 2910),
 (3, 2872),
 (2, 2803),
 (9, 2801),
 (4, 2781),
 (0, 2725),
 (8, 2718),
 (6, 2716),
 (5, 2498)]

for idx in range(0, 10):
    plt.imshow(test_images[idx], cmap = plt.get_cmap('gray'))
    plt.title("Predict: " + str(test_predict_result[idx]))
    plt.show()

MLP 이용하기

1. 데이터 로드하기

2. Torch를 이용한 데이터 로드, 모델 생성

3. 여기까지 모델 설계. 이제 학습 시작

4. Test 하고 Test 결과를 출력해보자.

CNN 이용하기

1. Data Load 하기

2. 모델 생성

3. 학습

4. 추론

Templates (for web app):

Error