Notice

GitHUb

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

ComputerVision Jack

MobileNetV2: Inverted Residuals and Linear Bottlenecks 본문

Reading Paper/Classification Networks

MobileNetV2: Inverted Residuals and Linear Bottlenecks

JackYoon 2021. 12. 24. 11:08

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Abstract

새로운 Mobile Architecture, MobileNetV2 제시한다.
Object Detection 진행하기 위해 새로운 framework인 MobileDeepLabv3 사용하여 Mobile Model 효율적인 방법으로 제시한다.
Inverted Residual 구조를 사용하여, bottleneck 구간에 Shortcut Connections 진행한다.
중간 Expansion Layer 대해 가벼운 Depthwise Convolution 적용해 non-lineartiy한 Feauture 추출한다.
그리고 narrow layer에서 non-lineartiy 제거하는 것이 Feature 잘 유지할 수 있음을 발견했다.
논문의 접근 방식은 Input/Output 도메인 분리를 허용하기 때문에 분석을 위한 framework에서 편리하게 작동한다.

Introduction

Network의 정확도 향상을 위해선 항상 비용 측면이 따라온다 : 현재 state of the art Network는 Mobile 및 Embedded Application의 수용성을 넘어 높은 연산을 요구한다.
따라서 논문은 새로운 Neural Network 구조를 통해 Mobile 및 Resource 제한 환경에 대한 맞춤형 해결을 제안한다.
저자들은 새로운 Layer Module에 많은 중점을 두었다 : Linear Bottleneck 사용하는 Inverted Residual Block이다.
이 Module은 Low-Dimensional Feature 입력으로 받아 High-Dimension 확장하고, Depthwise Convolution 이용하여 Filtering 작업을 진행한다. 이 후, Feature는 Linear-Convolution 사용해 다시 Low-Dimension 투영된다.
이러한 Convolution Module은 특히 Mobile Design에 적합하다. 왜냐하면 Tensor 중간 단계로 복원하는 과정이 필요 없기 때문에 Memory Footprint 이점이 있기 때문이다.

Preliminaries, Discussion and Intuition

Depthwise Separable Convolutions

Depthwise Separable Convolutions Block은 많은 효율적인 Neural Network 구조에서 사용하고 있으며, 논문의 저자들 또한 사용한다.
첫 번째 Layer는 Depthwise Convolution 진행하며, Input Channel에 대해 단일 Convolution Filtering 수행한다.
두 번째 Layer는 1 x 1 Convolution 진행하는 Pointwise Convolution이다. 해당 Layer는 Input Channels에 대해 Linear Combination 연산하여 새로운 Features 추출한다.
Depthwise Separable Convolution은 다른 Layer 비해 연산을 줄일 수 있으며, MobileNetV2에서 k = 3 → (3 x 3 Depthwise Separable Convolution)이다.

Linear Bottlenecks

Real Image인 Input Set 갖고, Layer Activation 통과하면 "Maniford of Interest" 형성된다는 것을 안다. 그리고 이렇게 형성된 Feature는 Neural Network에서 Low-Dimensional Subspace에 들어가게 된다.
이러한 사실은 Layer의 Dimensionality 줄이거나, Operating Space의 Dimensionality 줄이면 알 수 있다.
MobileNetV1 직관을 따르면서, width_multiplier 접근은 Features가 전체 공간에 들어가기까지 Activation Space의 Dimensionality 줄이는 것을 허락한다. 그러나 이러한 직관은 Deep Convolutional Neural Network가 ReLU와 같은 활성화 함수(non-linear) 사용하는 생각을 한다면 오류인 것을 알 수 있다.
Deep Network는 Non-Zero인 Linear Classifier Output 도메인만 갖기 때문이다.

다시 말해 ReLU 통과할 때, 불가피하게 정보가 손실 될 수 있다는 것을 말한다. 그러나 만약 Channel 개수가 많다면 Activation 통과하더라도 Manifold 정보는 다른 Channel에 남아 있을 수 있다.
요약 하자면, Manifold of Interest는 Higher-Dimensional Activation Space의 Low-Dimensional Subspace에 놓여 있어야 한다는 것이다.

위 두 가지 관점이 Neural Architectrues 설계하는데 도움을 준다.

Inverted Residuals

Bottleneck Blocks은 Residual Block과 비슷하게 보인다. 각 Block은 Input 포함하고 있으며 여러 Bottlenecks과 Expansion이 뒤따른다.
Shortcuts 삽입하는 이유는 고전적인 Residual Connections과 의미가 비슷하다. 이로 인해 Gradient Propagate 능력을 향상 시키기 위함이다.
Inverted Design 상당히 메모리 효율적이며 더 잘 작동된다.

Inverted Residual - Running Time and Parameter Count for Bottleneck Convolution

Total number of multiply add : $h * w * d' * t(d' + k^2 + d'')$

Information Flow Interpretation

구조의 장점은 Building Block이 Input/Output 도메인을 자연스럽게 분리 시킨다는 것이다.
그리고 기존과 다르게 Expansion Ratio 1 보다 크게 하여 유용하게 한다.

Model Architecture

Non-linearity → ReLU6 사용하였다. 왜냐하면 해당 활성화 함수는 Low-precision Computation에 강인하기 때문이다. 다음 모든 3 x 3 kernel_size 사용하고, BatchNorm 연산을 추가한다.

Trade-off Hyper Parameters

저자들은 96 ~ 224 Resolution에 대해 실험하며, width_multiplier (0.35 to 1.4)가 최적인 것을 알아내었다.

Conclusions and future work

저자들은 높은 효율성을 갖는 Mobile Model 제안하였다. Inference 측면에서 메모리 효율성이 있다.
Theoretical Side : Inverted Residual Block은 Expansion Layer의 Expressiveness와 Bottlenec Input의 Capacity 분리할 수 있는 속성을 갖고 있다.

from ..layers.conv_block import Conv2dBn, Conv2dBnAct, DepthwiseConvBnAct, DepthwiseConvBn
from ..initialize import weight_initialize

import torch
from torch import nn

class MobileNet_stem(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(MobileNet_stem, self).__init__()
        self. conv = Conv2dBnAct(in_channels=in_channels, out_channels=out_channels, kernel_size=3, stride=2)

    def forward(self, input):
        return self.conv(input)

class InvertedResidualBlock(nn.Module): # expansion ratio = t
    def __init__(self, in_channels, out_channels, exp, stride):
        super(InvertedResidualBlock, self).__init__()
        self.stride = stride
        if exp == 0:
            exp = 1
        self.exp = exp
        self.conv1 = Conv2dBnAct(in_channels=in_channels,out_channels=in_channels * self.exp, kernel_size=1, stride=1, dilation=1, 
                                    groups=1, padding_mode='zeros', act=nn.ReLU6())
        self.dconv = DepthwiseConvBnAct(in_channels=in_channels * self.exp, kernel_size=3, stride=self.stride,
                                    dilation=1, padding_mode='zeros', act=nn.ReLU6())
        self.conv2 = Conv2dBn(in_channels=in_channels * self.exp, out_channels=out_channels, kernel_size=1)
		self.identity = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=stride)

    def forward(self, input):
        output = self.conv1(input)
        output = self.dconv(output)
        output = self.conv2(output)
        if self.stride == 1:
        	input = self.identity(input)
            output = input + output        
        return output


class _MobileNetv2(nn.Module):
    def __init__(self, in_channels, classes):
        super(_MobileNetv2, self).__init__()
        self.stage_channels = []
        self.stem_block = MobileNet_stem(in_channels=in_channels, out_channels=32)

        # confing in_channels, out_channels, expansion_ratio, stride
        layer1 = [ 
            [32, 16, 1, 1],
            [16, 24, 6, 2], [24, 24, 6, 1]
        ]
        layer2 = [
            [24, 32, 6, 2], [32, 32, 6, 1], [32, 32, 6, 1]
        ]
        layer3 = [ 
            [32, 64, 6, 2], [64, 64, 6, 1], [64, 64, 6, 1], [64, 64, 6, 1],
        ]
        layer4 = [
            [64, 96, 6, 2], [96, 96, 6, 1], [96, 96, 6, 1],
            [96, 160, 6, 1], [160, 160, 6, 1], [160, 160, 6, 1]
        ]
        layer5 = [
            [160, 320, 1, 1]
        ]
        self.layer1 = self.make_layers(layer1)
        self.layer2 = self.make_layers(layer2)
        self.layer3 = self.make_layers(layer3)
        self.layer4 = self.make_layers(layer4)
        self.layer5 = self.make_layers(layer5)
        self.classification = nn.Sequential(
            Conv2dBn(in_channels=320, out_channels=1280, kernel_size=1),
            nn.AdaptiveAvgPool2d(1),
            nn.Conv2d(1280, classes, 1)
        )
    def forward(self, input):
        stem_out = self.stem_block(input)
        s1 = self.layer1(stem_out)
        s2 = self.layer2(s1)
        s3 = self.layer3(s2)
        s4 = self.layer4(s3)
        s5 = self.layer5(s4)
        pred = self.classification(s5)
        b, c, _, _ = pred.size()
        pred = pred.view(b, c)
        stages = [s1, s2, s3, s4, s5]
        return {'stage':stages, 'pred':pred}
    
    def make_layers(self, layers_configs):
        layers = []
        for i, o, exp, stride in layers_configs:
            layers.append(InvertedResidualBlock(in_channels=i, out_channels=o, exp=exp, stride=stride))
        return nn.Sequential(*layers)

def MobileNetv2(in_channels, classes=1000):
    model = _MobileNetv2(in_channels=in_channels, classes=classes)
    weight_initialize(model)
    return model


if __name__ == '__main__':
    model = MobileNetv2(in_channels=3, classes=1000)
    model(torch.rand(1, 3, 224, 224))

'Reading Paper > Classification Networks' 카테고리의 다른 글

Searching for MobileNetV3 (0)	2021.12.29
Squuze-and-Excitation Networks (0)	2021.12.28
Densely Connected Convolutional Networks (0)	2021.12.23
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (0)	2021.12.22
Aggregated Residual Transformations for Deep Neural Networks (0)	2021.12.21

'Reading Paper/Classification Networks' Related Articles

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

ComputerVision Jack

ComputerVision Jack

MobileNetV2: Inverted Residuals and Linear Bottlenecks 본문

MobileNetV2: Inverted Residuals and Linear Bottlenecks

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Abstract

Introduction

Preliminaries, Discussion and Intuition

Model Architecture

Conclusions and future work

'Reading Paper > Classification Networks' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역