Skip to content

Implementation for CIFAR-10 challenge with Vision Transformer Model (compared with CNN based Models) from scratch

Notifications You must be signed in to change notification settings

dqj5182/ViT-PyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VIT_CNN_CIFAR_10_FROM_SCRATCH

Folder Structure

.
├── model
│   ├── HCGNet
│   ├── densenet
│   ├── dla
│   ├── dpn
│   ├── efficientnet
│   ├── efficientnetV2
│   ├── mobilenetV3
│   ├── pyramidnet
│   ├── resnet
│   ├── resnext
│   ├── vgg
│   └── vit
├── notebooks
│   └── Pretrained_Vision_Transformer_w_o_PyTorch_Lightning.ipynb
├── utils
│   ├── autoaugment.py
│   ├── dataaug.py
│   └── utils.py
├── main.py
└── vit_saved_model.pth          

[General Option] Train the any model with CIFAR-10 dataset:

CUDA_VISIBLE_DEVICES=0 python main.py --dataset c10 --label-smoothing --autoaugment --model-name [name of the model]

[Option VIT] Train VIT model with CIFAR-10 dataset (default):

CUDA_VISIBLE_DEVICES=0 python main.py --dataset c10 --label-smoothing --autoaugment

[Option ResNet] Train ResNet model with CIFAR-10 dataset:

CUDA_VISIBLE_DEVICES=0 python main.py --dataset c10 --label-smoothing --autoaugment --model-name resnet

CUDA_VISIBLE_DEVICES=0 means that we are assigning GPU0 to be our CUDA device

Pretrained Models

Epoch: 200 (CNN based), Epoch: 1500 (VIT)

Model Pretrained (.pth) Notebook Paper Accuracy
Vision Transformer Google Drive Google Colab Arxiv 90.61%
ResNet-50 Google Drive Google Colab Arxiv 95.36%
ResNeXT Google Drive Google Colab Arxiv 94.91%
HCGNet Google Drive Google Colab Arxiv 94.76%
DenseNet Google Drive Google Colab Arxiv 94.59%
VGG-19 Google Drive Google Colab Arxiv 94.37%
VGG-11 Google Drive Google Colab Arxiv 92.72%

From Scratch VS PyTorch Lightning

Please visit my another GitHub repo for PyTorch Lightning version of this code. They work pretty much the same but in different format.

Reference

Many of the codes for VIT models, training, and testing are from ViT-CIFAR repo by omihub777
Some of the codes for VIT models are from vision_transformer repo by google-research
EfficientNet from EfficientNet-PyTorch repo by lukemelas
MobileNetV3 from mobilenetv3 repo by xiaolai-sqlai
Most of the other models from pytorch-cifar repo by kuangliu

About

Implementation for CIFAR-10 challenge with Vision Transformer Model (compared with CNN based Models) from scratch

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published