Open
Conversation
Collaborator
|
잘 적어주셨습니다! 데이터 효율성과 학습 난이도로 인해 10년 지난 ResNet이 아직도 사용되는 이유이기도 하죠. 수고했어 채우~~ 👍👍 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ans1)
Ans2)
<마지막 epoch기준>
Pretrained = True
Train Loss: 0.0965, Train Accuracy: 96.68%
Val Loss: 0.3630, Val Accuracy: 89.20%
Pretrained = False
Train Loss: 0.6932, Train Accuracy: 75.42%
Val Loss: 0.8317, Val Accuracy: 70.75%
성능비교
ViT는 inductive bias가 부족하기에 사전 가중치의 사용여부에 따라 성능에 차이가 나타난다. Pretrained=True일때는 구글의 초대형 데이터셋을 활용해 충분히 학습한 상태이기에 약 90% 대의 성능을 유지하지만 Pretrained=False는 제한된 데이터셋만으로 학습을 하기에 파라미터를 최적화할 환경이 충분치 않아 loss도 높고 낮은 성능을 보인다.
학습 곡선의 안정성
아이러니하게 Pretrained=False일때 loss와 accuracy 곡선 둘다 더 안정적으로 나타났다. Pretrained=False일때 모델은 복잡한 패턴을 찾지 못하는 수준이기에 성능 변동 폭이 적지만 True일때 새로운 데이터에 맞게 파인튜닝하는 과정에서 급격한 변화가 발생한 것으로 보인다.