본문 바로가기

전체 글

Active Learning For Convolutional Neural Networks: A Core Set Approach(ICLR 2018) https://arxiv.org/abs/1708.00489 Active Learning for Convolutional Neural Networks: A Core-Set Approach Convolutional neural networks (CNNs) have been successfully applied to many recognition and learning tasks using a universal recipe; training a deep model on a very large dataset of supervised examples. However, this approach is rather restrictive in pract arxiv.org 1. Introduction Active Lear.. 더보기
DINO : Emerging Properties in Self-Supervised Vision Transformers https://arxiv.org/abs/2104.14294 Emerging Properties in Self-Supervised Vision Transformers In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the fact that adapting self-supervised methods to this architecture works partic arxiv.org CVPR 2021 self-supervised learning이 ViT.. 더보기
MAE:Masked autoencoders are scalable vision learners https://arxiv.org/abs/2111.06377 Masked Autoencoders Are Scalable Vision Learners This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core designs. First, we arxiv.org 2022 CVPR 한줄 요약 : input image에 mask random patch를 .. 더보기
HummingBrid: Towards In-context Scene Understanding Introduction 저자들이 선택한 scene undurstanding task ( 장면에 대한 이해) 의 in context learning의 3가지 구성요소 generality data efficiency fast adaptation NN 검색 method를 이용하여 dense scene undertanding task의 성능 향승을 보임(기존에 취약한 부분) 이를 이용한 retrieval-based decoding mechanism - task specific parameter나 finetuning 필요 x ⇒ standard encoder에 적용하는데 추가적인 노력 불필요 (ResNet 이던 ViT던 적용가능) 모델 성능을 증진하기 위한 두 가지 pretraining components 제시 .. 더보기
BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arxiv.org) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train.. 더보기
DDIM: Denoising Diffusion Implicit Models [2010.02502] Denoising Diffusion Implicit Models (arxiv.org) Denoising Diffusion Implicit Models Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion arxiv.org 2021 ICLR 1. Introduction high qua.. 더보기
ISIM: Iterative Self-Improved Model for Weakly Supervised Segmentation [2211.12455] ISIM: Iterative Self-Improved Model for Weakly Supervised Segmentation (arxiv.org) 2022.11 arxiv 논문 CAM을 WSSS task에 많이 이용, but segmentation에서 조금 낮은 효율 ⇒ encoder-decoder based segmentation 모델을 변형하 iterative approach 제시 ground-truth segmentation label 주어지지 않음 ⇒ pseudo-segmentation label 생성 Motivation CAM을 WSSS application에 이용하면서 나타나는 2가지 이슈(이 논문에서 중점적으로 발전시킨 부분) classification model 에.. 더보기
(2021 CVPR)DatasetGAN:Efficient Labeled Data Factory with Minimal Human Effort 2021 CVPR DatasetGAN : 적은 놁으로 많은 양의 high quality semantically segmented image 생성하는 GAN 실제 사용가능한 dataset을 생성해내는 능력 decoder만 few labeled example로 학습 시켜 annotated dataset generator로 활용가능 Introduction labeling된 dataset 부족 + 직접 labeling 하는데 소요되는 시간과 자원 어마어마 함 DatasetGAN을 제시하여 이를 해결하고자 함 학습된 GAN의 feature space를 활용하여 pixel-level labeling을 생성하기 위해 decoder 훈련 DatasetGAN pixel-wise annotation task(semanti.. 더보기