박경문 교수님 연구실(Artificial General Intelligence Lab.) CVPR 2024 총 3편 게재 승인
Artificial General Intelligence Lab (지도교수: 박경문)의 논문 3편이 컴퓨터 비전 분야의 Top-tier 학술 대회인 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (CVPR 2024)에 게재 승인 되었습니다.
논문 제목: “Generative Unlearning for Any Identity”
“Generative Unlearning for Any Identity” 논문은 최근 생성 모델들의 눈부신 발전 이면에 존재하는 privacy 문제를 다룹니다. 생성 모델에서의 inversion 및 image editing 기술을 바탕으로 누구라도 쉽게 특정 인물을 생성 모델 상에서 표현하고 변형할 수 있습니다. 본 논문에서는 생성 모델 상에서 특정 인물에 대한 생성을 막기위한 프레임워크인 GUIDE를 제안합니다. GUIDE는 크게 두가지의 성분으로 구성됩니다. 하나는 latent space 상에서 특정 인물을 대체하기 위한 다른 latent code를 찾는 Un-Identifying Face On Latent Space (UFO)와 실질적으로 특정 인물을 제거하기 위한 손실 함수들로 이루어진 Latent Target Unlearning (LTU)입니다. GUIDE를 통해서 생성 모델에서의 특정 인물을 기존 모델의 성능을 최대한 유지하면서 제거할 수 있었습니다.
[논문 정보]
Generative Unlearning for Any Identity
Juwon Seo, Sung-Hoon Lee, Tae-Young Lee, Seungjun Moon, and Gyeong-Moon Park
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Abstract:
Recent advances in generative models trained on large-scale datasets have made it possible to synthesize high-quality samples across various domains.
Moreover, the emergence of strong inversion networks enables not only a reconstruction of real-world images but also the modification of attributes through various editing methods. However, in certain domains related to privacy issues, e.g., human faces, advanced generative models along with strong inversion methods can lead to potential misuses. In this paper, we propose an essential yet under explored task called generative identity unlearning, which steers the model not to generate an image of specific identity. In the generative identity unlearning, we target the following objectives:(i) preventing the generation of images with a certain identity, and (ii) preserving the overall quality of the generative model. To satisfy these goals, we propose a novel frame-work, Generative Unlearning for Any IDEntity (GUIDE), which prevents the reconstruction of a specific identity by unlearning the generator with only a single image. GUIDE consists of two parts: (i) finding a target point for optimization that un-identifies the source latent code and (ii) novel loss functions that facilitate the unlearning procedure while less affecting the learned distribution. Our extensive experiments demonstrate that our proposed method achieves state-of-the-art performance in the generative machine unlearning task.
논문 제목: Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners
"Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners" 논문은 기존의 Few-Shot Class Incremental Learning (FSCIL)을 위한 작은 모델들을 활용하는 한계를 지적하며, 최근 컴퓨터 비전에서 널리 활용되는 Large model을 통한 학습 방식인 PriViLege를 제시합니다. Large model의 활용으로 FSCIL에서 발생하는 Overfitting 문제와 Catastrophic Forgetting 문제가 더욱 심화되는 것을 해결하기 위해, Pre-trained Knowledge Tuning (PKT), Entropy-based Divergence Loss, 그리고 Semantic Knowledge Distillation Loss를 제안합니다. 이를 통해 기존의 방법들에 비해 낮은 Forgetting과 뛰어난 Performance를 달성하였습니다.
[논문 정보]
Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners
Keon-Hee Park, Kyungwoo Song, and Gyeong-Moon Park
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Abstract:
Few Shot Class Incremental Learning (FSCIL) is a task that requires a model to learn new classes incrementally without forgetting when only a few samples for each class are given. FSCIL encounters two significant challenges: catastrophic forgetting and overfitting, and these challenges have driven prior studies to primarily rely on shallow models, such as ResNet-18. Even though their limited capacity can mitigate both forgetting and overfitting issues, it leads to inadequate knowledge transfer during few-shot incremental sessions. In this paper, we argue that large models such as vision and language transformers pre-trained on large datasets can be excellent few-shot incremental learners. To this end, we propose a novel FSCIL framework called PriViLege, Pre-trained Vision and Language transformers with prompting functions and knowledge distillation. Our framework effectively addresses the challenges of catastrophic forgetting and overfitting in large models through new pre-trained knowledge tuning (PKT) and two losses: entropy-based divergence loss and semantic knowledge distillation loss. Experimental results show that the proposed PriViLege significantly outperforms the existing state-of-the-art methods with a large margin, e.g., +9.38% in CUB200, +20.58% in CIFAR-100, and +13.36% in miniImageNet.
논문 제목: "Open Set Domain Adaptation for Semantic Segmentation"
"Open Set Domain Adaptation for Semantic Segmentation" 은 기존 Unsupervised Domain Adaptation (UDA) for Semantic Segmentation 시나리오가 target domain 에서 알려지지않은(unknown) 클래스가 있을 때 이를 고려하지 못하기 때문에 실제 시나리오에 적용이 제한된다는 점을 지적합니다. 이에 대응하여 target domain에서 unknown 클래스가 발생하는 Open Set Domain Adaptation for Semantic Segmentation (OSDA-SS) 시나리오를 처음으로 제안합니다. 또한 OSDA-SS 에서 기존 UDA 기법을 적용했을 때 unknown 클래스들의 boundary 부분과 모양을 정확하게 예측하지 못한다는 점을 해결하기 위해 BUS를 제안합니다. BUS는 boundary 부분에서 unknown 클래스를 정확하게 구분하기 위한 팽창과 침식 모폴로지 연산 기법 기반 대비 손실 함수(DECON Loss)와 unknown 클래스의 모양을 예측하기 위해 모델이 도메인과 크기에 불변적인 특징을 학습하도록하는 새로운 도메인 혼합 증강 방법인 OpenReMix를 포함합니다. 이를 통해 OSDA-SS 시나리오에서도 기존 클래스의 성능 저하 없이 unknown 클래스를 정확하게 예측할 수 있습니다.
[논문 정보]
Open Set Domain Adaptation for Semantic Segmentation
Seun-An Choe, Ah-Hyung Shin, Keon-Hee Park, Jinwoo Choi, and Gyeong-Moon Park
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Abstract:
Unsupervised domain adaptation (UDA) for Semantic segmentation aims to transfer the pixel-wise knowledge from the labeled source domain to an unlabeled target domain. However, current UDA methods typically assume a shared label space between source and target, limiting their applicability in real-world scenarios where novel categories may emerge in the target domain. In this paper, we introduce Open Set Domain Adaptation for Semantic Segmentation (OSDA-SS) for the first time, where the target domain includes unknown classes. We identify two major problems in the OSDA-SS scenario as follows: 1) the existing UDA methods struggle to predict the exact boundary of the unknown class, and 2) they fail to accurately predict the shape of the unknown class. To address these issues, we propose Boundary and Unknown Shape-Aware open set domain adaptation, coined BUS. Our BUS can accurately discern the boundaries between known and unknown classes using a novel dilation-erosion-based contrastive loss, which helps to discriminate the region for the unknown class in a contrastive manner. In addition, we propose OpenReMix, a new domain mixing augmentation method that guides our model to effectively learn domain and size-invariant features for improving the shape detection of the known and unknown classes. Through extensive experiments, we demonstrate that our proposed BUS effectively detects the unknown class in the challenging OSDA-SS scenario, compared to the previous methods by a large margin.
2024.03.25