Few-Shot Class-Incremental Learning
Paper from:https://arxiv.org/pdf/2004.10956.pdf
Incremental learning new classes
Few-shot class-incremental learning problem(FSCIL)
FSCIL requires CNN models to incrementally learn new classes from very few labelled samples.
Topology-perserving knowledge incrementer framework(TOPIC)
To mitigate forgetting, most class-incremental learning(CIL) works use the knowledge distillation technique that maintains the network's output logins correspond to old classes. They usually store s set of old class exemplars and apply the distillation loss to the network's output.
Problems: 1)class-imbalance problem; 2)performance trade-off between old and new classes
TOPIC uses a neural gas(NG) network to model the topology of feature space. When learning the new classes, NG grows to adapt to the change of feature space. On this basis, we formulate FSCIL as an optimization problem with two objectives.
1) to avoid catastrophic forgetting, TOPIC preserves the old knowledge by stabilizing the topology of NG, which is implemented with an anchor loss(AL) term
2) to prevent overfitting to few-shot new classes, TOPIC adapt the feature space by pushing the new class training sample towards a correct new NG node with the same label and pulling the new nodes of different labels away from each other.
Most CIL works adopt the knowledge distillation technique for mitigating forgetting. The loss function is defined as:
L(D, P; O) = lce(D,P;O) + r ldl(D,P;O)
ldl is the distillation loss term, lce is the cross-entropy loss terms
The distillation approach faces several critical issues when applied to FSCIL:
1) the bias problem caused by imbalance old/new class training data.(some solutions: cosine distance measure; learn a bias correction model)
2) balance the contribution between lce and ldl.
The knowledge distillation methods typically store a set of exemplars randomly drawn from the old training set and compute the distillation loss using these exemplars.
We represent the knowledge by preserving the feature space topology, which is achieved by a neural gas(NG) network. NG maps the feature space F to a finite set of feature vectors V and preserves the topology of F by competitive Hebbian learning.
Incremental learning new classes
Few-shot class-incremental learning problem(FSCIL)
FSCIL requires CNN models to incrementally learn new classes from very few labelled samples.
Topology-perserving knowledge incrementer framework(TOPIC)
To mitigate forgetting, most class-incremental learning(CIL) works use the knowledge distillation technique that maintains the network's output logins correspond to old classes. They usually store s set of old class exemplars and apply the distillation loss to the network's output.
Problems: 1)class-imbalance problem; 2)performance trade-off between old and new classes
TOPIC uses a neural gas(NG) network to model the topology of feature space. When learning the new classes, NG grows to adapt to the change of feature space. On this basis, we formulate FSCIL as an optimization problem with two objectives.
1) to avoid catastrophic forgetting, TOPIC preserves the old knowledge by stabilizing the topology of NG, which is implemented with an anchor loss(AL) term
2) to prevent overfitting to few-shot new classes, TOPIC adapt the feature space by pushing the new class training sample towards a correct new NG node with the same label and pulling the new nodes of different labels away from each other.
Most CIL works adopt the knowledge distillation technique for mitigating forgetting. The loss function is defined as:
L(D, P; O) = lce(D,P;O) + r ldl(D,P;O)
ldl is the distillation loss term, lce is the cross-entropy loss terms
The distillation approach faces several critical issues when applied to FSCIL:
1) the bias problem caused by imbalance old/new class training data.(some solutions: cosine distance measure; learn a bias correction model)
2) balance the contribution between lce and ldl.
The knowledge distillation methods typically store a set of exemplars randomly drawn from the old training set and compute the distillation loss using these exemplars.
We represent the knowledge by preserving the feature space topology, which is achieved by a neural gas(NG) network. NG maps the feature space F to a finite set of feature vectors V and preserves the topology of F by competitive Hebbian learning.
Comments
Post a Comment