Improving transfer learning under distribution shift

Defense Date: January 13, 2025

Transfer learning is a popular tool used to create machine learning models in fields such as natural language processing or computer vision. A simple example is the use of a model trained on a certain, typically large, dataset as a feature extractor for another model on a target dataset. A model built this way can achieve better results than a model trained only on the target dataset. A common issue with this approach is the distribution shift occurring between the two datasets, which negatively impacts modeling results. The aim of this work is to introduce the reader to the topic of transfer learning and to compare various knowledge transfer strategies in terms of their robustness to issues arising from distribution shift. The concept of distribution shift is discussed in both mathematical and practical contexts. Three experiments were conducted to compare the effectiveness of three model fine-tuning methods. In two of these experiments, distribution shift was modeled, while in the third, three datasets with varying degrees of shift were used. The results of the experiment suggest that the studied issue paves the way for more effective and efficient training of machine learning models.

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Paweł Zawistowski

Share on