The use of large language models in active learning methods
Defense Date:
In this thesis, I address the issue of using large language models (LLMs) as oracles in the active learning process, where an LLM labels samples selected by specific strategies. To this end, I focus on currently used LLMs and appropriate “Prompt Engineering” techniques to obtain the best possible labels. While researching the topic, I analyzed the available scientific literature, searching for similar solutions. I observed that current research may suffer from the problem of data leakage. To mitigate this issue, I prepared my own completely new dataset to minimize the chance of this phenomenon occurring. I confirmed that for this new data, the use of an LLM in the active learning process yields good results. Analyzing the subject of active learning, I selected the best and most diverse strategies to compare how they perform in the presence of noise introduced by the LLM. The experiments show that for the selected datasets, an uncertainty-based strategy performs best; therefore, the implementation complexity and computational overhead do not justify the introduction of the other strategies. Additionally, this work consolidates the scattered and rapidly changing knowledge in this field, while simultaneously conducting uniform experiments at a single point in time.
