Application of graph neural networks in the expansion of graph sets

Defense Date:

This study combines two issues, namely the expansion of sets and the graph neural network (GNN), the latter being a relatively new type of artificial neural networks. The sets are expanded by separating from a large data set a relatively small number of elements similar to each other which are jointly referred to as an expanded set. In the residual data set, referred to as a candidate set, the search of further elements of the expanded set, i.e. those similar to the previously separated elements, is continued. With regard to the expansion of sets it is of great importance that the expanded set is much less numerous than the candidate set and that the latter is contaminated with elements of the expanded set. The other of the methods consists in training semi-supervised networks which consists in labeling elements of the expanded set as 1, increasing the candidate set by a similar number of class 1 elements and labeling all the elements of the candidate set as 0. The network was trained with reference to two types of sets: the two-class Mutagenesis set and the six-class Enzymes set. In the case of the Mutagenesis set the expanded set contained chemical compounds of one type only, and the candidate set – chemical compounds of both types. In the case of the Enzymes set the expanded set contained enzymes of one from among six types, and the candidate set – enzymes of all six types. According to the study results the GNN network is able to learn despite the candidate set being contaminated with elements of the expanded set, while at the same time the number of elements of the expanded set is small compared to the candidate set.