Instance Selection for Geometric Semantic Genetic Programming

Abstract

Geometric Semantic Genetic Programming (GSGP) is a method that exploits the geometric properties describing the spatial relationship between possible solutions to a problem in an n-dimensional semantic space. In symbolic regression problems, n is equal to the number of training instances. Although very effective, the GSGP semantic space can become excessively big in most real applications, where the value of n is high, having a negative impact on the effectiveness of the GSGP search process. This paper tackles this problem by reducing the dimensionality of GSGP semantic space in symbolic regression problems using instance selection methods. Our approach relies on weighting functions-to estimate the relative importance of each instance based on its position with respect to its nearest neighbours-and on dimensionality reduction techniques-to improve the notion of closeness between instances, generating datasets with simplified input spaces. Experiments were performed on a set of 15 datasets and our experimental analysis shows that using instance selection by instance weighting and dimensionality reduction does improve the effectiveness of the search with almost no impact on root mean square error results.

Publication
2020 IEEE Congress on Evolutionary Computation (CEC)
comments powered by Disqus

Related