Deep learning applied to regression, classification and feature transformation problems
- José Ramón Dorronsoro Ibero Director/a
Universidad de defensa: Universidad Autónoma de Madrid
Fecha de defensa: 18 de marzo de 2022
- César Hervás Martínez Presidente/a
- Ana González Marcos Secretario/a
- Javier Martínez Moguerza Vocal
Tipo: Tesis
Resumen
The decade from 2010 to 2020 has seen a series of impressive improvements in the performance of Machine Learning models, especially in problems such as image and video tagging, automatic translation, optical character recognition, speech-to-text and others, collectively known as Computer Perception. Those improvements have been motivated by the greater computational power developed during those years by several improvements in computing hardware and software and the great amount of data available in the so called era of Big Data, but those are not the only reasons for it. The development of Deep Learning, term that refers to modern arti cial neural networks that employ a series of relatively recent techniques (initialization, activation, regularization, etc.), has been, probably, the most important factor of all. These connectionist Machine Learning models are not only universal approximators, but have a exible architecture that can be adapted to di erent types of data and loss functions. Examples of this are convolutional neural networs and recurrent neural networks, adapted to data with spatial structure and data with temporal structure, respectively. This thesis, structured as a compendium of articles, presents three developments tightly related to Deep Learning that have resulted in the several publications included. In the rst place, an application of convolutional neural networks to the problem of prediction in renewable energies is proposed, taking advantage of the spatial structure present in such data, that results in an improvement over the previous results while keeping computational costs under control thanks to the e ciency of arti cial neural networks. The related publications were among the rst contributions at the time to make use of the new DNN frameworks for renewable energy prediction. In the second place, the usefulness of Deep Learning models for feature transformation is shown. The importance of a correct feature transformation can be paramount when confronting a Machine Learning problem. Such algorithms can be used as a rst step in the modelization pipeline, prior to a classi er or regressor, for example, and split the problem into two more manageable sub-problems: obtaining a good representation of the data and generating the actual prediction from it. In this case, the feature transformation technique for classi cation problems known as Fisher's Discriminant Analysis (FDA) will be studied. Once the theoretical framework is set up, the limitations and drawbacks of such tool can be more deeply analyzed. Such limitations include its linear nature, which implies a limited expressive power. This is often solved with the use of kernels, with the disadvantage of a much higher computational cost, that relegates this technique to the world of small to medium data. To overcome such limitations, a partial equivalence between the traditional technique and Least-Squares based models is exploited. This equivalence will allow to train linear trasformers with an iterative algorithm, and, by extension, to use Arti cial Neural Networks as the underlying computational engine, obtaining non-linear trasformations in the process while maintaining reasonable computing costs even for big datasets. Additionally, the use of neural networks in imbalanced classi cation problems, an application closely related to FDA, will be presented. In the third place, the application of margin loss functions, such as those usually employed in support vector machines, over arti cial neural networks will be studied. As it will be shown, these loss functions can represent a noticeable improvement in the model performance in certain cases, even when highly sophisticated neural networks as LeNet or ResNet are used, and again the e ciency of neural networks in terms of computational cost can be an important advantage over more classical techniques. Also, an adaptation of Deep Learning models to the use of several simultaneous loss functions, margin loss functions among them, which will again produce noticeable changes in the quality of predictions, will be detailed