Uso de patrones secuenciales multivariantes para clasificación y extracción de conocimiento temporalestudio de supervivencia de pacientes en la Unidad de Quemados Críticos

  1. Casanova López, Isidoro Jesús
Dirixida por:
  1. Manuel Campos Martínez Director
  2. Jose M. Juarez Director
  3. José Ángel Lorente Balanza Director

Universidade de defensa: Universidad de Murcia

Fecha de defensa: 05 de maio de 2023

Tipo: Tese

Resumo

Our research proposal is based on patients with severe burns who have been admitted to the ICU and whose evolution has been recorded daily. Some clinical parameters available from the arrival of the patient, such as age or the extent of the burn, allow an initial assessment of the severity and help to predict the estimated survival on admission. However, the study of the evolution of other clinical parameters recorded during the first 5 days of ICU stay (pH, diuresis, base excess, ...) can help to define objectives and to assess the evolution and treatment response. This thesis proposes the generation of potential knowledge by observing the temporal evolution of these variables, in such a way that it could be possible to predict the survival of a patient or suggest new ideas to physicians about the behavior of these variables. METHODS. A knowledge discovery process is initially defined with the following 4 steps: 1) discretization of temporal attributes, 2) multivariate sequential pattern mining, 3) post-processing, filtering as interesting patterns those that are discriminative, and subsequently applying a compressed representation of these patterns, 4) classification of patient survival with interpretable models. Next, we will compare how different discretization methods affect the classification and we will try to reduce the number of sequential patterns used as predictors in the classifiers, making an evaluation of their consistency. In addition, we propose the use of a statistical indicator widely used in epidemiological studies, the Diagnostic Odds Ratio (DOR), as an alternative interestingness measure with respect to frequency, to select interesting patterns. Finally, we present an original method to obtain a reduced subset of novel sequential patterns that represent the surprising temporal evolution of the patient's clinical status, which we will call Jumping Diagnostic Odds Ratio Sequential Patterns (JDORSP). We will use the DOR to select sequential patterns that represent a dramatic change in the patient's evolution, that is, patterns that become a protective factor when we extend a pattern that was a risk factor, or vice versa. RESULTS. The results of the classification tests show that our approach outperforms the burn severity scores currently used by physicians in terms of Brier score, and to the best of our knowledge, this would be the first paper where multivariate sequential patterns are used as mortality predictors in ICU. Regarding the use of different discretization techniques, to our knowledge, no previous study has made this comparison using sequential patterns specifically. The best performance with classification has been obtained with the automatic UCPD discretization. We also get an acceptable result with expert discretization, outperforming many automatic discretization algorithms. By evaluating consistency, we have further reduced the number of sequential patterns, finding patterns uniformly distributed within the patient database. Regarding the use of the DOR statistic to reduce the number of sequential patterns and select only the most discriminative ones, with expert discretization, the highest specificity is achieved by directly using the DOR value to select patterns. This is, to our knowledge, the first time that some of these approaches have been proposed and compared in the scientific literature. Finally, regarding the proposed novel JDORSP patterns, to the best of our knowledge this is the first time that the DOR and sequential patterns have been used in this way. We highlight the drastic reduction in sequential patterns with respect to the current state of the art, allowing manual review by medical experts of the surprisingness and relevance of the patterns discovered. Thus, the most interesting fact is the high surprisingness in sequential patterns that initially have a risk factor, and their extensions become a protection factor, that is, the patients that recover after several days of being at high risk of dying.