Diseño de un modelo predictivo-asistencial de pacientes infectados por Covid-19, mediante un modelo supervisado de Machine Learning basado en criterios de derivación hospitalaria o ambulatoria.
Fecha
2021-04
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad de Guayaquil. Facultad de Ciencias Matemáticas y Físicas. Carrera de Ingeniería en Sistemas Computacionales.
Resumen
Un problema derivado de la pandemia del COVID-19 es la falta de una herramienta digital que
pueda predecir la intensidad de la gravedad de un paciente enfermo. El presente proyecto
consiste en realizar un modelo predictivo asistencial para pacientes infectados por COVID-19,
utilizando herramientas de Machine Learning mediante algoritmos de aprendizaje supervisado
como Naive Bayes y Random Forest para obtener un criterio sobre derivación hospitalaria o
ambulatoria. Entre los principales objetivos específicos se encuentran la extracción de un
conjunto de base de datos con la información vinculada al historial médico de los pacientes
diagnosticados con COVID-19, para la depuración y construcción de un dataset con las
variables relacionadas, y evaluarlas para mejorar la toma de decisiones a partir de un modelo
de algoritmo supervisado. La metodología empleada es “Knowledge Discovery in Databases –
KDD”, la cual se desarrolla en 6 fases: importación y muestreo de datos, calidad de datos,
transformación, modelización, evaluación e implementación; sin embargo, esta última fase no
se llevará a cabo, en su lugar se realizará un prototipo desarrollado a nivel de Python. Se utilizó
la librería sklearn de la herramienta Python 3.5 para el entrenamiento del algoritmo, la
herramienta STAT:FIT para las distribuciones estadísticas, y basándose en la sintomatología
del paciente los algoritmos arrojaron un porcentaje de precisión (93,5% Random Forest y 95%
Naive Bayes), concluyendo que el mejor predictor es el algoritmo de Naive Bayes, también se
demostró que existe relación entre ambos algoritmos con respecto a la derivación hospitalaria
o ambulatoria mediante el análisis de correlación de Pearson, haciendo que se cumplan las
hipótesis planteadas. Al ser útil este prototipo para la toma de decisiones para la respectiva
derivación del paciente, los beneficiarios directos son los doctores, dado que obtienen una
herramienta que les agilitará la exhaustiva acción de decidir.
A problem derived from the COVID-19 pandemic is the lack of a digital tool that can predict the severity of a sick patient. The present project consists of carrying out a predictive model of care for patients infected by COVID-19, using Machine Learning tools using supervised learning algorithms such as Naive Bayes and Random Forest to obtain criteria on hospital or outpatient referral. Among the main specific objectives are the extraction of a set of databases with the information related to the medical history of patients diagnosed with COVID-19, for the purification and construction of a dataset with the related variables and evaluate them to improve the decision making based on a supervised algorithm model. The methodology used is "Knowledge Discovery in Databases - KDD", which is developed in 6 phases: import and data sampling, data quality, transformation, modeling, evaluation and implementation; however, this last phase will not be carried out, instead a prototype developed at the Python level will be made. The sklearn library of the Python 3.5 tool was used for the training of the algorithm, the STAT::FIT tool for the statistical distributions, and based on the patient's symptoms, the algorithms yielded a percentage of precision (93.5% Random Forest and 95% Naive Bayes), concluding that the best predictor is the Naive Bayes algorithm, it was also shown that there is a relationship between both algorithms with respect to hospital or outpatient referral by means of Pearson's correlation analysis, making the hypotheses raised. As this prototype is useful for decision-making for the respective referral of the patient, the direct beneficiaries are the doctors, since they obtain a tool that will expedite the exhaustive decision making action.
A problem derived from the COVID-19 pandemic is the lack of a digital tool that can predict the severity of a sick patient. The present project consists of carrying out a predictive model of care for patients infected by COVID-19, using Machine Learning tools using supervised learning algorithms such as Naive Bayes and Random Forest to obtain criteria on hospital or outpatient referral. Among the main specific objectives are the extraction of a set of databases with the information related to the medical history of patients diagnosed with COVID-19, for the purification and construction of a dataset with the related variables and evaluate them to improve the decision making based on a supervised algorithm model. The methodology used is "Knowledge Discovery in Databases - KDD", which is developed in 6 phases: import and data sampling, data quality, transformation, modeling, evaluation and implementation; however, this last phase will not be carried out, instead a prototype developed at the Python level will be made. The sklearn library of the Python 3.5 tool was used for the training of the algorithm, the STAT::FIT tool for the statistical distributions, and based on the patient's symptoms, the algorithms yielded a percentage of precision (93.5% Random Forest and 95% Naive Bayes), concluding that the best predictor is the Naive Bayes algorithm, it was also shown that there is a relationship between both algorithms with respect to hospital or outpatient referral by means of Pearson's correlation analysis, making the hypotheses raised. As this prototype is useful for decision-making for the respective referral of the patient, the direct beneficiaries are the doctors, since they obtain a tool that will expedite the exhaustive decision making action.
Descripción
PDF
Palabras clave
Derivación hospitalaria, Covid-19, Aprendizaje automático, Bosques aleatorios, Redes Bayesianas, Hospital Referral, Machine Learning, Random forest, Naive Bayes