Use of Text Analytics to Extract Dementia Patients’ Cognitive Assessment Test Results from Free Text Clinical Notes

This abstract has open access
Abstract Description
Abstract ID :
HAC1204
Submission Type
Authors (including presenting author) :
Lui CSM(1), Woo PPS(1), Man SP(2), Tse CM(2), Leung EKH(3), Au Yeung TW(2), Lam PKW(1), Tsui ELH(1)
Affiliation :
(1) Strategy and Planning Division, Hospital Authority Head Office, Hong Kong
(2) Department of Medicine and Geriatrics, Pok Oi Hospital, Hong Kong
(3) Department of Occupational Therapy, Pok Oi Hospital, Hong Kong
Introduction :
As part of the effort to construct a theme-based research dataset on dementia, cognitive assessment results in addition to other clinical data were identified as essential data items to analyse dementia patients. The cognitive assessment results, in particular, the Mini-Mental State Examination(MMSE) scores and reference dates, are mostly captured in free text format clinical notes in the Clinical Management Systems(CMS) in the HA. To extract this information manually from free text format clinical notes is not only time consuming but also not practical when large numbers of clinical notes are involved. Text analytics can provide an automated, fast and consistent way to harness the information from clinical notes and transform the unstructured data into structured data ready to be used for further statistical analysis and research.
Objectives :
We use text analytics to extract and transform the cognitive assessment scores and references dates in free text clinical notes into structured data for further statistical analysis.
Methodology :
We developed a text analytics algorithm based on the rule-based text pre-processing and the Long Short Term Memory (LSTM) machine learning model to extract the MMSE scores and reference dates from the clinical notes of 723 dementia patients sampled from the chronic disease virtual registry of the HA. Each clinical note was treated as a sequence of words and the words were labeled one by one into the categories of irrelevant, MMSE score and MMSE reference date. 614 clinical notes were randomly selected from the sample and were used to train the algorithm. The trained algorithm was then used to predict the category of each word in the other 109 clinical notes for evaluating the accuracy of the algorithm.
Result & Outcome :
The text analytics algorithm developed in this study achieved 98% accuracy for the 614 clinical notes in the train dataset and 95% on the 109 clinical notes in the test dataset. The data extracted by the algorithm, together with the other clinical data available in the structured fields of the CMS in the HA can facilitate modeling of the cognitive trajectory to identify factors contributing to the variability in progression of cognitive impairment among dementia patients.

Abstracts With Same Type

Abstract ID
Abstract Title
Abstract Topic
Submission Type
Primary Author
HAC720
Clinical Safety and Quality Service I
HA Staff
Maria SINN Dr
HAC456
Enhancing Partnership with Patients and Community
HA Staff
Donna TSE
HAC1262
Enhancing Partnership with Patients and Community
HA Staff
S F LEE Dr
HAC997
Clinical Safety and Quality Service II
HA Staff
K L CHAN
444 visits