Phase 1 overview

Developments and findings from 2021-22

A short summary of the report for phase 1 of the project is available on this page. The full report can be downloaded here

Phase 1 of the text mining pilot project was hosted at Nottinghamshire Healthcare NHS Foundation Trust and took place between 2021-22. This formed the basis of the current Patient Experience-QDC project. This project aimed to improve the use of FFT and patient experience survey qualitative data in selected trusts, working towards generating from this piece of work a national “support or guidance toolkit” to help drive service improvements. This project’s overarching goal was to create text mining software, originating in the NHS, which analyses qualitative patient experience data for theme and sentiment, displays this on an online dashboard and is freely available to all NHS Trusts.

Objectives of phase 1:

Improve the processing and analysis of FFT and patient experience survey qualitative data using text analytics (e.g. machine learning), through creating, developing and deploying text mining software.
Develop a process and software that is reproducible, sustainable and can be easily implemented in different NHS provider organisations and services, i.e. reusable across the system at low costs using open source components.
Establish data visualisation and/or reporting approaches that support the use of patient experience feedback for quality improvement.
Gain a better understanding of the variation in NHS trust needs across different trusts and service settings, thereby creating an easily transferable and adaptable solution.

The project was characterised by three distinct outputs:

Text classification model. A ML model that is able to predict the themes and criticalities of unlabelled feedback text (release 0.3.4)
Text mining dashboard. An interactive dashboard that presents the results of the text classification model, along with further insights derived from the feedback text, including sentiment analysis and other text mining methods (release 0.4.4)
Model summary dashboard. A dashboard designed to allow non technical users to assess the accuracy of the model in use, including summary metrics as well as example of classified data to enable the user to make their own judgments about accuracy

Machine learning model and dataset

The models that were trained in phase 1 of the project were trained on approximately 10,000 rows of data that were provided by three partner trusts, and coded using the Python scikit-learn library. There were two models trained that were able to predict two different targets:

9 ‘themes’ or categorical labels. These were: Access, Care received, Communication, Couldn’t be improved, Dignity, Environment/ facilities, Miscellaneous, Staff, and Transition/coordination. The model was able to predict 1 label/theme for each comment.
9 ‘criticality’ values, ranging from -4 to 4, with -5 being highly critical of the organisation and 5 being highly complimentary towards the organisation.

The winning model for the nine themes was a linear support vector machine with 71% accuracy and 52% class balance accuracy on the test set. This is available on github, together with the code used to create the model and finetune the hyperparameters.

The winning model for the nine criticality values was a logistic regression with 59% accuracy and 44% class balance accuracy on the test set. This is also available on github, together with the code used to create the model and finetune the hyperparameters.

Dashboard

The two dashboards were created using Shiny/R. The model summary dashboard provided contextual information regarding the trained machine learning models. The other dashboard, experiencesdashboard, was one of the primary outputs of the project. Functionality available in this dashboard was:

Uploading of data (FFT comments, quantitative score, demographic data regarding respondents, and organisational data e.g. the relevant team/department) to the dashboard in CSV format
The uploaded data was then categorised/labelled using the machine learning model.
The uploaded data could be explored. Filtering by team/department and date was possible, as well as by the themes assigned by the machine learning model.
Sentiment analysis was conducted on the uploaded text at two levels: theme-level, to see the positivity and negativity of sentiment accorded to specific topics, and text-level, to see the positivity and negativity of individual comments.

Learning and feedback from phase 1

To evaluate phase 1 of the project, semi-structured interviews with three of the partner trusts were conducted. Participants were asked to consider whether or not the project outputs met objectives 1 and 3, in particular. The key findings were:

Machine learning model

Overall, participants in the evaluation found the algorithm useful, and felt that it could potentially save time and lead to better use of qualitative patient feedback. It was also seen as more sophisticated than the existing commercial solution utilised by one trust. Suggestions for improvements included:

Revision of the themes/categories to better align with the needs of different trust types was suggested, as the phase 1 data was largely skewed towards community trusts.
More granular level of thematic tagging (subcategories) was suggested as being helpful as the categories in phase 1 were quite broad.
Phase 1 had only 1 label for each comment, whereas all respondents stated that in their experience, many comments had multiple labels. The ability to detect multiple topics in a comment text was deemed important functionality in future developments of the algorithm.
Performance of criticality/sentiment labelling in phase 1 of the project was not deemed sufficiently reliable to inform decision making.

Dashboard

The dashboard from phase 1 was easy to use and the searching and presentation of information was commended. Areas for improvement included: More readable/understandable visualisations, through improvements to the design choices

Inclusion of trend data (showing changes over time) and other data visualisation
Alternative output options e.g. PDF, Power BI
A more usable/intuitive dashboard design

Project management

Interview participants felt that the project was overall successful in meeting its objectives, and expressed a desire to continue their involvement in further development of the solution. In particular, they praised:

The project structure, with its initiation and running from within the NHS
The knowledge and expertise of the project lead

Areas for improvement included:

Clarity around the role of NHSE and increased input to help coordinate expertise
More structure and management for the project, as it was all centred around one individual (the project lead)
Better opportunity for participating partner trusts to interact with each other and to interrogate decisions and processes in the project
Information sharing, with a better balance between detailed technical information and non-technical explanations.