Project development: Additional features

Training the model to take other information into account when processing text

When training machine learning models, a key concept to grasp is that of ‘features’ and ‘targets’. The features are in the inputs to the model, and the targets are the intended outputted predictions of the model. For example, in a model that uses the atmospheric pressure, humidity, and temperature for today to predict the chance of rain for tomorrow, the atmospheric pressure, humidity and temperature would be the ‘features’, and the ‘target’ would be the chance of rain for tomorrow.

Example of features and target for a weather model

In the basic pxtextmining model, the ‘feature’ is the text of the patient feedback comments, whilst the ‘targets’ are the category labels for the text.

Basic pxtextmining model feature and target

However, the meaning of patient feedback comments can vary depending on the question asked. Take the example below, where the answer “Nothing” has more negative connotations when answering the question in Scenario A, whereas it is more positive in Scenario B.

Scenario A: 
Q: "What went well?" 
A: "Nothing"

Scenario B: 
Q: "What could be improved?" 
A: "Nothing"

As a result of this, we have opted to include “Question type” as one of the features of the pxtextmining model. From the data provided by participating trusts, we have found that the questions asked by NHS Trusts in their Friends and Family Test surveys tend to fall into one of three categories:

Adding the question type as an additional feature to the model improved the macro F1 score by 0.05. We have incorporated this into the model that is available via the API. Users of the API must provide the following information in their request, when sending comments to be labelled by the model:

Final pxtextmining model features and target

Technical details on how we have achieved this are available in this blogpost.