Quick API

To facilitate the use of the models trained in this project, an API has been created using the FastAPI library. Users will be able to send their patient experience feedback comments to the model via the API, and will receive the predicted labels for those comments.

This API utilises the Support Vector Classifier model which is less performant than the transformer-based Distilbert model. However, it is also much quicker and simpler. Performance metrics for this model can be seen on our project documentation website.

The API has been created using FastAPI and is deployed on Posit Connect. The URL is available on request. Full documentation for the API, automatically generated by FastAPI, is available at [API URL]/docs.

How to make an API call

1. Prepare the data in JSON format. In Python, this is a list containing as many dicts as there are comments to be predicted. Each dict has two compulsory keys:

  • comment_id: Unique ID associated with the comment, in str format. Each Comment ID per API call must be unique.
  • comment_text: Text to be classified, in str format.
# In Python

text_data = [
              { 'comment_id': '1', # The comment_id values in each dict must be unique.
                'comment_text': 'This is the first comment. Nurse was great.',
                },
              { 'comment_id': '2',
                'comment_text': 'This is the second comment. The ward was freezing.',
                },
              { 'comment_id': '3',
                'comment_text': '',  # This comment is an empty string.
                },
            ]
# In R

library(jsonlite)

comment_id <- c("1", "2", "3")
comment_text <- c(
  "This is the first comment. Nurse was great.",
  "This is the second comment. The ward was freezing.",
  ""
)
df <- data.frame(comment_id, comment_text)
text_data <- toJSON(df)

2. Send the JSON containing the text data to the predict_multilabel endpoint. In python, this can be done using the requests library.

# In Python

import requests

url = "API_URL_GOES_HERE"

response = requests.post(f"{url}/predict_multilabel",
                          json = text_data)
# In R

library(httr)

r <- POST(
  url = "API_URL_GOES_HERE",
  body = text_data,
  encode = "json",
  add_headers(
    "Content-Type" = "application/json"
  )
)

3. After waiting for the data to be processed and passed through the machine learning model, receive predicted labels at the same endpoint, in the example format below. Note that the comment with blank text, with comment_id 3, was assigned the label 'Labelling not possible' as it would have been stripped out during preprocessing.

# In Python

print(response.json())
# Output below
[
  { 'comment_id': '1',
    'labels': ['Non-specific praise for staff']} ,
  { 'comment_id': '2',
    'labels': ['Sensory experience']} ,
  { 'comment_id': '3',
    'labels': ['Labelling not possible'] }
]
# In R

r_parsed = fromJSON(content(r, "text"))