HealthX Challenge Statement: Harnessing Speech-To-Text (STT) technology to translate clinical conversations into medical records.

HealthX Call-for-Innovation

Harnessing Speech-To-Text (STT) technology to translate clinical conversations into medical records.

Closing date: 25 Sep 2024 11.59pm (SGT)

Current State

Clinical documentation is an integral part of the practice. However, the documentation process is often burdensome and time consuming for clinicians and staff. For instance, during the patient’s health assessment session, the therapist will gather information such as patient history, existing medication, lifestyle changes, exercises, diet, symptoms, and examination. This is then manually input into the EMR system prior to formulating an evaluation or finding. After which, a treatment plan is recommended.

Another typical scenario is the capturing of conversation amongst the care team. Multi-disciplinary rounds are held where doctors from different disciplines will discuss about the patients’ condition and the clinical follow ups required. The various doctors will then complete their documentation in the EMR separately after the discussion.

Other similar scenario which involves a team of clinicians is during resuscitation; a scribe nurse receiving information from the rest of the team providing intervention to patient, will write the intervention on the whiteboard and read back the information to the team to confirm and then document them in EMR.

For front-line staff at the contact centre who receives phone calls from patients, next-of-kin or members of the public, they too need to summarise the content of the phone call. Depending on the nature of the call, staff will either enter the summary into a post-call log in the CRM system or in the EMR.

Challenge Statement

How might we improve healthcare through harnessing Speech-to-Text Technology (STT) to translate and summarise clinician-patient or/and clinicians’ conversation/dictation accurately into medical records to increase clinician efficiency, improve overall patient care and enhance the overall healthcare delivery process?

What are we looking for?
(to-be state)

The proposed STT solutioning for this Proof of Value (POV) trial should cover the following 3 types of use cases (refer to Table 1 below);

Type A will generally be in a clinical setting, in which the solution will capture the voice commands (primarily English language) from the user to navigate the system hands free.
Type B will generally be in a clinical setting, in which the Ambient STT solution will capture the conversation (primarily in English or Mandarin languages) among the clinicians or/and nurses, and create a summary note in near real-time for review and edit.
Type C will generally be in a clinical / non-clinical environment (e.g. patient’s home, call centre etc.), in which the Ambient STT solution will dictate and capture the speech content (ideally multi-language - English, Mandarin or Malay) involving clinicians/staff and non-clinical personnel (patient, next-of-kin, etc.), and create a summary note for review and edits before update into EMR.

Technology	Voice Control AI & Speech to Text Dictation	Ambient Speech-to-Text AI
Use case types	Type A	Type B	Type C
Use case description	Voice control (navigation)	Clinician & clinician	Clinician & patient
Environment type	Clinical setting	Clinical setting	Clinical or non-clinical setting
Capability	Voice command to navigate and execute software functions Speech to text dictation	Ambience speech recognition & transcription + summarisation Convert audio conversation into text transcriptions
User type	Clinician(s) only		Clinician & patient
Language	Monolingual (English)		Multilingual (min. English, Mandarin or Malay)
Use cases	Rehab Centre Hand Occupational Therapy	Emergency Department resuscitation Multi-disciplinary discussion	Rehab Centre Hand Occupational Therapy Contact Centre

Table 1: Use Case Types - For detail of the use cases, please refer to Annex A- Use Cases Summary.

1. The following key capabilities of the proposed solutions are the key shortlisting criteria:

i. Ambience and Speech to Text Dictation –

Ability to accurately generate medical notes from clinician/patient conversations and almost instantly convert voice conversation data into draft medical notes. In addition, ability to translate pure speech to text dictation.

ii. Clinical Summary –

Ability to summarise healthcare worker’s/patient’s conversation’s captured text and create a summary note in near real-time with appropriate tagging to the clinical notes. It allows clinicians to review and finalise the notes before they are updated into the hospital’s EMR system.

iii. Voice Control Navigation –

Ability to allow users to perform navigation and operational tasks with Epic system, or with other EMR system via voice commands.

iv. Multi-Languages –

Ability to support multi-languages (both English and Mandarin at the minimum). Added advantage if the solution could support other mixed languages (e.g. Malay etc).

v. Medical Vocabularies –

Ability to support in-built medical vocabularies to capture medical terminologies.

vi. Epic Integration –

Solution providers need to have either proven track records in Epic integration or express explicit commitment to establish Epic integration by the time of actual implementation of the solution after the trial. For the latter, it is the sole responsibility of the solution provider to work directly with Epic to get their integration endorsed by Epic.

vii. AI-Safety –

Added advantage if there is AI safety services to detect and block sensitive health information and personal identifiable information.

viii. User Friendliness –

The solution is intuitive for users to use without elaborate setup or configurations.

2. The proposed STT solution for this Proof of Value (POV) trial shall fulfil the following Technical Requirements:

i. The architecture design pattern addresses cybersecurity & data sovereignty, and data residency requirements. Solutions must comply with cybersecurity standards, especially when there is personal and clinical data involved. (Refer to Annex B: Table of Security and Infrastructure Requirements and Controls for Trial & Pilot Environment).

ii. For STT Solutions that leverage their proprietary Large Language Model (LLM) hosted at their native location for the processing of audio to text and summarising capabilities, the solution provider may have to demonstrate that the audio and transcript text file do not reside at their native location.

3. Overall collaboration requirements:

i. Scalable: The proposed solutions should consider a broader plan to scale across other healthcare institutions and settings, and consider the need to integrate and interface with the various relevant systems.

ii. Cost-effectiveness: The proposed solutions must be cost-effective and beneficial to the public healthcare in Singapore.

iii. Implementation Timeline: This POV collaboration is scoped to address the challenge statement and the trial completion period is to be capped at not more than 3 months (including the setup of the trial).

iv. Cost: The proposed solution should include indicative cost (if applicable) for the trial.

Resources

Challenge Statement

Download the challenge statement brief in PDF format

Download

Submit a proposal

Participate in this challenge statement by completing the submission form.

Submit a proposal

Have any questions?

Check out our FAQ section.

Find out more