HealthX Call-for-Innovation
Harnessing Speech-To-Text (STT) technology to translate clinical conversations into medical records. 
 
Closing date: 25 Sep 2024 11.59pm (SGT) 
 

Current State

Clinical documentation is an integral part of the practice. However, the documentation process is often burdensome and time consuming for clinicians and staff. For instance, during the patient’s health assessment session, the therapist will gather information such as patient history, existing medication, lifestyle changes, exercises, diet, symptoms, and examination. This is then manually input into the EMR system prior to formulating an evaluation or finding. After which, a treatment plan is recommended.

Another typical scenario is the capturing of conversation amongst the care team. Multi-disciplinary rounds are held where doctors from different disciplines will discuss about the patients’ condition and the clinical follow ups required. The various doctors will then complete their documentation in the EMR separately after the discussion.

Other similar scenario which involves a team of clinicians is during resuscitation; a scribe nurse receiving information from the rest of the team providing intervention to patient, will write the intervention on the whiteboard and read back the information to the team to confirm and then document them in EMR.

For front-line staff at the contact centre who receives phone calls from patients, next-of-kin or members of the public, they too need to summarise the content of the phone call. Depending on the nature of the call, staff will either enter the summary into a post-call log in the CRM system or in the EMR.

 

Challenge Statement

How might we improve healthcare through harnessing Speech-to-Text Technology (STT) to translate and summarise clinician-patient or/and clinicians’ conversation/dictation accurately into medical records to increase clinician efficiency, improve overall patient care and enhance the overall healthcare delivery process?

 

What are we looking for?
(to-be state)

The proposed STT solutioning for this Proof of Value (POV) trial should cover the following 3 types of use cases (refer to Table 1 below); 

  • Type A will generally be in a clinical setting, in which the solution will capture the voice commands (primarily English language) from the user to navigate the system hands free.
  • Type B will generally be in a clinical setting, in which the Ambient STT solution will capture the conversation (primarily in English or Mandarin languages) among the clinicians or/and nurses, and create a summary note in near real-time for review and edit.
  • Type C will generally be in a clinical / non-clinical environment (e.g. patient’s home, call centre etc.), in which the Ambient STT solution will dictate and capture the speech content (ideally multi-language - English, Mandarin or Malay) involving clinicians/staff and non-clinical personnel (patient, next-of-kin, etc.), and create a summary note for review and edits before update into EMR.
Technology  Voice Control AI & Speech to Text Dictation Ambient Speech-to-Text AI 
Use case types Type A Type B Type C
Use case description Voice control (navigation) Clinician & clinician Clinician & patient
Environment type Clinical setting Clinical setting Clinical or non-clinical setting
Capability
  • Voice command to navigate and execute software functions
  • Speech to text dictation
  • Ambience speech recognition & transcription + summarisation 
  • Convert audio conversation into text transcriptions
 User type Clinician(s) only Clinician & patient
 Language Monolingual (English) Multilingual (min. English, Mandarin or Malay)
 Use cases  Rehab Centre Hand Occupational Therapy 
  • Emergency Department resuscitation
  • Multi-disciplinary discussion
  • Rehab Centre Hand Occupational Therapy 
  • Contact Centre

Table 1: Use Case Types - For detail of the use cases, please refer to Annex A- Use Cases Summary.

1. The following key capabilities of the proposed solutions are the key shortlisting criteria:

i. Ambience and Speech to Text Dictation –

Ability to accurately generate medical notes from clinician/patient conversations and almost instantly convert voice conversation data into draft medical notes. In addition, ability to translate pure speech to text dictation.

ii. Clinical Summary –

Ability to summarise healthcare worker’s/patient’s conversation’s captured text and create a summary note in near real-time with appropriate tagging to the clinical notes. It allows clinicians to review and finalise the notes before they are updated into the hospital’s EMR system.

iii. Voice Control Navigation –

Ability to allow users to perform navigation and operational tasks with Epic system, or with other EMR system via voice commands.

iv. Multi-Languages –

Ability to support multi-languages (both English and Mandarin at the minimum). Added advantage if the solution could support other mixed languages (e.g. Malay etc).

v. Medical Vocabularies –

Ability to support in-built medical vocabularies to capture medical terminologies.

vi. Epic Integration –

Solution providers need to have either proven track records in Epic integration or express explicit commitment to establish Epic integration by the time of actual implementation of the solution after the trial. For the latter, it is the sole responsibility of the solution provider to work directly with Epic to get their integration endorsed by Epic.

vii. AI-Safety –

Added advantage if there is AI safety services to detect and block sensitive health information and personal identifiable information.

viii. User Friendliness –

The solution is intuitive for users to use without elaborate setup or configurations.

 

2. The proposed STT solution for this Proof of Value (POV) trial shall fulfil the following Technical Requirements:

i. The architecture design pattern addresses cybersecurity & data sovereignty, and data residency requirements. Solutions must comply with cybersecurity standards, especially when there is personal and clinical data involved. (Refer to Annex B: Table of Security and Infrastructure Requirements and Controls for Trial & Pilot Environment).

ii. For STT Solutions that leverage their proprietary Large Language Model (LLM) hosted at their native location for the processing of audio to text and summarising capabilities, the solution provider may have to demonstrate that the audio and transcript text file do not reside at their native location.

3.   Overall collaboration requirements:

i. Scalable: The proposed solutions should consider a broader plan to scale across other healthcare institutions and settings, and consider the need to integrate and interface with the various relevant systems.

ii. Cost-effectiveness: The proposed solutions must be cost-effective and beneficial to the public healthcare in Singapore. 

iii. Implementation Timeline: This POV collaboration is scoped to address the challenge statement and the trial completion period is to be capped at not more than 3 months (including the setup of the trial). 

iv. Cost: The proposed solution should include indicative cost (if applicable) for the trial.   

 
 

Resources

Challenge Statement

Download the challenge statement brief in PDF format

Download
Submit a proposal

Participate in this challenge statement by completing the submission form.

Submit a proposal
Have any questions?

Check out our FAQ section.

Find out more
X

By continuing to use and navigate this website, you consent to the use of cookies in accordance with our Privacy Policy.

Confirm