ISV Training: Introduction to nVoq Dictation

HTTP & WebSocket Dictation

nVoq offers two types of dictation functionality: HTTP and WebSocket.

HTTP dictation works best for back-end transcription and where required by tightly constrained networking environments. This is a common use case for asynchronous or "batch" dictation where medical professionals record their audio and submit it for processing when the entire recording is complete. The transcribed text is then inserted into the medical record system when the audio is fully transcribed.

The WebSocket API provides low latency access to the dictation service that works great for real-time dictation where the user is dictating and receiving the corresponding transcription as they speak. The operations are similar to those used by the HTTP API, but while the audio is uploaded in parts text is being sent back to the client concurrently. This provides a great user experience, but the application developer has a bit more to manage in this asynchronous implementation.

There are a few things to keep in mind when using WebSocket dictations:

If no dictation server is available for WebSockets, a user will see a message stating that "the dictation service is temporarily unavailable."
If WebSocket-specific whitelisting is needed, wss://healthcare.nvoq.com:443 is the best URL to whitelist.
WebSocket dictation does not work if the nVoq Wireless Microphone app is being used as the recording device. Dictations using the nVoq Wireless Microphone app automatically revert to HTTP.

Ways to Receive Text

There are also two types of text you can receive: HYPOTHESISTEXT and STABLETEXT.

HYPOTHESISTEXT returns text more quickly and may change as the user continues to speak based on context of the rest of the dictation.

STABLETEXT returns chunks of finalized text that do not change.

Acoustic Adaptation

Regardless of whether you're using HTTP or WebSocket, or HYPOTHESISTEXT or STABLETEXT, we perform audio adaptation on user profiles for each speaker. That means that the audio for each dictation is tailored for each specific user to improve accuracy. It is for this reason that we do NOT recommend having multiple speakers dictating on a single account. If multiple speakers use the same account, adaptation will adjust to the most recent speaker's voice but will be distorted by the fact that it was originally tuned to another speaker, which will degrade dictation accuracy.

Dictation APIs