Text-to-speech - cassette tape to AI

Today, a doctor can dictate a medical record and get a fully structured text in seconds, directly integrated with Norwegian EHR systems. But it hasn't always been this way. In the 1980s, Norwegian doctors dictated medical records onto cassette tapes, which were then sent to a typing pool for manual transcription. This was standard practice until digital dictation machines and electronic medical record systems became more widespread in the 1990s.

For those who want to read more about Norwegian hospital history and medical documentation, there are several good resources: Norwegian Museum of Science and Technology - Hospital History, Health History Forum and Store Norske Leksikon – The History of Public Health Services.

The 1980s: Dictaphone and typing pool

In practice, the process worked as follows: the doctor dictated the medical record onto a cassette, which was then sent to a centralized typing office within the hospital. A secretary would listen to the recording and transcribe the document before it was sent back to the doctor for approval. This method had several challenges: it could take several days for the note to be completed, unclear recordings or medical terms could be misunderstood, and the costs for dedicated staff were high. During busy periods, cassettes could pile up, creating bottlenecks.

1990s: Digital Dictation and Speech-to-Text 1.0

In the 1990s, digital dictation machines were introduced, and recordings could be saved as files, making distribution easier than with cassette tapes. Around 1997, the first commercial speech-to-text systems emerged, such as Dragon NaturallySpeaking, which promised that doctors could dictate directly into a PC without involving a typing pool.

The problem was that the systems required extensive training – often 20–30 minutes of input per voice – and accuracy was often below 85% %. Medical terms were frequently transcribed incorrectly, and many doctors therefore continued to use a secretary or write manually.

2000s: Better recognition and EHR integration

During the 2000s, speech-to-text became increasingly reliable. Systems gained larger medical vocabularies, required less training, and integrated with electronic health record (EHR) systems. In Norway, some systems could now be dictated directly into EHR systems such as Gerica, Infodoc Plenario, DIPS, and System X. The technology was still expensive and required manual proofreading, but it laid the groundwork for faster and more accurate documentation.

2010s: Cloud-based speech recognition

Around 2010, cloud-based speech recognition solutions became common. Doctors could use web-based services from a PC, mobile phone, or tablet, without the need for local installation. The systems had automatic updates, subscription-based models, and were accessible from multiple devices. Accuracy increased to around 90–95 % for general speech, but medical terminology remained a challenge. At the same time, privacy became a key issue for Norwegian stakeholders.

2020s: AI-powered transcription

Today, modern AI speech-to-text systems, such as Medivox, have made great strides. They are trained on Norwegian medical terminology, can understand context like «type 2 diabetes,» correctly recognize drug names, and automatically structure the note into Current, Assessment, and Plan. The systems also offer GDPR-secure processing within Norway/EEA (check terms and conditions for details).

Example:

The doctor says: «The patient has had a headache for two days. No fever. I suspect migraines. Give Paracetamol 1 gram as needed.»

AI produces a structured memo:

Current Headache for 2 days. No fever. Assessment: Suspect migraine. Plan: - Paracetamol 1 g as needed

This saves the doctor manual formatting and ensures more consistent notes.

How Medivox stands out

Medivox has several strengths, including:

Norwegian Medical Terminology – Diagnoses, Medications, and Abbreviations
GDPR-secure processing - no data is sent outside the EEA without a legal basis
Integration with Norwegian EHR systems
Offline mode under development – AI can run locally on the device

See how it works in practice on their webinar.

Effect on medical record quality

Studies and experience from Norwegian general practitioners show that AI contributes to longer and more structured medical records, reduces medication errors, and saves 3-5 minutes per consultation. Nevertheless, the doctor must always ensure the quality of the text.

The future

The forward-looking view points toward real-time transcription during consultation, automatic diagnosis coding (ICPC-2/ICD-10), intelligent referral letters, multilingual support with Norwegian translation, and voice authentication for security.

Conclusion

From cassette tapes in the 1980s to AI-based real-time transcription today, medical speech-to-text has evolved tremendously. The goal is not to replace the doctor, but to free up time that can be spent on patients instead of documentation.