01.02.12
Recognising the future
Source: National Health Executive Jan/Feb 2012
Kate Ashley explores the concepts underlying digital dictation and voice recognition, and asks how far we are willing and able to go.
The newest wave of technology driving innovation and efficiency in the health sector lies with digital dictation and voice recognition systems. While the potential this could unlock is hugely beneficial for health service staff, there is a multitude of issues to consider before we dive in.
These include the traditional challenges of transcription all the way up to teaching machines the rules of a language full of contradictions; one that we still do not fully understand.
The benefits of this digital information revolution are certainly alluring. Data stored on computers affords a higher level of security, reduces paperwork, thus saving money and the environment, reduces the possibility for human error and saves copious amounts of time.
Time – that rare resource – means that work can be redistributed, allowing more to get done, more efficiently. Of course this can lead some staff to fear for their jobs – they will refuse to ‘train’ their replacements only to be dismissed. And while there is a good argument for suggesting there is more than enough for both humans and computers to maintain employment (if they learn to share) some of these concerns will undoubtedly transform into reality. But whenever new technology comes along with promises of making our lives easier, we direct our energies and aspirations into new channels, leaving us with the same amount of work, if not more, than we had before.
Some problems affect both forms of transcription; with any dictation, the clearer the speech, the better. This automatically discriminates against those with regional or foreign accents, for those who suffer from stammers or similar speech defects, and even those who are simply shy.
While these are all differences that can be adjusted to, it is worth noting that this technology may not be applicable to the entire workforce. Even those with fairly standard speech will be required to modify their natural language, especially in the case of voice recognition, in order to meet the standards of that technology.
This is the case because very few people regularly speak in the same way as they write Standard English. The differences between the two mediums mean that transcription is much more than a verbatim copy. The process includes sorting relevant information from irrelevant, so that even with voice recognition secretaries turn from typists into editors; checking, rewriting and reformatting the text.
It is hard to imagine how a computer could completely replace this process, with its myriad rules and conventions, and without a sense of context to resolve semantic ambiguities. What would the programme do when faced with a homophone? Or for that matter, a newly-coined medical term?
But evidence is growing to show programmes can conduct transcription to a high degree of accuracy. At the cutting edge of voice recognition is work to intelligently react to the data we feed in; learning from our every word and using algorithms to predict which words would be most likely to follow in sequence, in order to check the text is cohesive.
As impressive as this is, it should be acknowledged that complete accuracy is unattainable, at least using this method of learning.
“Imitation can take you so far, but true fluency lies in the quirks of a language, the exceptions to every rule and exceptions to those exceptions in the few instances where they apply. Even if a system had access to huge collections of natural language, it still would not be enough. The programme would be able to give an extraordinary performance, but language holds infinite possibilities. There are still so many sentences that have never been uttered, despite the length of time humans have had language and the vast amount of time we regularly engage in speech.
We also break and create new rules at an alarming rate. Developing new vocabulary and modifying meaning happens all over the world as language continues to change. This means that keeping up is beyond the limits of practicality, if not possibility.
Despite this realm of developments and feasibility that cannot be taught, a larger obstacle to widespread implementation of voice recognition and super-smart computers may be good oldfashioned resistance. Although we can and do promote new practices, humans often dislike change. This lies at the heart of technophobia and explains why many staff may be reluctant to have a machine handle a large part of their work, with very minimum intervention by a person.
But should we let our prejudices hold us back from the march of progress? It seems as if this is set to be the future, and while it may make some of us uncomfortable, our society continues to change, as quickly and surely as our language does.
Tell us what you think – have your say below, or email us directly at [email protected]