TEXT-TO-SPEECH INPUT NORMALIZATION ALGORITHM FOR MEDICAL PRESCRIPTION
Keywords:
Natural Language Processing, Normalization, Medical Prescription, Text-ToSpeech System, CorpusAbstract
The use of noisy text such as abbreviations, alphanumeric, and acronyms has become a tradition for medical practitioners when writing prescriptions for their patients. The meaning of these words is usually context and language-dependent, making it very difficult for a natural language processing system such as text-to-speech to process. Therefore, there is a need for the normalization of these texts into their clean forms before processing. The existing normalization schemes are not capable of addressing this problem due to the unavailability of a medical prescription corpus and are focusing on social media content while this is not the only content where these informal words are used. Therefore, in this research, we propose a medical prescription corpora and a hybrid text-to-speech input normalization algorithm. Python Programming Language together with Natural Language Toolkit (NLTK) and Tkinter was used for the implementation of the benchmark and the proposed algorithms. The proposed medical prescription algorithm achieved 0.8999 (89.99%) BLEU scores accuracy against 0.64 (64%) BLEU baseline score. This indicates that the accuracy of the proposed algorithm increases by 25.99% compared to the benchmark scheme and is better used in normalizing the medical prescription.