Spelling Error Pattern Analysis of Punjabi Typed Text
Loading...
Files
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Error pattern analysis of a language is useful in language related technology
development, such as Spell Checker and Corrector, Optical Character Recognition,
Machine Translation, Natural Language Interfaces etc. Error pattern analysis includes
analysis of various types of errors (insertion, deletion, transposition, substitution, run-on,
split word error) positional analysis, word length effects, phonetic errors, first position
error analysis, keyboard effects etc. Though considerable work has been done in the area
for English and related languages, the Indian Language scenario presents a relatively
more complex and uphill task. In this thesis, I have presented a statistical error analysis
for Punjabi, the world’s 14th most widely spoken language. For this purpose I have
collected about 20000 misspelled words generated by typists. The application of the error
analysis in designing the suggestion list for a Punjabi spell checker is also discussed.
