TAIParse Part-of-Speech (POS) Tagger (DOWNLOAD)
We are proud to announce the release of a standalone freeware executable of TAIParse featuring part-of-speech tagging.
A tagger is a necessary component of most text analysis systems, as it assigns a syntax class (e.g., noun, verb, adjective, adverb) to every word in a sentence.
The tagger produces an output format almost identical to that of the Penn Treebank Project, including bracketing of noun phrases. The current version achieves 94% accuracy
in a blind test that we use to assess progress.
The tagger has been built manually with general rules and methods.
The entire analyzer definition, in our NLP++ language, is supplied with the download In contrast to other taggers, which are overtrained for particular document sets and use overly specific rules, this tagger can readily be applied to unseen text types.
Editing, enhancing, and compiling the tagger requires Professional VisualText, available automatically by DOWNLOAD.
We welcome your feedback, questions, and suggestions.
DOWNLOAD TAIParse 0.8 beta, focusing on POS tagging and shallow parsing.
Reference: Tagset used in Penn Treebank.