NLP++ is a new C++ -like programming language specially designed for building deep text analyzers. It features a simple syntax that integrates code, rules, parse tree, and knowledge base data types.
VisualText builds multi-pass and multi-strategy text analyzers, in which each pass is implemented as a pass file containing NLP++ code and rules.
Pass types include tokenization, pattern matching, recursive grammar, and more. Such analyzers are analogous to a set of cascaded YACC or BISON systems, or a set of cascaded Finite State Automata.
NLP++ enables grammar rules to apply in particular contexts of a parse tree, in order to build fast analyzers and minimize "leakage", or spurious matching of rules.
NLP++ can run interpreted or compiled.
In VisualText, NLP++ is interpreted to support fast edit-and-test development cycles for rapid prototyping of analyzers. For deployment, NLP++ can be compiled to afford typically 3 times faster text analysis.
Next Section: |