Skip to content

Orekhov/SentenceBreaking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple application for sentence boundary disambiguation.

Main idea:
File 'separators' contains all valid separators. File 'filters' 
contains filters, defining what is a sentence bound. File 'exclusions'
contains exclusions, defining what is not a sentence bound.
Application reads input.txt file and shows all bounds.


File 'separators' structure:
Each line in file is a regular expression, which defines valid 
separators. More compound separators must come earlier.
( For example, '?!' is more compound than '?' or '!' )

File 'filters' structure:
Each line in file is a regular expression, which defines what is a sentence bound. 

File 'exclusions' structure:
Each line in file is a regular expression, which defines what is not a sentence bound. 

Regular expression syntax:
Regular Expression Language from .NET Framework 4.5

About

Sentence boundary disambiguation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages