Computing at Glasgow University
Paper ID: 8114
DCS Tech Report Number: TR-2005-209

Extracting Data from Personal Text Messages

Publication Type: Tech Report (internal)
Appeared in: DCS Technical Report Series
Page Numbers : 1-23
Publisher: Dept of Computing Science, University of Glasgow
Year: 2005

We present an approach to Information Extraction (IE) from short text and electronic mail messages in the restricted task of extracting data which can be added to a specified data repository. Whereas most IE systems work by starting from a syntactic analysis of the message, our software works by generating possible sentence structures from the database metadata and then pattern matching these structures against the input text. This technique finds new data and generates update statements which can be used to add the new data to the repository. The paper describes an initial version of a component which handles a number of kinds of sentences, anaphoric references and synonyms.

Keywords: "Information Extraction" "Text Mining" E-mail "SMS Text"

