What is Text Mining? by Rei Morikawa
Text mining, also called text data mining, is the process of deriving high-quality information from written natural language. High-quality information refers to information that is new, relevant, and of interest for the project at hand. All of the data that we generate via e-mails, word documents, PDF files, and text messages are written in natural language, but this data isn’t typically stored in a structured format. Text mining is the process that we use to draw insights and patterns from that unstructured data.
For example, scanning a set of documents written in natural language is a simple text mining task. Then, you would either model the documents for predictive classification purposes, or populate a clean database with the extracted information.