Text Mining
Text Mining is the discovery by computer of new, previously unknown
information, by automatically extracting information from different written
resources. A key element is the linking together of the extracted information
together to form new facts or new hypotheses to be explored further by more
conventional means of experimentation. Text mining is different from what are
familiar with in web search. In search, the user is typically looking for
something that is already known and has been written by someone else. The
problem is pushing aside all the material that currently is not relevant to
your needs in order to find the relevant information. In text mining, the goal
is to discover unknown information, something that no one yet knows and so could
not have yet written down.
Machine intelligence is a problem for text mining. Natural language has
developed to help humans communicate with one another and record information.
Computers are a long way from comprehending natural language. Humans have the
ability to distinguish and apply linguistic patterns to text and humans can
easily overcome obstacles that computers cannot easily handle such as slang,
spelling variations and contextual meaning. However, although our language
capabilities allow us to comprehend unstructured data, we lack the computer’s
ability to process text in large volumes or at high speeds. Figure depicts a
generic process model for a text mining application.
Starting with a collection of documents, a text mining tool would retrieve a
particular document and preprocess it by checking format and character sets.
Then it would go through a text analysis phase, sometimes repeating techniques
until information is extracted. Three text analysis techniques are shown in the
example, but many other combinations of techniques could be used depending on
the goals of the organization. The resulting information can be placed in a
management information system, yielding an abundant amount of knowledge for the
user of that system.
You can download Text Mining seminar abstract from here.
0 comments:
Post a Comment