Towards intelligent process of collecting, analyzing and using information from multiple sources based on agent model and data fusion framework
Nowadays the activities of most companies are based on a large amount of electronic documentation which is stored in internal and external data sources. Monitoring of relevant data is a special type of information-analytical work that allows to collect diverse business information and use it for management purposes. There are some challenges in the development of monitoring system. The first and most significant problem is that the vast amount of information on the Internet makes it difficult to find and select the information you really need. Raw, non-aggregated and unverified data cannot provide quality decision support. The second problem is that the information on the Internet is clearly dynamic: information is posted, modified and deleted. The third problem that needs to be solved is the automatic extraction of relevant data from text information. The fourth problem is the need to identify non-obvious patterns and connections based on processing of multiple sources. Thus, the issue of intelligent monitoring system development is still open.
The conceptual bases of monitoring systems creation in the distributed information space for the decision-making support are considered. Based on the fact that traditional monitoring systems are focused on collecting data from internal sources, it is proposed to supplement the existing approach to monitoring through the implementation of a monitoring system based on information fusion from internal and external sources. The conceptual model of monitoring of actual data is offered. Three components for support of the process of monitoring are allocated: model of search of sources, model of extraction of data, and model of estimation of the received information. The monitoring framework is suggested, which is based on the collection of three types of information: information on performance, which is contained in internal information sources; information on the state of the external business environment, which is publicly available on the Internet; information on the effectiveness of activities as a reflection in the external information space.
Application of the category theory to represent the content of the information space allows to operate on heterogeneous objects using common mechanisms and tools. This approach is appropriate for developing general rules and their implementation to work with arbitrary types of content. There are formalized descriptions of information retrieval and collection processes, a data source model, a web page data collection model and a model for integrating information obtained from different data sources, and a model for extracting and identifying knowledge based on text processing.
The basic principles of creating a multi-agent system for searching, extracting and interpreting information are considered. These tasks are decomposed into the interaction of agents. The formal architecture of the agent is offered, which provides the representation of the agent function based on its mental model, given by the comparator. A multilayer agent model is proposed for creating intelligent monitoring systems to perform management tasks based on the ontological description of the processes of search, collection and evaluation of relevant data. A prototype of a software system for monitoring up-to-date data based on an agent-oriented paradigm using open source software has been developed.
The experiments on testing and implementation of search, collection and extraction of data were held based on proposed concepts. The methods of information extraction were tested on the collection of data from electronic marketplaces. It has been experimentally investigated that the processing of textual information and the definition of a set of keywords that describe a group of similar products makes it possible to automate the process of clustering information objects based on the linguistic processing of their textual description. The results of experiments on information retrieval, extraction of facts from textual information, formation of tolerance classes of information objects, etc. are discussed.
Salle K71 – 11h