Today the task of integrated information space creation is important not only for separate enterprises,
but for systems of cooperating enterprises and for all enterprises in the long run. It's obvious that
creation of such extensive information space imposes special requirements on integration technologies.
This paper addresses the requirements which should be complied by the data integration system being
possible to integrate unrestricted, constantly extending, amount of information systems, and the ways
of the requirements' fulfillment.
Introduction
Integrated information space creation can be done either on the level of user interface or on the
level of data. In the first case, all integrated systems are accessible through one window, such as a
WEB browser. But this approach integrates the systems on the level of information presentation,
not on the level of information itself. Data of different systems remain insufficiently connected, which
restricts users' possibilities to process them.
The second approach, the approach of integration on the level of data, is more advanced
since it removes the restrictions on processing data. According to this approach, an
integrated data space is created, which is to be used for solving complex tasks: construction of
a unified user interface to all systems without seams among them on the level of information, creation
of general data processing algorithms, etc. Here the integrated data space can be used as if
one worked with one system, whereas several information systems are used at the same time in fact.
There are two categories of approaches to data integration which are met in practice: analytical
and transactional. Analytical approaches imply creation of data warehouses, by means of ETL
and ELT tools including, for data analysis purposes. It lets to do high-performance analytical
data processing, but the information space created in this way is not full from
the point of view of integrated data modification with immediate reflection in source data.
Transactional approaches are more complete in this connection since they let both data
analysis and data modification. There are to variations of these approaches: with and
without integrated schema. In the case of integrated schema existence, data are modified
where they are stored in fact but though an integrated interface combining several schemas, relational
as a rule, into one schema logically. So here differences among the systems are erased, data
are processed through the one integrated [relational] schema as if a single system were used [DB2UD, DVDP].
In the case of integration without such schema, interactions between systems are to be implemented on
the level of every system separately in an explicit form [ODII]. Here for one to process
data he (she) should master structure of relational schemas for every system being integrated;
moreover information connections between the systems are not explicit and are known to developers only.
It's obvious that absence of an integrated schema complicates the integration process
significantly since it requires to known nuances of data organization for every integrated
system and gives no general access interface for all information.
So, only transactional solutions maintaining integrated schemas let do full-blown integration of
information systems in the sufficiently efficient way, whereas analytical solutions forbid direct
data modifications without additional complicated programming and transactional
solutions without integrated schemas do not let work with data as with a single whole. Further
we focus only on transactional solutions with integrated schemas.