Donetsk national technical university

Faculty of Computer information technologies and automations

Department of "Automatic Controlling Systems"

DonNTU
Master’s portal
Russian Version
English version
Biography
Library
Links
Report about search
The individual task

Resume

Subject of master’s work: " Dynamic optimization of data allocation among network sites "

Baranova Svetlana


Introduction

      Distributed databases (DDB) have got broad using in condition of the optimization to organizations and processing the greater volumes of data. Among main advantage of distributed databases it’s possible to select following:

  • Using DDB allows to display structure of organization.

  • Increasing to data accessibility.

  • If data is allocated on the most loading node, the deployment of DDB can promote increasing to velocities of the access to the database.

  • In distributed ambience expansion of existing systems is realized much more simply. Accompaniment in network of the new node does not influence on operation already suschestvuyuschih.

  •       For present-day of technologies database management systems reached that level of the development, when on the market already there are it is enough developed and reliable commercial systems. But in there are the problem of increasing to efficiency of the operation of Distributed databases The key factor, reducing factors of the operation DDB, is that communication networks, coverring greater territory while remain slow. So primary task of the distributed systems consists in that to minimize use the networks that is to say minimize the volume sent data or time of their issue.

          Purpose of work is increasing to capacity of the work of the DDB and to optimize data allocation among sites of the computer network. And there are next problems:
  • Learn particularities to realization to technologies of the distribution given in DBMS Oracle (replication and queries).
  • Develop model of the work of DDB in condition to DBMS Oracle.
  • Define set a parameters of work of DDB, which are required for determination of efficiency of the operating the database.
  • Modify algorithm of optimization of the data distribution among sited.


  •       Scientific novelty is concluded in following:
  • Studied all particularities of the database work in DBMS Oracle (spreading the renovations and execution queries), as well as, the definable the parameters of DDB, which possible collect or get by means of existing software programs, develop the mathematical model of the optimum sharing the files on elements of the computer network for the reason minimization of the general running time request and spreading the renovations. Given model must take into account such particularities DDB, as fragmentation and replication.

  • Considering that important at reception of the decision for database in concrete DBMS, are a raw data, is expected develop the tool of the collection to statistical information on process, occurring when functioning DDB (about temporary parameter of the spreading the renovations and execution request

  • Develop the modification of the algorithm, allowing find the decision of the optimum distribution given on elements of the computer network.
  • Review of the literature on theme "Dynamic optimization of the data allocation among sites"

          Problem to optimization of the distribution data have got it is enough broad development. This questions is dedicated to thesis of my leader Telyatnikov A. He has developed the object model to dynamic optimization of the distribution data. The Idea is founded on joint use the device genetic algorithm and object model of DDB. The scheme of the distribution fragment given on elements DDB is coded in the manner of set chromosomes. In process of the optimization by means of GA operators are generated chromosomes that is to say schemes of the distribution data. The got schemes are source information for object model, by means of which are computed estimations criterion to efficiency DDB. These estimations, in turn are importance’s of the fitness function of GA for given variant of the decision. Input given for modeling and optimization is a statistics of the use request and spreading the renovations, but result - modified a scheme of the distribution data. As criterion to efficiency is chose minimum total average running time request and spreading the renovations.
    The Task to optimization DDB is worded as follows: necessary to find the scheme of the distribution data, under which total average running time request and spreading the renovations, the system generated by operating must be minimum.
          Timeses of the execution request and spreading the renovations are computed by means of object model DDB. For its realization is designed classes, characterizing standard components DDB: element, channel data communication, application, query At calculation criterion optimization necessary to take into account the following restrictions:

    1. In DDB at least one data fragment must be present.
    2. Total keeping on element data size must not exceed the general disc space given element.
    3. Maximum query running time must not exceed given limiting importance.

          Given model is dynamic, takes into account the fragmentation and replication data that is to say corresponds to the modern requirements. But she does not take into account the particularities to realization of the distributed processing in concrete DBMS. So we shall consider one more approach to modeling of DDB. It's a model which is laid by I.AHMAD.

          Under such approach problem of model to optimization of the distribution given is considered as complex problem. It's expected that exists the mutual dependency between given scheme of the distribution (which shows the location of each fragment on different elements database) and strategy of query optimization (which solves, how query can be optimum executed under given scheme of the accomodation). In models is entered notion graph of fragment dependencies, which prototypes the relations between fragment and amount sent data, required for performing the queries. Except graph, considered algorithm has a following input parameters: expenseses on data communication between elements, restriction on accomodation amount fragment on elements, frequency of the query execution from elements. The result will be modified scheme of the distribution data.

          In models there are two types of the expenseses on data communication are taken into account, they are connected with using in DBMS miscellaneous strategy to optimization distributed query. Thereby, depending on particularities concrete DBMS possible to take into account one or another type of the expenseses. But model has and defect: she does not take into account presence of the mechanism of the replication. However that she considers the particularities of the operation concrete DBMS, will allow to get the scheme of the distribution data, more exactly corresponding to real ambience. Thereby, for study was chose model, offered by I.AHMAT. We'll allow to get the optimum scheme of the distribution using to this models and taking into account particularities of the operation in concrete DBMS. As ambiences we choose Oracle DBMS. For adapting the source model to chosen DBMS is required study the particularities to realization portioned queries and spreading the renovations in DBMS Oracle.

    Planning results
          Considered models and methods to optimization of the distribution data allow to do the following findings:
    1. The models possess beside some admissions and restrictions, which do impossible their using at analysis and optimization of the functioning in real DDB.
    2. Earlier used methods of DDB optimization - a method of the branches and bordersand other, also can not give positive result at optimization, since their use is limited in connection with big dimensionality of the task of the distribution ensemble fragment given on elements computer networks.
    3. It's Reasonable to using evolutionary methods for decision of the task to optimization of the distribution data.
    4. For decision of the task in models of the optimization it is necessary to take into account presence of the mechanism of the renovations and fragmentations in DDB, as well as particularities to realization portioned request in concrete SUBD.

          Thereby, in spite of called on earlier studies questions of modeling and optimization DDB of computer information systems have not got the final judgement. Used models and methods have a row defect that has conditioned need their further perfect.
          As a result of executionof master's work is made review of the studies on question of the optimization of the distribution given on elements of the computing network. It's Planned get such results:

    1. Study particularities to DDB realization with use DBMS Oracle.
    2. Modify models of the distributed database.
    3. Modify genetic algorithm for decision of the task of optimization
    4. Conduct experimental researches

    Literature
    1. Cegelik G.G. The Systems of the distributed databases. - Lvov: 1990. - 168 p.
    2. T.KONNOLLI, K.BEGG. The Database: designing, realization and accompaniment. The theory and practice. - M., 2000. - 1120 p.
    3. Telyatnikov A. The Development to object model of distributed database// DonNTU. - Donetsk, 2004. - p. 192 - 200.
    4. I. Ahmad, K. Karlapaem. Evolutionary Algorithms for Allocating Data in Distributed Database Systems//
      http://ranger.uta.edu/~iahmad/journal-papers/[J39]%20Evolutionary%20algorithms%20for%20allocating%20data.pdf
    5. Distributed and parallel database systems. Database management systems, #04/1996 - http://www.osp.ru/dbms/1996/04/4.htm