DonNTU   Masters' portal

Abstract

Сontent

1. Relevance of the topic

At present, the study of socio-economic development of the countries of the world is extremely urgent and appropriate task because, first, there are constantly replenished large database of indicators of development of countries of the world, and secondly the study and understanding of the laws that were in the past and the fact that we Today we have the key to a sustainable development in the future. Knowing certain patterns of behavior from the rest of the country can adjust the decisions taken at the level of the state or certain regions in order to achieve better development.

2. Formulation of objectives

The purpose of the master's work – research and develop a system to identify patterns in the indicators of development of the world in the form of software, which includes the training data, data mining and visualization using graphs and charts.

The object of the study are indicators of development of the countries of the world by the World Bank data.

Research subjects – the methods of data mining, visualization techniques and pre-processing of data.

3. The content of the work in stages

3.1. Step 1

These are on the Web data resources of the World Bank. [9] The data is available as an archive format zip (Fig. 1). The archive contains a MS Excel document format xlsx. Due to the above task is to download data from the Internet and extract them.

3.2. Step 2

By default, data is stored as a two-dimensional table, and have the following format. As can be seen from the data format they are three-dimensional in nature, so the task of converting data into a plurality of two-dimensional tables (Fig. 1).

The content of the work in stages

Figure 1 – Stages of the master's work
(animation: 7 shots, a lot of repetition cycles, 91 kilobytes)

Convenient for further work data format – MS Excel document with multiple pages – indicators. Each table contains a list of the country in rows and columns, the. As a result of all the above is the need to develop, which will execute the transformation of the data.

3.3. Step 3

A data base on the second phase of work will include spaces in the data (Fig. 1). Supposed to fill in the gaps by linear interpolation types of "forward", "back", "neutral" [5].

3.4. Step 4

At this stage it is smoothing the data (Fig. 1). Smoothing can be performed using the exponential smoothing with respect to the obtained, resulting in the third stage of the table.

3.5. Step 5

At this stage it is the application of data mining techniques described in paragraph overview of the subject area of research in the world, or their composition (Fig. 1).

3.6. Step 6

Data Visualization – a problem faced in the work of any researcher (Fig. 1). Knowledge gained on the fifth stage of the plan visualized by graphs and charts.

Findings

In a study of socio-economic development of the countries of the world adopted the task of writing software that enables you to perform all phases of action processing, analysis and visualization. As a result of an analysis of the literature on the methods of preparation and analysis. Program were implemented first and second phases of the master's work, as well as a number of transformations of the original database.

References

  1. The classification method [electronic resource access mode]: http://www.inftech.webservis.ru/it/database/datamining/ar2.html (10.04.2013);
  2. Clustering method [electronic resource access mode]: http://habrahabr.ru/post/101338/ (15.04.2013);
  3. The method of association [electronic resource access mode]: http://www.inftech.webservis.ru/it/database/datamining/
    ar1.html#Ассоциация(21.04.2013);
  4. The method of decision trees [electronic resource access mode]: http://www.inftech.webservis.ru/it/database/datamining/ar2.html#
    4.5. Деревья решений (decision trees) (27.04.2013);
  5. Wikipedia: the free encyclopedia. [Electronic resource access mode]: http://ru.wikipedia.org/wiki/Линейная_интерполяция (2.05.2013);
  6. Date Mining [electronic resource access mode]: http://compit.by/upload/Data_Mining.pdf (2.05.2013);
  7. Date Mining [electronic resource access mode]: http://www.iteam.ru/publications/it/section_92/article_1448/ (5.05.2013);
  8. A multivariate visualization [electronic resource access mode]: http://pca.narod.ru/ZINANN.htm (6.05.2013);
  9. World Bank data [electronic resource access mode]: http://data.worldbank.org (9.05.2013);
  10. Software protection [electronic resource access mode]: ru.wikipedia.org/wiki/Защита_программного_обеспечения (12.05.2013);
  11. About the statistical methods of data analysis [electronic resource access mode]: www.omsu.ru/file.php?id=4948 (13.05.2013);
  12. Ramon Antonio Rodriguez's work [electronic resource access mode]: http://ea.donntu.ru:8080/jspui/
    browse?type=author&value=%D0%A0%D0%BE%D0%B4%D1%80%D0%B8%D0%B3%D0%B5%D1%81+%D0%97%D0%B0%

    D0%BB%D0%B5%D0%BF%D0%B8%D0%BD%D0%BE%D1%81%2C+%D0%A0%D0%B0%D0%BC%D0%BE
    %D0%BD+%D0%90%D0%BD%D1%82%D0%BE%D0%BD%D0%B8%D0%BE (15.05.2013);
  13. An analysis of Internet traffic using data mining [electronic resource access mode]: http://masters.donntu.ru/2012/fknt/paushchik/links/index.htm (15.05.2013);
  14. Methods of data mining [electronic resource access mode]: http://masters.donntu.ru/2011/fknt/pominchuk/library/tez1.htm (20.05.2013);
  15. Review of methods of displaying spatial data by clustering [electronic resource access mode]: http://masters.donntu.ru/2012/fknt/prikhodko/library/article1.htm (22.05.2013);
  16. Methods for interactive visualization of geospatial data of complex structure [electronic resource access mode]: http://masters.donntu.ru/2011/fknt/serik/index.htm (23.05.2013).