Abstract

Attention! At the time of writing this essay the master's work is not completed. The estimated completion date is May 2024. The full text of the work, as well as materials on the topic can be obtained from the author or his supervisor after the specified date.

Introduction
1. Purpose and objectives of the study, planned results
2. Basic concepts and methods of multidimensional statistics
2.1 Factor analysis
2.2 Discriminant analysis
2.3 Cluster analysis
2.4 Multidimensional scaling
2.5 Quality control methods
3. Application of multidimensional statistical analysis in regional studies
4. Application of mathematical statistics methods to analyze the demographic situation in the DPR
Conclusions
References

Introduction

One of the key aspects of the development of modern states is effective governance territorial development. In a constantly changing economic, social and political environment, there is a need to conduct a multidimensional mathematical and statistical analysis of the development of regions in order to identify the main trends, regularities and identification of priority areas of development.

Multidimensional mathematical and statistical analysis is a set of methods and tools, allowing to investigate the relationship between various factors affecting the development of the region, and to determine the degree of their influence on key indicators. This makes it possible to identify the most promising regions and develop strategies for their development, as well as to determine priority areas for attracting investments and improving the quality of life of the population.

1. Purpose and objectives of the study, planned results

The purpose of this study is to conduct a multidimensional mathematical and statistical analysis development of regions to determine the main trends and patterns, as well as the development of recommendations for optimizing management territorial development and improvement of the quality of life of the population at the regional level.

2. Basic concepts and methods of multidimensional statistics

There are situations in which random variability was represented by one or two random values, signs. For example, when studying a statistical population of people, we are interested in height and weight. In this situation, no matter how many people are included in the statistical population, we can always make a scatter chart and see the whole picture. But if, for example, there are three signs, a sign is added — the age of a person, then the scattering diagram should be compiled in three-dimensional space. It is already quite difficult to imagine a set of points in three-dimensional space.

In fact, in practice, not every observation is represented by one, two or three numbers, but by some a visible set of numbers that describe dozens of characters. In this situation, in order to compile a scattering diagram, it would be necessary consider multidimensional spaces. Statistics section dedicated to the research of experiments with multidimensional observations, it is called multidimensional statistical analysis [1].

Measuring several features (properties of an object) at once in one experiment is generally more natural than measuring any one, two. Therefore, potentially multidimensional statistical analysis has a wide field of application.

Multidimensional statistical analysis includes the following sections:

factor analysis;
discriminant analysis;
cluster analysis;
multidimensional scaling;
methods of quality control.

2.1 Factor analysis

In modern statistics, this method of data analysis is used to identify hidden factors, affecting the observed variables. The basic idea of factor analysis is that many variables can be reduced to a smaller number of factors that explain the main trends and the relationships between these variables.

Factor analysis can be applied in many fields, from economics and business to medical and psychological statistics. This method helps researchers determine which factors influence the observed variables and which variables are most sensitive to the effects of these factors.

One of the most important tasks that can be solved with the help of factor analysis is to identify the main factors influencing the change in indicators within a certain sample. In addition, factor analysis can be it is used to determine the relationships between various variables, as well as to predict the values of these variables, based on based on available data [2].

In general, factor analysis is a powerful tool that allows researchers to more deeply and accurately study the relationships between different variables, identify hidden factors and create more accurate models for predicting the future values of these variables.

The main idea of factor analysis is to identify the most important and significant factors in the set variables with similar characteristics.

To understand factor analysis, it is necessary to know the basic concepts and definitions that are part of it theoretical apparatus. The most important of them are the concepts of factor, factor load, utility, internal value, factor space.

A factor is a kind of hidden variable that explains the relationship between a set of variables. The factors may be different, for example, in medical statistics, diseases can act as factors, in economics – the parameters of economic activity, and in sociology – social facts. With the help of factor analysis, it is possible to identify the most significant factors that have the greatest correlation with the data set and explain the largest part of the variability.

Factor load is a coefficient that shows how strongly each indicator affects a given factor. Factorial the load can be positive or negative, which indicates the direction of influence. The higher the factor load, the a specific indicator plays an important role in the formation of this factor.

Utility is a coefficient that shows how much a single indicator is explained by a common factor. Utilities it is close to one if the variable is a good representative of this factor. The closer the utility is to zero, the weaker relation of this variable to the general factor [3].

An eigenvalue is a value that reflects how important a factor can be isolated from a data set. Proper number it is calculated based on eigenvectors, and the higher this number, the more important the corresponding factor.

A factor space is a multidimensional space in which each variable is represented by a factor. Factorial the space shows how strong the correlation is between different indicators, and allows you to see which variables they are closer to each other and which ones are further away.

The principle of factor analysis is that a data matrix is analyzed, where each column represents a variable, and each row is an observation.

Within the framework of factor analysis, variables are grouped by similarity and factorized, that is, they are separated based on common properties, which are present in their correlation structure. With the help of factor analysis, it is possible to identify dependencies between variables and determine which of them are the most significant, and which do not affect the overall picture.

The results of factor analysis can be the main components, which are a combination of the initial variables, and also their weight coefficients. The main goal of factor analysis is to reduce the number of variables, minimize intersections between them and identify the real distinguishing features of the data set.

2.2 Discriminant analysis

Suppose there is a collection of objects, divided into several groups, and for each object it is possible to determine which group it belongs to concern. For each object there are measurements of several quantitative characteristics. It is necessary to find a way how to based on these characteristics, you can find out the group to which the object belongs. This will allow you to specify the groups that include new objects of the same population. Discriminant analysis methods are used to solve this problem.

Discriminant analysis is a section of statistics, the content of which is the development of methods for solving discrimination problems (discrimination) of objects of observation on certain grounds.

Discriminant analysis turns out to be convenient when processing the test results of individuals when it comes to admission to this or that position. In this case, it is necessary to divide all candidates into two groups: suitable and not suitable [4].

The use of discriminant analysis is possible by the banking administration to assess the financial state of clients when granting them a loan. The Bank classifies them into reliable and unreliable on a number of grounds.

Discriminant analysis can be used as a method of dividing a set of enterprises into several homogeneous groups according to the values of any indicators of production and economic activity.

The methods of discriminant analysis allow us to construct functions of the measured characteristics, the values of which explain the partition objects into groups. It is desirable that there are few of these functions (discriminant features). In this case, the results of the analysis it is easier to interpret meaningfully.

Due to its simplicity, linear discriminant analysis plays a special role, in which classifying features are selected as linear functions of primary features.

2.3 Cluster analysis

Cluster analysis methods allow you to divide the studied set of objects into groups of "similar" objects, called clusters. The word cluster of English origin — cluster translates as brush, bundle, group, swarm, cluster.

Cluster analysis solves the following tasks:

Classifies objects taking into account all those features that characterize object. The very possibility of classification advances us to a more in-depth understanding of the considered totality and objects, included in it.
Sets the task of checking the presence of a priori given structure or classification in available aggregate. Such a check makes it possible to use the standard hypothetical-deductive scheme of scientific research.

Most clustering (hierarchical group) methods are agglomerative (unifying) — they start with the creation of elementary clusters, each of which consists of exactly one initial observation (one point), and at each subsequent step the two closest clusters are combined into one [5].

The moment of stopping this process can be set by the researcher (for example, by specifying the required number of clusters or the maximum distance at which the union is achieved).A graphical representation of the cluster merging process can be obtained using a dendrogram — a cluster union tree.

2.4 Multidimensional scaling

Multidimensional Scaling (MDS) is a statistical method that is used to analyze and visualize data based on comparative estimates or similarities between objects. It allows you to represent multidimensional data in the form of a geometric structure in a smaller space.

The main idea of the multidimensional scaling method is to find a mapping (transformation) that preserves relative distances or similarities between objects. Thus, more similar objects will be located closer to each other, and less similar objects will be located further from each other. The multidimensional scaling method can be used for various types of data, such as estimates of similarity between products, estimates of consumer preferences, results of psychological tests and It is widely used in various fields, including marketing, sociology, psychology, biology and others.

The basic principle of the multidimensional scaling method is to preserve the relative distances between objects. This means that if two objects are more similar to each other, then they should be located closer to each other than objects that less similar. Thus, the method seeks to preserve the structure of the source data in a space of smaller dimension.

2.5 Quality control methods

Quality control methods are designed to control the quality of manufactured products in order to identify violations and "bottlenecks" in organization of production and technological processes. Widespread use of scientifically based quality control methods It was an important factor in the success of the leading countries of the world economy, especially Japan.

Recently, new methods of more effective management in order to improve quality have been called "six sigma". They they are considered as a formula for the success of most multinational corporations [7]. Unlike most the methods of multidimensional analysis described above, quality control methods do not require time—consuming calculations - they are extremely simple and visual. The simplicity, visibility and effectiveness of statistical methods of quality control made it possible and justified their widespread use in advanced countries, up to craftsmen, and sometimes individual workers.

3. Application of multidimensional statistical analysis in regional studies

Multidimensional statistical analysis is used in regional studies to study the relationship of various factors, influencing the development of the region. For example, you can study the relationship between the level of economic development of the region and the level of education of the population, between the level of unemployment and the level of crime, between the level of investment and the level of development infrastructure, etc.

Multidimensional statistical analysis can also be used to classify regions according to various criteria, such as the level of economic development, the level of social tension, the level of environmental safety, etc. This allows you to identify the most problematic regions and develop measures to support and develop them.

4. Application of mathematical statistics methods to analyze the demographic situation in the DPR

The most important indicators of public health in any country are demographic indicators that characterize stability and security, as well as prospects for its further development.

In this context, the analysis of the demographic situation using the methods of mathematical statistics becomes especially relevant. After all, this approach allows us to obtain the most accurate and objective data on the state of the population, its structure and the dynamics of changes.

One of the key tools of mathematical statistics is the analysis of time series, which allows you to identify trends and patterns in changing demographic indicators. Modeling and forecasting methods are also used, allowing predict possible scenarios for the development of the demographic situation [8-10].

In general, the use of mathematical statistics methods allows us to gain a deeper understanding of demographic processes and make informed decisions in the field of social policy, healthcare and other areas related to public health and the well-being of the population.

Table 1 – Total fertility and mortality rate

Period (year)	Total population	Number of births per year	Number of deaths per year	Total fertility rate	Total mortality rate
2015	2 322 532	9 162	29 300	3,945	12,616
2016	2 326 254	11 771	34 833	5,060	14,974
2017	2 306 263	11 800	33 636	5,117	14,585
2018	2 293 431	8 239	32 574	3,592	14,203
2019	2 276 573	9 300	33 010	4,085	14,500
2020	2 257 012	8 644	35 435	3,830	15,700
2021	2 235 406	7 990	45 155	3,574	20,200
2022	2 202 440	6 986	36 780	3,172	16,700

Figure 1 – Dynamics of fertility and mortality in the DPR for the period from 2015 to 2022

Dynamics of the birth rate and

mortality in the DPR for the period from 2015 to 2022

Figure 2 – Dynamics of fertility and mortality rates in the DR for the period from 2015 to 2022
(animation: 9 frames, 9 repetition cycles, 113 kilobytes)

During the years studied, it is clear that the mortality rate is increasing and it can be characterized as high. Based Due to the low birth rate, it can be concluded that the population of the Donetsk People's Republic is facing serious demographic problems that may lead to a decrease in the population.

Various methods can be used to predict the population, including time series analysis, regression analysis analysis, cluster analysis and others. However, one of the most popular methods is the use of series indicators dynamics, such as growth rate, growth rate and average value. These indicators allow you to analyze data changes during time and make predictions for the future [11].

The main indicators characterizing absolute and relative changes in the dynamics series are: absolute growth (decrease), growth rate, growth rate, growth rate, absolute value of one percent increase (decrease).

Indicators of the dynamics series by the nature of their calculation are divided into chain and basic.

Chain indicators:

АAbsolute increase - shows how much the current level is greater (or less) the previous one.
Growth rate - shows how many times the current level is greater (or less) the previous one.
Growth rate - shows by how many percent the current level is greater (or less) the previous one.

Basic indicators:

Absolute value of the level - shows which level of the row was selected for basic.
Chain increment - shows the change in the current level compared to the baseline.
Basic growth rate - shows how many times the basic level is greater (or less) the current one.
Basic growth rate - shows by how many percent the basic level is higher (or less than) the current one.

Conclusions

The use of multidimensional statistical analysis is an important tool for studying development regions and identification of patterns and trends. This helps to identify the most promising areas of development and develop strategies to improve the situation in the region. Multivariate statistical analysis can also be used for classification of regions according to various characteristics, which allows you to identify the most problematic regions and develop measures for their support. In general, the use of multidimensional statistical analysis in regional studies is necessary for obtaining more accurate and comprehensive information about the development of regions and making informed decisions in the field of management and development of territories.

The demographic situation in the DPR can be called catastrophic. The death rate is 3.9 times higher than the birth rate. The demographic crisis poses a serious threat to the stability and prosperity of society. Therefore, it is necessary to invest in the future and take measures to preserve and develop the population, improve living conditions and create a favorable environment for birth and upbringing of children.

Period	Total population	Absolute Increment/Chain	Absolute increase/basic	Growth rate(%)/chain	Growth rate(%)/basic	Growth rate(%)/chain	Growth rate(%)/basic	Absolute content of 1% increase
2015	2 322 532				100
2016	2 326 254	3 722	3 722	100,16	100,16	0,16	0,16	23 225
2017	2 306 263	-19 991	-16 269	99,14	99,30	-0,86	-0,70	23 263
2018	2 293 431	-12 832	-29 101	99,44	98,75	-0,56	-1,25	23 063
2019	2 276 573	-16 858	-45 959	99,26	98,02	-0,74	-1,98	22 934
2020	2 257 012	-19 561	-65 520	99,14	97,18	-0,86	-2,82	22 766
2021	2 235 406	-21 606	-87 126	99,04	96,25	-0,96	-3,75	22 570
2022	2 202 440	-32 966	-120 092	98,53	94,83	-1,47	-5,17	22 354
2023	2 121 453	-80 987	-201 079	96,32	91,34	-3,68	-8,66	22 024

2024	2025	2026	2027	2028	2029	2030
2 096 318	2 071 183	2 046 048	2 020 914	1 995 779	1 970 644	1 945 509

Darya Buksha

Institute of Computer Science and Technology (ICST)
Faculty of Intelligent Systems and Programming

Department of Applied Mathematics (AM)

Speciality «Applied Mathemarics»

Multidimensional mathematical and statistical analysis of the development of the regions of the Russian Federation

Scientific adviser: Ph.D., E.V. Prokopenko

Abstract

Contents

Introduction

1. Purpose and objectives of the study, planned results

2. Basic concepts and methods of multidimensional statistics

2.1 Factor analysis

2.2 Discriminant analysis

2.3 Cluster analysis

2.4 Multidimensional scaling

2.5 Quality control methods

3. Application of multidimensional statistical analysis in regional studies

4. Application of mathematical statistics methods to analyze the demographic situation in the DPR

Table 1 – Total fertility and mortality rate

Table 2 - Population of the Donetsk People's Republic

Table 3 - The projected population of the Donetsk People's Republic in the period from 2024 to 2030.

Conclusions

References

Darya Buksha

Institute of Computer Science and Technology (ICST) Faculty of Intelligent Systems and Programming

Department of Applied Mathematics (AM)

Speciality «Applied Mathemarics»

Multidimensional mathematical and statistical analysis of the development of the regions of the Russian Federation

Scientific adviser: Ph.D., E.V. Prokopenko

Abstract

Contents

Introduction

1. Purpose and objectives of the study, planned results

2. Basic concepts and methods of multidimensional statistics

2.1 Factor analysis

2.2 Discriminant analysis

2.3 Cluster analysis

2.4 Multidimensional scaling

2.5 Quality control methods

3. Application of multidimensional statistical analysis in regional studies

4. Application of mathematical statistics methods to analyze the demographic situation in the DPR

Table 1 – Total fertility and mortality rate

Table 2 - Population of the Donetsk People's Republic

Table 3 - The projected population of the Donetsk People's Republic in the period from 2024 to 2030.

Conclusions

References

Institute of Computer Science and Technology (ICST)
Faculty of Intelligent Systems and Programming