Biography
Abstract

rus eng





Links:
DonNTU
Master's portal

Abstract

Theme of final work:

"Creation the monitoring system of the personal computer user actions"


Author:

Taran Anton


Abstract

          The monitoring system of the personal computer users actions it is the software that watch the users actions when working on PC.

Main components of the monitoring system of the personal computer users actions:

  1. Keyboard monitor - for log user input information with write from what program it was doing and at what time.
  2. Monitor of user activity - for log time when user begin and end his work with PC, percent of processor and memory workload per work etc.
  3. Web monitor - for monitoring information about web work: open URLs, how much traffic was used etc.
  4. Programs run monitor - for watching per programs run actions. Logging it's name, window title, run time, work length etc.
  5. Screenshots maker module - for create screenshots per timers or large system workload.
  6. information processing and interface modules - for processing on log data and output in readable form (diagrams, tables, grafics) to admin of the monitoring system.
Most of monitoring systems support only part of this functions.

Actuality

           On modern progress stage of information technologies practically on everyone work place stand personal computer. Problems of administration and supervision over personnel are very actual.

          Many managers conscious of right decision of this problems may raise operating efficiency, scale down cost on administration and raise a labour discipline on-site. So develop the software, that effectively pursue this tasks, is actual goal.

Aims and tasks of work

          Object of research is methods of personal computer users actions monitoring in operation system Windows, methods of processing receive data and it's program realization on C#.

          Aim or masters final work is a research of program control means in operation system Windows in .NET Framework environment that can be used in a monitoring system of the personal computer user actions and methods of monitoring results processing for the purpose of classification of running programs on types "work", "games", "web", "comunity" and other.

          For progress of indicated aims have sets next work tasks:

  • research of program control means in operation system Windows that can be used in a monitoring system of the personal computer user actions and it's program realization on C#;
  • monitoring system of the personal computer user actions development;
  • using monitoring system to get log data sets for future using;
  • investigation capability of using datamining methods at analysis of monitoring results. Particularly for problem solution of classification running programs.
  • algorithm design for problem solution of classification running programs based on loged datasets and classification methods investigations.

Monitoring the personal computer users actions task review

          The foundations for manageability in Windows operating systems are Windows Management Instrumentation (WMI; formerly known as WBEM) and WMI extensions for Windows Driver Model.

          Windows Management Instrumentation (WMI) - is effective management of PC and server systems in an enterprise network benefits from well-instrumented computer software and hardware, which allow system components to be monitored and controlled, both locally and remotely. Microsoft is committed to simplifying instrumentation of hardware and software for the Microsoft Windows operating system. Microsoft is also committed to providing consistent access to this instrumentation for both Windows-based management systems and legacy management systems that are hosted in other environments.

          Purpose of WMI

          The purpose of WMI is to define a non-proprietary set of environment-independent specifications which allow management information to be shared between management applications that run in diverse enterprise operating system environments. WMI prescribes enterprise management standards and related technologies that work with existing management standards, such as Desktop Management Interface (DMI) and SNMP. WMI complements these other standards by providing a uniform model. This model represents the managed environment through which management data from any source can be accessed in a common way.

          Overview

          In order to unify the management techniques for the sake of simplicity, the DMTF defined CIM to represent real-world manageable entities in a unified way. The CIM object model is an object database model using terms and semantics that are unique to all constructors and software developers. This object model is implemented in a database called the CIM repository.

          Based on the CIM model, WMI includes real-world manageable components, available from the DMTF standards with some specific extensions that represent the various Windows components. Moreover, WMI exposes a collection of COM-scriptable objects that allow various applications to take advantage of the management information.

          As part of the installation process, most of the Microsoft applications available today (e.g. SQL Server, Exchange Server, Microsoft Office, Internet Explorer, Host Integration Server, Automated Deployment Services) extend the standard CIM object model to add the representation of their manageable entities in the CIM repository. This representation is called a WMI class, and it exposes information through properties and allows the execution of some actions via methods. The access to the manageable entities is made via a software component, called a “provider” which is simply a DLL implementing a COM object written in C/C++. Because a provider is designed to access some specific management information, the CIM repository is also logically divided into several areas called namespaces. Each namespace contains a set of providers with their related classes specific to a management area (i.e. RootDirectoryDAP for Active Directory, RootSNMP for SNMP information or RootMicrosoftIISv2 for Internet Information Services information).

          To locate the huge amount of management information available from the CIM repository, WMI comes with a SQL-like language called the WMI Query Language (WQL).

          The WMI architecture are reveal lower.

WMI architecture

WMI architecture

          WMI Driver Extensions

          The WMI extensions to WDM provide kernel-level instrumentation such as publishing information, configuring device settings, supplying event notification from device drivers and allowing administrators to set data security through a WMI provider known as the WDM provider. The extensions are part of the WDM architecture; however, they have broad utility and can be used with other types of drivers as well (such as SCSI and NDIS). The WMI Driver Extensions service monitors all drivers and event trace providers that are configured to publish WMI or event trace information. Instrumented hardware data is provided by way of drivers instrumented for WMI extensions for WDM. WMI extensions for WDM provide a set of Windows device driver interfaces for instrumenting data within the driver models native to Windows, so OEMs and IHVs can easily extend the instrumented data set and add value to a hardware/software solution. The WMI Driver Extensions, however, are not supported by Windows Vista and later operating systems.

Classification task in datamining review

          Data mining is an information extraction activity whose goal is to discover hidden facts contained in databases. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results. Typical applications include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, and credit risk analysis.

          Data mining is a process that uses a variety of data analysis tools to discover patterns and relationships in data that may be used to make valid predictions.

          The first and simplest analytical step in data mining is to describe the data — summarize its statistical attributes (such as means and standard deviations), visually review it using charts and graphs, and look for potentially meaningful links among variables (such as values that often occur together). As emphasized in the section on THE DATA MINING PROCESS, collecting, exploring and selecting the right data are critically important.

          But data description alone cannot provide an action plan. You must build a predictive model based on patterns determined from known results, then test that model on results outside the original sample. A good model should never be confused with reality (you know a road map isn’t a perfect representation of the actual road), but it can be a useful guide to understanding your business.

          The final step is to empirically verify the model. For example, from a database of customers who have already responded to a particular offer, you’ve built a model predicting which prospects are likeliest to respond to the same offer. Can you rely on this prediction? Send a mailing to a portion of the new list and see what results you get.

          Classification

          Classification problems aim to identify the characteristics that indicate the group to which each case belongs. This pattern can be used both to understand the existing data and to predict how new instances will behave. For example, you may want to predict whether individuals can be classified as likely to respond to a direct mail solicitation, vulnerable to switching over to a competing longdistance phone service, or a good candidate for a surgical procedure.

          Data mining creates classification models by examining already classified data (cases) and inductively finding a predictive pattern. These existing cases may come from an historical database, such as people who have already undergone a particular medical treatment or moved to a new longdistance service. They may come from an experiment in which a sample of the entire database is tested in the real world and the results used to create a classifier. For example, a sample of a mailing list would be sent an offer, and the results of the mailing used to develop a classification model to be applied to the entire database. Sometimes an expert classifies a sample of the database, and this classification is then used to create the model which will be applied to the entire database.

          Decision trees

          Decision trees are a way of representing a series of rules that lead to a class or value. For example, you may wish to classify loan applicants as good or bad credit risks. Figure 7 shows a simple decision tree that solves this problem while illustrating all the basic components of a decision tree: the decision node, branches and leaves.

A simple classification tree

A simple classification tree

          Depending on the algorithm, each node may have two or more branches. For example, CART generates trees with only two branches at each node. Such a tree is called a binary tree. When more than two branches are allowed it is called a multiway tree. Each branch will lead either to another decision node or to the bottom of the tree, called a leaf node. By navigating the decision tree you can assign a value or class to a case by deciding which branch to take, starting at the root node and moving to each subsequent node until a leaf node is reached. Each node uses the data from the case to choose the appropriate branch.

          Decision trees models are commonly used in data mining to examine the data and induce the tree and its rules that will be used to make predictions. A number of different algorithms may be used for building decision trees including CHAID (Chi-squared Automatic Interaction Detection), CART (Classification And Regression Trees), Quest, and C5.0.

Conclusion

          Here are some intermediate results of the final master's work. Estimated date of completion in December 2007. In further work will be carried out further work on the targets. So far implemented functionality of the monitoring system and a review of many methods of classification. In the near future, is expected to be completed and begin monitoring module to collect data and conduct research on them using datamining technology. The final stage will be analysing the results and the development of targeted programs launched by PC users.

References

1. Article "An in-depth look at WMI and instrumentation, Part II" Klaus Salchner from site DeveloperLand.com http://www.developerland.com/DotNet/Enterprise/147.aspx
2. Article "Введение в Windows Management Instrumentation (WMI)" from site script-coding.info http://www.script-coding.info/WMI.html
3. Article "Вы всё ещё не используете WMI?" Константин Леонтьев from site AV5 http://www.av5.com/journals-magazines-online/1/9/84
4. Article "Внутренние угрозы ИТ-безопасности" Алексей Доля from site BYTE/Россия http://www.bytemag.ru/\?ID=603365
5. Materials from StatSoft's e-textbook http://www.statsoft.ru/home/textbook/
6. Article "Data mining и статистика: плюсы и минусы" from site SnowCactus http://www.snowcactus.ru/dmvsstat.htm
7. Article "Деревья решений - общие принципы работы" from site BaseGroup Labs http://basegroup.ru/trees/description.htm
8. Article "Технология Data Mining" from site Deep Data Diver technology http://www.datadiver.nw.ru/dm_tec.htm
9. Article "Отличия алгоритма дерева решений от ассоциативных правил в задачах классификации" from site Business Data Analytics http://www.businessdataanalytics.ru/DecisionTreesVsAssociationAlgorithm.htm
10. Materials from site Data Miner System & Scoring StatSoft http://www.spc-consulting.ru/DMS/index.htm
11. Abstract on theme of final work: "Automatic Data Mining form data bases" of master of DonNTU Chernov Ivan http://masters.donntu.ru/2006/fvti/ichernov/diss/index.htm




In writing this paper masters work is not yet complete and is under construction. The final conclusion of the research is scheduled to 01.01.2008. Full text and material on the topic can be obtained from the author after indicated date.

[ Biography]