Ïîðòàë ìàãèñòðîâ
DonNTU   Masters' portal

Abstract

Ñontent

Introduction

For today a lot of information is presented in the World Wide Web. Every day it is accessed by millions of users all over the world. The information is represented in the form of hypertext documents – Web pages.

Web pages have a different interface – a set of tools by which the user interacts with the page. Unfortunately, not all web pages are easy to use and therefore are less likely to be successful and demanded by users. These pages should be corrected.

1. Theme urgency

Currently there are many different methods and services for evaluation of the interface and navigation within the web pages, but not all of them are of good quality and reliable.

It is necessary to develop new estimation algorithms of the interface and navigation within the web pages using mining techniques.

2. Goal and tasks of the research

Analysis of evaluation methods of the interface and navigation within the web pages, the development of new algorithms by using mining techniques is the goal of the research.

Main tasks of the research:

  1. Review mining techniques of the navigation within the web pages.
  2. Explore data analysis techniques when searching for user behavior patterns.
  3. Investigate the existing analysis methods of webpages usability, propose approaches to joint use with the data mining techniques.

Research object: web page.

The estimation methods research of the interface and navigation within the web pages is planned in order to find an efficient algorithm within of master's work.

3. Estimation methods for the web pages' interface and navigation within the pages

Web Mining technology is actively used in order to identify common patterns in the Internet at present. This technology covers the mining techniques that are able to discover a previously unknown knowledge for unstructured and nonuniform information contained on web pages and which can later be used in practice.

One of the Web Mining techniques, directed to analyze user navigation on the webpage is the search for user behavior patterns. The analyzed user activities include referrals, form submission, scrolling pages, etc. Discovered patterns are used to optimize the structure of the site.

The following methods are used to search for user behavior patterns (fig. 1):

  • association search;
  • sequence analysis;
  • sequence clustering.
Figure 1 – The method of search for user behavior patterns

Figure 1 – The method of search for user behavior patterns

The Association is a search for jointly requested pages, ordered goods [12]. Hidden connections are found in seemingly unrelated data in the search. These connections are association rules. Those rules, the amount of which exceeds a certain threshold are considered to be interesting [13].

Sequence analysis is a search for action sequences. Algorithm Apriori is the most commonly used algorithm [12], based on the Data Mining techniques, and is intended for knowledge discovery in databases [13] (fig. 2).

Figure 2 – Algorithm Apriori

Figure 2 – Algorithm Apriori
(animation: 8 frames, 6 cycles of repetition, 10.1 kilobytes)

Sequence clustering approach is a search for user groups with similar action sequences. Microsoft Sequential Clustering algorithm, included in Microsoft Analysis Services 2012, can be used for the implementation of such an analysis.

Fig. 3 shows the Sequence Clustering Model structure of Microsoft Analysis Services.

Figure 3 - The Sequence Clustering Model structure of Microsoft Analysis Services

Figure 3 – The Sequence Clustering Model structure of Microsoft Analysis Services

Thus the Web Mining technology solves the following problems in evaluation of the web page navigation:

  • determination of typical sessions and user navigation paths (sequence analysis, association rules);
  • finding dependencies when using site services (search for association rules, clustering sequences) [12].

For today usability of the resource is one of the key points when dealing with behavioral factors. There are several analysis methods of website usability:

  1. Analysis using statistics services.
  2. Dealing with visitors’ reviews.
  3. Testing with focus groups.
  4. Monitoring the visitors’ activities in using focus groups and special services [16].

The above methods do not relate to the web page usability mining techniques. However the joint use of certain usability analysis methods and mining techniques, in particular, search for user navigation patterns is possible in the future. This approach would help to automate completely the process of analyzing usability.

Conclusion

Many existing estimation methods of web page interface are biased or superficial. It is necessary to create approaches based on the application of mining techniques and allowing qualitatively analyze and mine new useful information from the web pages’ data.

In this paper, one of the Web Mining methods has been considered – search for user behavior patterns, – the results of which can be used to optimize the site structure.

Also some usability analyze methods were reviewed. These methods can be used in conjunction with mining techniques for a better estimation.

In further work the development of estimation algorithms for the web page interface and navigation within pages using mining techniques is planned.

References

  1. Yue Xu. Mining for User Navigation Patterns Based on Page Contents [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://www2.cs.uregina.ca/~wss/wss03/03/wss03-127.pdf
  2. Melody Yvette Ivory. An Empirical Foundation for Automated Web Interface Evaluation [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://webtango.berkeley.edu/papers/thesis/thesis.pdf
  3. Myra Spiliopoulou, Lukas C. Faulstich, Karsten Winkler. A Data Miner analyzing the Navigational Behaviour of Web Users [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://www.hhl.info/fileadmin/LS/micro/Download/Spiliopoulou_1999_ADataMiner.
    pdf
  4. Jose Borges, Mark Levene. Data Mining of User Navigation Patterns [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.484
  5. ×åðíîâ Â.Â. Ê ïðîáëåìå ðàçðàáîòêè âåá-èíòåðôåéñîâ [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://www.rae.ru/fs/pdf/2012/11-2/30559.pdf
  6. Ñîðòîâ À.À., Õîðîøèëîâ À.Â. Ôóíêöèîíàëüíîå òåñòèðîâàíèå Web-ïðèëîæåíèé íà îñíîâå òåõíîëîãèè UniTesK [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://ispras.ru/ru/proceedings/docs/2004/8/1/isp_2004_8_1_77.pdf
  7. Ïî÷àíñêèé Î.Ì. Ïðèìåíåíèå ñòðóêòóðíûõ õàðàêòåðèñòèê web-äîêóìåíòîâ ïðè îöåíèâàíèè èõ ïðèâëåêàòåëüíîñòè äëÿ êîíå÷íîãî ïîëüçîâàòåëÿ [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://archive.nbuv.gov.ua/portal/Natural/SOI/2011_4/pochan.pdf
  8. Åãîðîâà È.Í., Èñòîìèíà À.À. Èññëåäîâàíèå âîçìîæíîñòåé âåá-àíàëèòèêè [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://journals.uran.ua/index.php/1729-3774/article/viewFile/3917/3585
  9. Ïàþùèê Þ.Â., – Àíàëèç Èíòåðíåò-òðàôèêà ñ èñïîëüçîâàíèåì èíòåëëåêòóàëüíîãî àíàëèçà äàííûõ [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://masters.donntu.ru/2012/fknt/paushchik/diss/index.htm
  10. Øèíêàðåíêî Â.Ñ., – Àíàëèç àóäèòîðèè è ïðîãíîçèðîâàíèå ïîñåùàåìîñòè èíòåðíåò ðåñóðñà [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://www.masters.donntu.ru/2006/fvti/shynkarenko/diss/
  11. Ðàñêèí Äæåôô. Èíòåðôåéñ: íîâûå íàïðàâëåíèÿ â ïðîåêòèðîâàíèè êîìïüþòåðíûõ ñèñòåì / Äæåôô Ðàñêèí. – Ì.: Ñèìâîë-Ïëþñ, 2005. – 272 ñ.
  12. Web Mining: èíòåëëåêòóàëüíûé àíàëèç äàííûõ â ñåòè Internet [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: https://sites.google.com/site/upravlenieznaniami/tehnologii-upravlenia-znaniami/text-mining-web-mining/web-mining
  13. Àíàëèç äàííûõ ìåòîäîì àññîöèàòèâíûõ ïðàâèë. Àëãîðèòì Àpriori [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://www.drupal-site.com/analiz-dannyh-metodom-associativnyh-pravil-algoritm-apriori
  14. Âèëü÷èíñêàÿ Î.Ñ., Òåñëåíêî È.Â. Èçâëå÷åíèå ìíîãîóðîâíåâûõ àññîöèàòèâíûõ ïðàâèë èç áîëüøèõ áàç äàííûõ [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://archive.nbuv.gov.ua/portal/natural/vejpt/2006_3_3/EEJET_3_3_2006_27-31.pdf
  15. Àëãîðèòì äëÿ ïîèñêà àññîöèàòèâíûõ ïðàâèë [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://www.drupal-site.com/algoritm-dlya-poiska-associativnyh-pravil
  16. Þçàáèëèòè ñàéòà – àíàëèç, îöåíêà è òåñòèðîâàíèå [Ýëåêòðîííûé ðåñóðñ] – Ðåæèì äîñòóïà: www/ URL: http://www.sembook.ru/book/povyshenie-konversii-sayta/yuzabiliti-sayta/