Українська   Русский

Abstract

Content

Introduction

Today, the network passed a huge amount of different types of files: digital photos, videos, music and many others. However, the greatest intensity of transmission streams has just the same text information. Such endless amount of textual information makes it possible to implement it and to transfer of a secret message.

1. Theme urgency

Methods of secure communication in text documents devoted to such direction in secure communication technologies as a linguistic steganography. The distinguishing feature of this trend is that as the containers using conventional plaintexts [1]. Moreover, these texts must be completely "harmless" externally, ie not cause the reader of this message suspicions it contained a secret message.

Under linguistic steganography must understand the hidden encoding of arbitrary information in an arbitrary carrier text with relying on non-trivial linguistic ideas and resources.

It is clear that such an important application in an unsafe world linguistics attracts people far from science: SoftWare distributors (they have to hide in the transmitted to customer product unique sales number), brokers (they need to secretly inform of change in some course or rating), diplomats (they need to identify the source of leakage of public important information), security personnel (explanations are superfluous here) [2].

The main advantage of linguistic steganography (which explains its relevance is gaining momentum in the modern world) is that, unlike other types of steganography, secret message can be sent as you like: by e-mail, message written by hand or transmitted in conversation.

2. Goal and tasks of the research

The main goal of this master's work is to develop a system of text information hiding based on linguistic resources, namely the method of linguistic steganography based on the generation of meaningful text.

The main tasks of the research:

  • to analyze the state of the question and explore literature of methods information protection;
  • to develop an algorithm for hiding information in a text file using the database;
  • to develop a rule base (templates sentence) and knowledge base (dictionary);
  • create a software application of steganography system hidden information;
  • to analyze the effectiveness of the method.

The object of this work is studing of simulation system text information hiding, using the method of generating a meaningful text.

Subject of research - hiding messages in a text container for subsequent transmission of this information.

Methods of research and used technologies - concealment algorithm implementation and retrieval of textual information by of object-oriented language Java.

3. Prospective scientific novelty

In this work is planned to create a steganographic system using a generator of meaningful text. Prospective scientific novelty consists in the creation of generator russian texts, which will subsequently be modified into the system of hiding information.

General description of system information hiding

Steganographic system (stegosystem) - unification of methods and tools used to create a covert channel for transmission of information. With constructing such a system agreed on the following.

  1. The enemy represents the work of steganography system. Unknown to the enemy is the key with which you can learn about the existence and content of the secret message.
  2. If the enemy detects the presence of hidden messages he should not be able to retrieve the message as long as it does not get the key.
  3. The enemy does not have the technical and other benefits [8].

Diagram of the system shown in figure 1.

Visualization of information hiding system based on linguistic resources

Figure 1 — Visualization of information hiding system based on linguistic resources

(animation: 16 frames, 7 cycles of repetition, 11.7 KB)

At first the sender is typing a message that he would like to encrypt and transmit. Then the message is converted into a bit string and sent to the database (DB) (1), where pre-established rules-based dictionary and collected words into sentences, relatively assigned to them in accordance binary "0" or a binary "1" concealed messages. On the output from the database the user gets ready ciphertext (2). This text is sent to the recipient. The recipient, after the opened a letter sends the text to the database (3). There's text matched with a dictionary and decrypted. From the database to the user receives the decrypted message (4). In this case, the key of steganography system is a database, which is available only for the sender and recipient. This is the most general description of the interaction between system elements.

Linguistic steganography method based on the generation of meaningful text

The essence of the method consists in the fact that to bring your generated text to natural was originally formed patterns and vocabulary suggestions. So at enciphering the data stream at first is randomly selected template of sentence and then on it from the dictionary words are chosen according to the encoded message to encrypt.

However, if the formation of patterns of special difficulties arise, creating the dictionary you must consider the Russian language. However, if there is no particular difficulty when formed template, creating the dictionary you must consider the Russian language.

In Russian, the word rarely used without proper endings, allowing to agree the word with its surroundings in the text. To solve this problem, we propose to split the dictionary into separate tables according to the parts of speech and to the cases, gender, number, declension and conjugation.

Thus, this method creates a binary tree, and makes the text, selecting those of the leaves of the tree, which encode the desired bits.

For example, it is necessary to encode a combination of "100".

It should generate a text according to the pattern of the sentence:

Subject Predicate Object

There is a base for the subject (Maria Olga); base to the predicate (Bought, Acquired) base to Object (Dress, Sundress).

0: Maria 1: Olga

0: Bought: 1: Acquired

0: A Dress 1: A Sundress

Thus, a sentence: Olga bought a dress.

Of course, for this method, you must correctly and clearly make the dictionary, because stability of the method, generating stegotekst like natural providing by a predetermined grammar rules.

Lack of grammatical and spelling errors in the sentences makes it difficult to find differences between artificial text and the natural. Analysis of text comprehension can be made only with the participation of a person that is not always possible due to the sheer volume of the analyzed information. The most effective method of analysis uses prediction to identify the artificial nature of the text generated by the program Nicetext. At first analyzed the words of the first half of the text and a forecast of each subsequent word of the second part of the text. If the vast majority of cases the prognosis is successful, it means that we are dealing with natural text. Frequent errors in forecasting may indicate the presence of artificial text. For programs Texto and Markov-Chain-Based used methods, which taking into account the correlation of words between sentences. So, it is considered that the sentences, which contain words found only in technical texts, can not stand next to the proposals contained words occurring only in the texts of literature [10]. Therefore it is recommended in the foregoing method to share dictionaries on different topics.

Conclusion

This material will form the basis for the development of information hiding system based on the method of linguistic steganography. This method is an improvement over the method of replacing synonyms, described in [3]. The proposed system can be fundamentally new to work with texts in Russian.

In writing this essay master's work is not yet complete. Final completion: December 2014 Full text of the work and materials on the topic can be obtained from the author or his manager after that date.

References

  1. Алиев А.Т. Лингвистическая стеганография на основе замены синонимов для текстов на русском языке / А.Т. Алиев // Известия ЮФУ.Технические науки  № 11, — 2010, — C. 163-170.
  2. Большаков И.А. Использование синонимов, ограниченных контекстными словосочетаниями, для целей лингвистической стеганографии / И.А. Большаков, — 2004, — C. 23-29.
  3. Ларионова К.Е. Методы кодирования произвольной информации в компьютерных текстах на основе лингвистических ресурсов [Электронный ресурс] — Режим доступа: http://masters.donntu.ru/2009/fvti/...
  4. Spam mimic маскирует тайную переписку под спам [Электронный ресурс] — Режим доступа: http://daily.sec.ru/2000/12/19/Spam-mimic...
  5. Каталог лингвистических программ и ресурсов в Cети [Электронный ресурс] — Режим доступа: http://www.rvb.ru/soft/...
  6. Яндекс.Рефераты [Электронный ресурс] — Режим доступа: http://referats.yandex.ru
  7. Генератор стихов им. А.С. Пушкина [Электронный ресурс] — Режим доступа: http://referats.yandex.ru/pushkin/
  8. Стеганография. Материал из Википедии — свободной энциклопедии [Электронный ресурс] — Режим доступа: http://ru.wikipedia.org/wiki/Стеганография
  9. Большаков И.А. Кросслексика: универсум связей между русскими словами / И.А. Большаков // Бизнес-информатика №3(25) — 2013, — C. 19-26.
  10. Нечта И.В. Разработка методов обеспечения безопасности использования информационных технологий, базирующихся на идеях стеганографии. Автореферат [Электронный ресурс] — Режим доступа: www.sibsutis.ru/...