Українська   Русский
DonNTU   Masters' portal

Abstract

Contents

Introduction

Copyright is an intellectual property right. Everyone who invested (the author) in the creation of something (an object of law) time, forces and other resources wants to protect its creation from any kind of encroachment on its intellectual property by third parties (subjects of law). And if the protection of the right of ownership of a physical object can be legally supported by documents, then the software software is, in fact, a set of digital information, and information is an abstract object, and when extracting a piece of software code and using it in other software is very difficult to control and stop. Therefore, methods of copyright protection for software products are being actively developed today.

Copy protection System or copyright protection system – a set of tools that prevent or prohibit illegal distribution, using and/or modifying software.

Illegal distribution refers to the sale, exchange, or free distribution of a software product that is copyrighted to a third party, without their consent.

Today, the problem of copyright is very acute, since software products are, in fact, information, and therefore cannot be protected from theft by the same methods as physical objects. Steganographic methods allow you to introduce information that proves authorship, but do not tell third parties about the existence of such information. Effective protection is considered to require more resources to break than to purchase a copy of the product.

1. Goal and tasks of the research

The purpose of this work is to study the existing algorithms for software protection by means of steganography and improving the efficiency of such algorithms.

The main objectives of the study:

  1. Review of the literature on the problem.
  2. Review of existing technologies and solutions for copyright protection BY steganographic methods.
  3. Analysis of the architecture of existing solutions. Comparison of technologies. Advantages and disadvantages of the proposed methods of copyright protection.
  4. Finding solutions to the shortcomings of existing software copyright protection algorithms.
  5. Creating requirements for an improved software copyright protection algorithm using steganographic protection tools.
  6. Development of a proprietary algorithm for copyright protection using steganography.

Research object: digital steganography.

Research subject: software protection algorithms based on digital steganography methods.

As part of the master's work, it is planned to obtain relevant scientific results in following directions:

  1. Relevance of application of steganography–based software protection mechanisms.
  2. Potential effectiveness of steganography–based software protection mechanisms.
  3. Modification of known software protection methods by steganographic methods.

2. Software protection problems

There are two main questions about software product protection: who owns the authorship of this product? and received is this copy of the product in compliance with the agreements? (for example, whether a copy was purchased or obtained through piracy). The first issue can be solved by simply embedding information about the author in the product. The second issue can be solved by introducing information about copies (copy number or information about the entity that purchased the copy). But if such information is placed without trying to hide it, then attackers can delete or modify this information for their own purposes. To avoid such situations, you can try to use this information (label) hide or encrypt. For this purpose, various methods of steganographic and cryptographic embedding of tags are used. In the case if there is a dispute between potential authors, you can confirm the authorship by opening the tag authorship embedded in the disputed product instance. This type of placemark is called a digital watermark(DWM)[1].

If you have a question about whether a copy was received in compliance with the agreements, you can extract a label with the copy data, identifying the copy. These tags are called digital fingerprints (DFP)[2]. Thus, for the preliminary protection of the software product, the author needs to solve the following issues:

3. Research and development overview

Methods of protection by means of steganography as objects of research have appeared relatively recently. With the advent of digital steganography media files, such as images and video files, were often used as containers for stegos, sound file. Embedding digital watermarks in software requires more sophisticated algorithms. The literature on the application of digital labels to software is discussed below.

3.1 Theme urgency

Digital steganography is most often associated with media files: images, video, and audio files, and less often with text and even less often with software code. This is due to the complexity of implementing the message embedding algorithm in these containers. Media files are intended for perception by the human senses, and here the perception of images, not individual elements, works, therefore, information can be easily embedded unnoticed by a third–party observer, and even a computer may not always be able to detect its presence Messages. With text files, it is more difficult, since here a person can already distinguish between individual elements (letters and symbols), so it is more difficult to implement an algorithm for embedding a stegos message in a digital text. The software code is designed to be understood by the computer, therefore, the slightest changes in it can deprive the computer of the ability to perform the operations prescribed by the developer. For this reason embedding stegosoumessages in software is not widely used.

Software Protection with a DFP makes sense if the product that may be disputed is released by a small company limited edition (software made to order). For software for a wide range of users more efficient the solution is to provide application services through cloud services, allowing users to work only through a thin client, however, this method is more expensive, since in addition to the cost of software development, the cost of maintaining the service is required. Therefore, the method of software protection with the help of DFP is still a compromise between cost and efficiency.

the graph shown in figure 1 shows that the term steganography is more common in foreign literature, especially in the literature, published in the United States in English. For most of the first decade XXI century German literature held the championship.

Dynamics of the use of the term "steganography" in the literature

Figure 1 – Dynamics of the use of the term steganography in the literature

Frames: 9
Loops: 8

If we compare the frequency of occurrence of the terms digital and steganography, then the amount of American literature in the second half of the first and throughout the second decades of the XXI century breaks out ahead and continues to hold leading positions (see figure 2).

Dynamics of the use of the terms "digital" and "steganography" in the literature

Figure 2 – Dynamics of the use of the terms digital and steganography in the literature

Frames: 9
Loops: 8

3.2 Overview of national sources

In the reviewed book by A. Shcherbakov Защита от копирования[3], published in 1992, the most important issues are considered: common software protection methods used 30 years ago. All the methods described in this book are based on software copy protection with the licensed carrier, disassembly of the software.

To date, software copy protection from physical magnetic and optical media has lost its relevance, due to with the transition of a large number of software to digital distribution, and the methods of protection from debugging and tracing software described in this book, they are relevant for x86 processors, so they require modifications for the x64 family, but still the principles of their operation remain relevant even now.

Chapter 6 Выполнение защищённых программ и работа с защищёнными данными was useful, which described the principles of construction an external loader of executable modules that perform some kind of verification of the software startup and operation conditions. This will help you create a loader that will not work if, for example, the tag was detected and modified.

In the article Peering Inside the PE [4] describes the structure of the PE file. Portable Executable file – the main type of executable files in Windows operating systems. This step–by–step process explains how to run executable files, the purpose and structure of headers, the principle of alignment in memory. This gives you an idea about the possibility of embedding placemarks in filled–in fields. zero values as a result of section alignment. From the header structure, you can learn the rules for addressing headers and sectors in order to, to avoid conflicts when embedding a placemark. In conjunction with methods for building an external loader, this will help hide the placemark content (and the program code in general) from the end user.

In the textbook Monakhov M.Yu. and Tashmukhamedova V.F. Защита авторских прав [5] in addition to the description of software protection with on the legal side, it also describes the basics of technical methods and means of protecting software copyrights. As in many other materials devoted to technical methods of software protection provide principles for encrypting program code when storing it on the computer. local media, and decryption of the code only at the time of its execution. Here are the most common methods of opening protection, there are also methods to counteract them, which is very useful for protecting the embedded information tag. As in the book by A. Shcherbakov Защита от копирования it says that debuggers use standard interrupts, so if you have the IDs of these interrupts, you can change them their vectors, and thus complicate the debugging process by attackers. This section describes methods for embedding tags based on the structure features PE files described in the article Peering Inside the PE.

Sereda&nbps;S.A. in his work on Оценка эффективности защиты программного обеспечения [6] identifies two types of funds software protection: packers/encoders; PF against unauthorized copying and unauthorized access. The first ones were originally used to compress data and reduce the size of the executable module, but, as the author himself says, later the goal of software protection came to the fore from the analysis of the algorithms and unauthorized modification. This technique can also be used to preserve integrity placemarks in the executable module.

However, the author also points out the disadvantages of this method, the most significant of which, in my opinion, is that packaging and encryption executable code conflicts with the prohibition on code self–modification in modern operating systems.

the Second type of tools involves binding the software to the distribution media to complicate the process of copying the software and potentially it is not possible to launch it without the presence of media. As mentioned above, this method is not relevant due to the spread of digital software distribution.

In order to make it more difficult to copy and run software on multiple machines by many authors, as a possible measure for PC authentication, it is suggested to use some immutable or rarely changed security parameters. For example, this may be the ID of the LCD or CPU. But the problem is that the open PC architecture actually depersonalizes each IBM–PC–compatible machine, and, first, it can work and not give access to the user who legally received this copy of the software, but changed the hardware nodes that are important for authentication, and second, it can, on the contrary, be deceived if the request for hardware configuration data is intercepted, and instead of real data in response to the request, data will be sent that meets the access permission check[7].

3.3 Overview of international sources

In the literature [8] a method for embedding a label based on a zero–knowledge proof is proposed. To describe it, you need to enter the concepts of verifier and prover. The prover is an agent or subject who claims to know the proof of a claim and tries to prove it. The verifier is an agent or subject, which tries to get the proof from the verifier. At the end of the interaction, called a Protocol, the prover convinces the verifier that, that it knows the proof without passing on any additional knowledge about the contents of the information that is being verified.

One of the most well–known protocols for identifying an individual using zero–knowledge proof is the Protocol proposed by Amos Fiat and Adi Shamir, whose stability is based on the difficulty of extracting the square root modulo a sufficiently large composite number n, whose factorization is unknown[9]. In the same book, the author confirms our observations show that steganographic methods for media objects have been actively developed since the 90's of the XX century, but the same methods used in the field of software has become a topic of study relatively recently.

A different method is proposed in the proceedings of the 2015 international conference on architectural, energy and information engineering[10]. It is called Improved dynamic software watermark algorithm based on the R–tree. This algorithm is divided into 4 stages:

  1. Watermarking–sharing based on m–n variable carrying rule.
  2. Zero coding.
  3. Variable–Base Factorial Number System.
  4. R–tree related features.

This method, as its name implies, is an improved algorithm of the original dynamic programming algorithm of the DWM based on the R–tree[11], and in comparison with it has a greater resistance to additive attacks, and attacks related to the substitution of DWM.

4. Comparison of steganographic methods for embedding tags in a software product

The method of embedding the label depends, first, on the type and format of the product, and second, on the ability to hide information in the product a specific format. One of the ways to implement a tag can be to introduce some information at the stage of implementing the product code, for example, to prescribe some undocumented functionality for the product. Let's say the author of the app made it so that the label is displayed when entering commands that were not described in the application documentation.

Advantages:

Disadvantages:

Today, one of the most common executable file extensions is EXE. Their structure is not tied to the programming language in which the software product was written. Therefore, the introduction of labels in object codes has greater versatility and independence from the program interface.

One of these methods is based on the availability of free sections in the object codes of programs stored in executable files, which may contain fully or partially free sectors of the file. This method relies on what is written in the Portable Executable(PE) file after the procedure alignments fill in the missing bytes with a null value so that the section length is a multiple of a certain value (see figure 3).

The principle of alignment

Figure 3 – The principle of alignment

Figure 3 shows that the size of partitions in physical memory = 0x28 bytes, while in virtual memory, the size must be a multiple of 0x50. To do this, bytes with addresses 0x28–0x50 will be filled with null bytes 0x00.in such sections, labels can be hidden.

Embedding author information in free sections, firstly, guarantees the correct operation of the object code that was changed, because it is not embedded in a significant part of the code, and secondly, it does not change the file size after embedding the label.

Advantages:

Disadvantages:

the Second method of embedding in the object code is to change sections of the object code that do not affect the performance of the software product's. For example, the structure of executable PE files is such that changing the values of some fields will not affect the functionality software product (see Fig.4 and figure 5) [5].

A fragment of object code without an embedded label

Figure 4 – A fragment of object code without an embedded label

A fragment of object code with an embedded label

Figure 5 – A fragment of object code with an embedded label

For example, a DOS header has a size of 64 bytes, of which 6 bytes are critical: 2 bytes are reserved for the MZ signature at the beginning each PE file, without which the file simply won't start, and 4 bytes to store the offset to the PE header. That leaves 58 bytes for an arbitrary messages.

Advantages:

Disadvantages:

The third method of embedding a label in the object code is based on the fact that if it is changed or deleted, the functionality of the software product is disrupted. This method can be implemented at the product coding stage or implemented as a separate module, which, if integrity violation is detected markers in the case of DWM, and if a mismatch is found the data of the control data in the case of DFP, will take appropriate measures[4].

Advantages:

Disadvantages:

Conclusion

In the course of research, the analysis of existing methods of software copyright protection by means of steganography was performed. Advantages and disadvantages of existing methods are given. In the future, it is planned to develop an author's algorithm for software protection providing steganography tools by improving the effectiveness of existing algorithms for copyright protection using steganographies.

When writing this paper, the master's thesis has not yet been completed. Final completion: May 2021. Full text of the work and materials on the topic can be obtained from the author or his supervisor after the specified date.

References

  1. Мельников Ю., Теренин А., Погуляев В. Цифровые водяные знаки – новые методы защиты информации. PC Week N48, 2007.
  2. Орешин Е. Эффективные способы защиты авторских и смежных прав в Интернете. Журнал Суда по интеллектуальным правам N9, 2015, с. 48‑55.
  3. Щербаков А. Защита от копирования. – М.: ЭДЭЛЬ, 1992. – 79 с.
  4. Peering Inside the PE: A Tour of the Win32 Portable Executable File Format. – Access mode: https://docs.microsoft.com/ru-ru/previous-versions/ms809762(v=msdn.10)
  5. Монахов М., Ташмухамедова В. Защита авторских прав на программное обеспечение. Владимирский государственный университет, 2009. – с. 58.
  6. Середа С. Оценка эффективности систем защиты программного обеспечения. – Access mode: http://www.security.ase.md/publ/ru/pubru30.html
  7. Anderson R. SecurityEngineering. – N.Y.JohnWiley&Sons, 2001. – p. 612.
  8. M. Barni et al. (Eds.): IWDW 2005, LNCS 3710, pp. 299–312, 2005.©Springer–Verlag Berlin Heidelberg 2005.
  9. Feige U., Fiat A., Shamir A. Zero Knowledge Proofs of Identity // Journal of Cryptology. – 1988. – vol. 1, Iss. 2. – pp. 77–94.
  10. L. He & J.F. Xu Improved dynamic software watermarking algorithm based on R–tree // Architectural, energy and information engineering (AEIE 2015), Xiamen, China, 2015 – С. 531‑535.
  11. Xu, H. & Chen, H. & Feng, D. & Li, D.. (2005). Dynamic software watermarking algorithm. 33. 172‑174.