DonNTU > Masters' Portal of the DonNTU

Переключиться на русский вариант Switch to the English version Перемикнутися на український варіант

Main Page

Library

Links

Search Report

Individual


Denis Protivenskiy

Research and development of methods of the program source code obfuscation

ABSTRACT

Introduction

Obfuscation process is one of the most efficient ways of program protection from computer violations (such as industrial espionage, reverse engineering etc.) today. Obfuscation means applying semantics-preserving source code transformations in order to protect a program from being reverse engineered. The goal of the obfuscation is not to obtain complete and/or long-term protection, but to provide such level of complexity that makes program's analysis non-value-added compared to development cost of similar product from scratch.

There are four main classes of obfuscating transformations:

  1. Lexical obfuscation.
  2. Data structure obfuscation.
  3. Control flow graph obfuscation.
  4. Preventive obfuscation.

Fig. 1. Obfuscating Transformations Classification (5 frames, size 33,2 Kb).

Lexical obfuscation means removing all the comments from code or changing these comments to give misinformation; removing spaces and indents; identifier names distortion and other.

Data structure obfuscation is used to change program's data structures. It includes:

  • Data storage obfuscation, which means data storage transformation and data type transformation (for example, creation and use of uncommon data types);
  • Data merging and splitting obfuscation;
  • Reordering obfuscation, which is used to reorder variable declarations; internal data storage layout; methods, arrays, structures etc.

Control flow graph obfuscation is used to change a control flow graph of some function or a program at all. It may cause new functions' creation. It's the widest class of obfuscating transformations. Some of this class's methods are:

  • Deterministic dispatcher insertion. A deterministic dispatcher is a block, which gains control after running number of operators and continues execution of the algorithm, jumping to another block of operators depending on the specified condition. A dispatcher controls completely the way of algorithm's execution. A control flow graph may be transformed by using such a dispatcher;
  • Function inlining, which means that the function's body is being inlined into the invocation point;
  • Operator outlining, which is the opposite to the previous transformation;
  • Function interlacing. The idea is to mix two or more functions into one;
  • Loop rolling. The body of a loop is being copied two or more times, the exit condition and the counter increment operator is being modified respectively.

Preventive obfuscation is used to prevent the use of deobfuscators, decompilers and other deobfuscating tools.

Theme relevance

Source code obfuscation is a rather new way in the computer science. It appeared due to the evolution of the Internet as a protection from the "computer piracy". The first articles about obfuscation were published in 1996-1997 in the USA, where the level of this field of knowledge is the highest in the world up to the present time.

There is limited number of articles devoted to this theme in the CIS countries and in Ukraine. This situation might be caused by absence of big software companies of worldwide reputation here, as the necessity in creating protection tools arises with the products that need to be protected.

But I think that development of such an important branch of informatics as obfuscation must be performed in Ukraine even because of the wide theoretical basis of the obfuscating transformations (it intersects with compiler theory, discrete mathematics and theory of computation). Knowing theory, it's not hard to find an application of it, even if not in this realm, but in any related one as well.

Goals and objectives

The goal of the work is to develop an obfuscating method that transforms a program's control flow graph and/or a data flow graph. This method must be more optimal than other known methods of this obfuscation class for parameters of potency and/or resilience. Because of the fast growth of computing powers, parameter of cost of transformation will be considered as the least important during development process (but it's still considerable as an estimation criterion for the specified method). Following tasks are to be solved to realize the set goal:

  1. Consideration and analysis of existing methods of obfuscation, their classification.
  2. Study of issues of compilers' optimizing transformations and principles of work with control and data flow graphs.
  3. Creation of algorithm that performs obfuscating transformations and satisfies stated criteria of optimality.
Expected scientific novelty

Scientific novelty of the project is in creation of a new method, which helps make more diverse and complex obfuscating process of programs.

Experimental researches

A developed method must be realized in C++ programming language and tried out on test source code files of programs. Given results will be analyzed in order to obtain empirical values of main efficiency measures of the specified method.

Number of tests is planned to be carried out to detect possibility of successful interaction of this method with other known methods that are used in commercial obfuscators (if there are such obfuscators). If this interaction fails, analysis of reasons of such behavior will be made and recommendations will be given.

Overview of the results and conclusions

Main theoretical information about obfuscation methods is studied and systematized; area of development of new methods is researched at this stage of work.

Methods of obfuscation for procedural and object-oriented languages (like C++, Java) are considered, and these methods are a major portion of all available. They are related to substitution of standard constructions with equivalent but more complex for understanding and analysis.

Methods of data (arrays, classes, abstract data types) obfuscation are studied.

It's also paid attention to researches on creation of obfuscating algorithms for low-level programming languages (like Assembler and others).

Performed analysis set up a theoretical base to successful writing of the master's work.

Literature
  1. Christian Collberg, Clark Thomborson, and Douglas Low. A taxonomy of obfuscating transformations. Technical Report 148, Department of Computer Science, University of Auckland, July 1997.
  2. Christian Collberg, Clark Thomborson, and Douglas Low. Manufacturing cheap, resilient, and stealthy opaque constructs. In SIGPLAN-SIGACT POPL'98. ACM Press, San Diego, CA, January 1998.
  3. Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil Vadhan, Ke Yang. On the (im)possibility of obfuscating programs, Advances in Cryptology - CRYPTO'01, Springer Lecture Notes in Computer Science vol. 2139, pp. 1 - 18, Santa Barbara, CA, November 2001.
  4. Ахо А., Сети Р., Ульман Д. - Компиляторы: принципы, технологии и инструменты. : Пер. с англ. - М.: Вильямс, 2003 г. - 768 с.
  5. Ахо А., Ульман Д. - Теория синтаксического анализа, перевода и компиляции. В 2-х т. : Пер. с англ. - М.: Мир, 1978 г.

Main Page

Library

Links

Search Report

Individual

DonNTU > Masters' Portal of the DonNTU