Wednesday, March 5, 2008

Obfuscation and .NET

Reverse Engineering


Reverse engineering is the process of extracting the knowledge or design blueprints
from anything man-made. The concept has been around since long
before computers or modern technology, and probably dates back to the days
of the industrial revolution. It is very similar to scientific research, in which a
researcher is attempting to work out the “blueprint” of the atom or the human
mind. The difference between reverse engineering and conventional scientific
research is that with reverse engineering the artifact being investigated is manmade,
unlike scientific research where it is a natural phenomenon

Traditionally, reverse engineering has been about taking shrink-wrapped
products and physically dissecting them to uncover the secrets of their design.
Such secrets were then typically used to make similar or better products. In
many industries, reverse engineering involves examining the product under a
microscope or taking it apart and figuring out what each piece does.

Assemblies generated under .NET may be decompiled into easily recognized source code that is either identical or similar to the original source code. Individuals or companies deploying .NET generated assemblies that are targeted for client machines may unwittingly be distributing their source code – in most cases an unintended consequence.Since source code is generally considered a valuable intellectual asset, measures should be taken to prevent decompilation into easily recognized source code.

MSIL Disassembler (Ildasm.exe)
The MSIL Disassembler is a companion tool to the MSIL Assembler (Ilasm.exe). Ildasm.exe takes a portable executable (PE) file that contains Microsoft intermediate language (MSIL) code and creates a text file suitable as input to Ilasm.exe

Steps to use ILDASM

1.Pick the ILDASM.exe from “\Microsoft Visual Studio .Net 2005\ SDK\v2.0\Bin\ildasm.exe”

2.Click on the ILDASm.exe and this will popup with an ILDASM program


3.Click on file and the open and pick the exe or dll which you want to view. This will display the IL code of the exe or dll.


Note:- After step 3, you are able to view a tree structure that will display the information about the dll or exe .

4.On double click on the “MANIFEST” node, we are able to view the details of the Assembly, internal IL code.

5.Click on the other nodes or namespaces that are at the top of the hierarchy.



6.Further you can view the internals by clicking on the sub nodes.



DotNet Reflection
Contains a decompiler, and a powerful object browser.
Lutz Roeder's .NET Reflector
Reflector is a class browser for .NET components. It supports assembly and namespace views, type and member search, XML documentation, call and callee graphs, IL, Visual Basic, Delphi and C# decompiler, dependency trees, base type and derived type hierarchies and resource viewers.

NDepend
NDepend analyses source code and .NET assemblies. It allows controlling the complexity, the internal dependencies and the quality of .NET code. NDepend provides a language (CQL Code Query Language) dedicated to query and constraint a codebase. It also comes from with advanced code visualization (Dependencies Matrix, Metric treemap, Box and Arrows graph...), more than 60 metrics, facilities to generate reports and to be integrated with mainstream build technologies and development tools. NDepend also allows to compare precisely different versions of your codebase.

Lattix LDM
LDM reads in .NET code to extract intermodule dependencies which are then used to visualize and manage the architecture of .NET applications. The architecture is represented in a Dependency Structure Matrix (DSM) for a highly scalable representation that allows unwanted dependencies, often a result of unwanted architectural creep, to be identified quickly.

The Nature of .NET Assemblies


Assemblies in .NET consist of two major components: metadata and intermediate language code. The reflection capabilities of .NET languages and the metadata features of an assembly that include its classes and associated method signatures, fields properties and events make it possible to reverse engineer and retrieve this intellectual property. The intermediate language code provides information regarding the details of each method implementation. This enables well constructed decompilers to reveal the details of proprietary algorithms that you would typically not want users to have access to.



The Nature of Obfuscation





An ideal obfuscator mangles the large features(namespaces, class, names,methods,signatures and fields) and small features (method details and in particular string values defined as fields and within a method) of your assembly without changing its functionality. The more aggressive the mangling, the more likely that the obfuscated assembly will not run the same way as the original assembly. It is essential that an obfuscator keep the functionality of the software totally intact while making the original source code unrecognizable if the obfuscated assembly is decompiled.There are several well known problem areas that a well designed obfuscator must allow the user to deal with. When runtime type identification is used in an application to determine whether an object is of a particular type, the class name of the type being sought is used. If the obfuscator has mangled this class name, the dynamic type identification will fail in the obfuscated assembly. The user of the obfuscator must be given the option of selecting features that are not to be mangled. These include class,field and method names. The same issue exists when dynamic class loading is performed within the assembly or serialization or remoting is used.

One of the side-effects of obfuscation is the difficulty of debugging obfuscated code.An exception that is generated and reported by a user will typically include mangled method and class names making it almost impossible to identify the root cause. Providing a clearly labeled map file in the obfuscation tool is essential to the user in interpreting debugger output from the obfuscated assembly.Hackers often search deployed assemblies for strings that contain keywords such as “Password” or “Enter password”. By locating such strings, hackers attempt to circumvent the password protection embedded in the product that they are hacking. Some obfuscators provide the option of string encryption. Although this may introduce a small performance penalty because of the need to decrypt strings at runtime, the overhead associated with such string encryption and decryption is often negligible.

Some obfuscator products include the ability to statically analyze the application and determine the parts that are not being used. This includes unused types, unused methods,and unused fields. This could be of great benefit if memory footprint is a concern.Some obfuscators provide control of flow obfuscation. Control flow obfuscation typically introduces false conditional statements and other misleading constructs in order to confuse decompilers. Some obfuscators destroy the code patterns that decompilers use to recreate source code. The trick is to confuse the decompiler without changing the functionality of the obfuscate assembly.Incremental obfuscation allows the developer to make changes to the original sources after releasing an obfuscated assembly and then provide a patch to the user that reflects the changes to the original application while preserving the name-mapping used in the original release. In order to accomplish this, a map file must be saved and later used to ensure that the renaming is preserved when making changes and re-releasing the obfuscated assembly. Some obfuscator products support this useful capability. Finally, some obfuscators enable the user to embed watermarks such as user names and registration codes into the internal binary structures within the assembly.

Watermarking can assist in tracking distribution of the product on a per-executable basis if it is illicitly distributed or obtained.Several well known obfuscator products were sought and obtained for this review.Only two survived the rigorous tests that each obfuscator was subjected to. The obfuscators that failed generally produced assemblies that would not run or did not provide sufficient features to be of use for deploying commercial obfuscated assemblies.