RetroGuard User's GuideRetroGuard User's GuideContents
1. IntroductionWelcome to the RetroGuard User's Guide. This guide explains how to use the RetroGuard bytecode obfuscator to protect your Java code against decompilation and reverse engineering.About RetroGuardRetroGuard is a bytecode obfuscator, a tool designed to replace the human-readable identifiers and attributes in your Java classes with meaningless strings, making reverse engineering of the code almost impossible. The result is smaller code sizes for your shipping product and confidence that your valuable Java source code is secure.RetroGuard is free software, distributed under the Lesser GNU Public License. Features include:
About obfuscationJava bytecode (*.class files) contains all of the information, apart from comments, that is in Java source (*.java) files. Using a tool called a decompiler a hostile competitor can easily reverse engineer your Java classes. To counter this threat, it is possible to obfuscate your class files before distributing your software.The obfuscation process strips all unnecessary information from the classes. This includes the line number tables, local variable names and source file names used by debuggers. Also, class, interface, field and method identifiers are renamed to render them meaningless. The Java virtual machine, which runs your bytecode, does not care at all about these changes. However, the decompiled version of these classes is extremely difficult to understand, frustrating any attempt to reverse engineer your code. The changes that an obfuscator makes to your Java classes are not reversible - there is no automated way for a reverse engineer to recover the lost information about your code. An additional benefit to obfuscation is a substantial reduction in the size of your Java classes, due to the removal of unnecessary information and the replacement of large, human-readable identifiers with small machine generated names. This size reduction leads to faster download times for your Java applets. To determine which classes are to be obfuscated, most obfuscators start at a single entry point (usually the 'main' method of an application, or the 'Applet'-derived class for an applet), and construct a tree of all classes accessible from that point. Unfortunately, this method is quite limiting and works only in simple cases. If your Java code has multiple entry points (several applications, applets, or JavaBeans, or if your code is intended to be used as a Java library) then this method is just not flexible enough. Instead, RetroGuard obfuscates all classes and interfaces within a JAR file. JAR files are the industry standard mechanism for packaging Java classes for distribution - it is easy to package your classes as a jar using the 'jar' utility distributed with the Java Development Kit from Sun Microsystems. Any number of entry points to the JAR can be specified using a RetroGuard script file. This allows the obfuscation process to be completely flexible. A technique used by several obfuscators is to introduce corrupt bytecode into the obfuscated Java classes. These corruptions are prohibited by the definitive text, the Java Virtual Machine Specification by Yellin and Lindholm, but do not happen to be noticed by the current virtual machine implementations. The corruptions are sufficient to break some of the simpler decompilers on the market. This class corruption is a very dangerous course to take, however, since virtual machines will certainly enforce the constraints of the Specification much more strictly in the future. At that point, code which uses this 'corrupting obfuscation' will simply fail. Corruption of classes is unacceptable for developers - one cannot afford to ship Java bytecode which only sometimes runs, or fails completely on some virtual machines. For this reason the RetroGuard obfuscator produces only verifiable bytecode in full compliance with the Java Virtual Machine Specification. Instead of corrupting the bytecode, RetroGuard uses heavy overloading of identifiers (multiple uses of method names within a class) and the introduction of Java source-code keywords as identifiers to make it almost impossible to understand decompiled Java classes.
The RetroGuard obfuscator is simple to install. Just ensure that the archive 'retroguard.jar' is listed in your 'CLASSPATH' environment variable. Please refer to the JDK documentation for your platform if you are unsure how to set the 'CLASSPATH' environment variable. Before obfuscation, you must package your classes and any associated
resource files in a Jar archive. For example, to package an entire directory
tree of classes into a Jar, use: '
One way to view obfuscation is as a phase in your build process where the interface to your Jar archive is specified and only that interface is left accessible to the outside world. This interface is the list of classes, interfaces, methods and fields that you provide in the RetroGuard script file. Command lineThe command for running RetroGuard has the form,
where:
[...] ' above
is optional. If INPUT-JAR cannot be read, or if
OUTPUT-JAR or LOGFILE cannot be
written, execution is terminated with a warning message.
Simple script examplesA full description of the scripting language is given below but the most common examples of script entries are given here:
java RGgui '
assuming, as always, that 'retroguard.jar' is available in your CLASSPATH. A
full description of this script manager is given below.
Access to librariesIf your Java classes require access to other libraries when run, these libraries must also be available on the CLASSPATH when RetroGuard or RGgui are executed. For example, if you use the command
to run your application class 'MyApp.class ' which lives in
the archive 'myclasses.jar ' and which depends on classes
in the library 'thirdparty.jar ', then a suitable command
for obfuscation would be
In this case, 'script.rgs ' should be a text file
containing the line '.class MyApp public method ', so that
your application class is still accessible after obfuscation. Similarly, a
suitable command to run the script management tool would be
(The specific form of the command line argument '-classpath ' has
been given for the Windows platform. If you are running JDK on another
platform such as Solaris or Linux, the form of this argument will differ -
please refer to the platform specific documentation that came with your copy
of the JDK).
Log file contentsA log file is written by RetroGuard during obfuscation. An explanation of the various sections which can occur in this file is given below.Header:The log file header lists the current version of RetroGuard, the data and time of obfuscation, and lists the names used for the input jar file, the output jar file and the script file.Pass one phase:During the first pass over the Jar file to be obfuscated, each class is analysed and any use of methods which reference identifiers by a 'String' is identified. After this pass, a list of these 'identifier-as-String' method calls (if any are found) is written to the log file. The problems that can be associated with these methods, and the solution, are described in the following section. The memory usage and current heap size are also written to the log file at this stage.Renaming phase:All identifiers in the analyzed Java classes which are to be obfuscated are then renamed. Some identifiers cannot be changed, for one of the following reasons:
Pass two phase:A second pass is made over the input Jar file, during which debugging information is removed from each class and identifiers are replaced by their obfuscated version. The output Jar file is generated to contain the obfuscated classes, and a fresh manifest file is created for this Jar. Unless an error occurs during this pass, no output is sent to the log file.Run-time problemsThere are certain methods in the JDK classes 'java.lang.Class' and 'java.lang.ClassLoader' which refer to classes, methods or fields using a 'String' name. If these JDK methods are used to refer to identifiers inside your JAR, your code may behave incorrectly after obfuscation because the 'String' name may have been changed. Unfortunately, it is not possible for an obfuscator to solve this problem automatically (without in some fashion storing the entire name mapping table in your Jar archive), because the 'String' name can be constructed or changed in a way that is only known at run-time. If any of the following method calls are detected during obfuscation, a warning is posted to the log file.
Also, by default, all unnecessary class attributes are trimmed fom each class during obfuscation. In the unlikely case that any of these attributes are to be retained, this can also be done using the script file. Below we give a BNF grammar for the script language, followed by some annotated examples of script entries. BNF grammar
Notes:
Example script entries
# All text in a line following '#' is treated as a comment, and is ignored.
Resource filesThe common idioms for accessing resource files from a Java class are:
RetroGuard's behavior is to update all components of a resource's path
with obfuscated versions. The possibility exists that classes lose track of
their resources if those resources have been referenced through absolute paths.
For this reason, we encourage the use of the relative path methods of
Manifest fileWhen a Jar file is created, a manifest of its contents is generated by default and stored in the text file 'META-INF/MANIFEST.MF ' in the
Jar. In this manifest there is a section for each file in the Jar containing
the file name, some message digests (which are like checksums for verifying
the file contents), and possibly some additional information such as the line
'Java-Bean: True' if the file is a JavaBean class. Obfuscation of classes
causes the file names and the message digests in the manifest to become
invalid.
RetroGuard generates a new manifest for the obfuscated Jar with the obfuscated
class and resource names, with fresh MD5 and SHA-1 message digests, and with
the additional information, such as 'Java-Bean' specifiers, copied over from
the original manifest file.
Signature filesDigital signatures in the Jar file cannot be updated by RetroGuard automatically. If a Jar is to be digitally signed, this should be done after obfuscation is complete. Any signature files ('META-INF/*.SF ') found in the Jar prior to obfuscation
are discarded.
For information about digital signatures for Jar files, see the information on the JavaSoft website.
This visual tool scans a JAR automatically for applications, applets and JavaBeans (the most common entry points to a JAR) as well as allowing reservation of any other class, interface, method and field names. The tool takes the form of a multi-panel wizard, which leads the user through the selection of identifiers and attributes which are to be preserved during obfuscation. The following sections describe the script management tool, panel by panel. Jar file selection![]() When the script management tool is run (using
' If a new script is being generated you can now hit 'Next' to proceed. If you
are editing an existing script file, turn on the checkbox in line '2.' and
enter or browse to the name of the existing script, and then hit 'Next' to
proceed.
Application, applet, bean selection![]() The Jar file is now analysed. Depending on the number of classes in the Jar this may take a few moments. Once analysis is complete, the class names of any applications, applets and JavaBeans are displayed in the lists shown above. If you are generating a fresh script file, these application, applet and Bean classes are automatically selected for preservation. If you are editing an existing script, the settings that were read from the input script are displayed instead. To change an obfuscation setting, just select the class name in the list and select or de-select the appropriate checkbox. Once you are satisfied with the settings, hit 'Next' to proceed. Use the
'Back' button on any panel to come back and adjust settings, or even start
over with a different Jar file, at any time.
Class preservation![]() In this panel, the upper list is a tree control which gives access to all the packages, classes and interfaces in the Jar file. To open or close a package double-click on the '[package] name' line. If a class has inner classes within it, double-click on the class name to see the inner classes. When you select a class or interface name in the tree control, the current obfuscation settings are displayed using the checkboxes beneath. Select or de-select the checkboxes to change obfuscation of that class and its methods and fields. If you select a '[package]' line in the tree control, the obfuscation settings for all classes and interfaces in that package can be set at once (this will be output as a wildcard line in the script file). Once you are satisfied, hit 'Next' to proceed.
Method and field preservation![]() In this panel, there are three sections: the upper list is a tree control which is synchronized to the packages / class / interface tree control of the previous panel, and which is used in just the same way. The lower two lists show methods and fields declared in the class or interface selected. Select a method or field and click the 'Preserved?' checkbox to change its obfuscation settings. The 'Show all?' checkboxes are checked by default: this means that all methods and fields are shown. If 'Show all?' is unchecked, then a filter is applied and only methods and fields which are of immediate interest are shown. Specifically, 'public', 'protected' and default-access (package) methods and fields which have not been selected on the previous panel are displayed in these lists, with one exception: the special 'private' methods required by Java's 'Serializable' interface are also displayed. Unchecking the 'Show all?' checkbox can help to focus attention on methods and fields you may wish to preserve at this stage. Once you are satisfied, hit 'Next' to proceed.
Attribute preservation and script file generation![]() The checkboxes in the upper part of this panel allow certain class-file attributes to be preserved during obfuscation. These settings are applied globally across all classes and interfaces in the Jar file. It is unlikely you will want to adjust these settings; however, if you wish to use a debugger like 'jdb' on the obfuscated code, turn on the checkboxes to preserve 'SourceFile' and 'LineNumberTable'. Type the name for the generated script file in the text field beneath, or use the 'Browse' button to select the name using a file dialog. Finally, if you are satified with the obfuscation settings shown in all of
the panels, hitting 'Finish' will construct the script file and exit.
In many cases, it will be enough to let the script generator's automatic analysis feature find your applications, applets and JavaBeans and then to skim through the other panels, making little or no change to the settings.
Incremental obfuscationNormally, when obfuscation is applied to a Jar file, the obfuscated names bear no particular relation to the names in any previous or later version of your software. Even when the changes to your code are minimal there can be substantial changes to the name mappings, because of the process that RetroGuard uses to optimize the obfuscated name-spaces. The generation of a "patch" Jar which contains only the modified classes has not been possible. This restriction has been overcome in RetroGuard v1.1 by modifying the format of the log-file, so that it can be used in place of the script file during obfuscation of a later version. The mapping entries in the log then act to restrict this later obfuscation. The classes containing your bug-fix or feature addition can then be extracted and shipped as a patch file. The general rule is, when creating a patch between an old version of the software and a new version, use the output log-file from the old version's obfuscation as the input script file to the new version's obfuscation. This procedure will be demonstrated below in an example. Patch utilityThe extraction of updated classes and resource files from the obfuscated Jar and the creation of the patch Jar could be done by hand. Starting with a list of the classes and resources to be included in the patch, you would convert these names to their obfuscated form by referring to the log-file, then extract the obfuscated files from the full Jar and repack them into a small patch Jar. This procedure is tiresome and error-prone, particularly on the Windows
operating systems due to the limitations of the file system (Java identifiers
are case-sensitive, but the Windows OSes smash case, which can lead to one
class overwriting another during the un-jar/re-jar process).
A utility called The command for running
where:
WHOLE-JAR , LOGFILE or
LISTFILE cannot be read, or if
PATCH-JAR cannot be written, execution is terminated
with a warning message.
An exampleThe generation of obfuscated patch files is easiest to understand through
an example. We consider the case where a company ships an initial
release (version 1.0) of its product, called "JTool", to its customers
through a website download.
To discourage reverse engineering of the product, the company obfuscates
their Jar file, called
The company then posts the obfuscated Jar jtool-1.0.jar
to its website, where it is downloaded by customers. The log-file
jtool-1.0.log contains all of the obfuscation mappings,
and so is archived carefully and securely by the company. This log-file will
be vital during the creation of future patches to the software.
Several weeks later, a customer reports a problem with the JTool product. It is found that fixing the problem requires only a small change to one class and to an associated resource file. While it would be possible to recompile and obfuscate the fixed software, and repost the whole Jar to the website, this is not an attractive option for JTool's customers because the whole Jar is several Mbytes in size and takes some time to download. A better alternative is to create a small patch file containing only the
modified class and resource. To do this, the whole JTool application is
first compiled and packaged into a Jar
Notice that instead of using the script file jtool-1.0.rgs
as input to this obfuscation, we have used the log-file from the
original (v1.0) obfuscation process. This log-file was generated in an
extended form of the RGS script format, and contains the original entry from
jtool-1.0.rgs as well as all the name mappings from the
first obfuscation process. Obfuscating in this way means that the
v1.1 code can be binary compatible with the originally shipped
v1.0 code.
All that remains is to extract the modified class and resource from the
The command to create the patch is then:
This utility works in two stages: first the log-file
jtool-1.1.log is used to convert the entries in
jtool-patch-1.0-1.1.txt to obfuscated form;
these converted entries are then extracted from
jtool-1.1.jar and used to create the file
jtool-patch-1.0-1.1.jar .
Once again, the log-file
they could now take advantage of the bug-fix by using the patch file:
(Note: a colon ':' is used to separate classpath entries in the example above.
This is correct on Unix systems - on Windows systems a semi-colon ';' should
be used instead.)
Advanced issuesIn this section we touch on some advanced topics on incremental obfuscation and patching, using RetroGuard v1.1 and RGpatch.Patch sequences: One method of providing patches is to include in each patch only those changes which have occurred since the last patch was issued. The sequence of major release and patch Jar names might be:
Using this method means that each patch is as small as possible. However,
the user's classpath can become polluted by a long series of patch files
which must be given in the correct order.
An alternative technique is to have each patch contain all changes since the last major release. The user has only the most recent patch file in their classpath, but each patch becomes larger than the last. A typical sequence of major release and patch Jar names might then be:
Whichever method suits your project, it is clear that a very careful accounting must be kept of code changes, obfuscation log-files and shipped versions of your software. A robust and flexible version control system such as CVS (available as free software from Cyclic Software) is recommended for this purpose. In addition, it is recommended that a comprehensive suite of automated
tests are developed for your product. This is always desirable, but is
particularly helpful in testing patches. The suite can be used to exercise
the unobfuscated, the obfuscated, and the patched versions of the
product prior to shipping. Clearly, results should be identical in each
case.
Addition entry points: A patch can be used to introduce one or more
new entry points into your product. This will, as usual, require the listing
of these entry points in the RGS file used during obfuscation. Since the
log-file from the prior version is used as the RGS script file during
incremental obfuscation, the additional The existing entries in the log-file should not, under
any circumstances, be edited or removed: to do so would cause corruption of
the generated patch file. In addition, any new entry points (methods which
are to be left unobfuscated) must have been introduced to the code since the
previous version. It is not possible to change an existing method from being
obfuscated in one version to being unobfuscated in the next.
Inner classes: Since inner classes have priveleged access to
their outer classes, a patch containing a class must also contain all of
its inner classes. The Name conversion utilityIt can be useful to convert a list of original class names to their
obfuscated form, when processing an obfuscated Jar-archive after using
RetroGuard. The utility The command for running
where:
LOGFILE cannot be read, execution is terminated
with a warning message.
|