RetroGuard User's Guide




RetroGuard User's Guide


Contents

  1. Introduction
  2. Installing RetroGuard
  3. Running the obfuscator
  4. Scripting language
  5. Handling of non-'.class' files
  6. Script generation wizard
  7. Creating patch files


1. Introduction

Welcome to the RetroGuard User's Guide. This guide explains how to use the RetroGuard bytecode obfuscator to protect your Java code against decompilation and reverse engineering.

About RetroGuard

RetroGuard is a bytecode obfuscator, a tool designed to replace the human-readable identifiers and attributes in your Java classes with meaningless strings, making reverse engineering of the code almost impossible. The result is smaller code sizes for your shipping product and confidence that your valuable Java source code is secure.

RetroGuard is free software, distributed under the Lesser GNU Public License.

Features include:

  • reduced size of Java bytecode (savings of up to 50% are possible, 20-30% is typical) leading to faster download times for your applets
  • designed to fit seamlessly into the automated build process for your Java projects
  • allows full customization of the obfuscation process
  • multiple entry points into your Java code are supported - allow access to as many applications, applets, JavaBeans and library interfaces as you require
  • acts on Jar archives, the industry standard for packaging Java classes and their associated resources
  • obfuscation is controlled by a flexible scripting language
  • a graphical 'wizard' is provided for simple management of script files
  • uses massive overloading of method and field names for even higher security
  • generates only verifiable Java bytecode in full compliance with the Java Virtual Machine Specification
  • updates the manifest file in the Jar archive, using obfuscated names and automatically generated MD5 and SHA-1 message digests

About obfuscation

Java bytecode (*.class files) contains all of the information, apart from comments, that is in Java source (*.java) files. Using a tool called a decompiler a hostile competitor can easily reverse engineer your Java classes. To counter this threat, it is possible to obfuscate your class files before distributing your software.

The obfuscation process strips all unnecessary information from the classes. This includes the line number tables, local variable names and source file names used by debuggers. Also, class, interface, field and method identifiers are renamed to render them meaningless. The Java virtual machine, which runs your bytecode, does not care at all about these changes. However, the decompiled version of these classes is extremely difficult to understand, frustrating any attempt to reverse engineer your code. The changes that an obfuscator makes to your Java classes are not reversible - there is no automated way for a reverse engineer to recover the lost information about your code.

An additional benefit to obfuscation is a substantial reduction in the size of your Java classes, due to the removal of unnecessary information and the replacement of large, human-readable identifiers with small machine generated names. This size reduction leads to faster download times for your Java applets.

To determine which classes are to be obfuscated, most obfuscators start at a single entry point (usually the 'main' method of an application, or the 'Applet'-derived class for an applet), and construct a tree of all classes accessible from that point. Unfortunately, this method is quite limiting and works only in simple cases. If your Java code has multiple entry points (several applications, applets, or JavaBeans, or if your code is intended to be used as a Java library) then this method is just not flexible enough.

Instead, RetroGuard obfuscates all classes and interfaces within a JAR file. JAR files are the industry standard mechanism for packaging Java classes for distribution - it is easy to package your classes as a jar using the 'jar' utility distributed with the Java Development Kit from Sun Microsystems. Any number of entry points to the JAR can be specified using a RetroGuard script file. This allows the obfuscation process to be completely flexible.

A technique used by several obfuscators is to introduce corrupt bytecode into the obfuscated Java classes. These corruptions are prohibited by the definitive text, the Java Virtual Machine Specification by Yellin and Lindholm, but do not happen to be noticed by the current virtual machine implementations. The corruptions are sufficient to break some of the simpler decompilers on the market. This class corruption is a very dangerous course to take, however, since virtual machines will certainly enforce the constraints of the Specification much more strictly in the future. At that point, code which uses this 'corrupting obfuscation' will simply fail.

Corruption of classes is unacceptable for developers - one cannot afford to ship Java bytecode which only sometimes runs, or fails completely on some virtual machines. For this reason the RetroGuard obfuscator produces only verifiable bytecode in full compliance with the Java Virtual Machine Specification. Instead of corrupting the bytecode, RetroGuard uses heavy overloading of identifiers (multiple uses of method names within a class) and the introduction of Java source-code keywords as identifiers to make it almost impossible to understand decompiled Java classes.

(Back to contents...)



2. Installing RetroGuard

RetroGuard is a Java application, compliant with JDK1.1 and 1.2 and so a Java virtual machine is required to run the tool. We recommend that you obtain the latest copy of the JDK1.1 from Sun Microsystems or JDK1.2 from Sun Microsystems.

The RetroGuard obfuscator is simple to install. Just ensure that the archive 'retroguard.jar' is listed in your 'CLASSPATH' environment variable. Please refer to the JDK documentation for your platform if you are unsure how to set the 'CLASSPATH' environment variable.

Before obfuscation, you must package your classes and any associated resource files in a Jar archive. For example, to package an entire directory tree of classes into a Jar, use: 'jar cf MyJar.jar *'. See the JDK documentation for full details about creating and using Jar archives.

(Back to contents...)



3. Running the obfuscator

RetroGuard is designed to be integrated into an build procedure, so that obfuscation can become a consistently applied and automatic part of your regular build and QA cycle. For this reason, it is a command line tool with its options being set through a script file - this enforces stability from build to build.

One way to view obfuscation is as a phase in your build process where the interface to your Jar archive is specified and only that interface is left accessible to the outside world. This interface is the list of classes, interfaces, methods and fields that you provide in the RetroGuard script file.

Command line

The command for running RetroGuard has the form,
java RetroGuard [INPUT-JAR [OUTPUT-JAR [SCRIPT [LOGFILE]]]]
where:
  • INPUT-JAR is the filename for your original, unobfuscated Jar file (defaults to 'in.jar').
  • OUTPUT-JAR is the filename that will be used for the obfuscated Jar file (defaults to 'out.jar').
  • SCRIPT is the filename of the RetroGuard script file, specifying identifiers which are to be left unobfuscated (defaults to 'script.rgs').
  • LOGFILE is the filename that will be used for the text log of this obfuscation run (defaults to 'retroguard.log').
Each group of arguments in square brackets '[...]' above is optional. If INPUT-JAR cannot be read, or if OUTPUT-JAR or LOGFILE cannot be written, execution is terminated with a warning message.

Simple script examples

A full description of the scripting language is given below but the most common examples of script entries are given here:
  • To allow access to an application class 'MyApp.class' make an entry in the script, '.class MyApp public method'. This preserves the class name and its public methods, including the 'main' method.
  • To allow access to an applet class 'MyApplet.class' make an entry in the script file '.class MyApplet' which preserves the class name.
  • To allow access to a JavaBean class 'MyBean.class' make an entry in the script file '.class MyBean protected'. This preserves the class name and all public, protected and default accessible methods and fields in that class.
To simplify the generation of script files, a graphical script management tool is provided. It is invoked using the command 'java RGgui' assuming, as always, that 'retroguard.jar' is available in your CLASSPATH. A full description of this script manager is given below.

Access to libraries

If your Java classes require access to other libraries when run, these libraries must also be available on the CLASSPATH when RetroGuard or RGgui are executed. For example, if you use the command
java -classpath myclasses.jar;thirdparty.jar;d:\jdk1.1\lib\classes.zip MyApp
to run your application class 'MyApp.class' which lives in the archive 'myclasses.jar' and which depends on classes in the library 'thirdparty.jar', then a suitable command for obfuscation would be
java -classpath retroguard.jar;thirdparty.jar;d:\jdk1.1\lib\classes.zip RetroGuard myclasses.jar myclasses-obf.jar script.rgs
In this case, 'script.rgs' should be a text file containing the line '.class MyApp public method', so that your application class is still accessible after obfuscation. Similarly, a suitable command to run the script management tool would be
java -classpath retroguard.jar;thirdparty.jar;d:\jdk1.1\lib\classes.zip RGgui
(The specific form of the command line argument '-classpath' has been given for the Windows platform. If you are running JDK on another platform such as Solaris or Linux, the form of this argument will differ - please refer to the platform specific documentation that came with your copy of the JDK).

Log file contents

A log file is written by RetroGuard during obfuscation. An explanation of the various sections which can occur in this file is given below.
Header:
The log file header lists the current version of RetroGuard, the data and time of obfuscation, and lists the names used for the input jar file, the output jar file and the script file.
Pass one phase:
During the first pass over the Jar file to be obfuscated, each class is analysed and any use of methods which reference identifiers by a 'String' is identified. After this pass, a list of these 'identifier-as-String' method calls (if any are found) is written to the log file. The problems that can be associated with these methods, and the solution, are described in the following section. The memory usage and current heap size are also written to the log file at this stage.
Renaming phase:
All identifiers in the analyzed Java classes which are to be obfuscated are then renamed. Some identifiers cannot be changed, for one of the following reasons:
  • The identifier is listed for preservation in the RetroGuard script file.
  • Due to inheritance constraints, some identifiers cannot be modified. For example, if a class implements the interface 'java.util.Enumeration', the methods inherited from that interface ('hasMoreElements' and 'nextElement') cannot be changed.
  • Synthetic methods and fields, which are generated automatically by Java compilers as a way of implementing the JDK1.1/1.2 inner class features, are not obfuscated. Since these names are already short, meaningless, machine-generated strings this is no disadvantage.
After this renaming three tables are written to the log file:
  • A summary of the use of remapped identifiers.
  • A listing of identifiers which were explicitly preserved from obfuscation by entries in the script file.
  • A listing of obfuscated identifiers and identifiers which were left unchanged due to inheritance constraints.
Pass two phase:
A second pass is made over the input Jar file, during which debugging information is removed from each class and identifiers are replaced by their obfuscated version. The output Jar file is generated to contain the obfuscated classes, and a fresh manifest file is created for this Jar. Unless an error occurs during this pass, no output is sent to the log file.

Run-time problems

There are certain methods in the JDK classes 'java.lang.Class' and 'java.lang.ClassLoader' which refer to classes, methods or fields using a 'String' name. If these JDK methods are used to refer to identifiers inside your JAR, your code may behave incorrectly after obfuscation because the 'String' name may have been changed. Unfortunately, it is not possible for an obfuscator to solve this problem automatically (without in some fashion storing the entire name mapping table in your Jar archive), because the 'String' name can be constructed or changed in a way that is only known at run-time. If any of the following method calls are detected during obfuscation, a warning is posted to the log file.
  • In class java.lang.Class the methods:
    • Class forName(String className);
    • Field getDeclaredField(String name);
    • Field getField(String name);
    • Method getDeclaredMethod(String name, Class[] parameterTypes);
    • Method getMethod(String name, Class[] parameterTypes);
  • In class java.lang.ClassLoader the methods:
    • Class defineClass(String name, byte[] data, int offset, int length);
    • Class findLoadedClass(String name);
    • Class loadClass(String name);
    • Class loadClass(String name, boolean resolve);
If such a warning is found after obfuscation, you should examine your source code to determine if the methods are intended to act only on classes, methods and fields outside of the JAR. If so, they will cause no problems. If, however, the methods can refer to classes, methods or fields within your Jar file, run-time problems can arise because these identifiers may have been obfuscated. In this case, the solution is to reserve these identifiers using the script file, so that obfuscation does not change them.

(Back to contents...)



4. Scripting language

RetroGuard uses a simple scripting language to specify which identifiers are to be left unchanged by obfuscation. Certain names have to be reserved from obfuscation because they are intended to be accessed from outside. These reserved names can include the 'main' method and class of applications, any classes derived from 'Applet', as well as JavaBean classes. For the writers of Java libraries this list of reserved names can become quite extensive due the the large number of entry points into the code. The RetroGuard script file is a listing of these reserved names.

Also, by default, all unnecessary class attributes are trimmed fom each class during obfuscation. In the unlikely case that any of these attributes are to be retained, this can also be done using the script file.

Below we give a BNF grammar for the script language, followed by some annotated examples of script entries.

BNF grammar

  1. <script> ::= <script_line>*
  2. <script_line> ::= [<comment> | <exclusion>] "\n"
  3. <exclusion> ::= (<attribute_exc> | <class_exc> | <method_exc> | <field_exc>) [<comment>]
  4. <comment> ::= "#" <character>*
  5. <character> ::= "\040"-"\176"
  6. <attribute_exc> ::= ".attribute" ("SourceFile" | "LocalVariableTable" | "LineNumberTable")
  7. <class_exc> ::= ".class" <class_spec_wc> [("public" | "protected") ["method" | "field"]]
  8. <method_exc> ::= ".method" <method_field_spec> <method_descriptor>
  9. <field_exc> ::= ".field" <method_field_spec> <java_type>
  10. <method_field_spec> ::= <class_spec>"/"<java_identifier>
  11. <method_descriptor> ::= "("<java_type>*")"<java_type>
  12. <java_type> ::= ("["<java_type>) | "B" | "C" | "D" | "F" | "I" | "J" | "S" | "V" | "Z" | ("L"<class_spec>";")
  13. <class_spec> ::= ((<java_identifier>"/")*)((<java_identifier>"$")*)<java_identifier>
  14. <class_spec_wc> ::= <class_spec> | ((<java_identifier>"/")*)"*"
  15. <java_identifier> ::= <java_letter><java_letter_or_digit>*
  16. <java_letter> is a Unicode character for which java.lang.Character.isJavaIdentifierStart() is true
  17. <java_letter_or_digit> is a Unicode character for which java.lang.Character.isJavaIdentifierPart() is true
Notes:
  1. Interfaces are not treated separately from classes. The '.class' token can also refer to an interface.
  2. The Java virtual machine syntax is used for specifying types. These types are:
    • 'B' = byte
    • 'C' = char
    • 'D' = double
    • 'F' = float
    • 'I' = int
    • 'J' = long
    • 'S' = short
    • 'V' = void
    • 'Z' = boolean
    • 'LCOM/widgetco/MyClass;' = the type of class MyClass in package COM.widgetco
    • '[B' = a 1-dimensional array of bytes
    • '[[Ljava/lang/String;' = a 2-dimensional array of Strings
    For a full description of this syntax, see The Java Virtual Machine Specification by Yellin and Lindholm.
  3. Fully qualified class, method and field identifiers are used. So, for a class 'MyClass' with method 'void aMethod(boolean isSet)' and field 'float aField' in a package 'COM.widgetco', the script entries to reserve these identifiers would be:
    • .class COM/widgetco/MyClass
    • .method COM/widgetco/MyClass/aMethod (Z)V
    • .field COM/widgetco/MyClass/aField F

Example script entries

# All text in a line following '#' is treated as a comment, and is ignored.

# Preserve the class name 'MyClass' in package 'COM.widgetco'
# (note that the package identifiers 'COM' and 'widgetco' are
# automatically preserved also, so that 'MyClass' remains fully
# accessible):
.class COM/widgetco/MyClass

# Preserve the class name 'MyClass' and all public methods and
# fields declared in it:
.class COM/widgetco/MyClass public

# Preserve the class name 'MyClass' and all public, protected or
# default-access (package) methods and fields declared in it:
.class COM/widgetco/MyClass protected

# Preserve the class name 'MyClass' and all public, protected or
# default-access (package) methods (but not fields) declared in it:
.class COM/widgetco/MyClass protected method

# Preserve all classes and their public fields in the package
# 'COM.widgetco':
# (Reference to all classes in a package is the only situation
# where wildcards are allowed)
.class COM/widgetco/* public field

# Preserve the 'SourceFile' debugging attribute of all classes:
.attribute SourceFile

# Preserve the method 'double getValue(Object obj)' in class 'MyClass':
.method COM/widgetco/MyClass/getValue (Ljava/lang/Object;)D

# Preserve the field 'char aCharacter' in class 'MyClass':
.field COM/widgetco/MyClass/aCharacter C

(Back to contents...)



5. Handling of non-'.class' files

For all but the most trivial Java applications and applets, there are resource files associated with the code and stored with it in the Jar file. These resources can include images, audio files and resource bundles for localizing internationalized code. In addition, the Jar file contains a manifest file and can contain files associated with digital signatures. Below we specify the way that RetroGuard deals with each of these types of non-'.class' files in the Jar.

Resource files

The common idioms for accessing resource files from a Java class are:
  1. using a relative path and the java.lang.Class methods
    • InputStream getResourceAsStream(String relativePath);
    • URL getResource(String relativePath);
  2. using an absolute path and the java.lang.ClassLoader methods
    • InputStream getResourceAsStream(String absolutePath);
    • URL getResource(String absolutePath);
The relative paths are prefixed at run-time by the current class's name, and then the named resource is found. If this class name has been obfuscated, the path of the resource will also have to have been updated with the obfuscated version so that the resource can still be located.

RetroGuard's behavior is to update all components of a resource's path with obfuscated versions. The possibility exists that classes lose track of their resources if those resources have been referenced through absolute paths. For this reason, we encourage the use of the relative path methods of java.lang.Class, rather than the absolute path methods of java.lang.ClassLoader.

Manifest file

When a Jar file is created, a manifest of its contents is generated by default and stored in the text file 'META-INF/MANIFEST.MF' in the Jar. In this manifest there is a section for each file in the Jar containing the file name, some message digests (which are like checksums for verifying the file contents), and possibly some additional information such as the line 'Java-Bean: True' if the file is a JavaBean class. Obfuscation of classes causes the file names and the message digests in the manifest to become invalid. RetroGuard generates a new manifest for the obfuscated Jar with the obfuscated class and resource names, with fresh MD5 and SHA-1 message digests, and with the additional information, such as 'Java-Bean' specifiers, copied over from the original manifest file.

Signature files

Digital signatures in the Jar file cannot be updated by RetroGuard automatically. If a Jar is to be digitally signed, this should be done after obfuscation is complete. Any signature files ('META-INF/*.SF') found in the Jar prior to obfuscation are discarded.

For information about digital signatures for Jar files, see the information on the JavaSoft website.

(Back to contents...)



6. Script generation wizard

A graphical user interface is included with RetroGuard to make script generation and updating more straightforward. It must be stressed that this GUI will not run the obfuscator - RetroGuard is strictly a command line tool. The visual tool's purpose is to ease the generation of script files.

This visual tool scans a JAR automatically for applications, applets and JavaBeans (the most common entry points to a JAR) as well as allowing reservation of any other class, interface, method and field names.

The tool takes the form of a multi-panel wizard, which leads the user through the selection of identifiers and attributes which are to be preserved during obfuscation. The following sections describe the script management tool, panel by panel.

Jar file selection



When the script management tool is run (using 'java RGgui') the panel above is displayed. On line '1.' the user selects the Jar file for which the script is to be generated, either by entering the relative or absolute path in the text field or by hitting the 'Browse' button, which causes a file selection dialog to be displayed.

If a new script is being generated you can now hit 'Next' to proceed. If you are editing an existing script file, turn on the checkbox in line '2.' and enter or browse to the name of the existing script, and then hit 'Next' to proceed.

Application, applet, bean selection



The Jar file is now analysed. Depending on the number of classes in the Jar this may take a few moments. Once analysis is complete, the class names of any applications, applets and JavaBeans are displayed in the lists shown above.

If you are generating a fresh script file, these application, applet and Bean classes are automatically selected for preservation. If you are editing an existing script, the settings that were read from the input script are displayed instead.

To change an obfuscation setting, just select the class name in the list and select or de-select the appropriate checkbox.

Once you are satisfied with the settings, hit 'Next' to proceed. Use the 'Back' button on any panel to come back and adjust settings, or even start over with a different Jar file, at any time.

Class preservation



In this panel, the upper list is a tree control which gives access to all the packages, classes and interfaces in the Jar file. To open or close a package double-click on the '[package] name' line. If a class has inner classes within it, double-click on the class name to see the inner classes.

When you select a class or interface name in the tree control, the current obfuscation settings are displayed using the checkboxes beneath. Select or de-select the checkboxes to change obfuscation of that class and its methods and fields.

If you select a '[package]' line in the tree control, the obfuscation settings for all classes and interfaces in that package can be set at once (this will be output as a wildcard line in the script file).

Once you are satisfied, hit 'Next' to proceed.

Method and field preservation



In this panel, there are three sections: the upper list is a tree control which is synchronized to the packages / class / interface tree control of the previous panel, and which is used in just the same way.

The lower two lists show methods and fields declared in the class or interface selected. Select a method or field and click the 'Preserved?' checkbox to change its obfuscation settings. The 'Show all?' checkboxes are checked by default: this means that all methods and fields are shown. If 'Show all?' is unchecked, then a filter is applied and only methods and fields which are of immediate interest are shown. Specifically, 'public', 'protected' and default-access (package) methods and fields which have not been selected on the previous panel are displayed in these lists, with one exception: the special 'private' methods required by Java's 'Serializable' interface are also displayed. Unchecking the 'Show all?' checkbox can help to focus attention on methods and fields you may wish to preserve at this stage.

Once you are satisfied, hit 'Next' to proceed.

Attribute preservation and script file generation



The checkboxes in the upper part of this panel allow certain class-file attributes to be preserved during obfuscation. These settings are applied globally across all classes and interfaces in the Jar file. It is unlikely you will want to adjust these settings; however, if you wish to use a debugger like 'jdb' on the obfuscated code, turn on the checkboxes to preserve 'SourceFile' and 'LineNumberTable'.

Type the name for the generated script file in the text field beneath, or use the 'Browse' button to select the name using a file dialog.

Finally, if you are satified with the obfuscation settings shown in all of the panels, hitting 'Finish' will construct the script file and exit.

In many cases, it will be enough to let the script generator's automatic analysis feature find your applications, applets and JavaBeans and then to skim through the other panels, making little or no change to the settings.

(Back to contents...)



7. Creating patch files

RetroGuard (v1.1 and later) has the ability to apply obfuscation incrementally to your software's versions. This feature allows you to ship small patch files to your customers when a bug-fix or other modification has occured in your code, rather than shipping the whole software package again. Patches cannot be generated for obfuscated Jars which were created by earlier versions of RetroGuard.

Incremental obfuscation

Normally, when obfuscation is applied to a Jar file, the obfuscated names bear no particular relation to the names in any previous or later version of your software. Even when the changes to your code are minimal there can be substantial changes to the name mappings, because of the process that RetroGuard uses to optimize the obfuscated name-spaces. The generation of a "patch" Jar which contains only the modified classes has not been possible.

This restriction has been overcome in RetroGuard v1.1 by modifying the format of the log-file, so that it can be used in place of the script file during obfuscation of a later version. The mapping entries in the log then act to restrict this later obfuscation. The classes containing your bug-fix or feature addition can then be extracted and shipped as a patch file.

The general rule is, when creating a patch between an old version of the software and a new version, use the output log-file from the old version's obfuscation as the input script file to the new version's obfuscation. This procedure will be demonstrated below in an example.

Patch utility

The extraction of updated classes and resource files from the obfuscated Jar and the creation of the patch Jar could be done by hand. Starting with a list of the classes and resources to be included in the patch, you would convert these names to their obfuscated form by referring to the log-file, then extract the obfuscated files from the full Jar and repack them into a small patch Jar.

This procedure is tiresome and error-prone, particularly on the Windows operating systems due to the limitations of the file system (Java identifiers are case-sensitive, but the Windows OSes smash case, which can lead to one class overwriting another during the un-jar/re-jar process). A utility called RGpatch has therefore been provided to automate this patch generation.

The command for running RGpatch has the form,

java RGpatch WHOLE-JAR PATCH-JAR LOGFILE LISTFILE
where:
  • WHOLE-JAR is the filename for the complete, incrementally obfuscated Jar file
  • PATCH-JAR is the filename that will be used for the generated patch Jar file
  • LOGFILE is the filename of the log output during the obfuscation of WHOLE-JAR
  • LISTFILE is the filename of a text file containing a list of the unobfuscated names of class and resource files which are to be included in the patch Jar.
There are no optional arguments for this command. If WHOLE-JAR, LOGFILE or LISTFILE cannot be read, or if PATCH-JAR cannot be written, execution is terminated with a warning message.

An example

The generation of obfuscated patch files is easiest to understand through an example. We consider the case where a company ships an initial release (version 1.0) of its product, called "JTool", to its customers through a website download. To discourage reverse engineering of the product, the company obfuscates their Jar file, called jtool-unobf-1.0.jar, using RetroGuard. The script file used is called jtool-1.0.rgs and contains a single entry .class JTool public method which provides access to the main method of the JTool Java application. The obfuscation command is:

java RetroGuard jtool-unobf-1.0.jar jtool-1.0.jar jtool-1.0.rgs jtool-1.0.log
The company then posts the obfuscated Jar jtool-1.0.jar to its website, where it is downloaded by customers. The log-file jtool-1.0.log contains all of the obfuscation mappings, and so is archived carefully and securely by the company. This log-file will be vital during the creation of future patches to the software.

Several weeks later, a customer reports a problem with the JTool product. It is found that fixing the problem requires only a small change to one class and to an associated resource file. While it would be possible to recompile and obfuscate the fixed software, and repost the whole Jar to the website, this is not an attractive option for JTool's customers because the whole Jar is several Mbytes in size and takes some time to download.

A better alternative is to create a small patch file containing only the modified class and resource. To do this, the whole JTool application is first compiled and packaged into a Jar jtool-unobf-1.1.jar. The Jar is then incrementally obfuscated using the command:

java RetroGuard jtool-unobf-1.1.jar jtool-1.1.jar jtool-1.0.log jtool-1.1.log
Notice that instead of using the script file jtool-1.0.rgs as input to this obfuscation, we have used the log-file from the original (v1.0) obfuscation process. This log-file was generated in an extended form of the RGS script format, and contains the original entry from jtool-1.0.rgs as well as all the name mappings from the first obfuscation process. Obfuscating in this way means that the v1.1 code can be binary compatible with the originally shipped v1.0 code.

All that remains is to extract the modified class and resource from the jtool-1.1.jar using the RGpatch utility. First a text file jtool-patch-1.0-1.1.txt is created containing the unobfuscated names of the modified class and resource:

COM/widgetco/MyClass.class
COM/widgetco/anImage.gif
The command to create the patch is then:
java RGpatch jtool-1.1.jar jtool-patch-1.0-1.1.jar jtool-1.1.log jtool-patch-1.0-1.1.txt
This utility works in two stages: first the log-file jtool-1.1.log is used to convert the entries in jtool-patch-1.0-1.1.txt to obfuscated form; these converted entries are then extracted from jtool-1.1.jar and used to create the file jtool-patch-1.0-1.1.jar.

Once again, the log-file jtool-1.1.log is archived by the company for use in generating future patches to their software. The patch file jtool-patch-1.0-1.1.jar (which is only a few Kbytes in size), is posted to the website and downloaded by JTool customers. These customers just insert the patch Jar ahead of the original Jar in their classpaths, and run JTool as usual. So, if they previously ran JTool using the command:

jre -cp jtool-1.0.jar JTool
they could now take advantage of the bug-fix by using the patch file:
jre -cp jtool-patch-1.0-1.1:jtool-1.0.jar JTool
(Note: a colon ':' is used to separate classpath entries in the example above. This is correct on Unix systems - on Windows systems a semi-colon ';' should be used instead.)

Advanced issues

In this section we touch on some advanced topics on incremental obfuscation and patching, using RetroGuard v1.1 and RGpatch.

Patch sequences: One method of providing patches is to include in each patch only those changes which have occurred since the last patch was issued. The sequence of major release and patch Jar names might be:

jtool-1.0.jar (2.2Mb)
jtool-patch-1.0-1.1.jar (7Kb)
jtool-patch-1.1-1.2.jar (4Kb)
jtool-patch-1.2-1.3.jar (17Kb)
jtool-patch-1.3-1.4.jar (11Kb)

jtool-2.0.jar (3.4Mb)
jtool-patch-2.0-2.1.jar (22Kb)
jtool-patch-2.1-2.2.jar (13Kb)
Using this method means that each patch is as small as possible. However, the user's classpath can become polluted by a long series of patch files which must be given in the correct order.

An alternative technique is to have each patch contain all changes since the last major release. The user has only the most recent patch file in their classpath, but each patch becomes larger than the last. A typical sequence of major release and patch Jar names might then be:

jtool-1.0.jar (2.2Mb)
jtool-patch-1.0-1.1.jar (7Kb)
jtool-patch-1.0-1.2.jar (11Kb)
jtool-patch-1.0-1.3.jar (28Kb)
jtool-patch-1.0-1.4.jar (39Kb)

jtool-2.0.jar (3.4Mb)
jtool-patch-2.0-2.1.jar (22Kb)
jtool-patch-2.0-2.2.jar (35Kb)

Whichever method suits your project, it is clear that a very careful accounting must be kept of code changes, obfuscation log-files and shipped versions of your software. A robust and flexible version control system such as CVS (available as free software from Cyclic Software) is recommended for this purpose.

In addition, it is recommended that a comprehensive suite of automated tests are developed for your product. This is always desirable, but is particularly helpful in testing patches. The suite can be used to exercise the unobfuscated, the obfuscated, and the patched versions of the product prior to shipping. Clearly, results should be identical in each case.

Addition entry points: A patch can be used to introduce one or more new entry points into your product. This will, as usual, require the listing of these entry points in the RGS file used during obfuscation. Since the log-file from the prior version is used as the RGS script file during incremental obfuscation, the additional .class, .method, .field or .attribute entries should be placed in the log-file before obfuscation.

The existing entries in the log-file should not, under any circumstances, be edited or removed: to do so would cause corruption of the generated patch file. In addition, any new entry points (methods which are to be left unobfuscated) must have been introduced to the code since the previous version. It is not possible to change an existing method from being obfuscated in one version to being unobfuscated in the next.

Inner classes: Since inner classes have priveleged access to their outer classes, a patch containing a class must also contain all of its inner classes. The RGpatch utility takes care of this automatically - inner class names in the LISTFILE are ignored, while the listing of an outer class causes all of its inner classes to be copied to the patch Jar also.

Name conversion utility

It can be useful to convert a list of original class names to their obfuscated form, when processing an obfuscated Jar-archive after using RetroGuard. The utility RGconv has been provided for this conversion.

The command for running RGconv has the form,

java RGconv LOGFILE [ORIG-NAMES [OBF-NAMES]]
where:
  • LOGFILE is the filename of the log output during the obfuscation process;
  • ORIG-NAMES is the filename of a text file containing a list of the unobfuscated names of classes, one per line in the format COM/mycorp/AClass.class, which are to be mapped to obfuscated form (this argument is optional - if absent, the list of names are taken from the standard input (stdin) stream);
  • OBF-NAMES is the target file for the converted, obfuscated class names (this argument is optional - if absent, the list of converted names are written to the standard output (stdout) stream).
If LOGFILE cannot be read, execution is terminated with a warning message.

(Back to contents...)




Copyright © 1999 Retrologic. All Rights Reserved.
Send comments and questions to info@retrologic.com.