Difference between revisions of "Descript"

From DoomWiki.org

[checked revision][checked revision]
(Wikify and partially convert language to encyclopedic style (original content was retained with author's permission))
m (Important notes: Cannot have literal = in a template argument)
Line 103: Line 103:
 
In a few rare cases, two different source code fragments will compile to identical object code, with the result that Descript will choose one of the source representations. A classic example is if an {{c|if()}} statement occurs inside a {{c|do}} loop, and there is a {{c|continue}} statement inside the {{c|if()}} with nothing following it, Descript will generate an {{c|else}} statement instead of the "continue", meaning the same thing and generating the same object code.
 
In a few rare cases, two different source code fragments will compile to identical object code, with the result that Descript will choose one of the source representations. A classic example is if an {{c|if()}} statement occurs inside a {{c|do}} loop, and there is a {{c|continue}} statement inside the {{c|if()}} with nothing following it, Descript will generate an {{c|else}} statement instead of the "continue", meaning the same thing and generating the same object code.
  
The ACC compiler appears to get upset at the unary minus operator cropping up in expressions containing a multiplication or division. Thus the expression {{c|var0 = var1 * -5}} would not compile, but {{c|var0 = var1 * (-5)}} would. Thus Descript will put brackets around unary operators in this situation. The modulus ({{c|%}}) operator is also handled in the same manner.
+
The ACC compiler appears to get upset at the unary minus operator cropping up in expressions containing a multiplication or division. Thus the expression {{c|var0 = var1 * -5}} would not compile, but {{c|var0 = var1 * (-5)}} would. Thus Descript will put brackets around unary operators in this situation. The modulus ({{c|%}}) operator is also handled in the same manner.
  
 
ACC does not like variable type "str" as a script argument. Therefore, Descript will always display arguments as type "int", but if an argument is used as a string, its name will be "astr".
 
ACC does not like variable type "str" as a script argument. Therefore, Descript will always display arguments as type "int", but if an argument is used as a string, its name will be "astr".

Revision as of 14:27, 4 March 2016

Descript is an ACS decompiler for Hexen. It is a MS-DOS command-line program which, when given three parameters—the wad file, output file, and map number—will create an approximation of the original ACS script. Descript is a freeware utility. It performs the opposite operation of the ACC program created by Raven Software, which compiles a readable ACS script into bytecode.

Introduction

Descript is a full fledged and user friendly Hexen script decompiler which generates human-readable source code from any compiled script. This may take the form of a self-contained object file, or the BEHAVIOR resource in a main or patch WAD file.

Being able to decompile the existing scripts inside HEXEN is a boon to anyone wishing to design their own levels for the game, as it allows them to learn the ACS language and experiment freely without having to do a lot of work up front. Since the source files that Descript produces can be readily recompiled, it is simple to make and test modifications to the scripts for existing levels.

Features

The source files generated by Descript are designed to be highly readable: not only are all string parameters decoded, but also any variables passed as string parameters are automatically shown as type str (unless they are script arguments), and any strings they are assigned to are also shown. Most of the enumerated types (in header file defs.acs) are also fully decoded, again with variable assignments correctly shown. Even the return values from functions such as gametype() and gameskill() are decoded when used directly in conditional or switch statements.

The net result is that the generated script files are usually close in appearance to those that Raven Software used when making the game. However, like all compiled languages, some information in the source is not translated into the object file, with the result that it is permanently lost. For ACS, this applies to:

Variable names

Descript now names variables according to their usage, so for example, a variable used as a "thing ID" will be called "tid". This makes the code even more readable, especially where parameters are being passed into scripts. The full list of decoded types are:

   tid - thing ID on map
   tag - sector tag on map
   line - line identification
   poly - polyobject identification
   thing - thing type (assignments will be decoded)
   str - all strings (assignments will be decoded)
   spare - the variable is not used anywhere
   var - default type, if not decoded

Whenever Descript names a variable, it prefixes it with "world" for world-scope variables, "map" for map scope variables, "a" for arguments and nothing for local variables. In all cases, it adds a suffix number, which is the number of the variable in the object code. Note that all map and script scope variables are listed in ascending numerical order, even if not all are used. For example, if only the 4th map variable is actually used, the first three will be displayed as "mapspare0" through "mapspare2". If this was not done, the recompiled script would differ from the original.

Variables declared but not used

Unless a script or map variable is numerically below one that is used, or forms a script parameter, it will not show up on the decompiled source. This is because only variable usage is compiled; variable declarations are not.

Strings assigned to a variable but not used=

This case is slightly different as the string WILL be compiled into the object code, but Descript will not reproduce it because the variable is never passed to anything that it recognizes as taking a string parameter. This is because strings are stored as handles, with a numerical parameter listing an index of strings stored at the end of the script. The only difference between a string and a number is where it is used.

User defined macros

If the user originally used his own macros (using the #define directive), these will not be reproduced, as they are not present in the object code.

Comments and spacing

These are not compiled, so do not show up.

Advanced features

Descript has the capability to display extended definitions for even greater readability, but because these are non-standard, you also need an extended definitions file. This can be generated with the /g option, and has a fixed name, DSEDEFS.ACS (DeScript Extended DEFinitionS). Then, when decompiling scripts, use the /u option to use extended definitions. The extended definitions currently provided are:

  • Line specials - The setlinespecial() function takes in a special number as its second parameter, which is the same number as the compiled code uses when calling one of these functions directly. Since Descript already decompiles these names, it is a simple matter to decode them in this context, too. All the names are prefixed by "LS_", to distinguish them from the actual special function names. Note that there are a few, such as UsePuzzleItem, which are not listed in SPECIALS.ACS, as they are not suitable for being called directly from within scripts (or it is pointless to do so).
  • Sector sounds - The special Sector_ChangeSound takes in a parameter defining the sound to be selected, as listed in the technical specs. Descript will decode this, prefixing it with type "SS_".
  • Keys and locks - Two specials, Door_LockedRaise and ACS_LockedExecute take in a parameter specifying the key number that the player must hold for the action to take place. These are listed in the technical specs, and prefixed by type "KEY_".
  • Puzzle items - The first parameter of UsePuzzleItem() is decoded to a sensible name, as listed in the Hexen technical specs. However, since this special is unlikely to crop up, because it is not in SPECIALS.ACS, you may think that it is pointless decoding this parameter. Actually, this is quite useful, as Descript also decodes the parameters when passed to setlinespecial().

If Descript finds a setlinespecial() function, with a numerical parameter for the special type, it decodes all subsequent numerical parameters as if they were being passed to the special function directly. This extends readability still further, especially as this is the only place that some specials, such as UsePuzzleItem(), are used.

Note that you only need to generate the DSEDEFS.ACS file once, as it will be available to all subsequent decompilations. You can generate it at the same time as doing a decompilation, or on its own, and you don't have to specify the /u option at the same time. The generation of this file takes place before Descript opens any other files, so it will not affect operation in any other way (though it will slow down execution slightly). Note that you should check that the version of Descript you are using matches that displayed in the DSEDEFS.ACS file, and regenerate it if it does not.

Usage

To use Descript to decompile an object file, or the first BEHAVIOR resource in a WAD, type:

   DESCRIPT <objectfile> <sourcefile>

For example, to decompile the first script in HEXEN.WAD into a source file called HEXLEV1.ACS, type:

   DESCRIPT HEXEN.WAD HEXLEV1.ACS

If you have a multi-level WAD, such as HEXEN.WAD, you can extract any script into a source file, by directly specifying the map number on the end of the line:

   DESCRIPT <wadfile> <sourcefile> <map>

For example, to decompile level MAP08 in HEXEN.WAD, into file HEXLEV8.ACS, type:

   DESCRIPT HEXEN.WAD HEXLEV8.ACS 8

You can find out what levels are present in a WAD file by typing:

   DESCRIPT <wadfile> /l

Descript contains online help, accessed by just typing its name with no parameters, or by specifying the /? or ? switch. Additional debugging information can be printed out by adding the /v switch, which includes a list of all the strings and scripts in the file.

Other options

If a disassembly of the object module is desired rather than a decompilation, use the /a switch. Thus, to get an assembly listing of MAP30 in HEXEN.WAD, into file HEXLEV30.DIS, type:

   DESCRIPT HEXEN.WAD HEXLEV30.DIS 30 /a

If you do not like the nesting being shown with the brackets at the inner nesting level, you can use the /n option, where the brackets will be shown at the outer nesting level.

You can prevent Descript from displaying context sensitive variable names with the /c option, and from decoding variable assignments to enumerated types with the /e option. If you don't want assignments to text strings being displayed, use the /s option. All these are useful if you find that false information is being displayed due to the same variables being used for different purposes in the script.

Important notes

Descript is only guaranteed to work on code produced using the ACC compiler created by Ben Gokey of Raven Software. It relies on the compiler following a fixed set of rules, both in the compiled code and the layout of the ACS resource (it expects all the scripts to be sequentially arranged, followed by the strings, and finally the internal directory). Many changes which would still result in correctly executing code will be fatal for Descript , as its rules would be violated. This is not a design deficiency in Descript; several high level statements can only be reproduced because the low level opcodes appear in an exact sequence.

Descript relies on good programming practices having being used in the original scripts. For example, if a variable is used as both a string and an integer, this may result in Descript displaying false strings, and could result in the code recompiling differently. The enumerated types and context-sensitive variable naming could also cause a problem if the same variable is used for different purposes in different parts of the script, though the code will always recompile the same. If you find this a problem, you can disable variable assignments to enumerated types with the /e option, and context-sensitive variable names with the /c option. If strings assignments are being falsely displayed, you can turn them off with the /s option, though beware that this will seriously affect how the code will recompile. Note that use of the /s or /e switches will also disable the relevant context-sensitive variable names, as they stop variables from being assigned the appropriate types.

There appears to be a minor feature in the ACC compiler. If you create two nested "do" loops without brackets, and put a "continue" statement in the inner statement, the compiler will actually make this jump to the outer loop. This does not result in corrupted source (in fact Descript fixes it), but the decompiled source will compile differently than the original.

In a few rare cases, two different source code fragments will compile to identical object code, with the result that Descript will choose one of the source representations. A classic example is if an if() statement occurs inside a do loop, and there is a continue statement inside the if() with nothing following it, Descript will generate an else statement instead of the "continue", meaning the same thing and generating the same object code.

The ACC compiler appears to get upset at the unary minus operator cropping up in expressions containing a multiplication or division. Thus the expression var0 = var1 * -5 would not compile, but var0 = var1 * (-5) would. Thus Descript will put brackets around unary operators in this situation. The modulus (%) operator is also handled in the same manner.

ACC does not like variable type "str" as a script argument. Therefore, Descript will always display arguments as type "int", but if an argument is used as a string, its name will be "astr".

External links