This page introduces concepts of translation tuning.
One of the most frequently asked questions we hear in our migration services work is: "how much will we have to do by hand to finish the job?" It is an important question, and one that can only be answered by asking and answering a few more questions:
The reason for the first question is to help you understand and articulate your target architecture requirements. These requirements define how you will use .NET and will have a big impact on the migration effort.
The reason for the second question is to help you understand that doing things "by hand" with gmStudio can mean one of two things:
Better means "more correct" in terms or reproducing the functions of the legacy application" and "more conformant" in terms of following design and coding standards for the target platform. In many cases, you will find that it is more efficient to invest in configuring the translator to produce "better" code before spending time fixing translations by hand. This is particularly true if you have to migrate a large and active codebase.
Manual Design + Automated Implementation
Doing things by tool requires the same technical design work that is required when doing things by hand. Furthermore, you will typically fine-tune the .NET codes by hand in your favorite .NET IDE where you have access to features like intellisense. Once you have the details worked out, you should implement the design rules in the translation configuration so they can be applied in a repeatable and systematic manner across the codebase.
The claim is made that the process of rehosting VB6/ASP/COM source code in .NET via translation to C# or VB.NET is refactoring. Martin Fowler in "Refactoring, Improving The Design of Existing Code" says:"Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure". First, it is clear that translating a source code to a new language and then moving it to a new operating system constitutes "changing a software system" and, if done correctly, "improves its internal structure". The problem, of course, is "does not alter the external behavior". This is hard to achieve in old applications running under .NET; the user interface components have changed, error handling has changed, security has changed, data access has changed, and so on. There does, however, seem to be a mutually agreed upon notion of "Functional Equivalence", whose meaning will not be expanded here. When you have it you know it, but never stop testing -- enough said.
Different Refactoring operations fall on a spectrum from "Shallow" to "Deep":
gmStudio facilitates the entire spectrum of user-defined refactoring and correctly propagates refactoring changes consistently across the appropriate scope of code.
This topic introduces the techniques for customizing the translation process so it generates codes that are more correct and conformant to your coding standards.
The simplest form of user-defined refactoring is adding translation options to the Translation Script. The translation options are simple statements (e.g., one-liners). There are three types of translation option statements:
Translation options can be used to direct many aspects of the translator's behavior:
In general, the Great Migrations translators operate at the semantic level; however the system also includes an "editor" that works directly on the surface form of the source and target codes. The editor is a command-driven, multi-line search and replace facility. The rationale for the editor is discussed here. All edits must be explicitly defined in the Translation Scripts.
Sometimes a block of VB6 code is "too creative", archaic, or just plain wrong and must be changed in order to facilitate a clean translation. Source Edits allow you to do this in a repeatable, documented, automated manner. Source Edits can search and replace code, delete code, or comment things out. You can also use Source Edits to delete code or controls altogether, but be sure you also remove all references to any deleted identifiers (See also refactor/remove). Source edits are done after all the code is read in from an original source file and they do not modify that original file.
... <Compile Project="%SrcPath%"> <Fix host="Project1" name="Pre-Edit"> <Replace status="active" name="remove unusual use of &"> <OldBlock><![CDATA[Variant = &0)]]></OldBlock> <NewBlock><![CDATA[Variant = 0)]]></NewBlock> </Replace> </Fix> ... </Compile>
<GlobalImports> <Storage Action="Create" Identifier="%UserFolder%\GlobalSettings" /> <Registry type="EditFile" Source="%VirtualRoot%\INCLUDES\companyUsers\companyUserPreProc.asp"> <Fix name="Pre-Edit"> <Replace status="active" name="remove unusual use of &"> <OldBlock><![CDATA[Variant = &0)]]></OldBlock> <NewBlock><![CDATA[Variant = 0)]]></NewBlock> </Replace> </Fix> </Registry> ... </GlobalImports>
Target Edits are search and replace operations applied to the target code before it is authored. You can modify, add, or remove blocks of target code almost anywhere in the output code stream, including designer code and project files. There is a variation of the replace command that allows you to replace an entire file with a hand-written version.
<Author...> <Author name="%MigName%"> <Fix host="[%VirtualRoot%\includes\theFunctions.asp]"> <Replace name="add forTemp to avoid naming conflict"> <OldBlock><![CDATA[ foreach(string value in arTransfer) { ]]></OldBlock> <NewBlock><![CDATA[ foreach(string forTemp in arTransfer) { value = forTemp; ]]></NewBlock> </Replace> </Fix> ... </Author>
Most edits are processed by the translation engine, gmBasic.exe, and these do not support regular expressions, but they have other special properties for white-space handling. The edits done by gmBasic.exe will run as long as the replace element has no status attribute or has attribute status="active". gmBasic edits operate on the generated code in memory before it is written to disk by the Author command. Sometime you may find that you want to use a regular expression to modify the generated code. If you find yourself fixing something with a regex replacement, it may indicate a matter that would be handled better by other techniques and you should contact us for assistance. However, as a short term work around, we offer a special type of post-edit using status="regex". These edits are performed by the gmStudio IDE and they are done by editing the generated code bundle file after it is written by gmBasic.
An example of a regex fix is shown below:
<Author ...> ... <Fix host="" name="Post-Edit"> <Replace name="correct include tag malformation" status="regex"> <OldBlock><![CDATA[runat="server" />\r\n; %>]]></OldBlock> <NewBlock><![CDATA[runat="server" /> ]]></NewBlock> </Replace> <Replace name="correct include tag malformation" status="regex"> <OldBlock><![CDATA[;\r\n<inc:]]></OldBlock> <NewBlock><![CDATA[; %> <inc:]]></NewBlock> </Replace>
Within gmStudio many of the changes made to the code as it moves from its source form into its ultimate .NET target code form can be most easily formulated as refactoring operations. These include the following types of transformations:
Note that refactoring operations involve manipulation of the symbol table and of the semantic pseudocode produced. They do not directly reference the source or target code. To reemphasize this point, refactoring tends to affect all occurrences of a specific "type of" symbol across the entire codebase. Edits make individual changes based on the specific instances of source or target symbols the code.
The rule of thumb for using a refactoring operation as opposed to an edit operation is that the specification for the final recipient of the change is a symbol type as opposed to a line/block of code. Both techniques are useful and the selection of which approach to use will clearly vary by individual and application.
gmStudio allows you to translate code that used COM types to code that uses .NET types. The following rewriting operations can be automated:
More details on this process are described in Custom COM Replacement.
gmStudio allows you to customize how VB6 statements and project are expressed in .NET. The general process for doing this is
More details on this process are described in Custom VB6 Language Replacement.
Examine the WPF Sample to see how a custom language replacement, COM replacement, and gmSL scripts can be used for extremely advanced transformation:VB6 Forms to xaml and WPF