"I would like it to produce code like this, which is roughly standard ADO.NET connected db access:"
"You can see that is a pretty drastic translation. What sort of effort are we talking to implement that? And what sort of technology area within gmStudio would you use?"
Following the request supplied by the user as closely as possible, the target translations for these two would be
the following unmigrated translation is produced.
With these changes in place, the unmigrated translations are now as follows.
The remaining work involves adding a refactoring specification and finally gmSL code.
The character string, which consists of a series of "name=value;" pairs has been changed to remove the Driver pair and the DSN pair.
The first question that has to be asked is "How do we determine which character strings need to be changed?". In an editing context, perhaps all strings could be searched for these pairs and they could be universally removed. This would almost certainly work in this case, but in general changes need to be contingent on their use. The string above needs to be changed because in this later statement
it is being assigned to a connection string. Connection strings in the target environment do not have these two attributes. A search needs to be made of assignments to the ConnectionString property, and the strings used in these assignments need to be edited, if possible and necessary.
Transformation is not a process that works with source code or target code. It works with the intermediate code produced by the compiler. After the compiler has completed and after the analyser has finished the standard migrations built into gmBasic, references in the intermediate code to components in external libraries are scanned looking for those that have user supplied code registered for them. The purpose of that user code is to change that intermediate code so that it performs the target operation as opposed to its original source operation.
Once a statement that requires transformation has been identified, the first step is to examine the intermediate code currently produced for it. This can be seen in an audit report of the vbi file produced by the translation script or can be produced on-the-fly using the Opcode.DumpCode method. The intermediate code for the conectionString assignment statement is as follows.
It can be seen that the external library component receiving the content of the variable dbs is Component:Connect:50673. Looking this component up in the symbol audit shows that it is
Starting the actual transformation process, then, will involve writing a gmSL method that will be notified of all code references to Lib_Property RDO._rdoConnection.Connect. The purpose of that method will be to locate any strings being assigned to that property and to edit them to remove any unwanted attributes.
The namespace for these methods is the event name used in the refactor statement and the class for the methods is Transform. Remember that in real migration projects many different libraries and codes are being migrated; therefore, careful naming conventions are necessary. The actual gmSL code could be embedded within the gmSl statement; however, there are "intellisense" editors available for files that have the gmsl extension, so keeping this code separate makes it easier to author and maintain it.
For now the file msrdo20Transform.gmsl is simply as follows.
It simply logs a message to the translation log file and returns a zero indicating that no change has been made by the method.
The names of transpose methods are an "underline converted" form of the host relative identifier of the component whose reference code is to be transformed. Underline conversion changes all periods in the identifier to an underscore and changes all underscores in the identifier to double underscores. In this case the component identifier is RDO._rdoConnection.Connect. Making it host relative removes the leading RDO. and doing the underline conversion makes it __rdoConnection_Connect. This conversion is necessary to create unique but well-formed method identifiers. After the gmSL file is compiled, gmBasic scans the refactor host for components that match the underscore converted methods, sets their hasCodeHandler property True, and sets their migTransform member equal to the root of the transform method.
All transform methods have three parameters. The parameter subRoot is the root of the source code component that references the component being transformed. The parameter iStart is a starting code location that marks where the referencing code began. Its exact value will vary by the type of reference. The parameter iRefer is the code location of the actual reference to the transform component. To bring this into focus, the initial version of the transform method merely logs the content of these parameters. Running the translation script with Progress="1" produces the following log.
Focusing first on the actual message produced by the transform method, it confirms what was seen in the code dump earlier. The method connectDB contains a reference to RDO._rdoConnection.Connect at code offset 89 and that a good starting point is at offset 77 which in this case is the offset of the reference to the variable dbs whose content will eventually be evaluated and modified.
The log more importantly brings out the integration between the gmSL transpose capability and the overall processing of the translation script. Translation, when it becomes migration, is a very complex, iterative process. Things go wrong. Things do not work as expected. The translation produces a log file that describes what happens, and it also produces a vbi file that contains all of the detailed information about how the source code was migrated and what it was migrated into. When things go wrong it is important to have available an exact representation of the logic used to do the migration in the vbi file that actually produced it. Note in lines 05 and 06 above that the transpose class is being compiled in the same manner as the source code. All code associated with it is in the vbi file where it can be audited and examined in precisely the same manner as any other code. The translation produced is identical to the one before the refactor section was added, but the vbi is different. First the transform method itself has been added into the symbol table.
The source code for the method can be viewed if the EchoInput Select attribute is turned on.
An actual code dump of the transpose method can be examined.
Note that the gmSL, though it has a very different syntax than VB6, uses the identical gmIL operations. Finally, the actual entries for the migrated component have been updated to mark it has having a code handler whose offset is 85249 which is this method.
Though not in this sample, there could certainly be other references to the Connect property that are not assignments from local variables. So the first step is to verify that this is a local variable assignment to the property.
Remember that the gmSL itself is stored in the same overall structure as the user code; therefore, the first two calls in almost every transpose method use the Opcode methods GetCode and GetLength that reference the user code and not the running code. Next the method checks that the iStart parameter is a reference to a local variable and that the property reference is an assignment. From the dump the expected code sequence is.
Note that the call that checked for the local variable assignment also trapped its root in the localVar. The next step is to look for a preceding assignment to this local variable.
The code for the RefactorCode_FindAssign method is shown below. If it returns a nonzero value, then that value is the code offset of the value being assigned to the variable. The next step is to obtain the actual string value being assigned to the local variable.
The method Opcode.GetString is passed the starting and ending offset of code that may produce a string constant when it is executed. The actual code being passed to GetString is
The method Opcode.GetString literally executes this code, even though it is user code, using the same engine as is used to execute the gmSL code. Since this code is being executed at compile time, it may not be possible to resolve it into a string -- if it contains variable references. If it can resolve it, it returns the string; if not, it returns a null-string. In this case the variable connect contains a resolved connection string which it can edit by removing the attribute-values pairs that are not to be used.
The final step is to replace the old string expression with the revised string. Note that this replacement may well shift code that precedes the referencing code. The calling method that is scanning for transform references needs to be told that this has happened. A non-zero return value tells the scanner that the method has made a change in the code and that scanning should resume at the indicated code location.
Moving now to the method RefactorCode_FindAssign, note that it does not contain an underline converted identifier so it is simply private to this class. Its parameters are the root offset of the variable for which an assignment is sought and the code location that the assignment must precede.
It consists of a simple scan from the front of the user method code for an assignment to the indicated variable. If found it returns the starting offset of the expression that defines the value being assigned.
The method RefactorCode_ReplaceAssign deletes the original code in the expression and then inserts a reference to the replacement string.
When doing code substitution, the most difficult step is deciding how much code should be deleted or inserted. In this case, when nDelete is computed, iEnd contains the location immediately after the ARG operation that closes the expression code and iAssign contains the location immediately after the LEV operation that opens the expression code. Thus, iEnd - iAssign needs to be offset by the size of the ARG operation which is needed in the new expression and by the size of the LSC operation which will load the new string.
Running this new code does now produce the desired change as the following file comparison shows.
As with the previous string change requirement the fact that this string should change is determined from the fact that the variable SQL is used as the second argument to the method RDO._rdoConnection.CreateQuery.
Though the reference pattern is different and the actual editing is different, the logic of the transform method is about the same as the logic of the RDO._rdoConnection.Connect method. The name of the method is now constructed to refer to CreateQuery, the parameters are the same three. The declaration of the method is followed by the declarations of the local variables.
The first step is to make certain that this reference is a valid call to the method whose second argument is a variable reference. The code is a bit long but straightforward.
The highlighted code shows where the root of the SQL variable in sqlVar is determined. The second step is to find the preceding assignment to this variable.
The third step is to obtain the actual string value being assigned to the local variable.
If there is a constant query string, then the fourth step is to relace the "?"s with @index and if necessary replace it in the code.
Running this code causes the desired change in the translation as the following file comparison shows.
In addition to having to change the content of the query string, the actual call to the method needs to be changed into a combined method call followed by a propery assignment. The actual difference is
The easiest way to acheive these types of changes is to invent a new method that reflects the revised method and then to associate with that method the final migPattern. These new methods are placed in a migClass defined within the refactor section. The added declaration is
Note that it must precede the gmSL statement as the gmSL code references it. A new section of code can now be added to the transform method that scans forward in the referencing code, removes the old method calling operations and set command code, and replaces it with a reference to the pattern defined above.
Note that the statement that gets the root offset of the new method is highlighted. The comparison log shows that the code acheived the desired result.
The compiler processes this by generating a generic COL.Item operation that must be replaced by a pattern string.
This pattern is again stored in the migClass DotNet. The code for the transform then is simply
It checks for the needed operation, finds the root of the new pattern variable, replaces the old operation code with the new reference to the pattern and returns the new code reference scan location. As the following change log shows, both instances of the rdoParameters where changed.
In the migrated version, its scope is limited, as it is declared and opened in a using statement.
An important note here is that the target form of the OpenResultset has been migrated to ExecuteReader; however, with the symbol table it still has its source name which must be used. Trying to transform something like rdoPreparedStatement_ExecuteReader would not work.
The transform method itself begins in the standard way with the required declaractions.
The first set makes certain that the expected type of reference is present and it obtains the root of the Results variable.
The second step is to set the DeadCode property of the variable to True. Doing this blocks the declaration of the variable in the list of local variables.
And finally the third step changes the CMD.Set into an IFS.Using operation. This operation requires a type as well as a variable reference. The operation TYV.root displays the type of a root; so it is inserted into the code as well.
The file comparison shows that the declaration has been removed as well as the following desired change.
The using statement enters a new indentation level into the code; therefore, all the statements below it are shifted. In fact the translation log now shows this warning
The end of the using block must be found as well and entered into the migrated code.
The Opcode.CommentOut method finds the end of the statement containing the Close method reference and replaces it with a CMT.Delete operation which will delete the entire statement from the target code. It returns the code offset of that CMT operation. The method then inserts an IFS.EndUsing operation after the CMT. This achieves the desired result.
In the current unmigrated code the while loop checks for an end-of-file while the target code performs the actual read. Using the types of techniques used earlier, the simple approach is to change the migName of the EOF to Read() using gmPL.
Then the transform method for the property can simply check for the NOT operation and remove it. This is what the actual reference code looks like.
The transpose method then merely checks for the pattern and removes the NOT if present. The code is straight forward.
Checking the change log shows that the combination of the new migName and the removal of the NOT acheived the correct result.
Migrations that combine noncontingent renaming with contingent code modification are referred to as "shallow" transforms. The technology used by gmBasic is derived from the field of transformational grammar. The meaning of sentences is referred to as "deep structure", and the representation of the sentence as uttered is referred to as "surface structure". Rules that mix these two levels are called "shallow" and should generally be avoided. In our sample code, the only reference to the "EOF" property is in that while clause where the renaming to "Read()" is valid. But in other contexts, the transform would fail to apply, but the "EOF" would still be changed "Read()" -- certainly causing bad code.
In places such as this, shallow transforms are fine, but beware of them in larger scale migrations where they can introduce problems. A more complex approach would introduce a DotNet method Read and then do a contingent replacement.
This difference involves the references to two properties rdColumns and to Value both of which can simply be removed. The actual references can be seen in the code audit.
Except for the names of the methods they are identical.
The change log shows that these changes produced the desired result.
This demonstration migration has now been completed,