Morteza Zakerihttps://m-zakeri.github.io/2024-03-10T00:15:00+03:30PhD in Computer ScienceList of My Teaching Courses2024-03-10T00:15:00+03:302024-03-10T00:15:00+03:30Mortezatag:m-zakeri.github.io,2024-03-10:/list-of-my-teaching-courses.html<style>
.button1 {
color: #ffffff;
background-color: #1a55c1;
font-size: 18px;
border: 1px solid #2d63c8;
padding: 20px 50px;
margin: 4px;
cursor: pointer
}
.button1:hover {
color: #2d63c8;
background-color: #ffffff;
}
.button2 {
color: #ffffff;
background-color: #c82d8f;
font-size: 19px;
border: 1px solid #bd2dc8;
padding: 15px 50px;
cursor: pointer
}
.button2:hover {
color: #2d63c8;
background-color: #ffffff;
}
</style>
<div style="text-align: center;">
<img src="https://capsule-render.vercel.app/api?type=waving&height=200&color=gradient&text=Teaching§ion=header&animation=twinkling&fontColor=Brown&textBg=false"/>
<a href="https://m-zakeri.github.io/AI" target="blank">
<button class="button1" type="button" name="ai">Artificial Intelligence</button>
</a>
<a href="https://m-zakeri.github.io/Compilers" target="blank">
<button class="button1" type="button" name="compiler">Compilers</button>
</a>
<a href="https://m-zakeri.github.io/DatabaseDesign" target="blank">
<button class="button1" type="button" name="db">DatabaseDesign …</button></a></div><style>
.button1 {
color: #ffffff;
background-color: #1a55c1;
font-size: 18px;
border: 1px solid #2d63c8;
padding: 20px 50px;
margin: 4px;
cursor: pointer
}
.button1:hover {
color: #2d63c8;
background-color: #ffffff;
}
.button2 {
color: #ffffff;
background-color: #c82d8f;
font-size: 19px;
border: 1px solid #bd2dc8;
padding: 15px 50px;
cursor: pointer
}
.button2:hover {
color: #2d63c8;
background-color: #ffffff;
}
</style>
<div style="text-align: center;">
<img src="https://capsule-render.vercel.app/api?type=waving&height=200&color=gradient&text=Teaching§ion=header&animation=twinkling&fontColor=Brown&textBg=false"/>
<a href="https://m-zakeri.github.io/AI" target="blank">
<button class="button1" type="button" name="ai">Artificial Intelligence</button>
</a>
<a href="https://m-zakeri.github.io/Compilers" target="blank">
<button class="button1" type="button" name="compiler">Compilers</button>
</a>
<a href="https://m-zakeri.github.io/DatabaseDesign" target="blank">
<button class="button1" type="button" name="db">DatabaseDesign</button>
</a>
<a href="https://m-zakeri.github.io/CP" target="blank">
<button class="button1" type="button" name="compiler">Computer Programming </button>
</a>
<a href="https://m-zakeri.github.io/advanced-software-engineering.html#advanced-software-engineering" target="blank">
<button class="button1" type="button" name="ase">Advanced Software Engineering</button>
</a>
<a href="https://m-zakeri.github.io/dynamic-complex-network.html#dynamic-complex-network" target="blank">
<button class="button1" type="button" name="dcn">Dynamic Complex Networks</button>
</a>
<a href="https://m-zakeri.github.io/game-theory.html#game-theory" target="blank">
<button class="button1" type="button" name="gt">Game Theory</button>
</a>
<a href="https://webpages.iust.ac.ir/morteza_zakeri/repo/iust_course_materials" target="blank">
<button class="button2" type="button" name="gt">Find more</button>
</a>
</div>A gentle introduction to search-based software refactoring2022-05-05T00:45:00+04:302022-05-05T00:45:00+04:30Mortezatag:m-zakeri.github.io,2022-05-05:/a-gentle-introduction-to-search-based-software-refactoring.html<p>Finding the best sequence of the refactoring operation to ab applied to a software system is an optimization problem. It can be solved by search techniques in the field known as search-based software engineering (<span class="caps">SBSE</span>). In this approach, refactorings are applied stochastically to the original software solution, and then the software is measured using a fitness function consisting of one or more software quality measures. Unfortunately, there is no technical document describing an implementation of decent search-based refactoring. In this tutorial, I am going to explain the implementation of search-based refactoring at the source code level from scratch.</p><p>I will be back to complete this tutorial soon.</p>CodART: Automated Source Code Refactoring Toolkit2022-05-02T23:58:00+04:302022-05-02T23:58:00+04:30Mortezatag:m-zakeri.github.io,2022-05-02:/codart-automated-source-code-refactoring-toolkit.html<p>Refactoring engines are tools that automate the application of refactorings: first, the user chooses a refactoring to apply, then the engine checks if the transformation is safe, and if so, transforms the program.</p><p><strong>Abstract—</strong> Software refactoring is performed by changing the software structure without modifying its external behavior. Many software quality attributes can be enhanced through the source code refactoring, such as reusability, flexibility, understandability, and testability. Refactoring engines are tools that automate the application of refactorings: first, the user chooses a refactoring to apply, then the engine checks if the transformation is safe, and if so, transforms the program. Refactoring engines are a key component of modern Integrated Development Environments (IDEs), and programmers rely on them to perform refactorings. In this project, an open-source software toolkit for refactoring Java source codes, namely CodART, will be developed. <span class="caps">ANTLR</span> parser generator is used to create and modify the program syntax-tree and produce the refactored version of the program. To the best of our knowledge, CodART is the first open-source refactoring toolkit based on <span class="caps">ANTLR</span>.</p>
<p><strong>Index Terms:</strong> Software refactoring, refactoring engine, search-based refactoring, <span class="caps">ANTLR</span>, Java.</p>
<h2>1 Introduction</h2>
<p><strong>R</strong>efactoring is a behavior-preserving program transformation that improves the design of a program. Refactoring engines are tools that automate the application of refactorings. The programmer need only select which refactoring to apply, and the engine will automatically check the <em>preconditions</em> and apply the transformations across the entire program if the preconditions are satisfied. Refactoring is gaining popularity, as evidenced by the inclusion of refactoring engines in modern IDEs such as <a href="https://www.jetbrains.com/idea/">IntelliJ <span class="caps">IDEA</span></a>, <a href="http://www.eclipse.org">Eclipse</a>, or <a href="http://www.netbeans.org">NetBeans</a> for Java.</p>
<p>Considering the <em>EncapsulateField</em> refactoring as an illustrative example. This refactoring replaces all references to a field with accesses through setter and getter methods. The <em>EncapsulateField</em> refactoring takes as input the name of the field to encapsulate and the names of the new getter and setter methods. It performs the following transformations:</p>
<ul>
<li>Creates a public getter method that returns the field’s value, </li>
<li>Creates a public setter method that updates the field’s value,
to a given parameter’s value,</li>
<li>Replaces all field reads with calls to the getter method,</li>
<li>Replaces all field writes with calls to the setter method,</li>
<li>Changes the field’s access modifier to private.</li>
</ul>
<p>The <em>EncapsulateField</em> refactoring checks several preconditions, including that the code does not already contain accessor methods and that these methods are applicable to the expressions in which the field appears. Figure 1 shows a sample program before and after encapsulating the field <code>f</code> into the <code>getF</code> and <code>setF</code> methods.</p>
<p><img alt="Figure 1. Example EncapsulateField refactoring" src="../static/img/codart_example.png"></p>
<p><em>Figure 1. Example EncapsulateField refactoring</em></p>
<p>Refactoring engines must be reliable. A fault in a refactoring engine can silently introduce bugs in the refactored program and lead to challenging debugging sessions. If the original program compiles, but the refactored program does not, the refactoring is obviously incorrect and can be easily undone. However, if the refactoring engine erroneously produces a refactored program that compiles but does not preserve the semantics of the original program, this can have severe consequences. </p>
<p>To perform refactoring correctly, the tool has to operate on the syntax tree of the code, not on the text. Manipulating the syntax tree is much more reliable to preserve what the code is doing. Refactoring is not just understanding and updating the syntax tree. The tool also needs to figure out how to rerender the code into text back in the editor view, called code transformation. All in all, implementing decent refactoring is a challenging programming exercise, required compiler knowledge. </p>
<p>In this project, we develop CodART, a toolkit for applying a given refactoring on the source code and obtain the refactored code. To this aim, we will use <span class="caps">ANTLR</span> [1] to generate and modify the program syntax tree. CodART development consists of two phases: In the first phase, <strong>47 common refactoring operations</strong> will be automated, and in the second phase, an algorithm to find the best sequence of refactorings to apply on a given software will be developed using many-objective search-based approaches.</p>
<p>The rest of this <em>white-paper</em> is organized as follows. Section 2 describes the refactoring operations in detail. Section 3 explains code smells in detail. Section 4 briefly discusses the search-based refactoring techniques and many-objective evolutionary algorithms. Section 5 explains the implementation details of the current version of CodART. Section 6 lists the Java project used to evaluate CodART. Section 7 articulates the proposals that existed behind the CodART projects. Finally, the conclusion and future works are discussed in Section 8.</p>
<h2>2 Refactoring operations</h2>
<p>This section explains the refactoring operations used in the project. A catalog of 72 refactoring operations has been proposed by Fowler [2]. We called this refactorings atomic refactoring operations. </p>
<p>Each refactoring operation has a definition and is clearly specified by the entities in which it is involved and the role of each. Table 1 describes the desirable refactorings, which we aim to automate them. It worth noting that not all of these refactoring operations are introduced by Fowler [2]. A concrete example for most of the refactoring operations in the table is available at <a href="https://refactoring.com/catalog/">https://refactoring.com/catalog/</a>. Examples of other refactorings can be found at <a href="https://refactoring.guru/refactoring/techniques">https://refactoring.guru/refactoring/techniques</a> and <a href="ttps://sourcemaking.com/refactoring/refactorings">https://sourcemaking.com/refactoring/refactorings</a>. </p>
<p><em>Table 1. Refactoring operations</em></p>
<table>
<thead>
<tr>
<th>Refactoring</th>
<th>Definition</th>
<th>Entities</th>
<th>Roles</th>
</tr>
</thead>
<tbody>
<tr>
<td>Move class</td>
<td>Move a class from a package to another</td>
<td>package class</td>
<td>source package, target package moved class</td>
</tr>
<tr>
<td>Move method</td>
<td>Move a method from a class to another.</td>
<td>class method</td>
<td>source class, target class moved method</td>
</tr>
<tr>
<td>Merge packages</td>
<td>Merge the elements of a set of packages in one of them</td>
<td>package</td>
<td>source package, target package</td>
</tr>
<tr>
<td>Extract/Split package</td>
<td>Add a package to compose the elements of another package</td>
<td>package</td>
<td>source package, target package</td>
</tr>
<tr>
<td>Extract class</td>
<td>Create a new class and move fields and methods from the old class to the new one</td>
<td>class method</td>
<td>source class, new class moved methods</td>
</tr>
<tr>
<td>Extract method</td>
<td>Extract a code fragment into a method</td>
<td>method statement</td>
<td>source method, new method moved statements</td>
</tr>
<tr>
<td>Inline class</td>
<td>Move all features of a class in another one and remove it</td>
<td>class</td>
<td>source class, target class</td>
</tr>
<tr>
<td>Move field</td>
<td>Move a field from a class to another</td>
<td>class field</td>
<td>source class, target class field</td>
</tr>
<tr>
<td><a href="refactorings/push_down_field.md">Push down field</a></td>
<td>Move a field of a superclass to a subclass</td>
<td>class field</td>
<td>super class, sub classes move field</td>
</tr>
<tr>
<td>Push down method</td>
<td>Move a method of a superclass to a subclass</td>
<td>class method</td>
<td>super class, sub classes moved method</td>
</tr>
<tr>
<td><a href="refactorings/pull_up_field.md">Pull up field</a></td>
<td>Move a field from subclasses to the superclass</td>
<td>class field</td>
<td>sub classes, super class moved field</td>
</tr>
<tr>
<td><a href="refactorings/pull_up_method.md">Pull up method</a></td>
<td>Move a method from subclasses to the superclass</td>
<td>class method</td>
<td>sub classes, super class moved method</td>
</tr>
<tr>
<td><a href="refactorings/increase_field_visibility.md">Increase field visibility</a></td>
<td>Increase the visibility of a field from public to protected, protected to package or package to private</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td><a href="refactorings/decrease_field_visibility.md">Decrease field visibility</a></td>
<td>Decrease the visibility of a field from private to package, package to protected or protected to public</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td><a href="refactorings/make_field_final.md">Make field final</a></td>
<td>Make a non-final field final</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td><a href="refactorings/make_field_non_final.md">Make field non-final</a></td>
<td>Make a final field non-final</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td>Make field static</td>
<td>Make a non-static field static</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td>Make field non-static</td>
<td>Make a static field non-static</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td>Remove field</td>
<td>Remove a field from a class</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td>Increase method visibility</td>
<td>Increase the visibility of a method from public to protected, protected to package or package to private</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Decrease method visibility</td>
<td>Decrease the visibility of a method from private to package, package to protected or protected to public</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Make method final</td>
<td>Make a non-final method final</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Make method non-final</td>
<td>Make a final method non-final</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Make method static</td>
<td>Make a non-static method static</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Make method non-static</td>
<td>Make a static method non-static</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Remove method</td>
<td>Remove a method from a class</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Make class-final</td>
<td>Make a non-final class final</td>
<td>class</td>
<td>source class</td>
</tr>
<tr>
<td>Make class non-final</td>
<td>Make a final class non-final</td>
<td>class</td>
<td>source class</td>
</tr>
<tr>
<td>Make class abstract</td>
<td>Change a concrete class to abstract</td>
<td>class</td>
<td>source class</td>
</tr>
<tr>
<td>Make class concrete</td>
<td>Change an abstract class to concrete</td>
<td>class</td>
<td>source class</td>
</tr>
<tr>
<td>Extract subclass</td>
<td>Create a subclass for a set of features</td>
<td>class method</td>
<td>source class, new subclass moved methods</td>
</tr>
<tr>
<td><a href="refactorings/extract_interface.md">Extract interface</a></td>
<td>Extract methods of a class into an interface</td>
<td>class method</td>
<td>source class, new interface interface methods</td>
</tr>
<tr>
<td>Inline method</td>
<td>Move the body of a method into its callers and remove the method</td>
<td>method</td>
<td>source method, callers method</td>
</tr>
<tr>
<td>Collapse hierarchy</td>
<td>Merge a superclass and a subclass</td>
<td>class</td>
<td>superclass, subclass</td>
</tr>
<tr>
<td>Remove control flag</td>
<td>Replace control flag with a break</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Replace nested conditional with guard clauses</td>
<td>Replace nested conditional with guard clauses</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Replace constructor with a factory function</td>
<td>Replace constructor with a factory function</td>
<td>class</td>
<td>source class</td>
</tr>
<tr>
<td>Replace exception with test</td>
<td>Replace exception with precheck</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Rename field</td>
<td>Rename a field</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td><a href="refactorings/rename_method.md">Rename method</a></td>
<td>Rename a method</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Rename class</td>
<td>Rename a class</td>
<td>class</td>
<td>source class</td>
</tr>
<tr>
<td>Rename package</td>
<td>Rename a package</td>
<td>package</td>
<td>source package</td>
</tr>
<tr>
<td>Encapsulate field</td>
<td>Create setter/mutator and getter/accessor methods for a private field</td>
<td>class field</td>
<td>source class source filed</td>
</tr>
<tr>
<td>Replace parameter with query</td>
<td>Replace parameter with query</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Pull up constructor body</td>
<td>Move the constructor</td>
<td>class method</td>
<td>subclass class, superclass constructor</td>
</tr>
<tr>
<td>Replace control flag with break</td>
<td>Replace control flag with break</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Remove flag argument</td>
<td>Remove flag argument</td>
<td>class method</td>
<td>source class source method</td>
</tr>
<tr>
<td>Total</td>
<td>47</td>
<td>—</td>
<td>—</td>
</tr>
</tbody>
</table>
<h2>3 Code smells</h2>
<p>Deciding when and where to start refactoring—and when and where to stop—is just as important to refactoring as knowing how to operate its mechanics [2]. To answer this important question, we should know the refactoring activities. The refactoring process consists of six distinct activities [9]:</p>
<ol>
<li>
<p>Identify where the software should be refactored.</p>
</li>
<li>
<p>Determine which refactoring(s) should be applied to the identified places.</p>
</li>
<li>
<p>Guarantee that the applied refactoring preserves behavior.</p>
</li>
<li>
<p>Apply the refactoring.</p>
</li>
<li>
<p>Assess the effect of the refactoring on quality characteristics of the software (e.g., complexity, understandability, maintainability) or the process (e.g., productivity, cost, effort).</p>
</li>
<li>
<p>Maintain the consistency between the refactored program code and other software artifacts (such as documentation, design documents, requirements specifications, tests, etc.).</p>
</li>
</ol>
<p><em>Table 2. Code smells</em></p>
<table>
<thead>
<tr>
<th>Code smell</th>
<th>Descriptions and other names</th>
</tr>
</thead>
<tbody>
<tr>
<td>God class</td>
<td>The class defines many data members (fields) and methods and exhibits low cohesion. The god class smell occurs when a huge class surrounded by many data classes acts as a controller (i.e., takes most of the decisions and monopolizes the software’s functionality). Other names: Blob, large class, brain class.</td>
</tr>
<tr>
<td>Long method</td>
<td>This smell occurs when a method is too long to understand and most presumably perform more than one responsibility. Other names: God method, brain method, large method.</td>
</tr>
<tr>
<td>Feature envy</td>
<td>This smell occurs when a method seems more interested in a class other than the one it actually is in.</td>
</tr>
<tr>
<td>Data class</td>
<td>This smell occurs when a class contains only fields and possibly getters/setters without any behavior (methods).</td>
</tr>
<tr>
<td>Shotgun surgery</td>
<td>This smell characterizes the situation when one kind of change leads to many changes to multiple different classes. When the changes are all over the place, they are hard to find, and it is easy to miss a necessary change.</td>
</tr>
<tr>
<td>Refused bequest</td>
<td>This smell occurs when a subclass rejects some of the methods or properties offered by its superclass.</td>
</tr>
<tr>
<td>Functional decomposition</td>
<td>This smell occurs when the experienced developers coming from procedural languages background write highly procedural and non-object-oriented code in an object-oriented language.</td>
</tr>
<tr>
<td>Long parameter list</td>
<td>This smell occurs when a method accepts a long list of parameters. Such lists are hard to understand and difficult to use.</td>
</tr>
<tr>
<td>Promiscuous package</td>
<td>A package can be considered promiscuous if it contains classes implementing too many features, making it too hard to understand and maintain. As for god class and long method, this smell arises when the package has low cohesion since it manages different responsibilities.</td>
</tr>
<tr>
<td>Misplaced class</td>
<td>A Misplaced Class smell suggests a class that is in a package that contains other classes not related to it.</td>
</tr>
<tr>
<td>Switch statement</td>
<td>This smell occurs when switch statements that switch on type codes are spread across the software system instead of exploiting polymorphism.</td>
</tr>
<tr>
<td>Spaghetti code</td>
<td>This smell refers to an unmaintainable, incomprehensible code without any structure. The smell does not exploit and prevents the use of object-orientation mechanisms and concepts.</td>
</tr>
<tr>
<td>Divergent change</td>
<td>Divergent change occurs when one class is commonly changed in different ways for different reasons. Other names: Multifaceted abstraction</td>
</tr>
<tr>
<td>Deficient encapsulation</td>
<td>This smell occurs when the declared accessibility of one or more members of abstraction is more permissive than actually required.</td>
</tr>
<tr>
<td>Swiss army knife</td>
<td>This smell arises when the designer attempts to provide all possible uses of the class and ends up in an excessively complex class interface.</td>
</tr>
<tr>
<td>Lazy class</td>
<td>Unnecessary abstraction</td>
</tr>
<tr>
<td>Cyclically-dependent modularization</td>
<td>This smell arises when two or more abstractions depend on each other directly or indirectly.</td>
</tr>
<tr>
<td>Primitive obsession</td>
<td>This smell occurs when primitive data types are used where an abstraction encapsulating the primitives could serve better.</td>
</tr>
<tr>
<td>Speculative generality</td>
<td>This smell occurs where abstraction is created based on speculated requirements. It is often unnecessary that makes things difficult to understand and maintain.</td>
</tr>
<tr>
<td>Message chains</td>
<td>A message chain occurs when a client requests another object, that object requests yet another one, and so on. These chains mean that the client is dependent on navigation along with the class structure. Any changes in these relationships require modifying the client.</td>
</tr>
<tr>
<td>Total</td>
<td>20</td>
</tr>
</tbody>
</table>
<h2>4 Search-based refactoring</h2>
<p>After refactoring operations were automated, we must decide which refactorings souled be performed in order to elevate software quality. The concern about using refactoring operations in Table 1 is whether each one of them has a positive impact on the refactored code quality or not. Finding the right sequence of refactorings to be applied in a software artifact is considered a challenging task since there is a wide range of refactorings. The ideal sequence is, therefore, must correlate to different quality attributes to be improved as a result of applying refactorings. </p>
<p>Finding the best refactoring sequence is an optimization problem that can be solved by search techniques in the field known as Search-Based Software Engineering (<span class="caps">SBSE</span>) [3]. In this approach, refactorings are applied stochastically to the original software solution, and then the software is measured using a fitness function consisting of one or more software metrics. There are various metric suites available to measure characteristics like cohesion and coupling, but different metrics measure the software in different ways, and thus how they are applied will have a different effect on the outcome. </p>
<p>The second phase of this project is to use a many-objective search algorithm to find the best sequence of refactoring on a given project. Recently, many-objective <span class="caps">SBSE</span> approach for refactoring [3]–[5] and remodularization, regrouping a set of classes C in terms of packages P, [6] has gained more attention due to its ability to find the best sequence of refactoring operations which is led to the improvement in software quality. Therefore, we first focus on implementing the proposed approach approaches in [3], [5], [6] as fundamental works in this area. Then, we will improve their approach. As a new contribution, we add new refactoring operations and new objective functions to improve the quality attribute of the software. We also evaluate our method on the new software projects which are not used in previous works.</p>
<h2>5 Implementation</h2>
<p>This section describes implementation details of the CodART. It includes CodART architecture, high-level repository directories structure, refactoring automation with <span class="caps">ANTLR</span> parser generator, and refactoring recommendation through many-objective search-based software engineering techniques. </p>
<h3>5.1 CodART architecture</h3>
<p>The high-level architecture of CodART is shown in Figure 2. The source code consists of several Python packages and directories. We briefly describe each component in CodART. </p>
<p><img alt="CodART_Architecture" src="../static/img/CodART_architecture__v0.1.1.png"></p>
<p><em>Figure 2. CodART architecture</em></p>
<p>I. <code>grammars</code>: This directory contains three <span class="caps">ANTLR4</span> grammars for the Java programming language: </p>
<ol>
<li>
<p><code>Java9_v2.g4</code>: This grammar was used in the initial version of CodART. The main problem of this grammar is that parsing large source code files is performed very slow due to some decisions used in grammar design. We have switched to the fast grammar <code>JavaParserLabled.g4</code>.</p>
</li>
<li>
<p><code>JavaLexer.g4</code>: The lexer of Java fast grammar. This lexer is used for both fast parsers, i.e., <code>JavaParser.g4</code> and JavaParserLabeled.</p>
</li>
<li>
<p><code>JavaParser.g4</code>: The original parser of Java fast grammar. This parser is currently used in some refactoring. In the future release, this grammar will be replaced with <code>JavaPaseredLabled.g4</code>.</p>
</li>
<li>
<p><code>JavaParserLabeled.g4</code>: This file contains the same <code>JavaParsar.g4</code> grammar. The only difference is that the rules with more than one extension are labled with a specific name. The <span class="caps">ANTLR</span> parser generator thus generates separate visitor and listener methods for each extension. This grammar facilitates the development of some refactoring. It is the preferred parser in CodART project.</p>
</li>
</ol>
<p><span class="caps">II</span>. <code>gen</code>: The <code>gen</code> packages contain all generated source code for the parser, lexer, visitor, and listener for different grammars available in the grammars directory. To develop refactorings and code smells, <code>gen.JavaLabled</code> package, which contains <code>JavaParserLabled.g4</code> generated source code, must be used. The content of this package is generated <em>automatically</em>, and therefore it should <em>not</em> be modified <em>manually</em>. Modules within this gen package are just for importing and using in other modules.</p>
<p><span class="caps">III</span>. <code>speedy</code>: The python implementation for <span class="caps">ANTLR</span> is less efficient than Java or C++ implementation. The <code>speedy</code> module implements a Java parser with a C++ back-end, improving the efficiency and speed of parsing. It uses speedy-antlr implementation with some minor changes. The current version of the speedy module use <code>java9_v2.g4</code> grammar, which inherently slow as described. To switch to C++ back-end, first, the speedy module must be installed on the client system. It requires a C++ compiler. We <em>recommended</em> to CodART developers using the Python back-end as switching to C++ back-end would be done transparently in the future release. The Python back-end saves debugging and developing time.</p>
<p><span class="caps">IV</span>. <code>refactorings</code>: The <code>refactorings</code> package is the main package in the CodART project and contains numerous Python modules that form the kernel functionalities of CodART. Each module implements the automation of one refactoring operation according to standard practices. The modules may include several classes which <em>inherit</em> from <span class="caps">ANTLR</span> listeners. Sub-packages in this module contain refactorings, which are in an early step of development or deprecated version of an existing refactoring. This package is under active development and testing. The module in the root packages can be used for testing purposes.</p>
<p>V. <code>refactoring_design_patters</code>: The refactoring_design_pattern packages contain modules that implement refactoring to a specific design pattern automatically. </p>
<p><span class="caps">VI</span>. <code>smells</code>: The smell package implements the automatic detection of software code and design smells relevant to the refactoring operation supported by CodART. Each smell corresponds to one or more refactoring in the refactoring package.</p>
<p><span class="caps">VII</span>. <code>metrics</code>: The metrics packages contain several modules that implement the computation of the most well-known source code metrics. These metrics used to detect code smells and measuring the quality of software in terms of quality attributed. </p>
<p><span class="caps">VIII</span>. <code>tests</code>: The test directory contains individual test data and test cases that are used for developing specific refactorings. Typically, each test case is a single Java file that contains one or more Java classes.</p>
<p><span class="caps">IX</span>. <code>benchmark_projects</code>: This directory contains several open-source Java projects formerly used in automated refactoring researches by many researchers. Once the implementation of refactoring is completed, it will be executed and tested on all projects in this benchmark to ensure the generalization of functionality proposed by the implementation. </p>
<p>X. <strong>Other packages</strong>: The information of other packages will be announced in the future. </p>
<h3>5.2 Refactoring automation</h3>
<p>Each refactoring operation in Table 1 is implemented as an <span class="caps">API</span>, with the refactoring name. The <span class="caps">API</span> receives the involved entities with their refactoring roles and other required data as inputs, checks the feasibility of the refactoring using refactoring preconditions described in [2], performs the refactoring if it is feasible, and returns the refactored code or return null if the refactoring is not feasible.</p>
<p>The core of our refactoring engine is a syntax-tree modification algorithm. Fundamentally, <span class="caps">ANTLR</span> is used to generate and modify the syntax-tree of a given program. Each refactoring <span class="caps">API</span> is an <span class="caps">ANTLR</span> <em>Listener</em> or <em>visitor</em> class, which required argument by its constructor and preform refactoring when call by parse-tree walker object. The refactoring target and input parameters must read from a configuration file, which can be expressed in <span class="caps">JSON</span>, <span class="caps">XML</span>, or <span class="caps">YAML</span> formats.</p>
<p>The key to use <span class="caps">ANTLR</span> for refactoring tasks is the <code>TokenStreamRewriter</code> object that knows how to give altered views of a token stream without actually modifying the stream. It treats all of the manipulation methods as “instructions” and queues them up for lazy execution when traversing the token stream to render it back as text. The rewriter <em>executes</em> those instructions every time we call <code>getText()</code>. This strategy is very effective for the general problem of source code instrumentation or refactoring. The <code>TokenStreamRewriter</code> is a powerful and extremely efficient means of manipulating a token stream.</p>
<h3>5.3 Refactoring recommendation</h3>
<p>A solution consists of a sequence of n refactoring operations applied to different code elements in the source code to fix. In order to represent a candidate solution (individual/chromosome), we use a vector-based representation. Each vector’s dimension represents a refactoring operation where the order of applying these refactoring operations corresponds to their positions in the vector. The initial population is generated by randomly assigning a sequence of refactorings to some code fragments. Each generated refactoring solution is executed on the software system <em>S</em>. Once all required data is computed, the solution is evaluated based on the quality of the resulting design.</p>
<h2>6 Benchmark projects and testbed</h2>
<p>To ensure CodART works properly, we are running it on many real-life software projects.
Refactorings are applied to the software systems listed in Table 3. Benchmark projects may update and extend in the future. For the time being, we use a set of well-known open-source Java projects that have been intensely studied in previous works. We have also added two new Java software programs, <span class="caps">WEKA</span> and <span class="caps">ANTLR</span>, to examine the versatility of CodART performance on real-life software projects. </p>
<p><em>Table 3. Software systems refactored in this project</em></p>
<table>
<thead>
<tr>
<th>System</th>
<th>Release</th>
<th>Previous releases</th>
<th>Domain</th>
<th>Reference</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://github.com/apache/xerces2-j">Xerces-J</a></td>
<td>v2.7.0</td>
<td>—</td>
<td>software packages for parsing <span class="caps">XML</span></td>
<td>[3], [6]</td>
</tr>
<tr>
<td><a href="https://github.com/vuze/vuze-remote-for-android">Azureus</a></td>
<td>v2.3.0.6</td>
<td>—</td>
<td>Java BitTorrent client for handling multiple torrents</td>
<td>[3]</td>
</tr>
<tr>
<td><a href="https://github.com/argouml-tigris-org/argouml">ArgoUML</a></td>
<td>v0.26 and v0.3</td>
<td>—</td>
<td><span class="caps">UML</span> tool for object-oriented design</td>
<td>[3]</td>
</tr>
<tr>
<td><a href="https://github.com/apache/ant">Apache Ant</a></td>
<td>v1.5.0 and v1.7.0</td>
<td>—</td>
<td>Java build tool and library</td>
<td>[3]</td>
</tr>
<tr>
<td><a href="https://github.com/bardsoftware/ganttproject">GanttProject</a></td>
<td>v1.10.2 and v1.11.1</td>
<td>—</td>
<td>project management</td>
<td>[3], [6], [5]</td>
</tr>
<tr>
<td><a href="https://github.com/wumpz/jhotdraw">JHotDraw</a></td>
<td>v6.1 and v6.0b1 and v5.3</td>
<td>—</td>
<td>graphics tool</td>
<td>[6], [5], [4]</td>
</tr>
<tr>
<td><a href="https://github.com/jfree/jfreechart">JFreeChart</a></td>
<td>v1.0.9</td>
<td>—</td>
<td>chart tool</td>
<td>[6]</td>
</tr>
<tr>
<td><a href="https://github.com/svn2github/beaver-parser-generator-v09">Beaver</a></td>
<td>v0.9.11 and v0.9.8</td>
<td>—</td>
<td>parser generator</td>
<td>[5], [4]</td>
</tr>
<tr>
<td><a href="https://ws.apache.org/xmlrpc/">Apache <span class="caps">XML</span>-<span class="caps">RPC</span></a></td>
<td>v3.1.1</td>
<td>—</td>
<td><span class="caps">B2B</span> communications</td>
<td>[5], [4]</td>
</tr>
<tr>
<td><a href="http://jrdf.sourceforge.net/index.html"><span class="caps">JRDF</span></a></td>
<td>v0.3.4.3</td>
<td>—</td>
<td>semantic web (resource management)</td>
<td>[5]</td>
</tr>
<tr>
<td><a href="https://github.com/elharo/xom"><span class="caps">XOM</span></a></td>
<td>v1.2.1</td>
<td>—</td>
<td><span class="caps">XML</span> tool</td>
<td>[5]</td>
</tr>
<tr>
<td><a href="https://github.com/stleary/JSON-java"><span class="caps">JSON</span></a></td>
<td>v1.1</td>
<td>—</td>
<td>software packages for parsing <span class="caps">JSON</span></td>
<td>[4]</td>
</tr>
<tr>
<td><a href="https://github.com/jflex-de/jflex">JFlex</a></td>
<td>v1.4.1</td>
<td>—</td>
<td>lexical analyzer generator</td>
<td>[4]</td>
</tr>
<tr>
<td><a href="https://github.com/jfaster/mango">Mango</a></td>
<td>v2.0.1</td>
<td>—</td>
<td>—</td>
<td>[4]</td>
</tr>
<tr>
<td><a href="https://github.com/ohmrefresh/Weka-Android-3.9.1-SNAPSHOT">Weka</a></td>
<td>v3.9</td>
<td>—</td>
<td>data mining tool</td>
<td>New</td>
</tr>
<tr>
<td><a href="https://github.com/antlr/antlr4"><span class="caps">ANTLR</span></a></td>
<td>v4.8.0</td>
<td>—</td>
<td>parser generator tool</td>
<td>New</td>
</tr>
</tbody>
</table>
<h2>7 CodART in <span class="caps">IUST</span></h2>
<p>Developing a comprehensive refactoring engine required thousand of hours of programming. Refactoring is not just understanding and updating the syntax tree. The tool also needs to figure out how to rerender the code into text back in the editor view. According to a quote by Fowler [2] in his well-known refactoring book: “implementing decent refactoring is a challenging programming exercise—one that I’m mostly unaware of as I gaily use the tools.”</p>
<p>We have defined the basic functionalities of the CodART system as several student projects with different proposals. Students who will take our computer science course, including compiler design and construction, advanced compilers, and advanced software engineering, must be worked on these proposals as part of their course fulfillments. These projects try to familiarize students with the practical usage of compilers from the software engineering point of view.
The detailed information of our current proposals are available in the following links:</p>
<ol>
<li>
<p><a href="./proposals/core_refactorings_development.md">Core refactoring operations development</a> (Fall 2020)</p>
</li>
<li>
<p><a href="./proposals/core_code_smell_development.md">Core code smells development</a> Current semester (Winter and Spring 2021)</p>
</li>
<li>
<p><a href="./proposals/core_search_based_development.md">Core search-based development</a> (Future semesters)</p>
</li>
<li>
<p><a href="./proposals/core_refactoring_to_design_patterns_development.md">Core refactoring to design patterns development</a> (Future semesters)</p>
</li>
</ol>
<p><strong>Note:</strong> Students whose final project is confirmed by the reverse engineering laboratory have an opportunity to work on CodART as an independent and advanced research project. The only prerequisite is to pass the compiler graduate course by Dr. Saeed Parsa.</p>
<h2>8 Conclusion and remarks</h2>
<p>Software refactoring is used to reduce the costs and risks of software evolution.
Automated software refactoring tools can reduce risks caused by manual refactoring, improve efficiency, and reduce software refactoring difficulties. Researchers have made great efforts to research how to implement and improve automated software refactoring tools. However, the results of automated refactoring tools often deviate from the intentions of the implementer. The goal of this project is to propose an open-source refactoring engine and toolkit that can automatically find the best refactoring sequence required for a given software and apply this sequence. Since the tool is work based on compiler principles, it is reliable to be used in practice and has many benefits for software developer companies. Students who participate in the project will learn compiler techniques such as lexing, parsing, source code analysis, and source code transformation. They also learn about software refactoring, search-based software engineering, optimization, software quality, and object-orient metrics. </p>
<h3>Conflict of interest</h3>
<p>The project is supported by the <a href="http://reverse.iust.ac.ir"><span class="caps">IUST</span> Reverse Engineering Research Laboratory</a>.
Interested students may continue working on this project
to fulfill their final bachelor and master thesis or their internship.</p>
<h2>References</h2>
<p>[1] T. Parr and K. Fisher, “<span class="caps">LL</span>(*): the foundation of the <span class="caps">ANTLR</span> parser generator,” Proc. 32nd <span class="caps">ACM</span> <span class="caps">SIGPLAN</span> Conf. Program. Lang. Des. Implement., pp. 425–436, 2011.</p>
<p>[2] <span class="caps">M. K. B.</span> Fowler, Refactoring: improving the design of existing code, Second Edi. Addison-Wesley, 2018.</p>
<p>[3] <span class="caps">M. W.</span> Mkaouer, M. Kessentini, S. Bechikh, M. Ó Cinnéide, and K. Deb, “On the use of many quality attributes for software refactoring: a many-objective search-based software engineering approach,” Empir. Softw. Eng., vol. 21, no. 6, pp. 2503–2545, Dec. 2016.</p>
<p>[4] M. Mohan, D. Greer, and P. McMullan, “Technical debt reduction using search based automated refactoring,” J. Syst. Softw., vol. 120, pp. 183–194, Oct. 2016.</p>
<p>[5] M. Mohan and D. Greer, “Using a many-objective approach to investigate automated refactoring,” Inf. Softw. Technol., vol. 112, pp. 83–101, Aug. 2019.</p>
<p>[6] W. Mkaouer et al., “Many-Objective Software Remodularization Using <span class="caps">NSGA</span>-<span class="caps">III</span>,” <span class="caps">ACM</span> Trans. Softw. Eng. Methodol., vol. 24, no. 3, pp. 1–45, May 2015.</p>
<p>[7] M. Mohan and D. Greer, “MultiRefactor: automated refactoring to improve software quality,” 2017, pp. 556–572.</p>
<p>[8] N. Tsantalis, T. Chaikalis, and A. Chatzigeorgiou, “Ten years of JDeodorant: lessons learned from the hunt for smells,” in 2018 <span class="caps">IEEE</span> 25th International Conference on Software Analysis, Evolution and Reengineering (<span class="caps">SANER</span>), 2018, pp. 4–14.</p>
<p>[9] T. Mens and T. Tourwe, “A survey of software refactoring,” <span class="caps">IEEE</span> Trans. Softw. Eng., vol. 30, no. 2, pp. 126–139, Feb. 2004.</p>
<h4>Related links</h4>
<p><a href="http://parsa.iust.ac.ir/courses/compilers/"><span class="caps">IUST</span> compiler course official webpage</a></p>
<p><span class="caps">ANTLR</span> slides: <span class="caps">PART</span> 1: <a href="http://parsa.iust.ac.ir/download_center/courses_material/compilers/slides/ANTLR_part1_introduction.pdf">Introduction</a></p>
<p><span class="caps">ANTLR</span> slides: <span class="caps">PART</span> 2: <a href="http://parsa.iust.ac.ir/download_center/courses_material/compilers/slides/ANTLR_part2_getting_started_in_Java.pdf">Getting started in Java</a></p>
<p><span class="caps">ANTLR</span> slides: <span class="caps">PART</span> 3: <a href="http://parsa.iust.ac.ir/download_center/courses_material/compilers/slides/ANTLR_part3_getting_started_in_CSharp.pdf">Getting started in C#</a></p>
<h4>Endnotes</h4>
<p>[1] <a href="https://www.jetbrains.com/idea/">https://www.jetbrains.com/idea/</a></p>
<p>[2] <a href="http://www.eclipse.org">http://www.eclipse.org</a></p>
<p>[3] <a href="http://www.netbeans.org">http://www.netbeans.org</a></p>
<p>[4] <a href="https://github.com/mmohan01/MultiRefactor">https://github.com/mmohan01/MultiRefactor</a> </p>
<p>[5] <a href="http://sourceforge.net/projects/recoder">http://sourceforge.net/projects/recoder</a> </p>
<p>[6] <a href="http://reverse.iust.ac.ir">http://reverse.iust.ac.ir</a> </p>Automated refactoring of the Java code using ANTLR in Python2022-05-02T00:30:00+04:302022-05-02T00:30:00+04:30Mortezatag:m-zakeri.github.io,2022-05-02:/automated-refactoring-of-the-java-code-using-antlr-in-python.html<p>Refactoring is a type of program transformation that preserves the program’s behavior. The goal of refactoring is to improve the program’s internal structure without changing its external behavior. In this way, the program quality, defined and measured in terms of quality attributes, is improved. The refactoring process could be automated to reduce the required time and cost and increase the reliability of applied transformation. In this tutorial, I give a short description of how we can automate the refactoring process with <span class="caps">ANTLR</span> in Python.</p><p>Refactoring is a type of program transformation that preserves the program’s behavior. The goal of refactoring is to improve the program’s internal structure without changing its external behavior. In this way, the program quality, defined and measured in terms of quality attributes, is improved. Researchers have recently studied the improvement of different quality attributes through refactoring (Mkaouer et al. 2016; Mohan and Greer 2019).</p>
<p>The refactoring process could be automated to reduce the required time and cost and increase the reliability of applied transformation. Refactoring engines are tools that automate the application of refactorings: first, the user chooses a refactoring to apply, then the engine checks if the transformation is safe and, if so, transforms the program. Refactoring engines are a key component of modern Integrated Development Environments (IDEs), and programmers rely on them to perform refactorings. The programmer need only select which refactoring to apply, and the engine will automatically check the preconditions and apply the transformations across the entire program if the preconditions are satisfied. Refactoring is gaining popularity, as evidenced by the inclusion of refactoring engines in modern IDEs such as IntelliJ <span class="caps">IDEA</span>, Eclipse, or NetBeans for Java.</p>
<p>According to Fowler (Fowler and Beck 2018), the biggest change to refactoring in the last decade is the availability of tools that support automated refactoring. Refactoring engines must be reliable. A fault in a refactoring engine can silently introduce bugs in the refactored program and lead to challenging debugging sessions. If the original program compiles, but the refactored program does not, the refactoring is obviously incorrect and can be easily undone. However, if the refactoring engine erroneously produces a refactored program that compiles but does not preserve the semantics of the original program, this can have severe consequences. Therefore, an automated refactoring tool must operate on the code’s syntax tree, not the text, to perform refactoring correctly. Manipulating the syntax tree is more reliable for preserving the program syntax, semantics, and behavior. For this reason, developing an automated refactoring tool requires deep knowledge of compiler techniques. Fowler (Fowler and Beck 2018) present a catalog of more than 70 refactoring in his book and state that </p>
<blockquote>
<p>implementing decent refactoring is a challenging programming exercise—one that I am not mostly unaware of as I gaily use the tools.</p>
<p><em>Martin Fowler</em></p>
</blockquote>
<h2><span class="caps">ANTLR</span> Background</h2>
<p>Before reading this tutorial, I recommend looking at <a href="antlr_basics.md"><span class="caps">ANTLR</span> basic tutorial</a> where I describe the background of using <span class="caps">ANTLR</span> to generate and walk phase three and implement custom program analysis applications with the help of the <span class="caps">ANTLR</span> listener mechanism.
The most important point is that we used the real-world programming languages grammars to show the parsing and analyzing process. The discussed approach forms the underlying concepts of our approach for automated refactoring. Indeed, we implement appropriate listeners that can perform the actions required to apply each refactoring. <span class="caps">ANTLR</span> provides <code>TokenStreamRewriter</code> class which can manipulate program tokens at specific indices in the program. </p>
<h2>Using <span class="caps">ANTLR</span> for automating refactoring</h2>
<p>The <em>key</em> to using <span class="caps">ANTLR</span> for refactoring tasks is the <code>TokenStreamRewriter</code> class that knows how to give altered views of a token stream without actually modifying the stream. It treats all of manipulation methods as “instructions” and queues them up for lazy execution when traversing the token stream to render it back as text. The rewriter <em>executes</em> those instructions every time we call the <code>getText()</code> method. This strategy is very effective for the general problem of source code instrumentation or refactoring. The <code>TokenStreamRewriter</code> is a powerful and extremely efficient means of manipulating a token stream.</p>
<p>In the remaining sections of this post, I discuss different refactoring techniques and describe the automation of most important refactoring operation in Python based on the <span class="caps">ANTLR</span> library.</p>
<p>Please note that the full implementation of the automation of any refactoring operations contains many details that are too complicated to describe here. Therefore, I only focused on the most important part of the automation process in this chapter. For the interested readers, the full implementation of discussed refactoring can be found on <a href="https://github.com/m-zakeri/CodART">https://github.com/m-zakeri/CodART</a>. </p>
<p>CodART is our recently developed open-source refactoring engine that has automated the application of 16 refactoring operations with <span class="caps">ANTLR</span>. The up-to-date documentation of CodART is available on <a href="https://github.com/m-zakeri/CodART">https://m-zakeri.github.io/CodART/</a>.
Each refactoring operation has a definition and is clearly specified by the entities in which it is involved and the role of each. I ask you to look at <a href="(https://github.com/m-zakeri/CodART)">CodART’s white</a> paper to find a decent introduction to refactoring operation.</p>
<h2>Encapsulate field</h2>
<p>We begin with a simple yet important refactoring, encapsulate field, which provides information hiding as one of the basic principles of the object-oriented design (Booch et al. 2008). The encapsulate field refactoring replaces all references to a field with accesses through setter and getter methods. This refactoring takes as input the name of the field to encapsulate and the names of its enclosing class. It performs the following transformations:</p>
<ul>
<li>Creates a public getter method that returns the field’s value, </li>
<li>Creates a public setter method that updates the field’s value to a given parameter’s value,</li>
<li>Replaces all field reads with calls to the getter method,</li>
<li>Replaces all field writes with calls to the setter method,</li>
<li>Changes the field’s access modifier to private.</li>
</ul>
<p>Figure 1 shows an example of encapsulate field refactoring for field f in class A. </p>
<p><img alt="Figure 1. Example EncapsulateField refactoring" src="../static/img/codart_example.png"></p>
<p><em>Figure 1: Example EncapsulateField refactoring</em></p>
<p>To perform this refactoring automatically, we develop a listener class, <code>EncapsulateFiledRefactoringListener</code>, that implements the aforementioned transformations. The constructor of this class is shown in the following. The class takes an instance of <code>CommonTokenStream</code> class, source class name, and field identifier as input. The first parameter is used to initialize an instance of <code>TokenStreamRewriter</code> class which provides a set of methods to manipulate the syntax (parse) tree. The second and third parameters specify the entity to be refactored. </p>
<div class="highlight"><pre><span></span><code><span class="n">_version_</span> <span class="o">==</span> <span class="s1">'0.1.0'</span>
<span class="n">_author_</span> <span class="o">==</span> <span class="s1">'Morteza Zakeri'</span>
<span class="k">class</span> <span class="nc">EncapsulateFiledRefactoringListener</span><span class="p">(</span><span class="n">JavaParserLabeledListener</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> To implement the encapsulate field refactoring</span>
<span class="sd"> make a public field private and provide </span>
<span class="sd"> accessors and mutator methods.</span>
<span class="sd"> """</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">common_token_stream</span><span class="p">:</span> <span class="n">CommonTokenStream</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">source_class_name</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">field_identifier</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="kc">None</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> :param common_token_stream: contains the program tokens</span>
<span class="sd"> :param source_class_name: contains the enclosing class of the field</span>
<span class="sd"> :param field_identifier: the field name to be encapsulated </span>
<span class="sd"> """</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream</span> <span class="o">=</span> <span class="n">common_token_stream</span>
<span class="bp">self</span><span class="o">.</span><span class="n">source_class_name</span> <span class="o">=</span> <span class="n">source_class_name</span>
<span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span> <span class="o">=</span> <span class="n">field_identifier</span>
<span class="bp">self</span><span class="o">.</span><span class="n">in_source_class</span> <span class="o">=</span> <span class="kc">False</span>
<span class="c1"># Move all the tokens in the source code in a buffer,</span>
<span class="c1"># token_stream_rewriter.</span>
<span class="k">if</span> <span class="n">common_token_stream</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span> <span class="o">=</span> \
<span class="n">TokenStreamRewriter</span><span class="p">(</span><span class="n">common_token_stream</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">TypeError</span><span class="p">(</span><span class="s1">'common_token_stream is None'</span><span class="p">)</span>
</code></pre></div>
<p>The entire refactoring application is performed in four steps: First, we should check whether the parse tree walker is visiting the class that contains the given field or not. We use a <code>flag</code> variable <code>in_source_class</code> to indicate that the walker is entered into the source class. This flag is set to <code>true</code> when <code>enterClassDeclaration</code> method is called and is set back to <code>false</code> when <code>exitClassDeclaration</code> method is called:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">enterClassDeclaration</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">JavaParserLabeled</span><span class="o">.</span><span class="n">ClassDeclarationContext</span><span class="p">):</span>
<span class="k">if</span> <span class="n">ctx</span><span class="o">.</span><span class="n">IDENTIFIER</span><span class="p">()</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">source_class_name</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">in_source_class</span> <span class="o">=</span> <span class="kc">True</span>
<span class="k">def</span> <span class="nf">exitClassDeclaration</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">JavaParserLabeled</span><span class="o">.</span><span class="n">ClassDeclarationContext</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">in_source_class</span> <span class="o">=</span> <span class="kc">False</span>
</code></pre></div>
<p>The second step is to change the access modifier of the field from public to private. We could perform this either when entering or exiting from the <code>fieldDeclartion</code> rule in the Java grammar. It is required to ensure that we modify the given field, not other files in the class or program. The first and second <code>if</code> statements in the following code perform this check. Afterward, the <code>replaceRange</code> method of the <code>token_stream_rewriter</code> is called to replace the “public” modifier token with the “private” modifier token, shown in the following code snippet. </p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">enterFieldDeclaration</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">JavaParserLabeled</span><span class="o">.</span><span class="n">FieldDeclarationContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_source_class</span><span class="p">:</span>
<span class="k">if</span> <span class="n">ctx</span><span class="o">.</span><span class="n">variableDeclarators</span><span class="p">()</span><span class="o">.</span><span class="n">variableDeclarator</span><span class="p">(</span>
<span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">variableDeclaratorId</span><span class="p">()</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">:</span>
<span class="k">if</span> <span class="n">ctx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">modifier</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">==</span> <span class="s1">'public'</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">replaceRange</span><span class="p">(</span>
<span class="n">from_idx</span><span class="o">=</span><span class="n">ctx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">modifier</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">start</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span>
<span class="n">to_idx</span><span class="o">=</span><span class="n">ctx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">modifier</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">stop</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span>
<span class="n">text</span><span class="o">=</span><span class="s1">'private'</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span>
</code></pre></div>
<p>Figure 2 shows the part of the parse tree generated for the code snippet in Figure 1. The parse tree visualization help understand the logic behind the implementation of the <code>enterFieldDeclaration</code> method in the above code. Indeed, we write a fragment of code by observing the position of nodes in the corresponding parse tree. For example, when we enter the <code>fieldDeclartion</code> rule (i.e., when <span class="caps">ANTLR</span> calls the above method), the <span class="caps">ANTLR</span> runtime library provides a ctx object of class <code>FieldDeclarationContext</code>, which contains pointers <code>FieldDeclaration</code> to parent and children. These pointers allow us to move between the different nodes in the parse tree, typically around our main node, which is <code>FieldDeclaration</code> in our example. </p>
<p><img alt="Figure 2. Part of the parse tree generated for the code is snipped in Figure 1." src="../static/img/refactoring/parse-tree-for-encapsulate-field.png"></p>
<p><em>Figure 2: Part of the parse tree generated for the code is snipped in Figure 1 (left).</em></p>
<p>Our goal is to change the “public” token to “private,” which is the direct child of the <code>classOrInterfaceModifier</code> node and the descendant of the modifier node. Therefore, we should access from <code>FieldDeclaration</code> node to one of the modifier or <code>classOrInterfaceModifier</code> nodes. The statement <code>ctx.parentCtx.parentCtx.modifier(0)</code> give us to the first child of the modifier node, i.e., <code>classOrInterfaceModifier</code>. The green arrow in Figure 2 shows how we can move from <code>FieldDeclaration</code> node to the <code>classOrInterfaceModifier</code> node.</p>
<p>In the third step, we add the getter and setter methods for the encapsulated field to the class body. To this aim, we define the <code>new_code</code> variable that holds the generated codes. The code can be generated based on a simple <em>template</em> that accessor and mutator methods typically follow. When the generated code is completed, it is added to the class body after the encapsulated field declaration using the <code>insertAfter</code> method of the <code>token_stream_rewriter</code> object. We use the <code>exitFieldDeclaration</code> method to put the described actions. The following code snippet shows generating and inserting accessor and mutator methods.</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">exitFieldDeclaration</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">JavaParserLabeled</span><span class="o">.</span><span class="n">FieldDeclarationContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_source_class</span><span class="p">:</span>
<span class="k">if</span> <span class="n">ctx</span><span class="o">.</span><span class="n">variableDeclarators</span><span class="p">()</span><span class="o">.</span><span class="n">variableDeclarator</span><span class="p">(</span>
<span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">variableDeclaratorId</span><span class="p">()</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">:</span>
<span class="c1"># Check if getter or setter methods already exist</span>
<span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">ctx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">classBodyDeclaration</span><span class="p">():</span>
<span class="k">if</span> <span class="n">c</span><span class="o">.</span><span class="n">memberDeclaration</span><span class="p">()</span><span class="o">.</span><span class="n">methodDeclaration</span><span class="p">()</span><span class="o">.</span><span class="n">IDENTIFIER</span><span class="p">()</span> \
<span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">==</span> <span class="s1">'get'</span> <span class="o">+</span> <span class="nb">str</span><span class="o">.</span><span class="n">capitalize</span><span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">getter_exist</span> <span class="o">=</span> <span class="kc">True</span>
<span class="k">if</span> <span class="n">c</span><span class="o">.</span><span class="n">memberDeclaration</span><span class="p">()</span><span class="o">.</span><span class="n">methodDeclaration</span><span class="p">()</span><span class="o">.</span><span class="n">IDENTIFIER</span><span class="p">()</span> \
<span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">==</span> <span class="s1">'set'</span> <span class="o">+</span> <span class="nb">str</span><span class="o">.</span><span class="n">capitalize</span><span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">setter_exist</span> <span class="o">=</span> <span class="kc">True</span>
<span class="c1"># Generate accessor and mutator methods if not exist</span>
<span class="c1"># Accessor body</span>
<span class="n">new_code</span> <span class="o">=</span> <span class="s1">''</span>
<span class="k">if</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">getter_exist</span><span class="p">:</span>
<span class="n">new_code</span> <span class="o">=</span> <span class="s1">'</span><span class="se">\n\t</span><span class="s1">// new getter method</span><span class="se">\n\t</span><span class="s1">'</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'public '</span> <span class="o">+</span> <span class="n">ctx</span><span class="o">.</span><span class="n">typeType</span><span class="p">()</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">+</span> \
<span class="s1">' get'</span> <span class="o">+</span> <span class="nb">str</span><span class="o">.</span><span class="n">capitalize</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">)</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'() { </span><span class="se">\n\t\t</span><span class="s1">return this.'</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span>
<span class="o">+</span> <span class="s1">';'</span> <span class="o">+</span> <span class="s1">'</span><span class="se">\n\t</span><span class="s1">}</span><span class="se">\n</span><span class="s1">'</span>
<span class="c1"># Mutator body</span>
<span class="k">if</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">setter_exist</span><span class="p">:</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'</span><span class="se">\n\t</span><span class="s1">// new setter method</span><span class="se">\n\t</span><span class="s1">'</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'public void set'</span> <span class="o">+</span> \
<span class="nb">str</span><span class="o">.</span><span class="n">capitalize</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">)</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'('</span> <span class="o">+</span> <span class="n">ctx</span><span class="o">.</span><span class="n">typeType</span><span class="p">()</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">+</span> <span class="s1">' '</span> \
<span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span> <span class="o">+</span> <span class="s1">') { </span><span class="se">\n\t\t</span><span class="s1">'</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'this.'</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span> <span class="o">+</span> <span class="s1">' = '</span> \
<span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span> <span class="o">+</span> <span class="s1">';'</span> <span class="o">+</span> <span class="s1">'</span><span class="se">\n\t</span><span class="s1">}</span><span class="se">\n</span><span class="s1">'</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">insertAfter</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">stop</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span> <span class="n">new_code</span><span class="p">)</span>
</code></pre></div>
<p>The above code checks that getter and setter do not already exist. Thereafter, a new getter and setter method is added to the class body.
The fourth step is to update the references of the encapsulated field to replace the field usages with the appropriate setter or getter method. There are different rules in the grammar that describe access to the class fields. A complete encapsulate field refactoring should be considered all rules. However, the code for updating the field usages with getter and setter methods is almost the same. For example, by looking at Figure 3, which is part of the code shown in Figure 1, we can find that when the right-hand side brother of node <code>experssion1</code> is a binary operator such as <span class="caps">MUL</span>, the child of <code>expression1</code> must be replaced with the getter method. </p>
<p><img alt="Figure 3. Part of the parse tree generated for the code is snipped in Figure 1 (left)" src="../static/img/refactoring/parse-tree-for-encapsulate-field-2.png"></p>
<p><em>Figure 3: Part of the parse tree generated for the code is snipped in Figure 1 (left)</em></p>
<p>The following code snipped performs this transformation when the walker exit from the <code>expression1</code> node.</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">exitExpression1</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">JavaParserLabeled</span><span class="o">.</span><span class="n">Expression1Context</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_source_class</span> <span class="ow">and</span> <span class="bp">self</span><span class="o">.</span><span class="n">in_selected_package</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">if</span> <span class="n">ctx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">getChild</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="ow">in</span>
<span class="p">(</span><span class="s1">'='</span><span class="p">,</span> <span class="s1">'+='</span><span class="p">,</span> <span class="s1">'-='</span><span class="p">,</span> <span class="s1">'*='</span><span class="p">,</span> <span class="s1">'/='</span><span class="p">,</span> <span class="s1">'&='</span><span class="p">,</span><span class="s1">'|='</span><span class="p">,</span> <span class="s1">'^='</span><span class="p">,</span> <span class="s1">'>>='</span><span class="p">,</span>
<span class="s1">'>>>='</span><span class="p">,</span> <span class="s1">'<<='</span><span class="p">,</span> <span class="s1">'%='</span><span class="p">)</span> <span class="ow">and</span> <span class="n">ctx</span><span class="o">.</span><span class="n">parentCtx</span><span class="o">.</span><span class="n">getChild</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="n">ctx</span><span class="p">:</span>
<span class="k">return</span>
<span class="k">except</span><span class="p">:</span>
<span class="k">pass</span>
<span class="k">if</span> <span class="n">ctx</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span> <span class="o">==</span> <span class="s1">'this.'</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">:</span>
<span class="n">new_code</span> <span class="o">=</span> <span class="s1">'this.get'</span> <span class="o">+</span> <span class="nb">str</span><span class="o">.</span><span class="n">capitalize</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">field_identifier</span><span class="p">)</span> <span class="o">+</span> <span class="s1">'()'</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">replaceRange</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">start</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span>
<span class="n">ctx</span><span class="o">.</span><span class="n">stop</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span>
<span class="n">new_code</span><span class="p">)</span>
</code></pre></div>
<p>Other places where the encapsulated field is accessed or modified should be found and updated in a similar way described in this step. Once all fourth steps described in this section are applied, the code snippet shown in Figure 1 (left) is transformed to the code snippet shown in Figure 1 (right), and the encapsulated refactoring is completed.</p>
<h2>Conclusion and remarks</h2>
<p>Most of the techniques described in this section can be used to automate other refactoring operations. The only different things are the required actions, which are often unique to each refactoring. The overall process consists of looking at the relevant parts of the parse tree, choosing a relevant node, and implementing the required actions.</p>
<p>The <code>ctx</code> object of the <code>Context</code> class contains all information we need to find and check or change when performing the refactoring. In addition, visualization of the parse tree helps choose which node can be chosen for which actions and how the actions should be programmed. </p>
<p>It should be noted that selecting a pares tree node (or grammar rule) to put the required actions does not have a unique and deterministic answer. In other words, we can put our actions in a set of nodes when programming with <span class="caps">ANTLR</span>. For example, to change the “public” token to a “private” token, one may put the required actions in the <code>memberDeclartion</code> node, which sightly changes our above code. The right node should be chosen that minimizes the implementation effort of that actions. As general advice, when automating refactoring operations, we write our actions on the node near the refactoring entities.</p>
<p>I try to explain the automation of more refactoring operation to this tutorial. </p>
<p>Stay hungry, stay incomplete :)</p>
<hr>
<h2>References</h2>
<p>Booch G, Maksimchuk <span class="caps">RA</span>, Engle <span class="caps">MW</span>, et al (2008) Object-oriented analysis and design with applications, third edition. <span class="caps">ACM</span> <span class="caps">SIGSOFT</span> Softw Eng Notes 33:29–29. https://doi.org/10.1145/1402521.1413138</p>
<p>Fowler M, Beck K (2018) Refactoring: improving the design of existing code, Second Edi. Addison-Wesley</p>Program dynamic analysis with ANTLR2021-03-30T23:45:00+04:302021-03-30T23:45:00+04:30Mortezatag:m-zakeri.github.io,2021-03-30:/program-dynamic-analysis-with-antlr.html<p>Dynamic analysis refers to extracting specific information from the program related to the program’s execution. Therefore, it requires to execute the program under analysis. Often the source code must be augmented in a way that executing the program outputs the additional information required for dynamic analysis. A well-known technique for this aim is program instrumentation. The <span class="caps">ANTLR</span> tool can be used to instrument the source code effectively. In this tutorial, I explain how we can use the <span class="caps">ANTLR</span> tool to instrument the C++ program in the Python programming language.</p><h2>Introduction</h2>
<p>In this tutorial we describe a primary task in source code transformation, <em>i.e.</em>, program instrumentation, which is one of the CodA features.</p>
<p>The task can be performed by properly applying compiler techniques, adding required code snippets at specific source code places. Instrumentation is the fundamental prerequisite for almost all dynamic analysis types. Let us begin with a simple case in which the purpose of instrumentation is to log the executed path of the program control flow graph for each execution. Consider the following C++ program used to calculate the greatest common divider (<span class="caps">GCD</span>) of two integers:</p>
<div class="highlight"><pre><span></span><code><span class="cp">#include</span><span class="w"> </span><span class="cpf"><stdio.h></span><span class="cp"></span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf"><iostream></span><span class="cp"></span>
<span class="kt">int</span><span class="w"> </span><span class="nf">main</span><span class="p">()</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">num1</span><span class="p">,</span><span class="w"> </span><span class="n">num2</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">gcd</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"Enter two integers: "</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">cin</span><span class="w"> </span><span class="o">>></span><span class="w"> </span><span class="n">num1</span><span class="w"> </span><span class="o">>></span><span class="w"> </span><span class="n">num2</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o"><=</span><span class="w"> </span><span class="n">num1</span><span class="w"> </span><span class="o">&&</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o"><=</span><span class="w"> </span><span class="n">num2</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">i</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="c1">// Checks if i is factor of both integers</span>
<span class="w"> </span><span class="k">if</span><span class="p">(</span><span class="n">num1</span><span class="o">%</span><span class="n">i</span><span class="o">==</span><span class="mi">0</span><span class="w"> </span><span class="o">&&</span><span class="w"> </span><span class="n">num2</span><span class="o">%</span><span class="n">i</span><span class="o">==</span><span class="mi">0</span><span class="p">)</span><span class="w"> </span>
<span class="w"> </span><span class="n">gcd</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"G.C.D is "</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">gcd</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</code></pre></div>
<p><em>Figure 1. Source code of <span class="caps">GCD</span> program.</em></p>
<p>Appropriate instrumentation will put a log statement at the beginning of each basic block. For simplicity, we add a print statement to write the number of the executed basic block in the console. In the <span class="caps">GCD</span> program, lines 6, 10, 13 shows the starting point of basic blocks. Therefore, the instrumented version of the <span class="caps">GCD</span> program is similar to the following code, in which print statement has been added manually:</p>
<div class="highlight"><pre><span></span><code><span class="cp">#include</span><span class="w"> </span><span class="cpf"><stdio.h></span><span class="cp"></span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf"><iostream></span><span class="cp"></span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf"><fstream></span><span class="cp"></span>
<span class="n">std</span><span class="o">::</span><span class="n">ofstream</span><span class="w"> </span><span class="n">logFile</span><span class="p">(</span><span class="s">"log_file.txt"</span><span class="p">);</span><span class="w"></span>
<span class="kt">int</span><span class="w"> </span><span class="nf">main</span><span class="p">()</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="n">logFile</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"p1"</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">num1</span><span class="p">,</span><span class="w"> </span><span class="n">num2</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">gcd</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"Enter two integers: "</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">cin</span><span class="w"> </span><span class="o">>></span><span class="w"> </span><span class="n">num1</span><span class="w"> </span><span class="o">>></span><span class="w"> </span><span class="n">num2</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o"><=</span><span class="w"> </span><span class="n">num1</span><span class="w"> </span><span class="o">&&</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o"><=</span><span class="w"> </span><span class="n">num2</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">i</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="n">logFile</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"p2"</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="c1">// Checks if i is factor of both integers</span>
<span class="w"> </span><span class="k">if</span><span class="p">(</span><span class="n">num1</span><span class="o">%</span><span class="n">i</span><span class="o">==</span><span class="mi">0</span><span class="w"> </span><span class="o">&&</span><span class="w"> </span><span class="n">num2</span><span class="o">%</span><span class="n">i</span><span class="o">==</span><span class="mi">0</span><span class="p">)</span><span class="w"> </span>
<span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">logFile</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"p3"</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">gcd</span><span class="o">=</span><span class="n">i</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="w"> </span><span class="c1">//continue;</span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"G.C.D is "</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">gcd</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="c1">//return 0;</span>
<span class="w"> </span><span class="n">logFile</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"p4"</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</code></pre></div>
<p><em>Figure 2. Source code of <span class="caps">GCD</span> program after instrumenting.</em></p>
<p>One can see the cout statements added in lines 6, 12, 15, i.e., at the beginning of each basic block. For large programs, it is impossible to add such statements manually. To perform this instrumentation by <span class="caps">ANTLR</span>, we just need to identify conditional statements, including if statements, loop statements, and switch-case statements. Besides, the beginning of each function should be recognized. <span class="caps">ANTLR</span> provides a listener interface that consists of an enter method and exit method for each non-terminal in target language grammar. The listener can be passed to the parse tree walker used for traversing the parse tree in <span class="caps">DFS</span> . For instrumenting, we must implement the methods of listener interface related to conditional rules. The implementation of the listener interface in Python is shown in the following:</p>
<div class="highlight"><pre><span></span><code> <span class="k">class</span> <span class="nc">InstrumentationListener</span><span class="p">(</span><span class="n">CPP14Listener</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">tokenized_source_code</span><span class="p">:</span> <span class="n">CommonTokenStream</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">branch_number</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">if</span> <span class="n">tokenized_source_code</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="c1"># Move all the tokens in the source code in a buffer, token_stream_rewriter. </span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span> <span class="o">=</span> <span class="n">TokenStreamRewriter</span><span class="o">.</span><span class="n">TokenStreamRewriter</span><span class="p">(</span><span class="n">tokenized_source_code</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">Exception</span><span class="p">(</span><span class="err">‘</span><span class="n">common_token_stream</span> <span class="ow">is</span> <span class="kc">None</span><span class="err">’</span><span class="p">)</span>
<span class="c1"># Creating and open a text file for logging the instrumentation result at beging of the program</span>
<span class="k">def</span> <span class="nf">enterTranslationunit</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">TranslationunitContext</span><span class="p">):</span>
<span class="n">new_code</span> <span class="o">=</span> <span class="s1">'</span><span class="se">\n</span><span class="s1"> #include <fstream> </span><span class="se">\n</span><span class="s1"> std::ofstream logFile("log_file.txt"); </span><span class="se">\n</span><span class="s1">'</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">insertAfter</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">start</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span> <span class="n">new_code</span><span class="p">)</span>
<span class="c1"># DFS traversal of a statement subtree, rooted at ctx and if the statement is a branching condition </span>
<span class="c1"># insert a prob.</span>
<span class="k">def</span> <span class="nf">enterStatement</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">StatementContext</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">parentCtx</span><span class="p">,</span> <span class="p">(</span><span class="n">CPP14Parser</span><span class="o">.</span><span class="n">SelectionstatementContext</span><span class="p">,</span>
<span class="n">CPP14Parser</span><span class="o">.</span><span class="n">IterationstatementContext</span><span class="p">)):</span>
<span class="c1"># if there is a compound statement after the branchning condition:</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">children</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">CompoundstatementContext</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">branch_number</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">new_code</span> <span class="o">=</span> <span class="s1">'</span><span class="se">\n</span><span class="s1"> logFile << "p'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">branch_number</span><span class="p">)</span> <span class="o">+</span> <span class="s1">'" << endl; </span><span class="se">\n</span><span class="s1">'</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">insertAfter</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">start</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span> <span class="n">new_code</span><span class="p">)</span>
<span class="c1"># if there is only one statement after the branchning condition then create a block.</span>
<span class="k">elif</span> <span class="ow">not</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">children</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
<span class="p">(</span><span class="n">CPP14Parser</span><span class="o">.</span><span class="n">SelectionstatementContext</span><span class="p">,</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">IterationstatementContext</span><span class="p">)):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">branch_number</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">new_code</span> <span class="o">=</span> <span class="s1">'{'</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'</span><span class="se">\n</span><span class="s1"> logFile << "p'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">branch_number</span><span class="p">)</span> <span class="o">+</span> <span class="s1">'" << endl; </span><span class="se">\n</span><span class="s1">'</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="n">ctx</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span>
<span class="n">new_code</span> <span class="o">+=</span> <span class="s1">'</span><span class="se">\n</span><span class="s1">}'</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">replaceRange</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">start</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span> <span class="n">ctx</span><span class="o">.</span><span class="n">stop</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span> <span class="n">new_code</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">enterFunctionbody</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">FunctionbodyContext</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">branch_number</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">new_code</span> <span class="o">=</span> <span class="s1">'</span><span class="se">\n</span><span class="s1"> logFile << "p'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">branch_number</span><span class="p">)</span> <span class="o">+</span> <span class="s1">'" << endl;</span><span class="se">\n</span><span class="s1">'</span>
<span class="bp">self</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">insertAfter</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">start</span><span class="o">.</span><span class="n">tokenIndex</span><span class="p">,</span> <span class="n">new_code</span><span class="p">)</span>
</code></pre></div>
<p><em>Figure 3. <span class="caps">ANTLR</span> listener for instrumenting.</em></p>
<p>In the above code, class <code>InstrumentationListener</code> implements the interface <code>CPP14Listener</code>, which is the base listener for C++ grammar and generated by <span class="caps">ANTLR</span>. Note that the grammar of C++ 14 is available at <span class="caps">ANTLR</span> official website. Two methods <code>enterStatement()</code> and <code>enterFunctionbody()</code> are implemented to add a print statement in proper places of program code, respectively, at the beginning of each conditional statement and each function. These two methods are invoked by <span class="caps">ANTLR</span> parser tree walker if we pass an instance of <code>InstrumnentationListerer</code> to it.</p>
<p><code>InstrumentationListener</code> class also has two attributes: <code>branch_number</code> and <code>token_stream_rewriter</code>. branch_umber used to track the number of instrumented blocks during the instrumentation. Each time we add a print statement, we increment the value of <code>branch_number</code> by one unit. </p>
<p>Line 3 defines <code>branch_number</code> and initialize it with zero. <code>token_stream_rewiter</code> object is an instance of <code>TokenStreamRewiter</code> class, which is provided by <span class="caps">ANTLR</span> and contain the stream of source code tokens. <code>TokenStreamRewriter</code> initializes with common_token_stream, which already has been built by <span class="caps">ANTLR</span> from the lexer class and then provides methods for adding and manipulating code snips within a given stream. Line 5 creates an instance of <code>TokenStreamRewriter</code> class to access its required methods. If <code>common_token_stream</code> is none, then an exception raises (Line 7).</p>
<p>Let explain the logic of <code>enterFunctionbody()</code> as it seems to be simpler than <code>enterStatement()</code>. Each time a function definition occurred in the source code, this method is invoked. First, the branch_number will be increased by 1 (Line 25). At line 26, the print statement, including the branch_number is prepared, and then at Line 27, we tell <code>token_stream_rewiter</code> to insert this new code after the current beginning function token, <em>i.e.</em>, <code>{</code> in C++. </p>
<p>For adding print after conditional and loop statements, more effort is required. <code>enterStatement()</code> is invoked each time that a statement node is visited. Line 10 checks to see if the statement is an instance of <code>SelectionsteatemetContext</code> or <code>IterationstatementContext</code>, which are relevant rule contexts for conditional and loop statements in C++ grammar. If this condition is not valid, <em>i.e.</em>, for regular statements, no action will perform. Otherwise, we are faced with two different situations. The first one (Line 11) is that the body of the conditional or loop statement is a compound statement, i.e., it has more than one statement, which encloses between two braces <code>{</code> and <code>}</code>.</p>
<p>In such a case, we just need to add our print statement at the beginning of the compound statement right after token <code>{</code>. The code of this condition is exactly the same code used in <code>enterFunctionbody()</code>. The second situation occurs when the conditional or loop statement has only one statement inside its basic block. In this state, only the first statement is considered within the condition or loop by the compiler. If one adds a print statement without any enclosing brace, the execution path will not be captured correctly. Hence, in Line 15, after detecting that the statement is neither a compound statement nor branch, the proper code will be provided. The required code for instrumenting includes a left brace, a print statement, a current statement or context, and at the end, a right brace. </p>
<p>Line 22 adds <code>new_code</code> to the current source code.
Now the implementation of our <code>InstrumentationListener</code> has been finished. The next step is to write the main driver for the instrumentation tool and connect this listener to the parse tree walker. </p>
<p>Figure 4 shows the body of the main python script required to create and run our efficient yet straightforward instrumenting tool. A comment line has explained each line of code, and therefore we omit extra descriptions. The only important note is that the instrumented code, <em>i.e.</em>, the modified source code, is accessible by <code>token_stream_rewirter</code> object. The <code>getDefualtText()</code> of <code>token_stream_rewirter</code> object is called to retrieve the new source code in Line 18.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">antlr4</span> <span class="kn">import</span> <span class="o">*</span>
<span class="c1"># Step 1: Convert input to a byte stream</span>
<span class="n">stream</span> <span class="o">=</span> <span class="n">InputStream</span><span class="p">(</span><span class="n">input_string</span><span class="p">)</span>
<span class="c1"># Step 2: Create lexer</span>
<span class="n">lexer</span> <span class="o">=</span> <span class="n">test_v2Lexer</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span>
<span class="c1"># Step 3: Create a list of tokens</span>
<span class="n">token_stream</span> <span class="o">=</span> <span class="n">CommonTokenStream</span><span class="p">(</span><span class="n">lexer</span><span class="p">)</span>
<span class="c1"># Step 4: Create parser</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">test_v2Parser</span><span class="p">(</span><span class="n">token_stream</span><span class="p">)</span>
<span class="c1"># Step 5: Create parse tree</span>
<span class="n">parse_tree</span> <span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
<span class="c1"># Step 6: Adding a listener</span>
<span class="n">instrument_listener</span> <span class="o">=</span> <span class="n">InstrumentationListener</span><span class="p">(</span><span class="n">common_token_stream</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">common_token_stream</span><span class="p">)</span>
<span class="c1"># Step 7: Create parse tree walker</span>
<span class="n">walker</span> <span class="o">=</span> <span class="n">ParseTreeWalker</span><span class="p">()</span>
<span class="c1"># Step 8: Walk parse tree, attaching the listener to instrumented_programs the code</span>
<span class="n">walker</span><span class="o">.</span><span class="n">walk</span><span class="p">(</span><span class="n">listener</span><span class="o">=</span><span class="n">instrument_listener</span><span class="p">,</span> <span class="n">t</span><span class="o">=</span><span class="n">parse_tree</span><span class="p">)</span>
<span class="c1"># Step 9: </span>
<span class="n">new_source_code</span> <span class="o">=</span> <span class="n">instrument_listener</span><span class="o">.</span><span class="n">token_stream_rewriter</span><span class="o">.</span><span class="n">getDefaultText</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="n">new_source_code</span><span class="p">)</span>
</code></pre></div>
<p><em>Figure 4. The driver code for instrumenting.</em></p>
<p>After the instrumenting was completed, the program must be compiled then executed to apply the modification. Figure 5 shows an example of executing with a sample input. As shown in Figure 5, the executed path for inputs 24 and 18 are logged into the console, in addition to the program output, which is 6 in this example. The sequence of the printed path shows the order in which basic blocks were executed. We may change the instrumentation to capture more complicated information about runtime. However, the techniques and principles will be the same used in this simple example. Interested readers may find more exercise about instrumentation at the end of this chapter.</p>
<p><img alt="Figure 5" src="../static/img/dynamic_analysis/gcd_execution_output.png"></p>
<p><em>Figure 5. An example of executing the <span class="caps">GCD</span> program after instrumenting.</em></p>
<h2>Conclusion</h2>
<p>We show that using the <span class="caps">ANTLR</span> listener mechanism; it would be very simple to instrument the real-world <span class="caps">CPP</span> programs. A similar technique can be used to instrument the source code written in other programming languages such as Java and C#.
In the <a href="program_instrumentation.md">next tutorial</a>, we discuss using <span class="caps">ANTLR</span> for static analysis of the source code and computing some source code metrics.</p>Program static analysis with ANTLR2021-03-29T23:45:00+04:302021-03-29T23:45:00+04:30Mortezatag:m-zakeri.github.io,2021-03-29:/program-static-analysis-with-antlr.html<p>Static analysis means extracting specific information from the program artifacts, e.g., source code, without any execution of the program. The <span class="caps">ANTLR</span> tool can be used to perform all types of static analysis at the source-code level. In this tutorial, I explain how we can use the <span class="caps">ANTLR</span> tool to perform some basic kinds of static analysis of the C++ programs in the Python programming language. The task I chose to explain is extracting the class diagram and computing the relevant design metrics.</p><h1>Introduction</h1>
<p>Source code and design metrics are extracted from the source code of the software, and their values allow us to reach conclusions about the quality attributes measured by the metrics.</p>
<p>A practical approach to computing such metrics is static program source code analysis. Again this analysis can be performed by using the compiler front-end to deal with the parse tree and symbol table. The idea is to create a symbol table for the program under analysis and extract desired metrics. In this section, we demonstrate the use of <span class="caps">ANLTR</span> to compute two essential design metrics, <span class="caps">FANIN</span>, and <span class="caps">FANOUT</span>, which affect the testability of a module. </p>
<p><span class="caps">FANIN</span> and <span class="caps">FANOUT</span> can be computed from <span class="caps">UML</span> class diagrams. In the case of source code, we require to extract the class diagram from the program source code. We begin with constructing a simple symbol table to hold the necessary entities, e.g., classes and their relationships. Similar to <a href="program_instrumentation.md">our source code instrumentation tutorial</a>, the <span class="caps">ANTLR</span> listener mechanism is utilized to build the symbol table. The structure of our symbol table is shown in Figure 1.</p>
<p><img alt="Figure 1" src="../static/img/static_analysis/symbol_table.png"></p>
<p><em>Figure 1: Class diagram of a simple symbol table for C++</em></p>
<p>The class diagram in Figure 1 has been implemented in Python. During syntax tree walking, each entity is recognized and saved in the corresponding instance of symbol table entities. For example, whenever a method is recognized, the instance of the Method class will be created to hold this method. The Model class is needed to keep a list of recognized classes as the top-level entities. The implementation code of the proposed symbol table in Python is straightforward, and we omit the code here.</p>
<p>The next step is creating a listener and adding codes to fill the symbol table. Listing 1 shows the listener used to recognize and add source code classes to the symbol table.</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">DefinitionPassListener</span><span class="p">(</span><span class="n">CPP14Listener</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Pass 1: Extracting the classes and structs from a given CPP source code</span>
<span class="sd"> """</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model</span><span class="p">:</span> <span class="n">Model</span> <span class="o">=</span> <span class="kc">None</span><span class="p">):</span>
<span class="k">if</span> <span class="n">model</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">Model</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">model</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="o">=</span> <span class="kc">None</span>
<span class="k">def</span> <span class="nf">enterClassspecifier</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">ClassspecifierContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="o">=</span> <span class="n">Class</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">exitClassspecifier</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">ClassspecifierContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">class_list</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="o">=</span> <span class="kc">None</span>
<span class="k">def</span> <span class="nf">enterClassheadname</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">ClassheadnameContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">ctx</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span>
</code></pre></div>
<p><em>Listing 1: Recognizing classes in source code and inserting them into the symbol table</em></p>
<p>When an instance of <code>DefinitionPassListner</code> in Listing 1 is passed to the <code>ParseTreeWalker</code> instance, the classes within the source code are identified and inserted into the symbol table. This task has been performed only by implementing the listener methods, which correspond to the class definition rule in C++ grammar. </p>
<p>To better understand which methods of the base listener (<code>CPP14Listener</code>), generated by <span class="caps">ANTLR</span>, should be implemented to perform this task, we may look at the parse tree of the simple program, including one class with one field, as shown in Listing 2. </p>
<div class="highlight"><pre><span></span><code><span class="n">class</span><span class="w"> </span><span class="n">A</span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">name</span><span class="p">;</span><span class="w"></span>
<span class="p">};</span><span class="w"></span>
</code></pre></div>
<p><em>Listing 2: C++ code snip with one class and one field.</em></p>
<p>The parse tree of code snip in Listing is shown in Figure 2.
The parse tree visualization can be performed by the <a href="https://plugins.jetbrains.com/plugin/7358-antlr-v4-grammar-plugin"><span class="caps">ANTLR</span> plugin for IntelliJ <span class="caps">IDEA</span></a>.
One can see the complexity of the C++ language and its compilation. The pares tree for the program with only four lines of codes has 39 nodes and more than 350 parse decisions (invocation in the recursive descent parsing), which shows that the real programming languages are too complex. Therefore, the only way to analyze and test them is to utilize compiler techniques.</p>
<p><img alt="Simple CPP/Java class parse tree" src="../static/img/static_analysis/simple_cpp_class_parse_tree.png"></p>
<p><em>Figure 2: The parse tree for the code snippet shown in Listing 2</em></p>
<p>The recognized classes, by applying <code>DefinitionPassListner</code>, only have a name (set in Line 25). The <code>DefinitionPassListner</code> listener class does not capture any required relationships for computing <span class="caps">FANIN</span> and <span class="caps">FANOUT</span> or any other analysis. </p>
<p>Relationships between classes in each program occurred in different ways, e.g., through the aggregation. In aggregation, one class has a field with the type of the other class. To extract the aggregation relationship, we should extract all fields whose types are user-defined. Therefore, we create another listener with the following codes:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ResolvePassListener</span><span class="p">(</span><span class="n">DefinitionPassListener</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Pass 2: Extracting the classes' fields</span>
<span class="sd"> """</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model</span><span class="p">:</span> <span class="n">Model</span> <span class="o">=</span> <span class="kc">None</span><span class="p">):</span>
<span class="nb">super</span><span class="p">(</span><span class="n">DefinitionPassListener</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">model</span><span class="o">=</span><span class="n">model</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">enter_member_specification</span> <span class="o">=</span> <span class="kc">False</span>
<span class="bp">self</span><span class="o">.</span><span class="n">field</span> <span class="o">=</span> <span class="n">Field</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">enterMemberspecification</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">MemberspecificationContext</span><span class="p">):</span>
<span class="k">if</span> <span class="n">ctx</span><span class="o">.</span><span class="n">getChildCount</span><span class="p">()</span> <span class="o">==</span> <span class="mi">3</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">enter_member_specification</span> <span class="o">=</span> <span class="kc">True</span>
<span class="k">def</span> <span class="nf">enterDeclspecifier</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">DeclspecifierContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">enter_member_specification</span><span class="p">:</span>
<span class="n">ctx_the_type_name</span> <span class="o">=</span> <span class="n">ctx</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span>
<span class="k">for</span> <span class="n">class_instance</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">class_list</span><span class="p">:</span>
<span class="k">if</span> <span class="n">ctx_the_type_name</span> <span class="o">==</span> <span class="n">class_instance</span><span class="o">.</span><span class="n">name</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">field</span><span class="o">.</span><span class="n">type</span> <span class="o">=</span> <span class="n">class_instance</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span><span class="o">.</span><span class="n">fields</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">field</span><span class="p">)</span>
<span class="k">break</span>
</code></pre></div>
<p><em>Listing 3: Adding class fields to the program symbol table</em></p>
<p>The method <code>enterDeclspecifier</code> is invoked by <code>ParseTreeWalker</code> each time a field is defined in the program source code. In <code>ResolvePassListener</code> an extra check is required to ensure that the recognized variable belongs to the class or not. The flag <code>enter_member_specification</code> is set to true in <code>enterMemeberspecification</code> method and used to understand the scope of the variable. In <code>enterDeclspecifier</code> method, the name of the variable is checked to find whether it is the name of another class or not. Indeed, if the field has a user-defined type, then the type of this field is resolved and added to the current class fields. </p>
<p>There are some practical considerations at this point. Why has a separate class defined for resolving the fields of classes? The <code>ResolvePassListener</code> has inherited from <code>DefinitionPassListner</code>, but why? The reason for separating the listener code into two classes is that the symbol table can not be completed by traversing the parse tree only once. If we try to add the field of the class at the same time that we are adding the class itself, we may not be able to find the proper type of the user-defined fields since all types still have not been inserted into the symbol table. The best practice is that two separate analysis passes are applied. One for adding types to the symbol table called definition pass, and another one for resolving the types to check or complete their information called resolved pass. Each pass in the compiling process reads the source code from start to end.</p>
<p>The resolve pass inherits from the definition pass since the operation in the definition pass is still required. For example, Line 20 in <code>ResolvePassListener</code> requires the current class when adding the recognized field to it. <code>DefinitionPassListner</code>, in Listing 1, is not suitable to use as a parent for ResolvePassListener. It only inserts new classes to the symbol table; however, we need to retrieve them when the ResolvePassListener is being applied. Another problem is that if the current code for <code>DefinitionPassListner</code> is executed more than once, the same class is inserted to self.model.class_list the object in the symbol table. We should fix the class <code>DefinitionPassListner</code> to solve these two problems.</p>
<p>First, before adding a new class (Line 25 in Listing 1), it should be checked that class has not existed in the symbol table. Second, if the class already exists in the code, in <code>enterClassheadname</code> method, the corresponding class should be retrieved by its name and assigned to <code>self.class_instance</code> object. These conditions are expected to be met when the <code>ResolvePassListener</code> is executed as a second pass of our analysis.
Listing 4 shows the modified version of the <code>DefinitionPassListner</code>.</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">DefinitionPassListener</span><span class="p">(</span><span class="n">CPP14Listener</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Pass 1 (modified): Extracting the classes and structs</span>
<span class="sd"> """</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model</span><span class="p">:</span> <span class="n">Model</span> <span class="o">=</span> <span class="kc">None</span><span class="p">):</span>
<span class="k">if</span> <span class="n">model</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">Model</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">model</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="o">=</span> <span class="kc">None</span>
<span class="k">def</span> <span class="nf">enterClassspecifier</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">ClassspecifierContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="o">=</span> <span class="n">Class</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">exitClassspecifier</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">ClassspecifierContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">find_calss_by_name</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span><span class="p">)</span> <span class="o">==</span> <span class="kc">False</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">class_list</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="o">=</span> <span class="kc">None</span>
<span class="k">def</span> <span class="nf">enterClassheadname</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">ctx</span><span class="p">:</span> <span class="n">CPP14Parser</span><span class="o">.</span><span class="n">ClassheadnameContext</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">find_calss_by_name</span><span class="p">(</span><span class="n">ctx</span><span class="o">.</span><span class="n">getText</span><span class="p">()):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">medel</span><span class="o">.</span><span class="n">get_class_by_name</span><span class="p">(</span><span class="n">class_instance</span><span class="o">.</span><span class="n">name</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_instance</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">ctx</span><span class="o">.</span><span class="n">getText</span><span class="p">()</span>
</code></pre></div>
<p><em>Listing 4: The fixed version of the <code>DefinitionPassListener</code> class.</em></p>
<p>In this tutorial, we assumed that the input program is compilable, and hence we did not perform additional compile-time tasks such as type checking. The complete implementation of two listeners, including import statements and some additional codes, will be available on the <a href="https://github.com/m-zakeri/CodA/">CodA repository</a>. </p>
<p>Once our listeners are completed, we can add a driver code to attach these listeners to a <code>ParseTreeWalker</code> and perform the target task, as discussed in the <a href="antlr_basics.md"><span class="caps">ANTLR</span> basic tutorial</a>.
The only difference is that we have two listeners that must be executed in order to get the desired result. Listing 5 shows the driver code for our static analysis task.</p>
<div class="highlight"><pre><span></span><code><span class="n">stream</span> <span class="o">=</span> <span class="n">FileStream</span><span class="p">(</span><span class="n">input_string</span><span class="p">)</span>
<span class="n">lexer</span> <span class="o">=</span> <span class="n">test_v2Lexer</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span>
<span class="n">token_stream</span> <span class="o">=</span> <span class="n">CommonTokenStream</span><span class="p">(</span><span class="n">lexer</span><span class="p">)</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">test_v2Parser</span><span class="p">(</span><span class="n">token_stream</span><span class="p">)</span>
<span class="n">parse_tree</span> <span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
<span class="n">pass1</span> <span class="o">=</span> <span class="n">DefinitionPassListener</span><span class="p">()</span>
<span class="n">walker</span> <span class="o">=</span> <span class="n">ParseTreeWalker</span><span class="p">()</span>
<span class="n">walker</span><span class="o">.</span><span class="n">walk</span><span class="p">(</span><span class="n">listener</span><span class="o">=</span><span class="n">pass1</span><span class="p">,</span> <span class="n">t</span><span class="o">=</span><span class="n">parse_tree</span><span class="p">)</span>
<span class="n">pass2</span> <span class="o">=</span> <span class="n">ResolvePassListener</span><span class="p">(</span><span class="n">model</span><span class="o">=</span><span class="n">pass1</span><span class="o">.</span><span class="n">model</span><span class="p">)</span>
<span class="n">walker</span><span class="o">.</span><span class="n">walk</span><span class="o">.</span><span class="n">walk</span><span class="p">(</span><span class="n">listener</span><span class="o">=</span><span class="n">pass2</span><span class="p">,</span> <span class="n">t</span><span class="o">=</span><span class="n">parse_tree</span><span class="p">)</span>
</code></pre></div>
<p><em>Listing 5: Driver coed to perform static analysis of the source code.</em></p>
<p>The last step in our analysis is to build the class diagram as a directed annotated graph forming the symbol table and compute the <em><span class="caps">FAN</span>-<span class="caps">IN</span></em> and <em><span class="caps">FAN</span>-<span class="caps">OUT</span></em> metrics for each class. This step is done by creating a node for each class and adding an edge between two classes, which have an aggregate relationship together. The direction of each edge specifies the direction of aggregation. </p>
<p>Listing 6 shows the methods that create and visualize the discussed graph. Two methods are defined in the <code>Model</code> class, which was part of our symbol table in previous steps. The first method, <code>create_graph</code>, creates a graph for a class diagram. It uses the <em>NetworkX</em> library to work with graphs. The second method, <code>draw_graph</code>, makes visualization of the created graph. The <code>Model</code> class also has two fields <code>class_list</code> and <code>class_diagram</code>, which have not been shown in Listing 6.
The first field holds all class instances of the source code, and the second field holds the class diagram corresponding graph.</p>
<div class="highlight"><pre><span></span><code><span class="n">__date__</span> <span class="o">=</span> <span class="s1">'2021-07-19'</span>
<span class="n">__author__</span> <span class="o">=</span> <span class="s1">'Morteza Zakeri'</span>
<span class="k">def</span> <span class="nf">create_graph</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">class_diagram</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">DiGraph</span><span class="p">()</span>
<span class="k">for</span> <span class="n">class_instance</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_list</span><span class="p">:</span>
<span class="n">class_diagram</span><span class="o">.</span><span class="n">add_node</span><span class="p">(</span><span class="n">class_instance</span><span class="o">.</span><span class="n">name</span><span class="p">)</span>
<span class="k">for</span> <span class="n">class_instance</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_list</span><span class="p">:</span>
<span class="k">if</span> <span class="n">class_instance</span><span class="o">.</span><span class="n">attributes_list</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">for</span> <span class="n">class_attribute</span> <span class="ow">in</span> <span class="n">class_instance</span><span class="o">.</span><span class="n">attributes_list</span><span class="p">:</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">class_attribute</span><span class="o">.</span><span class="n">variable_type</span><span class="p">,</span> <span class="n">Class</span><span class="p">)</span> <span class="ow">or</span>
<span class="nb">isinstance</span><span class="p">(</span><span class="n">class_attribute</span><span class="o">.</span><span class="n">variable_type</span><span class="p">,</span> <span class="n">Structure</span><span class="p">):</span>
<span class="n">w</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="n">class_diagram</span><span class="o">.</span><span class="n">has_edge</span><span class="p">(</span><span class="n">class_instance</span><span class="o">.</span><span class="n">name</span><span class="p">,</span> <span class="n">class_attribute</span><span class="o">.</span><span class="n">variable_type</span><span class="o">.</span><span class="n">name</span><span class="p">):</span>
<span class="n">w</span> <span class="o">=</span> <span class="n">class_diagram</span><span class="p">[</span><span class="n">class_instance</span><span class="o">.</span><span class="n">name</span><span class="p">][</span><span class="n">class_attribute</span><span class="o">.</span><span class="n">variable_type</span><span class="o">.</span><span class="n">name</span><span class="p">][</span><span class="s1">'weight'</span><span class="p">]</span>
<span class="n">w</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">class_diagram</span><span class="o">.</span><span class="n">add_edge</span><span class="p">(</span><span class="n">class_instance</span><span class="o">.</span><span class="n">name</span><span class="p">,</span> <span class="n">class_attribute</span><span class="o">.</span><span class="n">variable_type</span><span class="o">.</span><span class="n">name</span><span class="p">,</span>
<span class="n">rel</span><span class="o">=</span><span class="s1">'Aggregation'</span><span class="p">,</span>
<span class="n">weight</span><span class="o">=</span><span class="n">w</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span> <span class="o">=</span> <span class="n">class_diagram</span>
<span class="k">def</span> <span class="nf">draw_graph</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">new_names_dict</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">()</span>
<span class="k">for</span> <span class="n">node_name</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="o">.</span><span class="n">nodes</span><span class="p">:</span>
<span class="n">new_names_dict</span><span class="o">.</span><span class="n">update</span><span class="p">({</span><span class="n">node_name</span><span class="p">:</span> <span class="n">node_name</span><span class="p">})</span>
<span class="n">edge_labels</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">get_edge_attributes</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">,</span> <span class="s1">'rel'</span><span class="p">)</span>
<span class="n">edge_labels2</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">get_edge_attributes</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">,</span> <span class="s1">'cardinality'</span><span class="p">)</span>
<span class="n">pos</span> <span class="o">=</span> <span class="n">nx</span><span class="o">.</span><span class="n">kamada_kawai_layout</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">)</span>
<span class="n">nx</span><span class="o">.</span><span class="n">draw_networkx_nodes</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span>
<span class="n">nodelist</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="o">.</span><span class="n">nodes</span><span class="p">,</span>
<span class="n">node_shape</span><span class="o">=</span><span class="s1">'s'</span><span class="p">,</span>
<span class="n">node_size</span><span class="o">=</span><span class="mi">1000</span><span class="p">,</span>
<span class="n">alpha</span><span class="o">=</span><span class="mf">0.25</span><span class="p">,</span>
<span class="n">node_color</span><span class="o">=</span><span class="s1">'r'</span><span class="p">)</span>
<span class="n">nx</span><span class="o">.</span><span class="n">draw_networkx_edges</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span>
<span class="n">edgelist</span><span class="o">=</span><span class="nb">list</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="o">.</span><span class="n">edges</span><span class="p">),</span>
<span class="n">width</span><span class="o">=</span><span class="mf">2.0</span><span class="p">,</span>
<span class="n">alpha</span><span class="o">=</span><span class="mf">0.95</span><span class="p">,</span>
<span class="n">edge_color</span><span class="o">=</span><span class="s1">'b'</span><span class="p">)</span>
<span class="n">nx</span><span class="o">.</span><span class="n">draw_networkx_edge_labels</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">labels</span><span class="o">=</span><span class="n">edge_labels</span><span class="p">)</span>
<span class="n">nx</span><span class="o">.</span><span class="n">draw_networkx_edge_labels</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">labels</span><span class="o">=</span><span class="n">edge_labels2</span><span class="p">)</span>
<span class="n">nx</span><span class="o">.</span><span class="n">draw_networkx_labels</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">class_diagram</span><span class="p">,</span> <span class="n">pos</span><span class="p">,</span> <span class="n">new_names_dict</span><span class="p">,</span> <span class="n">font_size</span><span class="o">=</span><span class="mi">11</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p><em>Listing 6: Methods for creating and visualizing a simple class diagram.</em></p>
<p><span class="caps">FAN</span>-<span class="caps">IN</span> and <span class="caps">FAN</span>-<span class="caps">OUT</span> can for each class are defined respectively as in-degree and out-degree of the class diagram corresponding graph. Therefore, having that graph means that we can compute these metrics quickly. To illustrate the discussed static analysis on a real program, consider the C++ program in Listing 7, which has four simple classes: <code>Person</code>, <code>Student</code>, <code>Teacher</code>, and <code>Course</code>. The implementation of classes has been omitted for simplicity. Both the <code>Student</code> class and <code>Teacher</code> class have been inherited from the class <code>Person</code>. In addition, the <code>Student</code> class has aggregated an instance of the <code>Course</code> class. </p>
<div class="highlight"><pre><span></span><code><span class="cp">#</span><span class="w"> </span><span class="cp">include</span><span class="w"> </span><span class="cpf"><string></span><span class="cp"></span>
<span class="cp">#</span><span class="w"> </span><span class="cp">include</span><span class="w"> </span><span class="cpf"><iostream></span><span class="cp"></span>
<span class="k">using</span><span class="w"> </span><span class="k">namespace</span><span class="w"> </span><span class="nn">std</span><span class="p">;</span><span class="w"></span>
<span class="k">class</span><span class="w"> </span><span class="nc">Person</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="k">protected</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">;</span><span class="w"></span>
<span class="k">public</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">Person</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="kt">void</span><span class="w"> </span><span class="nf">setPersonName</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="k">virtual</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="nf">doJob</span><span class="p">();</span><span class="w"></span>
<span class="p">};</span><span class="w"></span>
<span class="n">Person</span><span class="o">::</span><span class="n">Person</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">)</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">firstName</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">firstName</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">lastName</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lastName</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">nationalCode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="kt">void</span><span class="w"> </span><span class="n">Person</span><span class="o">::</span><span class="n">setPersonName</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">)</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">firstName</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">firstName</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">lastName</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lastName</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="kt">int</span><span class="w"> </span><span class="n">Person</span><span class="o">::</span><span class="n">doJob</span><span class="p">()</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">firstName</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">" is a person "</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="k">class</span><span class="w"> </span><span class="nc">Student</span><span class="o">:</span><span class="w"> </span><span class="k">public</span><span class="w"> </span><span class="n">Person</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="k">private</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="kt">long</span><span class="w"> </span><span class="n">studentNumber</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">Course</span><span class="o">*</span><span class="w"> </span><span class="n">cource</span><span class="p">;</span><span class="w"></span>
<span class="k">public</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">Student</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">,</span><span class="w"> </span><span class="kt">long</span><span class="w"> </span><span class="n">studentNumber</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="nf">doJob</span><span class="p">()</span><span class="w"> </span><span class="k">override</span><span class="p">;</span><span class="w"></span>
<span class="p">};</span><span class="w"></span>
<span class="n">Student</span><span class="o">::</span><span class="n">Student</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">,</span><span class="w"> </span><span class="kt">long</span><span class="w"> </span><span class="n">studentNumber</span><span class="p">)</span><span class="o">:</span><span class="n">Person</span><span class="p">(</span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">)</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">studentNumber</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">studentNumber</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"I am a student: "</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">studentNumber</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">cource</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">Course</span><span class="p">();</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">cource</span><span class="o">-></span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"Software Engieering"</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="kt">int</span><span class="w"> </span><span class="n">Student</span><span class="o">::</span><span class="n">doJob</span><span class="p">()</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">firstName</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">" is studing "</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="mi">20</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="k">class</span><span class="w"> </span><span class="nc">Teacher</span><span class="o">:</span><span class="w"> </span><span class="k">public</span><span class="w"> </span><span class="n">Person</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="k">private</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="kt">long</span><span class="w"> </span><span class="n">teacherNumber</span><span class="p">;</span><span class="w"></span>
<span class="k">public</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">Teacher</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">,</span><span class="w"> </span><span class="kt">long</span><span class="w"> </span><span class="n">teacherNumber</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="nf">doJob</span><span class="p">()</span><span class="w"> </span><span class="k">override</span><span class="p">;</span><span class="w"></span>
<span class="p">};</span><span class="w"></span>
<span class="n">Teacher</span><span class="o">::</span><span class="n">Teacher</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">,</span><span class="w"> </span><span class="kt">long</span><span class="w"> </span><span class="n">teacherNumber</span><span class="p">)</span><span class="o">:</span><span class="n">Person</span><span class="p">(</span><span class="n">firstName</span><span class="p">,</span><span class="w"> </span><span class="n">lastName</span><span class="p">,</span><span class="w"> </span><span class="n">nationalCode</span><span class="p">)</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">teacherNumber</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">teacherNumber</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">"I am a teacher: "</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">teacherNumber</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="kt">int</span><span class="w"> </span><span class="n">Teacher</span><span class="o">::</span><span class="n">doJob</span><span class="p">()</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">cout</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">firstName</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="s">" is teaching "</span><span class="w"> </span><span class="o"><<</span><span class="w"> </span><span class="n">endl</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="k">class</span><span class="w"> </span><span class="nc">Course</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="k">public</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">name</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">number</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="n">Course</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">course_name</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">course_numbber</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">);</span><span class="w"></span>
<span class="p">};</span><span class="w"></span>
<span class="n">Course</span><span class="o">::</span><span class="n">Course</span><span class="p">(</span><span class="n">string</span><span class="w"> </span><span class="n">course_name</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">course_numbber</span><span class="p">)</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">course_name</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">this</span><span class="o">-></span><span class="n">number</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">course_numbber</span><span class="p">;</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
<span class="cm">/* main function */</span><span class="w"></span>
<span class="kt">int</span><span class="w"> </span><span class="n">main</span><span class="p">()</span><span class="w"></span>
<span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">Teacher</span><span class="w"> </span><span class="n">t1</span><span class="p">(</span><span class="s">"Saeed"</span><span class="p">,</span><span class="w"> </span><span class="s">"Parsa"</span><span class="p">,</span><span class="w"> </span><span class="mi">1234</span><span class="p">,</span><span class="w"> </span><span class="mi">1398</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">Student</span><span class="w"> </span><span class="n">s1</span><span class="p">(</span><span class="s">"Morteza"</span><span class="p">,</span><span class="w"> </span><span class="s">"Zakeri"</span><span class="p">,</span><span class="w"> </span><span class="mi">5678</span><span class="p">,</span><span class="w"> </span><span class="mi">2020</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">t1</span><span class="p">.</span><span class="n">doJob</span><span class="p">();</span><span class="w"></span>
<span class="w"> </span><span class="n">s1</span><span class="p">.</span><span class="n">doJob</span><span class="p">();</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</code></pre></div>
<p><em>Listing 7: A C++ application to test the developed static analysis program in this tutorial.</em></p>
<p>The corresponding graph for the class diagram of this program, which is the output of executing codes in Listings 5 and 6, has been shown in Figure 3. As one can see, the inheritance relationships also have been shown in the figure. We omitted the code that captures the inheritance relationship in this section. You may ask to implement the extraction of inheritance relationships after reading this tutorial. </p>
<p><img alt="The extracted class diagram" src="../static/img/static_analysis/extracted_calss_diagram.png"></p>
<p><em>Figure 3: Class diagram for the program shown in Listing 7.</em></p>
<p><span class="caps">FAN</span>-<span class="caps">IN</span> and <span class="caps">FAN</span>-<span class="caps">OUT</span> metrics can be computed, as discussed earlier. For this simple example, <span class="caps">FAN</span>-<span class="caps">IN</span> for class Student is 0, and <span class="caps">FAN</span>-<span class="caps">OUT</span> is one; however, for complete computation of these metrics, all relationships, including association, dependencies, and parameters passing, should be considered. </p>
<h2>Summary</h2>
<p>In this tutorial and the <a href="program_instrumentation.md">previous one</a>, I discussed the application of compilers in static and dynamic software analysis. I demonstrated these applications through a simple example of source code instrumentation and metrics computation. The former is a transformation task that modifies the source code, and the latter is an analysis task that extracts some information from the source code. Both of them are essential tasks in the future of software engineering. </p>
<p>Systematic software testing and quality assurance tools can be built on top of compiler tools such as <span class="caps">ANTLR</span>, <span class="caps">LLVM</span>, <span class="caps">JDT</span>, and Roslyn, with techniques presented in this chapter. Compilers build a detailed model of application code as they validate the syntax and semantics of that code. While traditional compilers used such a model to build the executable output from the source code in a block box manner, the new generation of compilers provides APIs to access the internal details of this model, which can be utilized to build more reliable software. Software testing is more realistic with advanced support by compilers.</p>Advanced Software Engineering2021-03-23T00:23:00+04:302021-03-23T00:23:00+04:30Mortezatag:m-zakeri.github.io,2021-03-23:/advanced-software-engineering.html<p>Advanced Software Engineering, graduate course.</p><p>The <span class="caps">IUST</span> advanced software engineering (<span class="caps">ASE</span>) course aims at teaching the latest and emerging topics and advances in the field of software engineering to the students who are already familiar with basic subjects in the field. Here, I will share relevant materials and resources with you.</p>
<h2>Teaching assistant</h2>
<h3>Foreword</h3>
<p>I was teaching assistant of Advanced Software Engineering M.Sc. and Ph.D. course by <a href="http://parsa.iust.ac.ir/" target="_blank">Dr. Saeed Parsa</a> for six semesters at Iran University of Science and Technology.
Our teaching materials during these three years are available to view and download.</p>
<h3>Useful links</h3>
<ul>
<li><a href="http://parsa.iust.ac.ir/courses/advanced-software-engineering/" target="_blank">Course official website</a></li>
</ul>Compilers2021-03-23T00:23:00+04:302021-03-23T00:23:00+04:30Mortezatag:m-zakeri.github.io,2021-03-23:/compilers.html<p>Compiler design and constructions, Undergraduate course (Bachelor).</p><h2>Teaching assistant</h2>
<h3>Foreword</h3>
<p>I was teaching assistant of Compiler Design and Construction B.Sc. course by <a href="http://parsa.iust.ac.ir/" target="_blank">Dr. Saeed Parsa</a> for seven semesters (more than three years) at Iran University of Science and Technology. Our teaching materials during these three years are available to view and download.</p>
<p>I put all source code that I developed to practically teach compiler to students on the GitHub page, the <a href="http://parsa.iust.ac.ir/courses/compilers/" target="_blank"><span class="caps">IUST</span> compiler course</a>. </p>
<h3>Useful links</h3>
<ul>
<li>
<p><a href="http://parsa.iust.ac.ir/courses/compilers/" target="_blank"><span class="caps">ISUT</span> Compiler Course Official page</a></p>
</li>
<li>
<p><a href="https://m-zakeri.github.io/IUSTCompiler/" target="_blank"><span class="caps">ISUT</span> Compiler Course GitHub Page (in English)</a></p>
</li>
<li>
<p><a href="https://compileriust.github.io/" target="_blank"><span class="caps">ISUT</span> Compiler Course GitHub Page (in Persian)</a></p>
</li>
</ul>
<h3>Projects</h3>
<p>I designed and planned some <strong>practical</strong> projects about the applications of compiler science in <strong>program analysis</strong>.
The projects shown in Table 1 have been assigned to the students who take the <span class="caps">IUST</span> compiler course during different semesters. Click on the link in the “Project” column to see the project proposal. </p>
<p><strong>Table 1:</strong> Compiler projects.</p>
<table>
<thead>
<tr>
<th>Project</th>
<th>Description</th>
<th>Semesters</th>
<th>Courses</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_symbol_table_development/">OpenUnderstand 2</a></td>
<td>Low-level source code metrics calculation</td>
<td>Spring 2022</td>
<td>Compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_symbol_table_development/">OpenUnderstand</a></td>
<td>Symbols table development</td>
<td>Fall 2021, Spring 2022,</td>
<td>Compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_software_metrics_development/">QualityMeter</a></td>
<td>- Source code quality attribute computation - Refactoring opportunity detection</td>
<td>Fall 2021</td>
<td>Advanced compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_code_smell_development/">CodART 2</a></td>
<td>Source code smell detection</td>
<td>Spring 2021 (Cancelled)</td>
<td>Compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_refactoring_to_design_patterns_development/">CodART</a></td>
<td>Source code refactoring</td>
<td>Fall 2020, Spring 2021,</td>
<td>Compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_refactorings_development/">CodART</a></td>
<td>Refactoring to design pattern at the source code level</td>
<td>Fall 2020</td>
<td>Advanced compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_clean_code_development/">CleanCode</a></td>
<td>Source code smell detection</td>
<td>Fall 2019, Spring 2020</td>
<td>Compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/core_source_code_instrumentation_development/">CodA</a></td>
<td>Source code instrumentation and testbed analysis tool</td>
<td>Fall 2018</td>
<td>Compiler / Advanced compiler</td>
</tr>
<tr>
<td><a href="https://m-zakeri.github.io/IUSTCompiler/projects/mini_java_compiler_development/"><span class="caps">ANTLR</span> MiniJava</a></td>
<td>Parse-tree and intermediate code generation for the MiniJava programming language with <span class="caps">ANTLR</span></td>
<td>Fall 2016, Spring 2017</td>
<td>Compiler</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
<h2>As a student</h2>
<p>I always enjoy learning about compilers, code transformation, and their application in automated software engineering. I firmly believe that the next generation of software engineers are intelligent white-box compilers! Such compilers are structure-aware, context-aware, and domain-aware, assisting the programmer in writing high-quality and testable programs.
Compilers helped artificial intelligence (<span class="caps">AI</span>) in the past, and now <span class="caps">AI</span> boosts compilers!</p>An introduction to ANTLR in Python2021-03-22T23:00:00+04:302021-03-22T23:00:00+04:30Mortezatag:m-zakeri.github.io,2021-03-22:/an-introduction-to-antlr-in-python.html<p><span class="caps">ANTLR</span> is a parser generator that can generate the parser program from context-free grammar descriptions specified in the <span class="caps">ANTLR</span> grammar format. In this tutorial, I explain how we can generate and use the Java parser with <span class="caps">ANTLR</span> in the Python programming language.</p><h2>Background</h2>
<p>The <span class="caps">ANTLR</span> tool generates a top-down parser from the grammar rules defined with the <span class="caps">ANTLR</span> meta-grammar (Parr and Fisher 2011). The initial version of <span class="caps">ANTLR</span> generated the target parser source code in Java. In the current version (version 4), the parser source code can be generated in a wide range of programming languages listed on the <a href="https://www.antlr.org" target="_blank"><span class="caps">ANTLR</span> official website</a> (Parr 2022a).
For simplicity, we generate the parser in Python 3, which provides us to run the tool on every platform having Python 3 installed on it.
Another reason to use Python is that we can integrate the developed program easily with other libraries available in Python, such as machine learning and optimization libraries.
Finally, I found that there is no comprehensive tutorial on using <span class="caps">ANTLR</span> with the Python backend. </p>
<p>To use <span class="caps">ANTLR</span> in other programming languages, specifically Java and C#, refer to the <span class="caps">ANTLR</span> slides I created before this tutorial. </p>
<p>The <span class="caps">ANTLR</span> tool is a small “.jar” file that must be run from the command line to generate the parser codes. The <span class="caps">ANTLR</span> tool jar file can be downloaded from <a href="https://www.antlr.org/download/antlr-4.10.1-complete.jar">here</a>. </p>
<h2>Generating parser</h2>
<p>As mentioned, to generate a parser for a programming language, the grammar specification described with <span class="caps">ANTLR</span> meta-grammar is required. <span class="caps">ANTLR</span> grammar files are named with the “.g4” suffix. </p>
<p>We obtain the grammar of Java 8 to build our parser for the Java programming language. The grammar can be downloaded from <span class="caps">ANTLR</span> 4 grammar repository on GitHub: <a href="[https://github.com/antlr/grammars-v4">https://github.com/antlr/grammars-v4</a>.
Once the <span class="caps">ANTLR</span> tool and required grammar files are prepared, we can generate the parser for that with the following command:</p>
<div class="highlight"><pre><span></span><code>> java -Xmx500M -cp antlr-4.9.3-complete.jar org.antlr.v4.Tool -Dlanguage=Python3 -o . JavaLexer.g4
> java -Xmx500M -cp antlr-4.9.3-complete.jar org.antlr.v4.Tool -Dlanguage=Python3 -visitor -listener -o . JavaLabeledParser.g4
</code></pre></div>
<p>The first command generates the lexer from the <code>JavaLexer.g4</code> description file and the second command generates the parser from the <code>JavaLabeledParser.g4</code> description file. It is worth noting that the lexer and parser can be written in one file. In such a case, a <em>single command</em> generates all required codes in one step.</p>
<p>The grammar files used in the above command are also available in <a href="https://m-zakeri.github.io/IUSTCompiler/grammars" target="_blank">grammars directory</a> of the CodART repository.
You may see that I have made some modifications to the Parser rules. </p>
<p>In the above commands, the <code>antlr-4.9.3-complete.jar</code> is the <span class="caps">ANTLR</span> tool that requires Java to be executed. <code>-Dlanguage</code> denotes the destination language that the <span class="caps">ANTLR</span> parser (and lexer) source code is generated in which. In our case, we set it to Python3. </p>
<p>After executing the <span class="caps">ANTLR</span> parser generation commands, eight files, including parser source code and other required information, are generated. Figure 1 shows the generated files. The “.py” contains lexer and parser source code that can parse any Java input file. The <code>-visitor -listener</code> switches in the second command result in generating two separate source files, <code>JavaLabledParserListener.py</code> and <code>JavaLabledParserVistor.py</code>, which provide interfaces to implement the required codes for a specific language application. Our application is source code refactoring which uses the listener mechanism to implement necessary actions transforming the program to the refactored version.
The parse tree structure in and listener mechanism are discussed in the next sections.</p>
<p><img alt="ANTLR generated files " src="../static/img/antlr_basics/antlr-generated-files.png"></p>
<p><em>Figure 1. Generated files by <span class="caps">ANTLR</span>.</em></p>
<p>It should be noted that to use the generated classes in Figure 1, for developing a specific program, we need to install the appropriate <span class="caps">ANTLR</span> runtime library. For creating <span class="caps">ANTLR</span>-based programs in Python, the command <code>pip install antlr-python3-runtime</code> can be used. It installed all runtime dependencies required to program using the <span class="caps">ANTLR</span> library.</p>
<h2><span class="caps">ANTLR</span> parse tree</h2>
<p>The generated parser by <span class="caps">ANTLR</span> is responsible for parsing every Java source code file and generating the parse tree or designating the syntax errors in the input file. The parse tree for real-world programs with thousands of lines of code has a non-trivial structure. <span class="caps">ANTLR</span> developers have provided some <span class="caps">IDE</span> plugins that can visualize the parse tree to better understand the structure of the parse tree generated by <span class="caps">ANTLR</span>. We use Pycharm <span class="caps">IDE</span> developed by Jetbrains to work with Python code. </p>
<p>Figure 2 shows how we can install the <span class="caps">ANTLR</span> plugin in PyCharm. The plugin source code is available on the <a href="https://github.com/antlr/intellij-plugin-v4" target="_blank">GitHub repo</a>. When the plugin is installed, the <span class="caps">ANTLR</span> preview widow is applied at the bottom of the PyCharm <span class="caps">IDE</span>. In addition, the <span class="caps">IDE</span> can be recognized as “.g4” files and some other options added to the <span class="caps">IDE</span>. The main option is the ability to test a grammar rule and visualize the corresponding parse tree to that rule.</p>
<p><img alt="Installing ANTLR plugin in PyCharm IDE " src="../static/img/antlr_basics/installing-antlr4-plugin-in-pycharm.png"></p>
<p><em>Figure 2. Installing the <span class="caps">ANTLR</span> plugin in the PyCharm <span class="caps">IDE</span>.</em></p>
<p>In order to use the <span class="caps">ANTLR</span> preview tab, the <span class="caps">ANTLR</span> grammar should be opened in the PyCharm <span class="caps">IDE</span>. We then select a rule (typically the start rule) of our grammar, right-click on the rule, and select the “Test Rule <code>rule_name</code>” option from the opened menu, shown in Figure 3. We now write our sample input program in the left panel of the <span class="caps">ANTLR</span> preview, and the parse tree is shown in the right panel. </p>
<p><img alt="Test the grammar rule in the ANTLR PyCharm plugin " src="../static/img/antlr_basics/selecting-rule-for-test-in-antlr-preview-window.png"></p>
<p><em>Figure 3. Test the grammar rule in the <span class="caps">ANTLR</span> PyCharm plugin.</em></p>
<p>Figure 4 shows a simple Java class and the corresponding parse tree generated by the <span class="caps">ANTLR</span>. The leaves of the parse tree are program tokens, while the intermediate nodes are grammar rules that the evaluating program is derived from them. Also, the root of the tree is the grammar rule, which we selected to start parsing. It means that we can select and test every rule independently. However, a complete Java program can only parse from the start rule of the given grammar, i.e., the <code>compilaionUnit</code> rule.</p>
<p><img alt="Test the grammar rule in the ANTLR PyCharm plugin" src="../static/img/antlr_basics/example_of_antlr-parse-tree-in-antlr-preview.png"></p>
<p><em>Figure 4. Test the grammar rule in the <span class="caps">ANTLR</span> PyCharm plugin.</em></p>
<p>It should be mentioned that the <span class="caps">ANTLR</span> Preview window is based on a grammar interpreter, not on the actual generated parser described in the previous section. It means that grammar attributes such as actions and predicates will not be evaluated during live preview because the interpreter is language agnostic. For the same reasons, if the generated parser and/or lexer classes extend a custom implementation of the base parser/lexer classes, the custom code will not be run during the live preview. </p>
<p>In addition to the parse tree visualization, the <span class="caps">ANTLR</span> plugin provides facilities such as profiling, code generation, etc., described in <a href="https://github.com/antlr/intellij-plugin-v4" target="_blank">here</a> (Parr 2022b). For example, the profile tab shows the execution time of each rule in the parser for a given input string.</p>
<p>I want to emphasize that visualizing the parse tree with the <span class="caps">ANTLR</span> plugin is really helpful when developing code and fixing bugs described in the next section of this tutorial.</p>
<h2>Traversing the parse tree programmatically</h2>
<p><span class="caps">ANTLR</span> is not a simple parser generator. It provides a depth-first parse tree visiting and a callback mechanism called listener to implement the required program analysis or transformation passes. The depth-first search is performed by instantiating an object from the <span class="caps">ANTLR</span> <code>ParseTreeWalker</code> class and calling the walk method, which takes an instance of <code>ParseTree</code> as an input argument and traverses it.</p>
<p>Obviously, if we visit the parse tree with the depth-first search algorithm, all program tokens are visited in the same order that they appeared in the source code file. However, the depth-first search contains additional information about when a node in the tree is visited and when the visiting all nodes in its subtree is finished. Therefore, we can add the required actions when visiting a node to perform a special task. For example, according to Figure 4, for counting the number of classes in a code snippet, we can define a counter variable, initialize it to zero, and increase it whenever the walker visits the “classDeclartion” node. </p>
<p><span class="caps">ANTLR</span> provides two callback functions for each node in the parse tree. One is called by the walker when it is entered into a node, i.e., visit the node, but the children are not visited yet. Another is called when all nodes in the subtree of the visited node have been visited, and the walker is exiting the node. These callback functions are available in the listener class generated by the <span class="caps">ANTLR</span> for every rule in a given grammar. In our example for counting the number of classes, we implement all required logic in the body of <code>enterClassDeclartion</code> method of the <code>JavaLabledParserListener</code> class. We called these logic codes <em>grammar’s actions</em> since, indeed, they are bunded to a grammar rule. </p>
<p>It is worth noting that we can add these actions codes in the grammar file (<code>.g4</code> file) to form an attributed grammar. Embedding actions in grammar increase the efficiency of the analyzing process. However, when we need many complex actions, the listener mechanism provides a better way to implement them. Indeed, <span class="caps">ANTLR</span> 4 emphasizes separating the language applications from the language grammar by using the listener mechanism.</p>
<p>Listing 1 shows the implementation program for counting the number of classes using the <span class="caps">ANTLR</span> listener mechanism. The <code>DesignMetrics</code> class inherits from <code>JavaLabeledParserListener</code> class which is the default listener class generated by <span class="caps">ANTLR</span>. We only implement the <code>enterClassDeclartion</code> method, which increases the value of the <code>__dsc</code> counter each time the walker visits a Java class.</p>
<div class="highlight"><pre><span></span><code><span class="c1"># module: JavaLabledParserListener.py</span>
<span class="n">__version__</span> <span class="o">=</span> <span class="s2">"0.1.0"</span>
<span class="n">__author__</span> <span class="o">=</span> <span class="s2">"Morteza"</span>
<span class="kn">from</span> <span class="nn">antlr4</span> <span class="kn">import</span> <span class="o">*</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="ow">and</span> <span class="s2">"."</span> <span class="ow">in</span> <span class="vm">__name__</span><span class="p">:</span>
<span class="kn">from</span> <span class="nn">.JavaLabeledParser</span> <span class="kn">import</span> <span class="n">JavaLabeledParser</span>
<span class="k">else</span><span class="p">:</span>
<span class="kn">from</span> <span class="nn">JavaLabeledParser</span> <span class="kn">import</span> <span class="n">JavaLabeledParser</span>
<span class="k">class</span> <span class="nc">JavaLabeledParserListener</span><span class="p">(</span><span class="n">ParseTreeListener</span><span class="p">):</span>
<span class="c1"># …</span>
<span class="k">def</span> <span class="nf">enterClassDeclaration</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span>
<span class="n">ctx</span><span class="p">:</span><span class="n">JavaLabeledParser</span><span class="o">.</span><span class="n">ClassDeclarationContext</span><span class="p">):</span>
<span class="k">pass</span>
<span class="c1"># …</span>
<span class="k">class</span> <span class="nc">DesignMetrics</span><span class="p">(</span><span class="n">JavaLabeledParserListener</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">__dsc</span><span class="p">:</span><span class="nb">int</span> <span class="o">=</span> <span class="mi">0</span> <span class="c1"># Keep design size in classes</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">get_design_size</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">__dsc</span>
<span class="k">def</span> <span class="nf">enterClassDeclaration</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span>
<span class="n">ctx</span><span class="p">:</span><span class="n">JavaLabeledParser</span><span class="o">.</span><span class="n">ClassDeclarationContext</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">__dsc</span> <span class="o">+=</span> <span class="mi">1</span>
</code></pre></div>
<p><em>Listing 1: Programs that count the number of classes in a Java source code.</em></p>
<h3>Wiring the modules</h3>
<p>To complete our simple analysis task, first, the parse tree for a given input should be constructed. Then, the DesignMetrics class should be instantiated and passed to an object of ParseTreeWalker class. We created a driver module in Python beside the generated code by <span class="caps">ANTLR</span> to connect different parts of our program and complete our task. Listing 2 shows the implementation of the main driver for a program that counts the number of classes in Java source codes.</p>
<div class="highlight"><pre><span></span><code><span class="c1"># Module: main_driver.py</span>
<span class="n">__version__</span> <span class="o">=</span> <span class="s2">"0.1.0"</span>
<span class="n">__author__</span> <span class="o">=</span> <span class="s2">"Morteza"</span>
<span class="kn">from</span> <span class="nn">antlr4</span> <span class="kn">import</span> <span class="o">*</span>
<span class="kn">from</span> <span class="nn">JavaLabledLexer</span> <span class="kn">import</span> <span class="n">JavaLabledLexer</span>
<span class="kn">from</span> <span class="nn">JavaLeabledParser</span> <span class="kn">import</span> <span class="n">JavaLabledParser</span>
<span class="kn">from</span> <span class="nn">JavaLabledParserListener</span> <span class="kn">import</span> <span class="n">DesignMetrics</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">(</span><span class="n">args</span><span class="p">):</span>
<span class="c1"># Step 1: Load input source into the stream object</span>
<span class="n">stream</span> <span class="o">=</span> <span class="n">FileStream</span><span class="p">(</span><span class="n">args</span><span class="o">.</span><span class="n">file</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="s1">'utf8'</span><span class="p">)</span>
<span class="c1"># Step 2: Create an instance of AssignmentStLexer</span>
<span class="n">lexer</span> <span class="o">=</span> <span class="n">JavaLabledLexer</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span>
<span class="c1"># Step 3: Convert the input source into a list of tokens</span>
<span class="n">token_stream</span> <span class="o">=</span> <span class="n">CommonTokenStream</span><span class="p">(</span><span class="n">lexer</span><span class="p">)</span>
<span class="c1"># Step 4: Create an instance of the AssignmentStParser</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">JavaLabledParser</span><span class="p">(</span><span class="n">token_stream</span><span class="p">)</span>
<span class="c1"># Step 5: Create parse tree</span>
<span class="n">parse_tree</span> <span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span class="n">compilationUnit</span><span class="p">()</span>
<span class="c1"># Step 6: Create an instance of DesignMetrics listener class</span>
<span class="n">my_listener</span> <span class="o">=</span> <span class="n">DesignMetrics</span><span class="p">()</span>
<span class="c1"># Step 7: Create a walker to traverse the parse tree and callback our listener</span>
<span class="n">walker</span> <span class="o">=</span> <span class="n">ParseTreeWalker</span><span class="p">()</span>
<span class="n">walker</span><span class="o">.</span><span class="n">walk</span><span class="p">(</span><span class="n">t</span><span class="o">=</span><span class="n">parse_tree</span><span class="p">,</span> <span class="n">listener</span><span class="o">=</span><span class="n">my_listener</span><span class="p">)</span>
<span class="c1"># Step 8: Getting the results</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">'DSC=</span><span class="si">{</span><span class="n">my_listener</span><span class="o">.</span><span class="n">get_design_size</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
</code></pre></div>
<p><em>Listing 2: Main driver module for the program in Listing 1</em></p>
<h2>Conclusion and remarks</h2>
<p>In this tutorial, we described the basic concepts regarding using the <span class="caps">ANTLR</span> tool to generate and walk phase three and implement custom program analysis applications with the help of the <span class="caps">ANTLR</span> listener mechanism. The most important point is that we used the real-world programming languages grammars to show the parsing and analyzing process. The discussed topics form the underlying concepts of our approach for automated refactoring used in CodART.
Check out the <a href="https://m-zakeri.github.io/CodART/tutorials/antlr_advanced/" target="_blank"><span class="caps">ANTLR</span> advanced tutorial</a> to find out how we can use <span class="caps">ANTLR</span> for reliable and efficient program transformation.</p>
<hr>
<h2>References</h2>
<p>Parr T <span class="caps">ANTLR</span> (ANother Tool for Language Recognition). https://www.antlr.org. Accessed 10 Jan 2022a</p>
<p>Parr T IntelliJ Idea Plugin for <span class="caps">ANTLR</span> v4. https://github.com/antlr/intellij-plugin-v4. Accessed 10 Jan 2022b</p>
<p>Parr T, Fisher K (2011) <span class="caps">LL</span>(*): the foundation of the <span class="caps">ANTLR</span> parser generator. Proc 32nd <span class="caps">ACM</span> <span class="caps">SIGPLAN</span> Conf Program Lang Des Implement 425–436. https://doi.org/http://doi.acm.org/10.1145/1993498.1993548</p>Innovations on Automatic Test Data Generation2021-03-22T23:00:00+04:302021-03-22T23:00:00+04:30Mortezatag:m-zakeri.github.io,2021-03-22:/innovations-on-automatic-test-data-generation.html<p>Fuzz testing (Fuzzing) is a dynamic software testing technique. In this technique with repeated generation and injection of malformed test data to the software under test (<span class="caps">SUT</span>), we are looking for possible faults and vulnerabilities.</p><p>Fuzz testing (Fuzzing) is a dynamic software testing technique. In this technique with repeated generation and injection of malformed test data to the software under test (<span class="caps">SUT</span>), we are looking for possible faults and vulnerabilities. To this goal, fuzz testing requires varieties of test data. The most critical challenge is to handle the complexity of the file structures as program input. Surveys have revealed that many of the generated test data in these cases follow restricted numbers and superficial paths, because of being rejected by the parser of <span class="caps">SUT</span> in the initial stages of parsing. Using the grammatical structure of input files to generate test data lead to increase code coverage. However, often, the grammar extraction is performed manually, which is a time consuming, costly and error-prone task.</p>
<p>Recently, we proposed an automated method for hybrid test data generation. We applied neural language models (NLMs) that are constructed by recurrent neural networks (RNNs). The proposed models by using deep learning techniques can learn the statistical structure of complex files and then generate new textual test data, based on the grammar, and binary data, based on mutations. Fuzzing the generated data is done by two newly introduced algorithms, called neural fuzz algorithms that use these models. We use our proposed method to generate test data, and then fuzz testing of MuPDF complicated software which takes portable document format (<span class="caps">PDF</span>) files as input. To train our generative models, we gathered a large corpus of <span class="caps">PDF</span> files. Our experiments demonstrate that the data generated by this method leads to an increase in the code coverage, more than 7.0%, compared to state-of-the-art file format fuzzers such as American fuzzy lop (<span class="caps">AFL</span>). Experiments also indicate a better learning accuracy of simpler <span class="caps">NLMS</span> in comparison with the more complicated encoder-decoder model and confirm that our proposed models can outperform the encoder-decoder model in code coverage when fuzzing the <span class="caps">SUT</span>.</p>
<p>Out paper presents a solution for complex test data generation with the help of deep neural networks. The word “complex” in this context means that test data consist of various data types gathering together based on a specific format or grammar. This is what happens in most of the real-word applications which accept a file as their main input. For example, <span class="caps">PDF</span> reader software must handle <span class="caps">PDF</span> files as input, and <span class="caps">PDF</span> is one of the most complex input formats. A <span class="caps">PDF</span> file contains both textual and non-textual or binary data plus many human-defined rules which put such data fields beside each other and generate a file. To handle a complex file, an application processes the file in different stages. These stages usually begin by parsing the input file, then continue with semantic analysis, and finally terminate by executing the file content. Generating test data, here a complex input file, which can access to high code coverage and find more probably existing bugs, require that test data pass all of these stages. To the best of our knowledge, the methods used in fuzzing, one of the most effective software testing technics, show that randomly generating such test data lead to very low coverage of code and hence can not guarantee the absence of bugs or reliability of software. On the other hand, generating test data from grammar requires extracting the grammar or model of the file manually, which is expensive and time-consuming.</p>
<h2>Challenges and Solutions</h2>
<p>The problem of generating complex test data that successfully pass different stages in the processing of the file is addressed by using some machine learning techniques to learn the structure of a given input file and then generates some new test data based on the learned model. A file can be seen as a sequence of bytes which generate by a grammar of that file. Hence, we can use a language model to automatically learn this grammar form a corpus containing various samples of the given file. Neural language models are effective models to learn natural language properties and successfully are utilized in the complex natural language processing (<span class="caps">NLP</span>) tasks such as machine translation and image captioning. We apply a model based on deep neural language to learn the grammar of the file. The learned model is then sampled to generate new files as test data.</p>
<p>The first problem we encountered was finding a mechanism to distinguish between data and meta-data. To do so, we applied a reasonable trick: As meta-data repeated in almost every sample of a file format, the learned model predicts the meta-data with higher probability in comparison to data. By putting a threshold at a model output, the data and metadata are distinguishable.</p>
<p>Aims to seek bugs in software by fuzzing techniques, the second problem was how to determine which byte should be fuzzed to reveal failures in the software under test (<span class="caps">SUT</span>). This can be done by targeting different stages of input processing that <span class="caps">SUT</span> used them. For example, if we would like to fuzz parser, we should fuzz meta-data because parser usually deals with meta-data to validate the format of the input file and to extract data. On the other hand, if we would like to fuzz the execution stage, it may be better to fuzz data. The ability of the learned model in distinguishing data and meta-data is used to determine the place of fuzz in the file.</p>
<p>In addition to determining the place of the fuzz, we should inform the value for replacing the third problem we addressed. As we know, the goal of fuzzing is creating the malformed input, and hence the most inappropriate byte is expected to put in the place of fuzz. Again most inappropriate byte can be determined by the learned neural language model. It is enough to select a byte with the lowest likelihood instead of the highest likelihood used in the default manner.</p>
<p>The fourth problem that raised in this regard was training a neural language model on an <span class="caps">ASCII</span> character set rather than training it on all bytes. To deal with non-<span class="caps">ASCII</span> bytes that make non-textual parts of the input, we replaced these parts by a specific token, called Binary Token, and asked the model to learn that token. At the generating time, whenever a model predicted the specific binary token, we replace the binary token with a real binary section previously deleted form file. This is a simple but effective method to reach a hybrid test data generation scheme. Two specific fuzzing algorithms, MetadataNeuralFuzz and DataNeuralFuzz, are proposed based on a learned generative model. The former targets the parsing stage, and the latter focuses on rendering or executions stage in the processing of the input files. We believe that both algorithms are required to reach a complete fuzz testing with high code coverage and probably a high number of discovered bugs. We investigate the effectiveness of various language models with different configurations and several sampling strategies in the context of complex test data generation. Also, we study various parameters required when generating and fuzzing test data with deep learning techniques.
Tools and Publications</p>
<p>To bring our new theory to a practical tool, we designed and implemented <span class="caps">IUST</span>-DeePFUzz as a modular file format fuzzer. The main module of <span class="caps">IUST</span> DeepFuzz is a test data generator that implements our neural fuzz algorithms. The fuzzer injects test data to <span class="caps">SUT</span> and checks for unexpected results such as crash the memory of the <span class="caps">SUT</span>. <span class="caps">IUST</span> DeepFuzz uses Microsoft Application Verifier, a free runtime monitoring tool, as a monitoring module to catch any memory corruption. It also uses VSPerfMon, another tool from Microsoft, to measure code coverage. Modules are connected using modest Python and batch scripts. <span class="caps">IUST</span>-DeepFuzz can find both the place and the value of the fuzzed symbol automatically while generating the input. Other file formats such as <span class="caps">HTML</span>, <span class="caps">CSS</span>, <span class="caps">XML</span>, <span class="caps">JSON</span>, and all types of source codes can be produced in the same manner which is suitable for fuzzing and quick quality assurance of any software systems.</p>
<h2>Read more</h2>
<p>For more information about both the theoretical and practical aspect of <span class="caps">IUST</span>-DeepFuzz, refer to the <a href="https://m-zakeri.github.io/iust_deep_fuzz"><span class="caps">IUST</span>-DeepFuzz website</a>.</p>Dynamic Complex Network2020-03-04T21:12:00+03:302020-03-04T21:12:00+03:30Mortezatag:m-zakeri.github.io,2020-03-04:/dynamic-complex-network.html<p>Dynamic Complex Network, Graduate course.</p><h2>Teaching assistant</h2>
<h2>Foreword</h2>
<p>I was teaching assistant of Dynamic Complex Network M.Sc. and Ph.D. course by <a href="http://webpages.iust.ac.ir/h_rahmani" target="_blank">Dr. Hossein Rahmani</a> for one semester (Winter and spring 2020) at Iran University of Science and Technology.
Our teaching materials during these two years are available to view and download.</p>
<h3>Useful links</h3>
<ul>
<li><a href="https://m-zakeri.github.io/iust_complex_network" target="_blank">Course code samples and homework solutions</a></li>
<li><a href="https://m-zakeri.github.io/CodART/" target="_blank">Software clustering</a></li>
</ul>Game Theory2020-03-04T21:12:00+03:302020-03-04T21:12:00+03:30Mortezatag:m-zakeri.github.io,2020-03-04:/game-theory.html<p>Game Theory, Graduate course.</p><p>The course discusses the fundamentals of game theory. Game theory is the study of mathematical models of strategic interactions among rational agents. It has applications in all fields of social science and logic, economics, systems science, and computer science. Initially, it addresses two-person zero-sum games, in which each participant’s gains or losses are exactly balanced by those of other participants. Game theory has recently found many applications in formulating network resource allocation problems and coordinating network entities’ behavior to achieve a stable operating point with global consensus property.</p>
<h2>Teaching assistant</h2>
<h2>Foreword</h2>
<p>I was teaching assistant of Game Theory M.Sc. and Ph.D. course by <a href="http://webpages.iust.ac.ir/vhakami" target="_blank">Dr. Vesal Hakami</a> for one semester (Winter and spring 2020) at Iran University of Science and Technology. Here, our teaching materials during these two years are available to view and download</p>
<h3>Useful links</h3>
<ul>
<li><a href="https://www.dropbox.com/s/tup6h56v6hvfxe5/hw01.pdf?dl=0" target="_blank">Homework #1 (<span class="caps">PDF</span>)</a></li>
<li><a href="https://www.dropbox.com/s/1holxfir0f5qfmn/hw02.pdf?dl=0" target="_blank">Homework #2 (<span class="caps">PDF</span>)</a></li>
<li><a href="https://www.dropbox.com/s/p36nebp41n0j98o/hw03.pdf?dl=0" target="_blank">Homework #3 (<span class="caps">PDF</span>)</a></li>
<li><a href="https://www.dropbox.com/s/zdxudn8hyj4empo/GT982_online_midterm_instructions.pdf?dl=0" target="_blank">Midterm instructions</a></li>
</ul>WordPress for beginning2019-04-02T02:00:00+04:302019-04-02T02:00:00+04:30Mortezatag:m-zakeri.github.io,2019-04-02:/wordpress-for-beginning.html<p>WordPress essential training for beginners</p><p>Welcome to our WordPress essential training for beginners: <strong>WordPress for beginning</strong>.
WordPress is a free and open-source content management system (<span class="caps">CMS</span>) based on <span class="caps">PHP</span> programming language and MySQL database. Features include a plugin architecture and a template system. It is most associated with blogging but supports other types of web content including more traditional mailing lists and forums, media galleries, and online stores. Used by more than 60 million websites, including 30.6% of the top 10 million websites, make it a popular <span class="caps">CMS</span> and web framework. The following video tutorials help you start building your own website by WordPress. Tutorials are available in Persian language.</p>
<h2>Video tutorials</h2>
<h3>Part 0: Introduction and syllabus</h3>
<div id="15541545029611626"><script type="text/JavaScript" src="https://www.aparat.com/embed/KPqS7?data[rnddiv]=15541545029611626&data[responsive]=yes"></script></div>
<h3>Part 1: Internet and world wide web (www)</h3>
<div id="15541545263083211"><script type="text/JavaScript" src="https://www.aparat.com/embed/xM7sP?data[rnddiv]=15541545263083211&data[responsive]=yes"></script></div>
<h3>Part 2: Content management systems (CMSs)</h3>
<div id="15541545411872107"><script type="text/JavaScript" src="https://www.aparat.com/embed/HPfO4?data[rnddiv]=15541545411872107&data[responsive]=yes"></script></div>
<h3>Part 3: Installing and configuring WordPress on cPanel</h3>
<p>Coming soon</p>
<h3>Part 4: Working with WordPress admin panel (dashboard)</h3>
<p>Coming soon</p>
<h3>Part 5: Plugin management</h3>
<p>Coming soon</p>
<h3>Part 6: Media management</h3>
<p>Coming soon</p>
<h3>Part 7: User management</h3>Children and programming2019-03-10T21:13:00+03:302019-03-10T21:13:00+03:30Mortezatag:m-zakeri.github.io,2019-03-10:/children-and-programming.html<p>Getting started: Teach computer programming to your children, today!</p><p><img alt="Children and Programming " src="../static/img/children-and-programming.jpg"></p>
<blockquote>
<p><span class="dquo">“</span>Children are a free resource of positive energy.” </p>
</blockquote>
<p>I love teaching computer programming to children. For many obvious reasons, I strongly believe that the 21st century is the century of <strong><span class="caps">CODE</span></strong>.
But where should we start from?
What can we do for our 21st century kids?</p>
<p>There are many ways to teach programming to a kid. I have begun with <a href="https://scratch.mit.edu" target="_blank">Scratch</a> programming tool to teach some children
the basic of computer programming.
Typically, children love computer games and enjoy playing them. However, I found that they enjoy more to learn how computer games could be made.</p>
<p>Scratch is a block-based visual programming language targeted primarily at children. Scratch is used as the introductory language because creation of interesting programs is relatively easy, and skills learned can be applied to other programming languages such as Python and Java. Although Scratch’s main user age group is 8–18 years of age, Scratch has been created for educators and parents. </p>
<p>You can see the result teaching programming to my niece, Yasin, in <a href="https://www.limoonad.com/course/193887/%D8%A2%D9%85%D9%88%D8%B2%D8%B4-%D8%A8%D8%B1%D9%86%D8%A7%D9%85%D9%87-%D9%86%D9%88%DB%8C%D8%B3%DB%8C-%D8%A8%D9%87-%DA%A9%D9%88%D8%AF%DA%A9%D8%A7%D9%86-%D8%AF%D8%A7%D9%86%D8%B4-%D8%A2%D9%85%D9%88%D8%B2%D8%A7%D9%86-%D8%A7%D8%B3%DA%A9%D8%B1%DA%86-scratch">these videos tutorials</a>, prepared by him.</p>A survey of sequence-to-sequence learning with neural networks2019-02-22T12:30:00+03:302019-02-22T12:30:00+03:30Mortezatag:m-zakeri.github.io,2019-02-22:/a-survey-of-sequence-to-sequence-learning-with-neural-networks.html<p>A survey of sequence-to-sequence learning with deep neural networks.</p><p><a href="https://boute.s3.amazonaws.com/290-IUST_logo_color.png"><b>دانشـکده مهندسی کامپیوتر</b></a></p>
<p><strong><h1 align = "center">یـادگیری توالیبهتوالی با شبکههای عصبی</h1></strong></p>
<h4 align="center">
<b> مرتــضی ذاکـری (M - Z A K E R I [ A T ] L I V E [ D O T ] C O M)
</b>
</h4>
<ul>
<li><a href="https://www.dropbox.com/s/gii9b6ncz28xfbn/nmt_lstm_sequence2sequence.zip?dl=0">پیادهسازی + مجموعه داده - فایل <span class="caps">ZIP</span> (حجم 18.<span class="caps">50MB</span>)</a> <em>بهمن 1396</em></li>
<li><a href="https://www.dropbox.com/s/3gris6459iw5tfp/Zakeri_NLP961_project_p3_final.pdf?dl=0">فاز سوم (نهایی) - نسخه <span class="caps">PDF</span> (حجم 1.<span class="caps">80MB</span>)</a> <em>بهمن 1396</em></li>
<li>ارایـه (<a href="https://www.dropbox.com/s/x8i9nwo8n1l3tlk/Zakeri_NLP961_project_p3_presentation.pdf?dl=0">اسلاید (حجم 1.<span class="caps">92MB</span>)</a> | <a href="https://www.dropbox.com/s/43xs7e0kmnil5ku/Zakeri_NLP961_project_p3_talk.mp4?dl=0">ویدئـو (حجم 53.<span class="caps">50MB</span>)</a>) <em>دی 1396</em></li>
<li><a href="https://www.dropbox.com/s/mtngihanlci090r/Zakeri_NLP961_project_p2_rc02.pdf?dl=0">فاز دوم - نسخه <span class="caps">PDF</span> (حجم 1.<span class="caps">75MB</span>)</a> <em>آذر 1396</em></li>
<li><a href="https://www.dropbox.com/s/b3n2nfc1hgmvkhh/Zakeri_NLP961_project_p1_rc01.pdf?dl=0">فاز اول - نسخه <span class="caps">PDF</span> (حجم 1.<span class="caps">23MB</span>)</a> <em>آبان 1396</em></li>
<li><a href="https://www.dropbox.com/s/xd5y5zxrvm9f67d/Zakeri_NLP961_project_figs.zip?dl=0"> تصاویر - فایل <span class="caps">ZIP</span> (حجم 2.<span class="caps">22MB</span>)</a> </li>
<li><a href="https://www.dropbox.com/s/ikyd12fytka91f3/2014_sequence-to-sequence-learning-with-neural-networks.pdf?dl=0">مرجع اصلی (حجم <span class="caps">165KB</span>)</a></li>
<li>تاریخ آخرین بروزرسانی: 19 - 11 - 1396</li>
</ul>
<hr>
<p><strong><h2>چکیده</h2></strong>یادگیری ژرف شاخهای نسبتا جدید از یادگیری ماشین است که در آن توابع محاسباتی بهشکل گرافهای چند سطحی یا ژرف برای شناسایی و تخمین قانون حاکم بر حل یک مسئله پیچیده بهکار بسته میشوند. شبکههای عصبی ژرف ابزاری برای طراحی و پیادهسازی این مدل یادگیری هستند. این شبکهها در بسیاری از وظایف یادگیری ماشینی سخت، موفق ظاهر شدهاند. بهمنظور استفاده از شبکههای ژرف در وظایفی که ترتیب ورودی داده در انجام آن مؤثر است مانند اکثر وظایف حوزه پردازش زبان طبیعی، شبکههای عصبی مکرر ابداع گشتند که بازنمایی مناسبی از مدلهای زبانی ارایه میدهند. این مدلها در حالت ساده برای همه وظیفههای یک مدل زبانی مناسب نیستند. در این گزارش مدل خاصی از شبکههای مکرر تحت عنوان مدل توالیبهتوالی یا کدگذار-گدگشا بررسی میشود که برای وظایفی که شامل توالیهای ورودی و خروجی با طول متفاوت هستند؛ نظیر ترجمه ماشینی، توسعه داده شده و توانسته است نتایج قابل قبولی را در این زمینه تولید کند.
<strong>کلیدواژهها:</strong> مدل توالیبهتوالی، شبکه عصبی مکرر، یادگیری ژرف، ترجمه ماشینی.</p>
<p><br/></p>
<h1><strong>مقدمه</strong></h1>
<p>مدلها و روشهای یادگیری بهکمک شبکههای عصبی ژرف (DNNs)<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup> اخیرا، با افزایش قدرت محاسباتی سختافزارها و نیز حل برخی از چالشهای اساسی موجود بر سر راه آموزش و یادگیری این شبکهها، بسیار مورد توجه واقع شدهاند. DNNها در انجام وظایف سخت یادگیری ماشین مانند تشخیص گفتار، تشخیص اشیاء و غیره، فوقالعاده قدرتمند ظاهر شدهاند و در مواردی روشهای سنتی را کاملاً کنار زدهاند. قدرت بازنمایی زیاد DNNها به این دلیل است که قادر هستند محاسبات زیادی را به صورت موازی در چندین لایه انجام داده، با تعداد زیادی پارامتر پاسخ مسئله داده شده را تخمین زده و مدل مناسبی از آن ارایه دهند. درحال حاضر DNNهای بزرگ میتوانند با استفاده از الگوریتم پسانتشار<sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup> بهصورت بانظارت<sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup> روی یک مجموعه آموزش برچسبزده و بهقدر کافی بزرگ آموزش ببینند. بنابراین در مواردی که ضابطه حاکم بر یک مسئله دارای پارامترهای بسیار زیادی است و یک مقدار بهینه از این پارامترها وجود دارد (صرفا با استناد به این که مغز انسان همین مسئله را خیلی سریع حل میکند)، روش یادگیری پسانتشار این تنظیم از پارامترها (مقدارهای بهینه) را یافته و مسئله را حل میکند [1].
بسیاری از وظایف یادگیری ماشین به حوزه پردازش زبان طبیعی (<span class="caps">NLP</span>)<sup id="fnref:4"><a class="footnote-ref" href="#fn:4">4</a></sup> مربوط میشوند؛ جایی که در آن معمولا ترتیب ورودیها و خروجیهای یک مسئله مهم است. برای مثال در ترجمه ماشینی دو جمله با واژههای یکسان ولی ترتیب متفاوت، معانی (خروجیهای) مختلفی دارند. این وظایف اصطلاحا مبتنی بر توالی<sup id="fnref:5"><a class="footnote-ref" href="#fn:5">5</a></sup> هستند. در واقع ورودی آنها به صورت یک توالی است. شبکههای عصبی رو به جلو ژرف<sup id="fnref:6"><a class="footnote-ref" href="#fn:6">6</a></sup> برای این دسته از وظایف خوب عمل نمیکنند؛ چرا که قابلیتی برای بهخاطر سپاری و مدلسازی ترتیب در آنها تعبیه نشده است.شبکههای عصبی مکرر (RNNs)<sup id="fnref:7"><a class="footnote-ref" href="#fn:7">7</a></sup> خانوادهای از شبکههای عصبی برای پردازش وظایف مبتنی بر توالی هستند. همانطور که شبکههای عصبی پیچشی (CNNs)<sup id="fnref:8"><a class="footnote-ref" href="#fn:8">8</a></sup>، ویژه پردازش یک تور<sup id="fnref:9"><a class="footnote-ref" href="#fn:9">9</a></sup> از مقادیر، برای مثال یک تصویر، طراحی شدهاند؛ یک <span class="caps">RNN</span> نیز همسو با پردازش یک توالی از مقادیر ورودی $$ x\quad =\quad <{ x }^{ (1) },\quad { x }^{ (2) },\quad …,\quad { x }^{ (n) }> $$ساخته شده است [2]. خروجی RNNها نیز مانند ورودی آنها در اغلب وظایف یک توالی است. این قابلیت پردازش توالی توسط شبکههای عصبی، آنها را برای استفاده در وظایف <span class="caps">NLP</span>، بسیار درخور ساخته است.</p>
<h2><strong>شرح مسئله و اهمیت موضوع</strong></h2>
<p>برخلاف انعطاف پذیری و قدرت بالای RNNها، در حالت ساده این شبکهها یک توالی ورودی با طول ثابت را به یک توالی خروجی با همان طول نگاشت میکنند. این موضوع اما یک محدودیت جدی است؛ زیرا، بسیاری از مسائل مهم، در قالب توالیهایی که طولشان از قبل مشخص نیست، به بهترین شکل قابل بیان هستند و در نظر گرفتن یک طول ثابت از پیش تعیین شده برای ورودی و خروجی به خوبی مسئله را مدل نمیکند. برای مثال ترجمه ماشینی (<span class="caps">MT</span>)<sup id="fnref:10"><a class="footnote-ref" href="#fn:10">10</a></sup> و تشخیص گفتار<sup id="fnref:11"><a class="footnote-ref" href="#fn:11">11</a></sup> مسائلی از این دست هستند. همچنین سیستم پرسش و پاسخ را نیز میتوان به صورت نگاشت یک توالی از واژهها بهعنوان پرسش، به یک توالی دیگر از واژهها به عنوان پاسخ، در نظر گرفت. بنابراین پُر واضح است که ایجاد یک روش مستقل از دامنه برای یادگـیری نگاشت توالیبهتولی مفید و قابل توجیه خواهد بود [1].</p>
<h2><strong>اهداف و راهکارها</strong></h2>
<p>همانطور که دیدیم طیف وسیعی از وظایف <span class="caps">NLP</span> مبتنی بر نگاشت توالیهای با طول نامشخص و متغیر به یکدیگر است. همچنین روشهای سنتی مثل n-garm دارای محدودیتهای خاص خود در حل این دسته مسائل هستند و استفاده از روشهای یادگیری ژرف به وضوح امید بخش بوده است. بنابراین هدف ارایه یک مدل مبتنی بر RNNها جهت نگاشت توالیبهتوالی است. در این گـزارش راهکار مطرح شده در [1] و نتایج آن بهتفصیل شرح داده میشود.
Stuskever و همکاران [1] نشان دادند که چگونه یک کاربرد ساده از شبکه با معماری حافظه کوتاهمدت بلند (<span class="caps">LSTM</span>)<sup id="fnref:12"><a class="footnote-ref" href="#fn:12">12</a></sup> میتواند مسائل نگاشت توالیبهتوالی را حل کند. ایده اصلی استفاده از یک <span class="caps">LSTM</span> برای خواندن توالی ورودی، بهصورت یک نمونه در هر مرحله زمانی، جهت اقتباس برداری بزرگ با بعد ثابت و سپس استفاده از یک <span class="caps">LSTM</span> دیگر برای استخراج توالی خروجی از آن بردار است. <span class="caps">LSTM</span> دوم دقیقا یک مدل زبانی مبتنی بر <span class="caps">RNN</span> است با این تفاوت که حاوی احتمال شرطی نسبت به توالی ورودی نیز هست. قابلیت <span class="caps">LSTM</span> در یادگیری موفق وابستگیهای مکانی طولانی مدت نهفته درون توالیها، آن را برای استفاده در مدل پیشنهادی مناسب ساخته است. شکل (1) یک طرحواره از این مدل را به صورت عام نشان میدهد.
<img alt="شکل (1) یک طرحواره از مدل توالیبهتوالی متشکل از دو RNN. این مدل توالی ABC را بهعنوان ورودی خوانده و توالی WXYZ را بهعنوان خروجی تولید میکند. مدل پس از تولید نشانه EOS روند پیشبینی خود را متوقف میکند [1]. " src="https://boute.s3.amazonaws.com/290-fig1.PNG"></p>
<h2><strong>دادهها و نتایج</strong></h2>
<p>مدل پیشنهادی در بخش قبل، برروی وظیفه ترجمه ماشینی عصبی (<span class="caps">NMT</span>)<sup id="fnref:13"><a class="footnote-ref" href="#fn:13">13</a></sup> مورد آزمایش قرار گرفته است. برای انجام آزمایشها از مجموعه داده ترجمه انگلیسی به فرانسوی <span class="caps">WMT</span>’14 استفاده شده است [3]. همچنین مجموعه داده کوچکتری در [4] وجود دارد که برای آموزش مدلهای آزمایشی و غیر واقعی مناسب است. این مجموعه شامل ترجمههای انگلیسی به فارسی نیز هست.
نتایج حاصل شده از این کار بدین قرار است. بر روی مجموعه داده <span class="caps">WMT</span>’14 با استخراج مستقیم ترجمه از پنج <span class="caps">LSTM</span> ژرف با 380 میلیون پارامتر، در نهایت امتیاز <span class="caps">BLEU</span> معادل 34.81 کسب گردیده است. این امتیاز بالاترین امتیازی است که تا زمان ارایه این مقاله از طریق <span class="caps">NMT</span> حاصل شده است. بهعنوان مقایسه امتیاز <span class="caps">BLEU</span> برای ترجمه ماشینی آماری (<span class="caps">SMT</span>)<sup id="fnref:14"><a class="footnote-ref" href="#fn:14">14</a></sup> برروی همین مجموعه داده برابر 33.30 است. این درحالی است که امتیاز 34.81 با احتساب اندازه واژهنامه 80هزار کلمه بهدست آمده و هرجا که کلمه ظاهر شده در ترجمه مرجع در واژهنامه نبوده این امتیاز جریمه شده است. بنابراین نتایج نشان میدهد که یک معماری مبتنی بر شبکه عصبی تقریبا غیر بهینه، که نقاط زیادی برای بهبود دارد، قادر است تا روشهای سنتی مبتنی بر عبارتِ سیستم <span class="caps">SMT</span> را شکست دهد [1].</p>
<p><br/></p>
<h1><strong>مفاهیم اولیه</strong></h1>
<p>در این قسمت پیرامون سه مفهوم اصلی گزارش پیشرو، یعنی مدل زبانی (<span class="caps">LM</span>)<sup id="fnref:15"><a class="footnote-ref" href="#fn:15">15</a></sup>، شبکههای عصبی مکرر و ترجمه ماشینی عصبی، بهصورت مختصر توضیحاتی ارایه میگردد. </p>
<h2><strong>مدل زبانی</strong></h2>
<p>مدل زبانی یک مفهوم پایه در <span class="caps">NLP</span> است که امکان پیشبینی نشانه بعدی در یک توالی را فراهم میکند. بهبیان دقیقتر <span class="caps">LM</span> عبارت است از یک توزیع احتمالی روی یک توالی از نشانهها (اغلب واژهها) که احتمال وقوع یک توالی داده شده را مشخص میکند. در نتیجه میتوان بین چندین توالی داده شده برای مثال چند جمله، آن را که محتملتر است، انتخاب کرد [5]. <span class="caps">LM</span> برای توالی
$$ x\quad =\quad <{ x }^{ (1) },\quad { x }^{ (2) },\quad …,\quad { x }^{ (n) }> $$
عبارت است از:
<img alt="Equation 1" src="https://boute.s3.amazonaws.com/290-rel1.PNG">
مدلهای سنتی n-gram برای غلبه بر چالشهای محاسباتی، با استفاده از فرض مارکوف رابطه (1) را به درنظر گرفتن تنها n-1 نشانه قبلی محدود میکنند. بههمین دلیل برای توالیهای طولانی (بیشتر از 4 یا 5 نشانه) و دیده نشده مناسب نیستند. مدلهای زبانی عصبی (NLMs)<sup id="fnref:16"><a class="footnote-ref" href="#fn:16">16</a></sup> که بر مبنای شبکههای عصبی عمل پیشبینی واژه بعدی را انجام میدهند، در ابتدا برای کمک به n-gramها با آنها ترکیب شدند که منجر به ایجاد پیچیدگیهای زیادی شد؛ در حالی که مشکل توالیهای طولانی همچنان وجود داشت [5]. اخیرا اما، معماریهای جدیدی برای <span class="caps">LM</span> که کاملا بر اساس DNNها است، ایجاد شدهاند. سنگبنای این مجموعه معماریها RNNها بوده که در بخش بعدی معرفی میشوند.</p>
<h2><strong>شبکههای عصبی مکرر</strong></h2>
<p>شبکههای عصبی مکرر کلاسی از شبکههای عصبی هستند که بهصورت یک <strong><em>گراف جهتدار دوری</em></strong> بیان میشوند. بهعبارت دیگر ورودی هریک از لایه(های) پنهان یا خروجی علاوه بر خروجی لایه قبل، شامل ورودی از مرحله قبل بهصورت بازخورد نیز میشود. شکل (2) یک <span class="caps">RNN</span> را نشان میدهد. همانطور که پیداست، لایه پنهان از مراحل قبلی هم بازخورد میگیرد. در هر مرحلهزمانی t از (t=1 تا t=n) یک بردار x<sup>(t)</sup> از توالی ورودی
$$ x\quad =\quad <{ x }^{ (1) },\quad { x }^{ (2) },\quad …,\quad { x }^{ (n) }> $$
پردازش میشود. در حالت کلی معادلههای بروزرسانی (گذرجلو<sup id="fnref:17"><a class="footnote-ref" href="#fn:17">17</a></sup>) یک <span class="caps">RNN</span> در t عبارتند از [2]:
<img alt="Equation 2" src="https://boute.s3.amazonaws.com/290-rel2_5.PNG">
که در آن بردارهای b و c بایاس و ماتریسهای <em>U</em>، <em>V</em> و <em>W</em> بهترتیب وزن یالهای لایه ورودی به پنهان، پنهان به خروجی و پنهان به پنهان، تشکیلدهنده مجموعه پارامترهای شبکه هستند. Φ تابع انگیزش است که معمولا یکی از توابع ReLU<sup id="fnref:18"><a class="footnote-ref" href="#fn:18">18</a></sup> یا سیگموید<sup id="fnref:19"><a class="footnote-ref" href="#fn:19">19</a></sup> انتخاب میشود. لایه آخر را نیز تابع بیشینه هموار<sup id="fnref:20"><a class="footnote-ref" href="#fn:20">20</a></sup> تشکیل میدهد که احتمال وقوع هر نشانه خروجی را مشخص میکند.
<img alt="شکل (2) گراف محاسباتی مربوط به یک نوع RNN که یک توالی ورودی از مقادیر x را به یک توالی خروجی از مقادیر o نگاشت میکند. فرض شده است که خروجی o احتمالات نرمال نشده است، بنابراین خروجی واقعی شبکه یعنی ŷ از اعمال تابع بیشینه هموار روی o حاصل میشود. چپ: RNN بهصورت یال بازگشتی. راست: همان شبکه بهصورت باز شده در زمان، بهنحوی که هر گره با یک برچسب زمانی مشخص شده است [2]." src="https://boute.s3.amazonaws.com/290-fig2.PNG">
در شکل (2)، <span class="caps">RNN</span> با یک لایه پنهان نشان داده شده است. اما میتوان RNNژرف با چندین لایه پنهان نیز داشت. همچنین طول توالیهای ورودی و خروجی میتواند بسته به مسئله مورد نظر متفاوت باشد. karpathy در [6] RNNها را از منظر طول توالی ورودی و طول توالی خروجی به چند دسته تقسیمبندی کرده است. شکل (3) این دستهبندی را نشان میدهد.
<img alt="شکل (3) طرح وارهای از حالتهای مختلف RNN. (الف):شبکه عصبی استاندارد، (ب):شبکه یک به چند، (پ): شبکه چند به یک، (ت)و (ث): شبکههای چند به چند [6]." src="https://boute.s3.amazonaws.com/290-fig3.PNG">
تصویر karpathy از حالتهای مختلف <span class="caps">RNN</span> بعد از انتشار مقاله منتخب در این گزارش میباشد؛ با این حال در بخش 4 خواهیم دید که چگونه میتوان از ترکیب این طرحها نیز برای ایده معماری توالیبهتولی الهام گرفت.</p>
<h2><strong>ترجمه ماشینی عصبی</strong></h2>
<p>بهطور کلی <span class="caps">MT</span> را می توان با یک <span class="caps">LM</span> که به جمله زبان مبدأ مشروط شده است، مدلسازی کرد. بر همین اساس <span class="caps">NMT</span> را میتوان یک مدل زبانی مکرر در نظر گرفت که مستقیما احتمال شرطی p(y|x) را در ترجمه جمله زبان مبدأ
$$ x\quad =\quad <{ x }^{ (1) },\quad { x }^{ (2) },\quad …,\quad { x }^{ (n) }> $$به جمله زبان مقصد
$$ y\quad =\quad <{y }^{ (1) },\quad { y }^{ (2) },\quad …,\quad { y }^{ (m) }> $$مدل میکند. دقت شود که طول جمله مبدأ یعنی n و جمله مقصد یعنی m الزاما برابر نیست. بنابراین در <span class="caps">NMT</span> هدف محاسبه این احتمال و سپس استفاده از آن در تولید جمله به زبان مقصد، هر دو به کمک DNNها است [5].</p>
<p><br/></p>
<h1><strong>کارهای مرتبط</strong></h1>
<p>کارهای زیادی در زمینه NLMs انجام شده است. در بیشتر این کارها از شبکههای عصبی روبهجلو یا مکرر استفاده شده و کاربرد آن معمولا در یک وظیفه <span class="caps">MT</span> با امتیازدهی مجدد n فهرست بهتر<sup id="fnref:21"><a class="footnote-ref" href="#fn:21">21</a></sup>، اعمال شده و نتایج آن معمولا نشان از بهبود امتیازهای قبلی داشته است [1].
اخیرا کارهایی در زمینه فشردن اطلاعات زبان مبدأ در <span class="caps">NLM</span> انجام شده است. برای نمونه Auli و همکاران [7] <span class="caps">NLM</span> را با مدل عنوان<sup id="fnref:22"><a class="footnote-ref" href="#fn:22">22</a></sup> جمله ورودی ترکیب کردهاند که نتایج بهبود بخشی داشته است. کار انجام شده در مقاله [1] به کار [8] بسیار نزدیک است. در مقاله [8] نویسندگان برای اولین بار توالی ورودی را در یک بردار فشرده کرده و سپس آن را به توالی خروجی تبدیل کردند. البته در این کار، برای تبدیل توالی به بردار، از CNNs استفاده شده که ترتیب واژهها را حفظ نمیکند. چُـــو و همکاران [9] یک معماری شبهِ <span class="caps">LSTM</span> را برای نگاشت توالی ورودی به بردار و سپس استخراج توالی خروجی و نهایتا ترکیب آن با <span class="caps">SMT</span> استفاده کردهاند. معماری آنها از دو <span class="caps">RNN</span> با عنوانهای کدگذار و کدگشا تشکیل شده که <span class="caps">RNN</span> اول وظیفه تبدیل یک توالی با طول متغیر به یک بردار با طول ثابت را قابل یک سلول زمینه c دارد و <span class="caps">RNN</span> دوم وظیفه تولید توالی خروجی را با لحاظ کردن c و نماد شروع جمله مقصد بر عهده دارد. معماری پیشنهادی آنها تحت عنوان کلی RNNکدگذار-کدگشا در شکل (4) نشان داده شده است. چون آنها از <span class="caps">LSTM</span> استفاده نکرده و بیشتر تلاش خود را معطوف به ترکیب این روش با مدلهای قبلی <span class="caps">SMT</span> کردهاند، برای توالیهای ورودی و خروجی طولانی همچنان مشکل عدم حفظ حافظه وجود دارد.
Bahdanau و همکاران [10] یک روش ترجمه مستقیم با استفاده از شبکه عصبی پیشنهاد دادهاند که از سازوکار <em>attention</em> برای غلبه بر کارآمدی ضعیف روش [9] روی جملات طولانی استفاده میکند و به نتایج مطلوبی دست یافتند. </p>
<p><img alt="شکل (4) مدل RNN کدگذار-کدگشا، که برای یادگـیری تولید توالی خروجی y از روی توالی ورودی x با استخراج سلول حافظه c از توالی ورودی، بهکار میرود [2]." src="https://boute.s3.amazonaws.com/290-fig4.PNG"></p>
<p><br/></p>
<h1><strong>مدل توالیبهتوالی</strong></h1>
<p>در مدل توالیبهتوالی از دو <span class="caps">RNN</span> با واحدهای <span class="caps">LSTM</span> استفاده شده است. هدف <span class="caps">LSTM</span> در اینجا تخمین احتمال شرطی
$$ p(<{ y }^{ (1) },\quad …,\quad { y }^{ (m) }>\quad |\quad <{ x }^{ (1) },\quad …,\quad { x }^{ (n) }>) $$
است که قبلا هم دیده بودیم (بخش 2-3). <span class="caps">LSTM</span> این احتمال شرطی را ابتدا با اقتباس بازنمایی بعد ثابت v برای توالی ورودی
$$ <{ x }^{ (1) },\quad …,\quad { x }^{ (n) }> $$
از آخرین مقدار حالت پنهان و در ادامه با محاسبه احتمال
$$<{ y }^{ (1) },\quad …,\quad { y }^{ (m) }> $$
از رابطه استاندارد مطرح در <span class="caps">LM</span> (رابطه (1)) و درنظر گرفتن برای حالت پنهان آغازین بهصورت داده شده در رابطه زیر، حساب میکند:
<img alt="Equation 6" src="https://boute.s3.amazonaws.com/290-rel6.PNG">
در رابطه (6) هر توزیع احتمالی
$$ p({ y }^{ (t) }\quad |\quad v,\quad y^{ (1) },\quad …,\quad y^{ (t-1) }) $$
بهوسیله یک تابع بیشینه هموار روی همه واژههای داخل واژهنامه بازنمایی میشود. برای <span class="caps">LSTM</span> از روابط [11] استفاده شده است. هر جمله در این مدل نیاز است تا با یک علامت خاص مثل <span class="caps">EOS</span> خاتمه یابد. این امر مدل را قادر میسازد تا بتواند توزیع احتمالی را روی توالی با هر طول دلخواهی تعریف کند. شمای کلی مدل در شکل (1) نشان داده شده است. در این شکل <span class="caps">LSTM</span> بازنمایی توالی ورودی
$$ <’A’,’B’,’C’,<span class="caps">EOS</span>> $$را حساب و سپس از این بازنمایی برای محاسبه احتمال توالی خروجی<br>
$$ <’W’,’X’,’Y’,’Z’,<span class="caps">EOS</span>> $$
استفاده میکند. در عین حال این مدل را میتوان ترکیبی از قسمتهای پ و ت شکل (3) دانست.
مدل پیادهسازی شده در عمل از سه جنبه با مدل معرفی شده در بالا تفاوت دارد. اول، از دو <span class="caps">LSTM</span> جداگانه استفاده شده است: یکی برای توالی ورودی و دیگری برای توالی خروجی؛ زیرا، انجام این کار پارامترهای مدل را با هزینه محاسباتی اندکی، به تعداد بسیار زیادی افزایش میدهد. دوم اینکه LSTMهای ژرف بهشکل قابل توجهی LSTMهای سطحی را شکست میدهند، به همین دلیل <span class="caps">LSTM</span> با ژرفای چهار لایه بهکار گرفته شده است. سوم اینکه نویسندگان در این مقاله یافتهاند که وارون کردن توالی ورودی در سرعتِ همگرایی آموزش شبکه و نیز دقت پیشبینی آن تأثیر شگرفی ایفا میکند. بنابراین بهجای نگاشت مستقیم توالی a,b,c به توالی α, β, γ شبکه <span class="caps">LSTM</span> برای نگاشت c,b,a به α, β, γ آموزش داده میشود که در آن α, β, γ ترجمه یا خروجی متناظر با همان a,b,c است. توجیه علت این پدیده آن است که در نگاشت به شیوه وارون ابتدای عبارتها که متناظر با یکدیگر هستند بههم نزدیک شده و این امر سبب زودتر همگرا شدن الگوریتم کاهش گرادیان تصادفی (<span class="caps">SGD</span>) و نزدیک شدن به مقادیر بهینه میشود [1].</p>
<h2><strong>آموزش شبکه</strong></h2>
<p>مدل توالیبهتوالی پس از معرفی توسط Sutskever و همکاران [1]، بارها و بارها تا به امروز مورد ارجاع دیگران قرار گرفته و تبدیل به یک مدل مرجع در <span class="caps">NMT</span> شده است. این مدل در رساله دکتری آقای لانگ [5] بهتفصیل و همراه با برخی اصلاحات توضیح داده شده است. در این بخش به برخی جزئیات آموزش شبکه مدل توالیبهتوالی میپردازیم.
شکل (5) یک نمایش دقیقتر از مدل ذکر شده در شکل (1) را نشان میدهد. آموزش شبکه بدین نحو است: ابتدا جمله زبان مقصد، سمت راست جمله متناظر خود در زبان مبدأ قرار داده میشود. نشان ‘-‘ در اینجا نقش <span class="caps">EOS</span> را دارد که البته میتواند پایان جمله مبدأ یا آغاز جمله مقصد را مشخص کند. بنابراین به هر کدام از دو گروه قابل تعلق است. <span class="caps">LSTM</span> سمت چپ یا همان شبکه کدگذار، در هر مرحلهزمانی یک واژه از جمله زبان مبدأ را خوانده پس از تبدیل به نمایش مناسب حالت داخلی لایه پنهان را بروزرسانی میکند. در مرحله پردازش آخرین واژه مقادیر لایههای پنهان بردار ثابت که اکنون نماینده کل جمله ورودی زبان مبدأ است را تشکیل میدهد. سپس <span class="caps">LSTM</span> دوم یا شبکه کدگشا اولین واژه زبان مقصد را به همراه بردار v، بهعنوان ورودی دریافت میکند و پیشبینی خود را انجام میدهد. برچسب واقعی این داده در واقع واژه بعدی در جمله زبان مقصد است. پس از مقایسه و محاسبه خطا، الگوریتم پسانتشار روی هر دو شبکه با شروع از شبکه کدگشا اجرا میشود و پارامترها را در خلاف جهت گرادیان تنظیم میکند. این روند تا پایان یافتن جمله زبان مقصد ادامه پیدا میکند. البته در عمل ممکن است ورودی به صورت یک دسته<sup id="fnref:23"><a class="footnote-ref" href="#fn:23">23</a></sup> به شبکه داده شده و گرادیان روی کل آن دسته حساب شود. به بیان دیگر در مجموع، شبکه کدگشا آموزش داده میشود تا جمله زبان مقصد را به همان جمله زبان مقصدی تبدیل کند که فقط واژههای آن یک واحد نسبت به جمله ورودی به سمت جلو جابهجا شدهاند. این روش اصطلاحا teacher forcing نامیده میشود [2] و زمانی مناسب است که جمله زبان مقصد (توالی خروجی) کاملا مشخص باشد. در واقع واژه بعدی به عنوان برچسب در فرایند آموزش بانظارت مورد استفاده قرار میگیرد و وزنها بر اساس آن تنظیم میگردند.
<img alt="شکل (5) نمایش نحوه عملکرد و آموزش مدل توالیبهتوالی روی وظیفه ترجمه ماشینی عصبی [5]." src="https://boute.s3.amazonaws.com/290-fig5wc.PNG"></p>
<p>در مرحله استنتاج<sup id="fnref:27"><a class="footnote-ref" href="#fn:27">27</a></sup> یعنی هنگامی که میخواهیم جمله ناشناخته زبان مقصد (توالی خروجی) را کدگشایی نماییم، فرایند شرح داده شده در بالا، با اندکی تفاوت و در قالب گامهای زیر انجام میپذیرد:
1. توالی ورودی با استفاده از شبکه کدگذار به بردار محتوا بدل میگردد. در صورتی که از سلول <span class="caps">LSTM</span> استفاده شود بردار محتوا برای هر لایه از شبکه حاوی دو متغیر حالت خواهد بود و در صورت استفاده از سلول <span class="caps">GRU</span> بردار محتوا برای هر لایه از شبکه دارای یک متغیر است.
2. یک توالی با اندازه ورودی 1 که ابتدا حاوی نشانه شروع جمله زبان مقصد است در ورودی شبکه کدگشا قرار داده میشود.
3. بردار محتوای حاصل شده از مرحله 1 به همراه توالی مرحله 2 به شبکه کدگشا داده میشوند تا نشانه (در اینجا واژه) بعدی جمله زبان مقصد پیشبینی شود.
4. از پیشبینی مرحله 4 نمونه برداری شده (به یکی از روشهای حریصانه یا جستوجوی پرتوی محلی که در ادامه توضیح داده خواهد شد) و واژه بعدی انتخاب میشود.
5. واژه انتخاب شده در مرحله 4 به جمله زبان مقصد (توالی خروجی) الحاق میشود.
6. واژه انتخاب شده در مرحله 4 به جای نشانه شروع جمله به شبکه کدگشا داده میشود و مراحل 3 و 4 و 6 تکرار میشوند تا زمانی که نشانه پایان جمله تولید شود یا اینکه طول جمله تولید شده از یک حد از پیش تعیین شده بیشتر شود.
نکته لازم به ذکر دیگر آن است که توالی ورودی انتخاب شده در این مرحله از مجوعه آزمون انتخاب میشود. در واقع مرحله استنتاج روی دادههای آزمون و برای ارزیابی مدل انجام میپذیرد.</p>
<h2><strong>جزئیات آموزش شبکه</strong></h2>
<p>در مقاله [1] از LSTMژرف با چهار لایه و 1000 سلول حافظه در هر لایه استفاده شده است. همچنین اندازه واژگان ورودی 160هزار و اندازه واژگان خروجی 80هزار کلمه است. حاصل کار یک شبکه <span class="caps">LSTM</span> با مجموع 380میلیون پارامتر بوده که 64میلیون آن اتصالات برگشتی هستند. دیگر جزئیات پارامترها و آموزش شبکه عبارتند از:</p>
<ul>
<li>پارامترها با مقادیر تصادفی از توزیع یکنواخت در بازه [0.08+ و 0.08-] مقداردهی اولیه شدهاند.</li>
<li>برای آموزش از <span class="caps">SGD</span> استاندارد با نرخ یادگیری 0.7 استفاده شده است. بعد از گذشت پنج دوره<sup id="fnref:24"><a class="footnote-ref" href="#fn:24">24</a></sup>، نرخ یادگیری در هر نیمدور، نصف میشود. در ضمن تعداد کل دورههای آموزش برابر 7.5 بوده است.</li>
<li>گرادیان بر روی دستههای 128تایی از توالیها محاسبه شده و به اندازه دسته، یعنی 128، تقسیم میشود.</li>
<li>هرچند LSTMها از معضل میرایی گرادیان<sup id="fnref:25"><a class="footnote-ref" href="#fn:25">25</a></sup> رنج نمیبرند، اما ممکن است مشکل انفجار گرادیان<sup id="fnref:26"><a class="footnote-ref" href="#fn:26">26</a></sup> را داشته باشند. بنابراین محدودیت سختی بر مقدار نورم گرادیان اعمال میشود بهاین نحو که هنگامی که نورم از مقدار آستانهای بیشتر شد، مجددا تنظیم شود. برای هر دسته در مجموعه آموزش مقدار
$$ s={ ||g|| }_{ 2 }$$</li>
</ul>
<p>محاسبه میشود که در آن g مقدار گرادیان پس از تقسیم بر 128 است. اگر s>5 شد آنگاه قرار داده میشود:
$$ g=\frac { 5g }{ s }. $$
+ جملات مختلف طولهای مختلفی دارند. بیشتر آنها کوتاه هستند (طولی بین 20 تا 30 دارند) اما برخی از آنها طولانی هستند (طولی بیشتر از 100 دارند)؛ بنابراین دستههای 128تایی از جملات که تصادفی انتخاب میشوند تعداد کمی جمله طولانی داشته و تعداد زیادی جمله کوتاه و در نتیجه سبب میشود تا بیشتر محاسبات داخل هر دسته هدر روند. برای غلبه بر این موضوع سعی شده است همه جملات داخل یک دسته طول تقریبا مساوی داشته باشند. این امر انجام محاسبات را تا 2 برابر تسریع کرده است.</p>
<p><br/></p>
<h1><strong>آزمایشها</strong></h1>
<p>روش یادگیری توالیبهتوالی معرفی شده روی وظیفه ترجمه ماشینی انگلیسی به فرانسوی در دو حالت مختلف آزمایش گردیده است. در حالت اول مدل، برای ترجمه مستقیم جملات انگلیسی به فرانسوی بهکار گرفته شده و در حالت دوم برای امتیاز دهی مجدد n فهرست بهتر از جملات در وظیفه <span class="caps">SMT</span> استفاده شده است. در این قسمت نتایج آزمایشهای انجام گرفته در قالب امتیازهای ترجمه کسب شده، نمونه جملات ترجمه شده و بلاخره مصورسازی بازنمایی جملات ورودی، بیان شده است.</p>
<h2><strong>پیادهسازی</strong></h2>
<p>پیادهسازی مدل اولیه با زبان ++C انجام شده است. این پیادهسازی از <span class="caps">LSTM</span> ژرف با پیکربندی شرح داده شده در بخش 4-1-2 روی یک <span class="caps">GPU</span>، تقریبا 1700 واژه بر ثانیه را پردازش میکند. این سرعت برای پردازش حجم داده زیادی مثل مجموعه <span class="caps">WMT</span> بسیار پایین است. برای این منظور مدل به صورت موازی شده روی 8 عدد <span class="caps">GPU</span> اجرا میگردد. هر لایه از <span class="caps">LSTM</span> روی یک <span class="caps">GPU</span> اجرا شده و فعالیتهای خود را به محض محاسبه به <span class="caps">GPU</span> یا لایه بعدی میدهد. چون مدل چهار لایه دارد، چهار <span class="caps">GPU</span> دیگر برای موازیسازی بیشینه هموار استفاده شدهاند بنابراین هر <span class="caps">GPU</span> مسئول محاسبه یک ضرب ماتریسی (ماتریس با اندازه 2000 × 1000) است. نتیجه حاصل از این موازیسازی در سطح <span class="caps">GPU</span>، رسیدن به سرعت پردازش 6300 واژه بر ثانیه است. فرایند آموزش در این شیوه پیادهسازی، 10 روز به طول انجامید [1].
علاوه بر پیادهسازی اولیه، پیادهسازیهای دیگری نیز از این مدل در زبانها و چهارچوبهای مختلف ارایه شده است؛ از جمله دو پیادهسازی خوب با زبان پایتون و روی چهارچوبهای کاری Tensorflow و Keras. پیادهسازی Tensorflow سازوکارهای جدیدتر مثل سازوکار <em>attention</em> را نیز اضافه کرده است [12]. پیادهسازی Keras هم به جای واژه، در <strong>سطح کاراکتر</strong> انجام شده است [13]. اگرچه در همه پیادهسازیها ترجمه ماشینی، بهعنوان وظیفه انتخاب شده است. اما این مدل عام بود و برای هر وظیفهای که شامل نگاشت یک توالی ورودی به یک توالی خروجی با طولهای متفاوت است، قابل اعمال خواهد بود.</p>
<h2><strong>جزئیات مجموعه داده</strong></h2>
<p>همانطور که قبلا گفته شد (بخش 3-1) از مجموعه داده ترجمه انگلیسی به فرانسوی <span class="caps">WMT</span>’14 در آزمایشها استفاده شده است [3]. مدل توصیف شده روی یک زیرمجموعه 12میلیون جملهای، شامل 348میلیون واژه فرانسوی و 340میلیون واژه انگلیسی، آموزش داده شده است. وظیفه ترجمه ماشینی و همچنین این مجموعه داده خاص، به خاطر دردسترس بودن عمومی یک مجموعه آموزش و یک مجموعه آزمون نشانهگذاری شده<sup id="fnref:29"><a class="footnote-ref" href="#fn:29">28</a></sup> جهت اهداف آموزش و ارزیابی مدل انتخاب شده است و مدل توالیبهتولی مستقل از یک وظیفه خاص است.
همچنانکه مدلهای زبانی عصبی معمولی روی یک بازنمایی برداری در نمایش هر کلمه تکیه میکنند، در اینجا نیز یک واژهنامه با اندازه ثابت، برای هر دو زبان بهکار گرفته شده است. برای این منظور، 160هزار واژه از پر استفادهترین واژههای زبان مبدأ (انگلیسی) و نیز 80هزار واژه از پر استفادهترین واژههای زبان مقصد (فرانسوی) برگزیده شدهاند. هر واژه خارج از این واژهنامهها که در جملهها ظاهر شده باشد، با نشانه خاص “<span class="caps">UNK</span>” جایگزین شده است.
برای پیادهسازی [12] از مجموعه داده ترجمه آلمانی-انگلیسی <span class="caps">WMT</span>’16 [14] استفاده شده است و همچنین مدل نمونه پیادهسازی شده در [13] از مجموعه داده کوچکتر موجود در [4] استفاده کرده است که قابل جایگزین کردن با مجموعههای ذکر شده در بالا نیز هست. ایراد اساسی پیادهسازی در سطح کاراکتر [13] این است که معمولا در ترجمه ماشینی واژهها به یکدیگر متناظر میشوند نه کاراکترها لذا این مدل از دقت مدلهای در سطح واژه برخوردار نیست اما ایده خوبی در مورد استفاده در سایر وظایف مبتنی بر نگاشت توالیبهتوالی نظیر تولید متن به دست میدهد.</p>
<h2><strong>کدگشایی و امتیازدهی مجدد</strong></h2>
<p>هسته اصلی آزمایشهای انجام شده در [1]، آموزش یک <span class="caps">LSTM</span> ژرف بزرگ روی تعداد زیادی جفت از جملههای زبان مبدأ و زبان مقصد است. آموزش با بیشینه کردن احتمال لگاریتمی یک ترجمه صحیح T برای جمله مبدأ داده شده S انجام میشود. بنابراین هدف آموزش عبارت است از:
<img alt="Equation 7" src="https://boute.s3.amazonaws.com/290-rel7.PNG">
که در آن <strong>S</strong> مجموعه آموزش است. وقتی آموزش کامل شد، ترجمهها با یافتن درستترین ترجمه از روی <span class="caps">LSTM</span> تولید میشوند:
<img alt="Equation 8" src="https://boute.s3.amazonaws.com/290-rel8.PNG">
برای یافتن درستترین ترجمه از یک کدگشای ساده با جستوجوی پرتوی محلی<sup id="fnref:30"><a class="footnote-ref" href="#fn:30">29</a></sup> چپ به راست استفاده شده است که تعداد B فرضیه جزئی<sup id="fnref:31"><a class="footnote-ref" href="#fn:31">30</a></sup> را نگهداری میکند. هر فرضیه جزئی پیشوندی از تعدادی ترجمه است. در هر مرحله زمانی، هر فرضیه جزئی با واژههای محتمل از داخل واژهنامه گسترش داده میشود. این روند تعداد فرایض جزئی را بهسرعت افزایش میدهد. با توجه به مدل احتمال لگاریتمی، تمام این فرضیهها به غیر از B فرضیه محتمل اول کنار گذاشته میشوند. بهمجرد اینکه نشانه “<span class="caps">EOS</span>” به یک فرضیه الصاق شد، از جستوجوی پرتوی محلی حذف و به مجموعه فرایض کامل افزوده میگردد. هرچند این روش کدگشایی تقریبی است؛ اما، برای پیادهسازی راحت خواهد بود. سیستم پیشنهادی حتی با اندازه پرتوی 1 و نیز اندازه پرتوی 2 بیشترین مزایای این روش جستوجو را فراهم میآورد. امتیازهای <span class="caps">BLEU</span> حاصله از آزمایشهای انجام شده روی مدل، در جدول (1) ذکر شده است.</p>
<h2><strong>وارونسازی جملات مبدأ</strong></h2>
<p>درحالیکه <span class="caps">LSTM</span> قابلیت حل مسائل با وابستگیهای طولانی مدت را دارد، در طول آزمایشهای انجام شده در [1] پژوهشگران یافتهاند که وقتی جملههای مبدأ وارون شده و بهعنوان ورودی به شبکه کدگذار داده میشوند، <span class="caps">LSTM</span> بهتر آموزش میبیند. توجه شود که جملات مقصد وارون نمیشوند. با انجام این عمل ساده، مقدار سرگشتگی<sup id="fnref:32"><a class="footnote-ref" href="#fn:32">31</a></sup> مدل از 5.8 به 4.7 کاهش یافتهاست و مقدار امتیاز <span class="caps">BLEU</span> کسب شده از ترجمههای کدگشایی شده مدل نیز از 25.9 به 30.6 افزایش داشته است.
نویسندگان [1] توضیح کاملی برای توجیه اثر این پدیده نداشتهاند. توجیه اولیه آنها بدین ترتیب است که عمل وارونسازی جملات زبان مبدأ باعث معرفی بسیاری از وابستگیهای کوتاه مدت به مجموعه داده میشود. وقتی جملههای زبان مبدأ را با جملههای زبان مقصد الحاق میکنیم، هر واژه در جمله مبدأ از واژه نظیرش در جمله مقصد دور میافتد. در نتیجه، مسئله یک دارای یک <em>تأخیر زمانی کمینه</em><sup id="fnref:33"><a class="footnote-ref" href="#fn:33">32</a></sup> خیلی بزرگ میشود [1]. با وارونسازی واژهها در جمله مبدأ فاصله میانگین بین واژههای نظیر به نظیر در جمله مبدأ با جمله مقصد تغییر نمیکند. هرچند تعداد کمی از واژههای آغازین جمله مبدأ در این حالت به واژههای آغازین جمله مقصد بسیار نزدیک میشوند؛ بنابراین تأخیر زمانی کمینه مسئله تا حد زیادی کاهش مییابد و الگوریتم پسانتشار زمان کمتری را برای استقرار ارتباط میان واژههای جملههای مبدأ و جملههای مقصد سپری خواهد نمود. این امر درنهایت منجربه بهبود قابل توجه کارآمدی کلی مدل میگردد.
ایده وارونسازی جملههای ورودی از این مهم نشئت گرفته است که در ابتدا تصور شده وارونسازی فقط به پیشبینی با اطمینانتر واژههای آغازین در زبان مقصد کمک میکند و منجربه پیشبینی کم اطمینانتر واژههای پایانی میشود. هرچند LSTMای که روی جملات مبدأ وارون شده آموزش دیده، در مقایسه با <span class="caps">LSTM</span> معمولی، روی جملههای طولانی عملکرد بهتری از خود نشان داده است (رجوع شود به بخش 1-6). </p>
<h2><strong>ارزیابی نـتایج</strong></h2>
<p>بهمنظور ارزیابی کیفیت ترجمههای صورت گرفته توسط مدل از روش امتیازدهی خودکار <span class="caps">BLEU</span> [16] استفاده شده است. برای محاسبه امتیاز <span class="caps">BLEU</span>، اسکریپت آماده multi-bleu.pl<sup id="fnref:34"><a class="footnote-ref" href="#fn:34">33</a></sup> بهکار رفته است. این نوع امتیاز دهی در کارهای قبلی مشابه نیز استفاده شده است [9] و [10]، بنابراین قابل اطمینان خواهد بود و مقایسه مدلها را امکانپذیر میسازد. بهعنوان نمونه، این اسکریپت برای [10] امتیاز 28.45 را تولید کرده است. نتایج در جدولهای (1) و (2) ارایه شدهاند. بهترین نتیجه از مجموعه LSTMهایی که در مقداردهی اولیه تصادفی و ترتیب تصادفی ریزدستهها تفاوت داشتهاند، حاصل شده است. هرچند سازوکار کدگشایی ترجمه بهکار برده شده در اینجا (جستوجوی پرتوی محلی)، سازوکار ساده و ضعیفی است؛ با این حال نخستین بار است که یک سیستم ترجمه ماشینی عصبی خالص، سیستم ترجمه ماشینی مبتنی بر عبارات را با اختلاف قابل توجهی شکست میدهد. این سیستم همچنین فاقد قابلیت کنترل واژههای خارج از واژهنامه است و همانطور که قبلا هم بیان شد کلیه واژههای بیرون از واژهنامه با واژه “<span class="caps">UNK</span>” جایگزین شدهاند. بنابراین در صورتی که سازوکاری برای کنترل این واژهها نیز به مدل اضافه شود یا اندازه واژهنامه افزایش یابد، عملکرد این سیستم باز هم جای بهبود خواهد داشت.</p>
<p><br/></p>
<p align="center">
جدول (1) کارآمدی <span class="caps">LSTM</span> روی مجموعه آزمون ترجمه انگلیسی به فرانسوی <span class="caps">WMT</span>’14 (ntst14). توجه شود که یک مجموعه متشکل از پنج <span class="caps">LSTM</span> با اندازه پرتوی 2، ارزانتر (سبکتر) از یک <span class="caps">LSTM</span> تنها با اندازه پرتوی 12 است [1].
</p>
<table>
<thead>
<tr>
<th style="text-align: center;"><strong>روش</strong></th>
<th style="text-align: center;"><strong>امتیاز <span class="caps">BLEU</span> (ntst14)</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center;">Bahdanau و همکاران [10]</td>
<td style="text-align: center;">28.45</td>
</tr>
<tr>
<td style="text-align: center;">یک <span class="caps">LSTM</span> روبهجلو، اندازه پرتوی 12</td>
<td style="text-align: center;">26.17</td>
</tr>
<tr>
<td style="text-align: center;">یک <span class="caps">LSTM</span> با ورودی وارون، اندازه پرتوی 12</td>
<td style="text-align: center;">30.59</td>
</tr>
<tr>
<td style="text-align: center;">پنج <span class="caps">LSTM</span> با ورودی وارون، اندازه پرتوی 1</td>
<td style="text-align: center;">33.00</td>
</tr>
<tr>
<td style="text-align: center;">دو <span class="caps">LSTM</span> با ورودی وارون، اندازه پرتوی 12</td>
<td style="text-align: center;">33.27</td>
</tr>
<tr>
<td style="text-align: center;">پنج <span class="caps">LSTM</span> با ورودی وارون، اندازه پرتوی 21</td>
<td style="text-align: center;">34.50</td>
</tr>
<tr>
<td style="text-align: center;">پنج <span class="caps">LSTM</span> با ورودی وارون، اندازه پرتوی 12</td>
<td style="text-align: center;"><strong>34.81</strong></td>
</tr>
</tbody>
</table>
<p><br/></p>
<p align="center">
جدول (2) روشهای مشابه که شبکههای عصبی را در کنار ترجمه ماشینی سنتی روی مجموعه داده <span class="caps">WMT</span>’14 در ترجمه انگلیسی به فرانسوی استفاده کردهاند [1].
</p>
<table>
<thead>
<tr>
<th style="text-align: center;"><strong>روش</strong></th>
<th style="text-align: center;"><strong>امتیاز <span class="caps">BLEU</span> (ntst14)</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center;">لـبه پژوهش [15]</td>
<td style="text-align: center;"><strong>37.00</strong></td>
</tr>
<tr>
<td style="text-align: center;">چــو و همکاران [9]</td>
<td style="text-align: center;">34.54</td>
</tr>
<tr>
<td style="text-align: center;">امتیازدهی مجدد 1000فهرست بهتر با یک <span class="caps">LSTM</span> روبهجلو</td>
<td style="text-align: center;">35.61</td>
</tr>
<tr>
<td style="text-align: center;">امتیازدهی مجدد1000فهرست بهتر با یک <span class="caps">LSTM</span> وارون</td>
<td style="text-align: center;">35.85</td>
</tr>
<tr>
<td style="text-align: center;">امتیازدهی مجدد1000فهرست بهتر با پنج <span class="caps">LSTM</span> وارون</td>
<td style="text-align: center;"><strong>36.50</strong></td>
</tr>
<tr>
<td style="text-align: center;">پیشگویی امتیازدهی مجدد 1000فهرست بهتر</td>
<td style="text-align: center;">45~</td>
</tr>
</tbody>
</table>
<h2><strong>تحلیل مدل</strong></h2>
<p>یکی از ویژگیهای جذاب مدل توالیبهتوالی ارایه شده در [1]، توانایی تبدیل یک توالی از واژهها به یک بردار با ابعاد ثابت است. شکل (6) تعدادی از بازنماییهای یادگرفته شده در روند آموزش را مصورسازی کرده است. این تصویر به وضوح نشان میدهد که بازنماییهای ایجاد شده به ترتیب واژهها حساس هستند؛ زیرا از جملههایی با واژههای یکسان و ترتیب متفاوت در تصویر استفاده شده است. بازنمایی واقعی مدل در ابعاد بالاتری بود و برای نگاشت روی دو بعد روش <span class="caps">PCA</span> بهکار برده شده است.</p>
<p><img alt="شکل (6) این شکل یک تصویر PCA دوبعدی از حالتهای پنهان LSTM را نشان میدهد که پس از پردازش جملههای نشان داده شده در شکل، گرفته شده است. عبارات با توجه به معنایشان خوشهبندی شدهاند که معنا در این مثال به طور عمده تابعی از ترتیب ظاهر شدن واژهها در عبارت است. رسیدن به چنین خوشهبندی با روشهای سنتی موجود، سخت است. توجه شود که در همه جملهها واژههای یکسانی استفاده شده و تنها ترتیب ظاهر شدن آنها، تفاوت ایجاد کرده است [1]. دایرههای کوچک در شکل اعداد دو بعد تصویر شده جمله را نشان میدهند." src="https://boute.s3.amazonaws.com/290-fig6.PNG"></p>
<h2><strong>کارآمدی روی جملات طولانی</strong></h2>
<p>خروجی مدل روی جملههای طولانی (از منظر تعداد واژه) کارآمدی بسیار خوب <span class="caps">LSTM</span> را در این زمینه تأیید میکند. یک مقایسه کمی از نتایج حاصل شده در شکل (7) نشان داده شده است. همچنین جدول (3) چندین جمله طولانی و ترجمههای تولید شده توسط مدل برای آنها را ارایه میکند.
<br/></p>
<p><img alt="شکل (7) نمودار سمت چپ کارآمدی سیستم را بهعنوان تابعی از طول جملهها نشان میدهد که محور افقی در آن طول واقعی جملهها بر حسب تعداد واژههای آنها است. کاهش امتیازی در جملاتی با طول کمتر از 35 واژه وجود ندارد. تنها یک کاهش جزئی در جملههای خیلی طولانی مشاهده میشود. نمودار سمت راست کارآمدی LSTM را روی جملههایی با واژههای کمتر بهکار رفته نشان میدهد که محور افقی در آن جملههای آزمایش شده برحسب میانگین تکرار واژههایشان است [1]." src="https://boute.s3.amazonaws.com/290-fig7.PNG"></p>
<p align="center">جدول (3) سه مثال از ترجمههای طولانی تولید شده توسط مدل توالیبهتوالی در مقایسه با ترجمه صحیح. خواننده میتواند صحت نتایج را با استفاده از مترجم گوگل تا حد خوبی درک کند [1].</p>
<p><img alt="Table 3" src="https://boute.s3.amazonaws.com/290-table3.PNG"></p>
<p><br/></p>
<h1><strong>نتیجهگیری و کارهای آتی</strong></h1>
<p>در این گزارش یک مدل یادگیری ژرف جدید برای یادگیری و نگاشت توالی از ورودیها به توالی از خروجیها مطرح و بحث گردید. نشان داده شد که یک شبکه <span class="caps">LSTM</span> ژرف با واژگان محدود روی وظیفه ترجمه ماشینی، قادر به شکست سیستمهای ترجمه ماشینی استاندارد مبتنی بر عبارات با واژگان نامحدود است. موفقیت این رویکرد نسبتا ساده روی وظیفه ترجمه ماشینی نشان دهنده این است که این مدل باید روی دیگر وظیفههای مبتنی بر توالی نیز در صورت فراهم بودن مجموعه دادههای آموزش کافی، بسیار خوب عمل کند.
در طی فرایند آموزش این اصل نیز کشف شده که وارون سازی توالی مبدأ سبب افزایش دقت و بهبود کارآمدی مدل میشود. میتوان نتیجه گرفت پیدا کردن روشی که وابستگیهای کوتاه مدت را زودتر معرفی کند در هر صورت آموزش مدل را خیلی سادهتر میکند. لذا به نظر میرسد که حتی آموزش یک <span class="caps">RNN</span> استاندارد (مدل غیر توالیبهتوالی) نیز با این روش بهتر باشد. البته این مورد در عمل مورد آزمایش قرار نگرفته است و بنابراین به صورت یک فرضیه باقی است.
نتیجه قابل ذکر دیگر، قابلیت <span class="caps">LSTM</span> در یادگیری صحیح ترجمه توالیهای طولانی است. در ابتدا تصور میشد که <span class="caps">LSTM</span> به دلیل حافظه محدود خود در یادگیری جملههای طولانی شکست بخورد؛ همچنانکه پژوهشگران دیگر در کارهای مشابه عملکرد ضعیفی را برای <span class="caps">LSTM</span> گزارش کرده بودند. با این حال اما روی جملههای خیلی طولانی در حالت وارون همچنان مشکل تضعیف حافظه پابرجاست و احتمالا قابلیت بهبود داشته باشد. در نهایت نتایج رضایت بخش این مدل یادگیری نشان دهنده این است که یک مدل ساده از شبکههای عصبی ژرف، که هنوز جای بهبود و بهینهسازیهای زیادی در خود دارد، قادر به شکست بالغترین سیستمهای ترجمه ماشینی سنتی است. کارهای آتی میتواند بر روی افزایش دقت مدل توالیبهتوالی و پیچیدهتر کردن آن در راستای یادگیری بهتر توالیهای طولانی باشد. در آینده نزدیک این مدلها روشهای سنتی را کاملا منسوخ میکنند. نتایج همچنین نشان میدهد این رویکرد روی دیگر وظیفههای مبتنی بر نگاشت توالیبهتوالی میتواند موفقیت آمیز ظاهر شود. این مهم، زمینه را برای حل مسائل مختلفی در دیگر حوزههای علوم آماده میسازد.
میتوان از این مدل برای ترجمه ماشینی متون طولانی انگلیسی به فارسی و بالعکس استفاده کرد در این وظیفه اثر وارونسازی جمله زبان مبدأ باید بررسی شود؛ زیرا، به نظر میرسد در زبانهای از راست به چپ با این کار تأخیر زمانی کمینه افزایش پیدا کند و نتیجه بدتری حاصل شود.
در وظایف دیگر مثل سیستم پرسش و پاسخ نیز میتوان از این مدل استفاده کرد. در تولید محتوا و برای کامل کردن متون تاریخی و اشعاری که بخشهایی از آنها وجود ندارد یا از بین رفته است استفاده از این مدل جالب و ارزشمند به نظر میرسد.
علاوه بر استفاده در وظایف جدید، تغییر معماری خود مدل نیز، جهت افزایش دقت وظایف نام برده پیشنهاد میشود. برای مثال استفاده از <span class="caps">RNN</span> دوسویه، ترکیبی
و نیز دارای حالت در شبکه کدگذار و کدگشا، استفاده از ژرفای بیشتر لایهها، تغییر دیگر ابرپارامترهای شبکه نظیر نرخ آموزش و افزودن سازوکار توجه میتواند از جمله پیشنهادهایی باشد که در ساختن مدلهای با دقت بیشتر قابل استفاده هستند. همچنین برای مواردی که دادههای برچسبدار به اندازه کافی موجود نیستند یا تمامی توالی خروجی یکجا دردسترس نیست (مثل یادگیری برخط یا یادگیری تقویتی)، استفاده از روش بیان شده در مرحله استنتاج به هنگام آموزش، به جای teacher forcing راهکار مناسبی به نظر میرسد. </p>
<p><br/></p>
<p><strong><h2>مراجع</h2></strong></p>
<p>[1] <span class="caps">Q.V.</span> Le Ilya Sutskever, Oriol Vinyals, I. Sutskever, O. Vinyals, and <span class="caps">Q. V.</span> Le, “Sequence to sequence learning with neural networks,” <em>Nips</em>, pp. 1–9, 2014.
[2] I. Goodfellow, Y. Bengio, and A. Courville, <em>Deep learning</em>. <span class="caps">MIT</span> Press, 2016.
[3] “<span class="caps">ACL</span> 2014 ninth workshop on statistical machine translation.” [Online]. Available: http://www.statmt.org/wmt14/medical-task/index.html. [Accessed: 13-Nov-2017].
[4] “Tab-delimited bilingual bentence pairsfrom the tatoeba project (good for anki and similar flashcard applications).”[Online]. Available: http://www.manythings.org/anki/. [Accessed: 13-Nov-2017].
[5] <span class="caps">M. T.</span> Luong, “Neural machine translation,” Stanford university, 2016.
[6] A. Karpathy, “Connecting images and natural language,” Stanford University, 2016.
[7] M. Auli, M. Galley, C. Quirk, and G. Zweig, “Joint language and translation modeling with recurrent neural networks.,” <em>Emnlp</em>, no. October, pp. 1044–1054, 2013.
[8] N. Kalchbrenner and P. Blunsom, “Recurrent continuous translation models,” <em>Emnlp</em>, no. October, pp. 1700–1709, 2013.
[9] K. Cho <em>et al.</em>, “Learning phrase representations using <span class="caps">RNN</span> encoder-decoder for statistical machine translation,” 2014.
[10] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” pp. 1–15, 2014.
[11] A. Graves, “Generating sequences with recurrent neural networks,” pp. 1–43, 2013.
[12] M.-T. Luong, E. Brevdo, and R. Zhao, “Neural machine translation (seq2seq) tutorial,” <em>https://github.com/tensorflow/nmt</em>, 2017.
[13] “Sequence to sequence example in Keras (character-level),” 2017. [Online]. Available: https://github.com/fcholle/keras/blob/master/examples/lstm_seq2seq.py. [Accessed: 13-Nov-2017].
[14] “Index of /wmt16/translation-task.” [Online]. Available: http://data.statmt.org/wmt16/translation-task/.[Accessed: 04-Dec-2017].
[15] N. Durrani, B. Haddow, P. Koehn, and K. Heafield, “Edinburgh’s phrase-based machine translation systems for <span class="caps">WMT</span>-14,” <em>Proc. Ninth Work. Stat. Mach. Transl.</em>, pp. 97–104, 2014.
[16] K. Papineni, S. Roukos, T. Ward, and W. Zhu, “<span class="caps">BLEU</span>: A method for automatic evaluation of machine translation,” <em>… 40Th Annu. Meet. …</em>, no. July, pp. 311-318,2002.</p>
<p><br/></p>
<p><strong><h2>پیوست الف: پیادهسازی مدل توالیبهتوالی در keras</h2></strong>
در این قسمت جزئیات کد مدل توالیبهتوالی پیادهسازی شده در keras و تغییرات آن را شرح میدهیم. این کد به همراه مجموعه داده از پیوندهای ابتدای گزارش قابل دریافت است. پیادهسازی مدل توالیبهتوالی در اینجا در سطح کاراکتر است یعنی وظیفه ترجمه ماشینی را کاراکتر به کاراکتر انجام میدهد. البته برای وظیفه ترجمه ماشینی مدل در سطح واژه مرسوم است. شروع از سطح کاراتر ساده تر بوده و بعدا با اضافه کردن یک لایه embedding میتوان مدل را به آسانی در سطح واژه آموزش داد.
مجموعه آموزش شامل یک فایل متنی است که در هر سطر آن یک عبارت انگلیسی و سپس ترجمه معادل آن آمده است. دو عبارت در یک سطر با کاراکتر t\ از هم جدا شدهاند. بنابراین جمله زبان مقصد با کاراکتر t\ شروع و با کاراکتر n\ خاتمه مییابد. برای تغییر حالت از کدگذار به کدگشا از نشانه t\ و برای مشخص کردن پایان جمله زبان مقصد از نشانه n\ استفاده خواهد شد. ابتدا فایل ورودی را سطر به سطر خوانده و دو بخش متن ورودی و متن هدف را را از روی آن میسازیم. سپس با روش one-hot متن ورودی و متن هدف را به بردار عددی معادل تبدیل میکنیم. تکه کد زیر اینکار را انجام میدهد:</p>
<div class="highlight"><pre><span></span><code><span class="p">...</span><span class="w"></span>
<span class="n">input_token_index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">dict</span><span class="p">(</span><span class="o">[</span><span class="n">(char, i) for i, char in enumerate(input_characters)</span><span class="o">]</span><span class="p">)</span><span class="w"></span>
<span class="n">target_token_index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">dict</span><span class="p">(</span><span class="o">[</span><span class="n">(char, i) for i, char in enumerate(target_characters)</span><span class="o">]</span><span class="p">)</span><span class="w"></span>
<span class="n">encoder_input_data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="nf">len</span><span class="p">(</span><span class="n">input_texts</span><span class="p">),</span><span class="w"> </span><span class="n">max_encoder_seq_length</span><span class="p">,</span><span class="w"> </span><span class="n">num_encoder_tokens</span><span class="p">),</span><span class="w"> </span><span class="n">dtype</span><span class="o">=</span><span class="s1">'float32'</span><span class="p">)</span><span class="w"></span>
<span class="n">decoder_input_data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="nf">len</span><span class="p">(</span><span class="n">input_texts</span><span class="p">),</span><span class="w"> </span><span class="n">max_decoder_seq_length</span><span class="p">,</span><span class="w"> </span><span class="n">num_decoder_tokens</span><span class="p">),</span><span class="w"> </span><span class="n">dtype</span><span class="o">=</span><span class="s1">'float32'</span><span class="p">)</span><span class="w"></span>
<span class="n">decoder_target_data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="nf">len</span><span class="p">(</span><span class="n">input_texts</span><span class="p">),</span><span class="w"> </span><span class="n">max_decoder_seq_length</span><span class="p">,</span><span class="w"> </span><span class="n">num_decoder_tokens</span><span class="p">),</span><span class="n">dtype</span><span class="o">=</span><span class="s1">'float32'</span><span class="p">)</span><span class="w"></span>
<span class="err">#</span><span class="w"> </span><span class="n">bulid</span><span class="w"> </span><span class="n">one</span><span class="o">-</span><span class="n">hot</span><span class="w"> </span><span class="n">vector</span><span class="w"></span>
<span class="k">for</span><span class="w"> </span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="p">(</span><span class="n">input_text</span><span class="p">,</span><span class="w"> </span><span class="n">target_text</span><span class="p">)</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">enumerate</span><span class="p">(</span><span class="n">zip</span><span class="p">(</span><span class="n">input_texts</span><span class="p">,</span><span class="w"> </span><span class="n">target_texts</span><span class="p">))</span><span class="err">:</span><span class="w"></span>
<span class="k">for</span><span class="w"> </span><span class="n">t</span><span class="p">,</span><span class="w"> </span><span class="nc">char</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">enumerate</span><span class="p">(</span><span class="n">input_text</span><span class="p">)</span><span class="err">:</span><span class="w"></span>
<span class="w"> </span><span class="n">encoder_input_data</span><span class="o">[</span><span class="n">i, t, input_token_index[char</span><span class="o">]</span><span class="err">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">1.</span><span class="w"></span>
<span class="k">for</span><span class="w"> </span><span class="n">t</span><span class="p">,</span><span class="w"> </span><span class="nc">char</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">enumerate</span><span class="p">(</span><span class="n">target_text</span><span class="p">)</span><span class="err">:</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="n">decoder_target_data</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">ahead</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="n">decoder_input_data</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="n">one</span><span class="w"> </span><span class="n">timestep</span><span class="w"></span>
<span class="w"> </span><span class="n">decoder_input_data</span><span class="o">[</span><span class="n">i, t, target_token_index[char</span><span class="o">]</span><span class="err">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">1.</span><span class="w"></span>
<span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="mi">0</span><span class="err">:</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="n">decoder_target_data</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">ahead</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="n">one</span><span class="w"> </span><span class="n">timestep</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="ow">and</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="k">include</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="k">start</span><span class="w"> </span><span class="k">character</span><span class="p">.</span><span class="w"></span>
<span class="w"> </span><span class="n">decoder_target_data</span><span class="o">[</span><span class="n">i, t - 1, target_token_index[char</span><span class="o">]</span><span class="err">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">1.</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<p>دقت شود که ورودی کدگشا در مرحله آموزش، عبارت زبان مقصد (متن هدف) و خروجی آن نیز همان عبارت زبان مقصد است که یک واحد به جلو شیفت داده شده است (روش موسوم به teacher forcing). کد بالا این کار را نیز انجام میدهد یعنی خروجی کدگشا را به همین روش اضافه میکند.
حال نوبت به تعریف <span class="caps">LSTM</span> کدگذار و <span class="caps">LSTM</span> کدگشا میرسد. در keras کلاس <span class="caps">LSTM</span> کلیه وظایف مربوط به این نوع شبکه را پیادهسازی کرده است. کافی است یک نمونه (شیء) از این کلاس ایجاد کنیم. این کلاس همچنین متد <strong><em>call</em></strong> را داراست که لایه ورودی را به عنوان آرگومان دریافت و به شیء ساخته شده از کلاس متصل میکند. <span class="caps">LSTM</span> کدگذار بنابراین بهصورت زیر تعریف میشود:</p>
<div class="highlight"><pre><span></span><code>...
encoder_inputs = Input(shape=(None, num_encoder_tokens))
#print(type(encoder_inputs))
encoder = LSTM(latent_dim, return_state=True)
#print(type(encoder))
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
...
</code></pre></div>
<p>وقتی از آرگومان return_state=True در ساخت یک شی از کلاس <span class="caps">LSTM</span> استفاده میشود دو حالت حافظه موجود در <span class="caps">LSTM</span> هم به عنوان خروجی، علاوه بر توالی خروجی اصلی بازگردانیده میشوند. در کد بالا این دو حالت state_c و state_h نام دارند. خروجی کدگذار در مدل توالیبهتوالی استفادهای ندارد و دور انداخته میشود. در عوض از حالتهای state_c و state_h به عنوان حالت آغازین <span class="caps">LSTM</span> کدگشا به صورت زیر استفاده میشود:</p>
<div class="highlight"><pre><span></span><code><span class="p">...</span><span class="w"></span>
<span class="n">encoder_states</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="err">[</span><span class="n">state_h</span><span class="p">,</span><span class="w"> </span><span class="n">state_c</span><span class="err">]</span><span class="w"></span>
<span class="c1"># Set up the decoder, using `encoder_states` as initial state.</span><span class="w"></span>
<span class="n">decoder_inputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Input</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="k">None</span><span class="p">,</span><span class="w"> </span><span class="n">num_decoder_tokens</span><span class="p">))</span><span class="w"></span>
<span class="c1"># We set up our decoder to return full output sequences,</span><span class="w"></span>
<span class="c1"># and to return internal states as well. We don't use the</span><span class="w"></span>
<span class="c1"># return states in the training model, but we will use them in inference.</span><span class="w"></span>
<span class="n">decoder_lstm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">LSTM</span><span class="p">(</span><span class="n">latent_dim</span><span class="p">,</span><span class="w"> </span><span class="n">return_sequences</span><span class="o">=</span><span class="no">True</span><span class="p">,</span><span class="w"> </span><span class="n">return_state</span><span class="o">=</span><span class="no">True</span><span class="p">)</span><span class="w"></span>
<span class="n">decoder_outputs</span><span class="p">,</span><span class="w"> </span><span class="n">_</span><span class="p">,</span><span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">decoder_lstm</span><span class="p">(</span><span class="n">decoder_inputs</span><span class="p">,</span><span class="n">initial_state</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">encoder_states</span><span class="p">)</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<p>همچنین یک لایه softmax برروی خروجی توالی نهایی LSTMکدگشا جهت تبدیل خروجی به احتمالات معتبر به شکل زیر قرار میدهیم:</p>
<div class="highlight"><pre><span></span><code>...
decoder_dense = Dense(num_decoder_tokens, activation = 'softmax')
decoder_outputs = decoder_dense(decoder_outputs)
...
</code></pre></div>
<p>اکنون لایههای مدل ساخته شده است. این لایهها بایستی به شکل یک گراف به هم متصل شده و تشکیل یک مدل با ورودی و خروجی معین را بدهند. در keras دو نوع مدل وجود دارد. نوع اول مدلهای ترتیبی (sequential) هستند که یک پشته خطی از لایهها را در قالب مدل به هم مرتبط میکنند. یعنی گراف نهایی مدل ترتیبی حالتی خطی دارد. مدل ترتیبی برای ایجاد مدلهای پیچیدهتر مثل مدل توالیبهتوالی مناسب نیست. نوع دوم مدل در keras با استفاده از Keras functional <span class="caps">API</span> ساخته میشوند. این حالت برای ساختن مدلهایی با چند ورودی و چند خروجی که گراف آنها لزوما خطی نیست به کار میرود. در اینجا از این روش استفاده شده است. برای این منظور پس از تعیین تک تک لایهها (کد قسمتهای قبلی)، از کلاس Model استفاده کرده و ورودی و خروجی نهایی مدل را تعیین میکنیم:</p>
<div class="highlight"><pre><span></span><code><span class="p">...</span><span class="w"></span>
<span class="c1"># Define the model that will turn</span><span class="w"></span>
<span class="c1">#encoder_input_data` & `decoder_input_data` into `decoder_target_data`</span><span class="w"></span>
<span class="n">model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Model</span><span class="p">(</span><span class="err">[</span><span class="n">encoder_inputs</span><span class="p">,</span><span class="w"> </span><span class="n">decoder_inputs</span><span class="err">]</span><span class="p">,</span><span class="w"> </span><span class="n">decoder_outputs</span><span class="p">)</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<p>با استفاده از تابع plot_model میتوان به صورت گرافیکی بهمبست مدل ایجاد شده را مشاهده کرد: </p>
<div class="highlight"><pre><span></span><code>...
plot_model(model, to_file = './modelpic/seq2seq_model_' + dt + '.png', show_shapes=True, show_layer_names=True)
...
</code></pre></div>
<p>برای استفاده از این تابع لازم است با دستور from keras.utils import plot_model بسته حاوی تابع plot_model را در ابتدای کد، به برنامه اضافه کنیم. نتیجه اجرای این تابع به شکل زیر است:
<img alt="شکل (الف - 1) مدل توالیبهتوالی ساخته شده در keras و رسم شده توسط تابع plot_model" src="https://boute.s3.amazonaws.com/290-seq2seq_model_20180206_155430.png"></p>
<p>در مرحله بعد تابع خطا و روش یادگیری مدل تعیین میشود:</p>
<div class="highlight"><pre><span></span><code>...
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
...
</code></pre></div>
<p>و درنهایت مدل را با دادههای واقعی آموزش میدهیم:</p>
<div class="highlight"><pre><span></span><code>...
# Run training
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2)
...
</code></pre></div>
<p>پس از آموزش مدل نوبت به آزمون مدل میرسد. در این مرحله میخواهیم با دادن یک جمله زبان مبدأ به مدل ترجمه معادل آن در زبان مقصد را کدگشایی کنیم. برای این منظور نیاز است تا مدلهایی مشابه اما جدا از مدل آموزش تعریف کنیم. ابتدا مدل کدگذار را به نحوی تعریف می کنیم که جمله زبان مبدا را به عنوان ورودی بپذیرد و حالات آغازین مدل کدگشا را به عنوان خروجی تولید کند:</p>
<div class="highlight"><pre><span></span><code>...
## encoder model
encoder_model = Model(encoder_inputs, encoder_states)
</code></pre></div>
<p>مدل کدگشا را طوری تعریف میکنیم که ورودی آن شامل حالتها و خروجی مرحله قبلی خود باشد. در مرحله اول کدگشایی مقادیر این ورودیها عبارتاند از حالات خروجی مدل کدگذار و نشانه آغاز جمله زبان مقصد یعنی همان t. تکه کد زیر مدل کدگشا را تعریف میکند:</p>
<div class="highlight"><pre><span></span><code><span class="p">...</span><span class="w"></span>
<span class="err">##</span><span class="w"> </span><span class="n">decoder</span><span class="w"> </span><span class="n">model</span><span class="w"></span>
<span class="n">decoder_state_input_h</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">Input</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">latent_dim</span><span class="p">,))</span><span class="w"></span>
<span class="n">decoder_state_input_c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">Input</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="n">latent_dim</span><span class="p">,))</span><span class="w"></span>
<span class="n">decoder_states_inputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">[</span><span class="n">decoder_state_input_h, decoder_state_input_c</span><span class="o">]</span><span class="w"></span>
<span class="n">decoder_outputs</span><span class="p">,</span><span class="w"> </span><span class="n">state_h</span><span class="p">,</span><span class="w"> </span><span class="n">state_c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">decoder_lstm</span><span class="p">(</span><span class="n">decoder_inputs</span><span class="p">,</span><span class="w"> </span><span class="n">initial_state</span><span class="o">=</span><span class="n">decoder_states_inputs</span><span class="p">)</span><span class="w"></span>
<span class="n">decoder_states</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">[</span><span class="n">state_h, state_c</span><span class="o">]</span><span class="w"></span>
<span class="n">decoder_outputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">decoder_dense</span><span class="p">(</span><span class="n">decoder_outputs</span><span class="p">)</span><span class="w"></span>
<span class="n">decoder_model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Model</span><span class="p">(</span><span class="o">[</span><span class="n">decoder_inputs</span><span class="o">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">decoder_states_inputs</span><span class="p">,</span><span class="o">[</span><span class="n">decoder_outputs</span><span class="o">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">decoder_states</span><span class="p">)</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<p>دقت شود که برای ساخت مدل از همان لایههای (اشیای <span class="caps">LSTM</span> کدگذار و کدگشای ) ساخته شده در مرحله آموزش استفاده شده است. در ای حالت وزن لایهها بین دو مدل ب اشتراک گذاشته میشود. یعنی وزن لایهها از مرحله آموزش گرفته میشود و از این مدلها فقط برای پیشبینی خروجی با داشتن ورودی، استفاده میشود. تابع decode_sequence آمده در زیر، به عنوان آرگومان یک توالی ورودی را دریافت و توالی خروجی معادل آن را باز میگرداند:</p>
<div class="highlight"><pre><span></span><code><span class="p">...</span><span class="w"></span>
<span class="n">def</span><span class="w"> </span><span class="n">decode_sequence</span><span class="p">(</span><span class="n">input_seq</span><span class="p">)</span><span class="err">:</span><span class="w"></span>
<span class="err">#</span><span class="w"> </span><span class="n">Encode</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="k">input</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="k">state</span><span class="w"> </span><span class="n">vectors</span><span class="p">.</span><span class="w"></span>
<span class="n">states_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">encoder_model</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">input_seq</span><span class="p">)</span><span class="w"></span>
<span class="err">#</span><span class="w"> </span><span class="n">Generate</span><span class="w"> </span><span class="n">empty</span><span class="w"> </span><span class="n">target</span><span class="w"> </span><span class="k">sequence</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="n">length</span><span class="w"> </span><span class="mf">1.</span><span class="w"></span>
<span class="n">target_seq</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="n">num_decoder_tokens</span><span class="p">))</span><span class="w"></span>
<span class="err">#</span><span class="w"> </span><span class="n">Populate</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="k">first</span><span class="w"> </span><span class="k">character</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="n">target</span><span class="w"> </span><span class="k">sequence</span><span class="w"> </span><span class="k">with</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="k">start</span><span class="w"> </span><span class="k">character</span><span class="p">.</span><span class="w"></span>
<span class="n">target_seq</span><span class="o">[</span><span class="n">0, 0, target_token_index['\t'</span><span class="o">]</span><span class="err">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">1.</span><span class="w"></span>
<span class="err">#</span><span class="w"> </span><span class="n">Sampling</span><span class="w"> </span><span class="n">loop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">batch</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="n">sequences</span><span class="w"></span>
<span class="err">#</span><span class="w"> </span><span class="p">(</span><span class="k">to</span><span class="w"> </span><span class="n">simplify</span><span class="p">,</span><span class="w"> </span><span class="n">here</span><span class="w"> </span><span class="n">we</span><span class="w"> </span><span class="n">assume</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">batch</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="k">size</span><span class="w"> </span><span class="mi">1</span><span class="p">).</span><span class="w"></span>
<span class="n">stop_condition</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">False</span><span class="w"></span>
<span class="n">decoded_sentence</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">''</span><span class="w"></span>
<span class="k">while</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nl">stop_condition</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="n">output_tokens</span><span class="p">,</span><span class="w"> </span><span class="n">h</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">decoder_model</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="w"></span>
<span class="w"> </span><span class="o">[</span><span class="n">target_seq</span><span class="o">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">states_value</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="n">Sample</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">token</span><span class="w"></span>
<span class="w"> </span><span class="n">sampled_token_index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">np</span><span class="p">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">output_tokens</span><span class="o">[</span><span class="n">0, -1, :</span><span class="o">]</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="n">sampled_char</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">reverse_target_char_index</span><span class="o">[</span><span class="n">sampled_token_index</span><span class="o">]</span><span class="w"></span>
<span class="w"> </span><span class="n">decoded_sentence</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">sampled_char</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="k">Exit</span><span class="w"> </span><span class="k">condition</span><span class="err">:</span><span class="w"> </span><span class="n">either</span><span class="w"> </span><span class="n">hit</span><span class="w"> </span><span class="nf">max</span><span class="w"> </span><span class="n">length</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">find</span><span class="w"> </span><span class="n">stop</span><span class="w"> </span><span class="k">character</span><span class="p">.</span><span class="w"></span>
<span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">sampled_char</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s1">'\n'</span><span class="w"> </span><span class="ow">or</span><span class="w"></span>
<span class="w"> </span><span class="nf">len</span><span class="p">(</span><span class="n">decoded_sentence</span><span class="p">)</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="n">max_decoder_seq_length</span><span class="p">)</span><span class="err">:</span><span class="w"></span>
<span class="w"> </span><span class="n">stop_condition</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">True</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="k">Update</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">target</span><span class="w"> </span><span class="k">sequence</span><span class="w"> </span><span class="p">(</span><span class="k">of</span><span class="w"> </span><span class="n">length</span><span class="w"> </span><span class="mi">1</span><span class="p">).</span><span class="w"></span>
<span class="w"> </span><span class="n">target_seq</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="n">num_decoder_tokens</span><span class="p">))</span><span class="w"></span>
<span class="w"> </span><span class="n">target_seq</span><span class="o">[</span><span class="n">0, 0, sampled_token_index</span><span class="o">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">1.</span><span class="w"></span>
<span class="w"> </span><span class="err">#</span><span class="w"> </span><span class="k">Update</span><span class="w"> </span><span class="n">states</span><span class="w"></span>
<span class="w"> </span><span class="n">states_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">[</span><span class="n">h, c</span><span class="o">]</span><span class="w"></span>
<span class="k">return</span><span class="w"> </span><span class="n">decoded_sentence</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<p>مدلهای کدگذار و کدگشا که تابع فوق از آنها استفاده میکند و قبل از این تعریف شدند نیز به ترتیب بهصورت گرافهای زیر هستند:</p>
<p><img alt="شکل (الف - 2) مدل کدگذار استفاده شده در مرحله پیشبینی (آزمون یا استنتاج)" src="https://boute.s3.amazonaws.com/290-sampling_encoder_model_20180206_155430.png"></p>
<p><img alt="شکل (الف - 3) مدل کدگشای استفاده شده در مرحله پیشبینی (آزمون یا استنتاج)" src="https://boute.s3.amazonaws.com/290-sampling_decoder_model_20180206_155430.png">
<strong>توجه:</strong> شمارههای سمت چپ هر گره در گرافهای این بخش به صورت ترتیبی توسط keras قرار داده میشوند و اهمیتی ندارند.</p>
<p><strong><h3>ژرفسازی شبکه</h3></strong></p>
<p>اگرچه مدل توضیح دادهشده در این قسمت به طور کامل از مفاهیم شبکههای عصبی ژرف استفاده میکند اما به معنای واقعی کلمه <em>ژرف</em> نیست. در keras به راحتی میتوان یک مدل ژرف را با پشته کردن لایهها روی یکدیگر ایجاد کرد. برای مثال چنانچه بخواهیم شبکه کدگذار مدل فوق دارای دو لایه <span class="caps">LSTM</span> باشد کافی است اولین لایه شبکه کدگذار (encoder_l1) را بهگونهای تعریف کنیم که یک توالی را بهعنوان خروجی بدهد. سپس لایه ورودی را به این لایه متصل میکنیم و لایه <span class="caps">LSTM</span> موجود در کد قبلی اینبار لایه جدید را به عنوان ورودی میپذیرد:</p>
<div class="highlight"><pre><span></span><code>...<span class="w"></span>
#<span class="w"> </span><span class="nv">Define</span><span class="w"> </span><span class="nv">an</span><span class="w"> </span><span class="nv">input</span><span class="w"> </span><span class="nv">sequence</span>.<span class="w"> </span>
<span class="nv">encoder_inputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">Input</span><span class="ss">(</span><span class="nv">shape</span><span class="o">=</span><span class="ss">(</span><span class="nv">None</span>,<span class="w"> </span><span class="nv">num_encoder_tokens</span><span class="ss">))</span><span class="w"></span>
#<span class="w"> </span><span class="nv">Define</span><span class="w"> </span><span class="nv">LSTM</span><span class="w"> </span><span class="nv">layer</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="nv">and</span><span class="w"> </span><span class="nv">pass</span><span class="w"> </span><span class="nv">the</span><span class="w"> </span><span class="nv">above</span><span class="w"> </span><span class="nv">encoder</span><span class="w"> </span><span class="nv">input</span><span class="w"> </span><span class="nv">sequence</span><span class="w"> </span><span class="nv">to</span><span class="w"> </span><span class="nv">it</span>.<span class="w"></span>
#<span class="w"> </span><span class="nv">note</span><span class="w"> </span><span class="nv">that</span><span class="w"> </span><span class="nv">return_sequences</span><span class="w"> </span><span class="nv">argument</span><span class="w"> </span><span class="nv">must</span><span class="w"> </span><span class="nv">set</span><span class="w"> </span><span class="nv">to</span><span class="w"> </span><span class="nv">be</span><span class="w"> </span><span class="nv">True</span><span class="w"> </span><span class="nv">in</span><span class="w"> </span><span class="nv">order</span><span class="w"> </span><span class="nv">to</span><span class="w"> </span><span class="k">connect</span><span class="w"> </span><span class="k">next</span><span class="w"> </span><span class="nv">to</span><span class="w"> </span><span class="nv">layer</span>.<span class="w"></span>
<span class="nv">encoder_l1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">LSTM</span><span class="ss">(</span><span class="nv">latent_dim</span>,<span class="w"> </span><span class="nv">return_sequences</span><span class="o">=</span><span class="nv">True</span>,<span class="w"> </span><span class="nv">return_state</span><span class="o">=</span><span class="nv">True</span><span class="ss">)(</span><span class="nv">encoder_inputs</span><span class="ss">)</span><span class="w"></span>
#<span class="w"> </span><span class="nv">Define</span><span class="w"> </span><span class="nv">LSTM</span><span class="w"> </span><span class="nv">layer</span><span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="ss">(</span><span class="nv">encoder</span><span class="ss">)</span><span class="w"></span>
<span class="nv">encoder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">LSTM</span><span class="ss">(</span><span class="nv">latent_dim</span>,<span class="w"> </span><span class="nv">return_state</span><span class="o">=</span><span class="nv">True</span><span class="ss">)</span><span class="w"></span>
#<span class="w"> </span><span class="nv">Pass</span><span class="w"> </span><span class="ss">(</span><span class="k">connect</span><span class="ss">)</span><span class="w"> </span><span class="nv">encoder_l1</span><span class="w"> </span><span class="nv">to</span><span class="w"> </span><span class="nv">LSTM</span><span class="w"> </span><span class="nv">layer</span><span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="ss">(</span><span class="nv">encoder</span><span class="ss">)</span><span class="w"></span>
<span class="nv">encoder_outputs</span>,<span class="w"> </span><span class="nv">state_h</span>,<span class="w"> </span><span class="nv">state_c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">encoder</span><span class="ss">(</span><span class="nv">encoder_l1</span><span class="ss">)</span><span class="w"></span>
...<span class="w"></span>
</code></pre></div>
<p>و به همین ترتیب این اقدام باید برای دیگر لایههای شبکه هم انجام شود.</p>
<p><strong><h3>تبدیل مدل به یک مدل در سطح واژه</h3></strong></p>
<p>مدل فوق در سطح کاراکتر عمل میکند. اگر یک توالی از اعداد صحیح داشته باشیم که هر عدد نشان دهنده شاخص واژهای خاص در یک دیکشنری باشد. میتوان با استفاده از لایه embedding موجود در keras مدل را برای استفاده از این نشانههای عددصحیح آماده کرد. تکه کد زیر این امکان را اضافه میکند:</p>
<div class="highlight"><pre><span></span><code><span class="p">...</span><span class="w"></span>
<span class="c1"># Define an input sequence and process it.</span><span class="w"></span>
<span class="n">encoder_inputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Input</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="k">None</span><span class="p">,))</span><span class="w"></span>
<span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Embedding</span><span class="p">(</span><span class="n">num_encoder_tokens</span><span class="p">,</span><span class="w"> </span><span class="n">latent_dim</span><span class="p">)(</span><span class="n">encoder_inputs</span><span class="p">)</span><span class="w"></span>
<span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">state_h</span><span class="p">,</span><span class="w"> </span><span class="n">state_c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">LSTM</span><span class="p">(</span><span class="n">latent_dim</span><span class="p">,</span><span class="n">return_state</span><span class="o">=</span><span class="no">True</span><span class="p">)(</span><span class="n">x</span><span class="p">)</span><span class="n">encoder_states</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="err">[</span><span class="n">state_h</span><span class="p">,</span><span class="w"> </span><span class="n">state_c</span><span class="err">]</span><span class="w"></span>
<span class="c1"># Set up the decoder, using `encoder_states` as initial state.</span><span class="w"></span>
<span class="n">decoder_inputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Input</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="k">None</span><span class="p">,))</span><span class="w"></span>
<span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Embedding</span><span class="p">(</span><span class="n">num_decoder_tokens</span><span class="p">,</span><span class="w"> </span><span class="n">latent_dim</span><span class="p">)(</span><span class="n">decoder_inputs</span><span class="p">)</span><span class="w"></span>
<span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">LSTM</span><span class="p">(</span><span class="n">latent_dim</span><span class="p">,</span><span class="w"> </span><span class="n">return_sequences</span><span class="o">=</span><span class="no">True</span><span class="p">)(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">initial_state</span><span class="o">=</span><span class="n">encoder_states</span><span class="p">)</span><span class="w"></span>
<span class="n">decoder_outputs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Dense</span><span class="p">(</span><span class="n">num_decoder_tokens</span><span class="p">,</span><span class="w"> </span><span class="n">activation</span><span class="o">=</span><span class="s1">'softmax'</span><span class="p">)(</span><span class="n">x</span><span class="p">)</span><span class="w"></span>
<span class="c1"># Define the model that will turn</span><span class="w"></span>
<span class="c1">#encoder_input_data` & `decoder_input_data` into `decoder_target_data`</span><span class="w"></span>
<span class="n">model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Model</span><span class="p">(</span><span class="err">[</span><span class="n">encoder_inputs</span><span class="p">,</span><span class="w"> </span><span class="n">decoder_inputs</span><span class="err">]</span><span class="p">,</span><span class="w"> </span><span class="n">decoder_outputs</span><span class="p">)</span><span class="w"></span>
<span class="c1"># Compile & run training</span><span class="w"></span>
<span class="n">model</span><span class="p">.</span><span class="n">compile</span><span class="p">(</span><span class="n">optimizer</span><span class="o">=</span><span class="s1">'rmsprop'</span><span class="p">,</span><span class="w"> </span><span class="n">loss</span><span class="o">=</span><span class="s1">'categorical_crossentropy'</span><span class="p">)</span><span class="w"></span>
<span class="c1"># Note that `decoder_target_data` needs to be one-hot encoded,</span><span class="w"></span>
<span class="c1"># rather than sequences of integers like `decoder_input_data`!</span><span class="w"></span>
<span class="n">model</span><span class="p">.</span><span class="n">fit</span><span class="p">(</span><span class="err">[</span><span class="n">encoder_input_data</span><span class="p">,</span><span class="w"> </span><span class="n">decoder_input_data</span><span class="err">]</span><span class="p">,</span><span class="w"> </span><span class="n">decoder_target_data</span><span class="p">,</span><span class="n">batch_size</span><span class="o">=</span><span class="n">batch_size</span><span class="p">,</span><span class="n">epochs</span><span class="o">=</span><span class="n">epochs</span><span class="p">,</span><span class="w"> </span><span class="n">validation_split</span><span class="o">=</span><span class="mf">0.2</span><span class="p">)</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<p><br>
<br></p>
<hr>
<p><strong><h2>واژهنامه</h2></strong></p>
<p><strong><h3 align = "center">واژهنامه فـارســی به انگلـیسی</h3></strong></p>
<table>
<thead>
<tr>
<th style="text-align: center;"><strong>واژهی فـارسی</strong></th>
<th style="text-align: center;"></th>
<th style="text-align: center;"><strong>معادل انگلیسی</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center;">انفجار گرادیان</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Exploding Gradient</td>
</tr>
<tr>
<td style="text-align: center;">بانظارت</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Supervised</td>
</tr>
<tr>
<td style="text-align: center;">پردازش زبان طبیعی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Natural Language Processing (<span class="caps">NLP</span>)</td>
</tr>
<tr>
<td style="text-align: center;">پسانتشار</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Backpropagation</td>
</tr>
<tr>
<td style="text-align: center;">تابع بیشینه هموار</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Softmax Function</td>
</tr>
<tr>
<td style="text-align: center;">تأخیر زمانی کمینه</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Minimal Time Lag</td>
</tr>
<tr>
<td style="text-align: center;">ترجمه ماشینی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Machine Translation (<span class="caps">MT</span>)</td>
</tr>
<tr>
<td style="text-align: center;">ترجمه ماشینی آماری</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Statistical Machine Translation (<span class="caps">SMT</span>)</td>
</tr>
<tr>
<td style="text-align: center;">ترجمه ماشینی عصبی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Neural Machine Translation (<span class="caps">NMT</span>)</td>
</tr>
<tr>
<td style="text-align: center;">تشخیص گفتار</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Speech Recognition</td>
</tr>
<tr>
<td style="text-align: center;">توالی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Sequence</td>
</tr>
<tr>
<td style="text-align: center;">جستوجوی پرتوی محلی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Beam Search</td>
</tr>
<tr>
<td style="text-align: center;">حافظه کوتاه مدت بلند</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Long-Short Term Memory (<span class="caps">LSTM</span>)</td>
</tr>
<tr>
<td style="text-align: center;">دسته</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Batch</td>
</tr>
<tr>
<td style="text-align: center;">دوره</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Epoch</td>
</tr>
<tr>
<td style="text-align: center;">سرگشتگی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Perplexity</td>
</tr>
<tr>
<td style="text-align: center;">شبکه عصبی پیچشی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Convolutional Neural Network (<span class="caps">CNN</span>)</td>
</tr>
<tr>
<td style="text-align: center;">شبکه عصبی رو به جلو ژرف</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Deep Feed-forward Neural Network</td>
</tr>
<tr>
<td style="text-align: center;">شبکه عصبی ژرف</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Deep Neural Network (<span class="caps">DNN</span>)</td>
</tr>
<tr>
<td style="text-align: center;">شبکه عصبی مکرر</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">(<span class="caps">RNN</span>) Recurrent Neural Network</td>
</tr>
<tr>
<td style="text-align: center;">فرضیه جزئی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Partial Hypothesis</td>
</tr>
<tr>
<td style="text-align: center;">کدگذار</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Encoder</td>
</tr>
<tr>
<td style="text-align: center;">کدگشا</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Decoder</td>
</tr>
<tr>
<td style="text-align: center;">گذر جلو</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Forward Pass</td>
</tr>
<tr>
<td style="text-align: center;">مدل زبانی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Language Model (<span class="caps">LM</span>)</td>
</tr>
<tr>
<td style="text-align: center;">مدل زبانی عصبی</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Neural Language Model (<span class="caps">NLM</span>)</td>
</tr>
<tr>
<td style="text-align: center;">میرایی گرادیان</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Vanishing Gradient</td>
</tr>
<tr>
<td style="text-align: center;">نشانهگذاری شده</td>
<td style="text-align: center;"></td>
<td style="text-align: center;">Tokenized</td>
</tr>
</tbody>
</table>
<p><br/></p>
<p><strong><h3>پانوشتها</h3></strong></p>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p>deep neural networks <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p>backpropagation <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p>supervised <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p>natural language processing <a class="footnote-backref" href="#fnref:4" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
<li id="fn:5">
<p>sequence <a class="footnote-backref" href="#fnref:5" title="Jump back to footnote 5 in the text">↩</a></p>
</li>
<li id="fn:6">
<p>deep feed-forward neural networks <a class="footnote-backref" href="#fnref:6" title="Jump back to footnote 6 in the text">↩</a></p>
</li>
<li id="fn:7">
<p>recurrent neural networks <a class="footnote-backref" href="#fnref:7" title="Jump back to footnote 7 in the text">↩</a></p>
</li>
<li id="fn:8">
<p>convolutional neural networks <a class="footnote-backref" href="#fnref:8" title="Jump back to footnote 8 in the text">↩</a></p>
</li>
<li id="fn:9">
<p>grid <a class="footnote-backref" href="#fnref:9" title="Jump back to footnote 9 in the text">↩</a></p>
</li>
<li id="fn:10">
<p>machine translation <a class="footnote-backref" href="#fnref:10" title="Jump back to footnote 10 in the text">↩</a></p>
</li>
<li id="fn:11">
<p>speech recognition <a class="footnote-backref" href="#fnref:11" title="Jump back to footnote 11 in the text">↩</a></p>
</li>
<li id="fn:12">
<p>long-short term memory <a class="footnote-backref" href="#fnref:12" title="Jump back to footnote 12 in the text">↩</a></p>
</li>
<li id="fn:13">
<p>neural machine translation <a class="footnote-backref" href="#fnref:13" title="Jump back to footnote 13 in the text">↩</a></p>
</li>
<li id="fn:14">
<p>statistical machine translation <a class="footnote-backref" href="#fnref:14" title="Jump back to footnote 14 in the text">↩</a></p>
</li>
<li id="fn:15">
<p>language model <a class="footnote-backref" href="#fnref:15" title="Jump back to footnote 15 in the text">↩</a></p>
</li>
<li id="fn:16">
<p>neural language models <a class="footnote-backref" href="#fnref:16" title="Jump back to footnote 16 in the text">↩</a></p>
</li>
<li id="fn:17">
<p>forward pass <a class="footnote-backref" href="#fnref:17" title="Jump back to footnote 17 in the text">↩</a></p>
</li>
<li id="fn:18">
<p>rectified linear unit <a class="footnote-backref" href="#fnref:18" title="Jump back to footnote 18 in the text">↩</a></p>
</li>
<li id="fn:19">
<p>sigmoid <a class="footnote-backref" href="#fnref:19" title="Jump back to footnote 19 in the text">↩</a></p>
</li>
<li id="fn:20">
<p>softmax function <a class="footnote-backref" href="#fnref:20" title="Jump back to footnote 20 in the text">↩</a></p>
</li>
<li id="fn:21">
<p>n-best list <a class="footnote-backref" href="#fnref:21" title="Jump back to footnote 21 in the text">↩</a></p>
</li>
<li id="fn:22">
<p>topic model <a class="footnote-backref" href="#fnref:22" title="Jump back to footnote 22 in the text">↩</a></p>
</li>
<li id="fn:23">
<p>batch <a class="footnote-backref" href="#fnref:23" title="Jump back to footnote 23 in the text">↩</a></p>
</li>
<li id="fn:24">
<p>epoch <a class="footnote-backref" href="#fnref:24" title="Jump back to footnote 24 in the text">↩</a></p>
</li>
<li id="fn:25">
<p>vanishing gradient <a class="footnote-backref" href="#fnref:25" title="Jump back to footnote 25 in the text">↩</a></p>
</li>
<li id="fn:26">
<p>exploding gradient <a class="footnote-backref" href="#fnref:26" title="Jump back to footnote 26 in the text">↩</a></p>
</li>
<li id="fn:27">
<p>inference <a class="footnote-backref" href="#fnref:27" title="Jump back to footnote 27 in the text">↩</a></p>
</li>
<li id="fn:29">
<p>tokenized <a class="footnote-backref" href="#fnref:29" title="Jump back to footnote 28 in the text">↩</a></p>
</li>
<li id="fn:30">
<p>beam search <a class="footnote-backref" href="#fnref:30" title="Jump back to footnote 29 in the text">↩</a></p>
</li>
<li id="fn:31">
<p>partial hypothesis <a class="footnote-backref" href="#fnref:31" title="Jump back to footnote 30 in the text">↩</a></p>
</li>
<li id="fn:32">
<p>perplexity <a class="footnote-backref" href="#fnref:32" title="Jump back to footnote 31 in the text">↩</a></p>
</li>
<li id="fn:33">
<p>minimal time lag <a class="footnote-backref" href="#fnref:33" title="Jump back to footnote 32 in the text">↩</a></p>
</li>
<li id="fn:34">
<p>چندین نوع محاسبه از امتیاز <span class="caps">BLEU</span> وجود دارد کــه هر نوع با یک اسکریپت زبان perl تعریف شده است و در این مقاله از این اسکریپتهای موجود برای محاسبه امتیاز <span class="caps">BLEU</span> استفاده شده است. <a class="footnote-backref" href="#fnref:34" title="Jump back to footnote 33 in the text">↩</a></p>
</li>
</ol>
</div>Welcome2019-02-22T12:30:00+03:302019-02-22T12:30:00+03:30Mortezatag:m-zakeri.github.io,2019-02-22:/welcome.html<p><span class="dquo">“</span><strong>L</strong>ife <strong>I</strong>s a <strong>F</strong>ractal <strong>E</strong>vent. Create unlimited values in limited time!”, Morteza</p><p><img alt="fractal-life" src="../static/img/fractal-life.gif"></p>
<blockquote>
<p><span class="dquo">“</span><strong>L</strong>ife <strong>I</strong>s a <strong>F</strong>ractal <strong>E</strong>vent (Exploration). Create unlimited (infinite) values in limited time!”</p>
<p><em>Morteza</em></p>
</blockquote>
<p>Hello and welcome to my new personal web-page and blog on GitHub. The blog is under construction and some pages will add in future. Please refer to <a href="../pages/about.md">About me</a> page for more information. </p>
<h3>A note about the blogging tool</h3>
<p>I recently read about <a href="http://docs.getpelican.com" target="_blank">Pelican</a> and decided to switch my blog from pure unstructured <span class="caps">HTML</span> to a structured static website. <a href="http://docs.getpelican.com" target="_blank">Pelican</a> is really beautiful blogging and publishing tool. Simply put, Pelican is a neat static site generator (<span class="caps">SSG</span>) written in Python. Like all SSGs, it enables super fast website generation. Pelican has no heavy docs, straightforward installation, and powerful features such as plugins and extendability.
I am new to <a href="http://docs.getpelican.com" target="_blank">Pelican</a>, but it is simple and easy to use. I strongly recommend you to use Pelican! </p>