Code-Truncater is a tool that was built to assist in a research task in which the aim is to automatically repair programs with code completion from CodeGPT.
The paper that describes that work and presents its results was published in APR 2022:
The main purpose of this tool is to analyze a line of Java source code and compute adequate column numbers where code can be truncated (or cut). These column numbers correspond to:
- Textual boundaries of language constructs represented by AST nodes;
- Camel-case and underscore separation of words according to Java naming conventions.
To build the jar file, run:
mvn package
which will create an executable package called code-truncater.jar with all dependencies and place it in the target/ directory.
To compute column numbers for a line of source code:
java -jar code-truncater.jar <java-src-file> <line-nr>
However, Code-truncater supports other modes of execution. The corresponding switches are given as the first argument:
--bulk: Requires a data file containing space-separated lines with the format:
<java-src-file> <line-nr>
and a prefix that is prepended to each <java-src-file>. Column numbers will be computed for each pair in the data file.
Example usage:
java -jar code-truncater.jar --bulk <data-file> <prefix>
<prefix> is provided separately because <data-file> is expected to have file paths that are relative to their main location, which for this study is a publicly available repository for people to be able to reproduce the experiments.
As such, all the paths in <data-file> are expected to be relative to <prefix>.
More precisely, if there is a data.txt:
src/main/java/File1.java <line_nr_1>
src/main/java/File2.java <line_nr_2>
...
referring to file paths in /home/user/project-dir/, the command to execute is:
java -jar code-truncater.jar --bulk data.txt /home/user/project-dir/
--is-str: Tells if the AST node in a particular line and column is a string literal. Works similar to--bulkmode but the data file expects a third field containing the column number to analyze:
<java-src-file> <line-nr> <column-nr>
Example usage:
java -jar code-truncater.jar --is-str <data-file> <prefix>
Outputs the <java-src-file> if a string literal is present in <line-nr> and <column-nr>.
Empty, otherwise.
--node-type: Same as--is-strbut determines the AST node type. For each line in the data file, outputs that same line plus a fourth field containing the AST node type.
Assuming data.txt contains:
src/main/java/File1.java 10 4
src/main/java/File1.java 10 7
...
and src/main/java/File1.java contains:
...
10 foo("string");
...
and that columns 4 and 7 are characters '(' and 't', respectively, the output will be:
src/main/java/File1.java 10 4 MethodCallExpr
src/main/java/File1.java 10 7 StringLiteralExpr
...
Example usage:
java -jar code-truncater.jar --node-type <data-file> <prefix>