Week 8

Previous Goals:

  • Automate analysis script for table 3 in auto-analysis
  • Set up end to end tests on new-pipeline and respond to Thomas’ concerns
  • Generate all results on Defects4J, generate tables/figures, rewrite the analyses
  • Work on the paper

Progress:

This week is the last week before the ICSE submission deadline. Thomas is still sick since last week with COVID, and I will try to do as much as possible to help with the progress although I do expect certain delays in the progress.

  • My biggest progress this week was made on integrating Mike’s clean-java branch into the pipeline and generate the second round of new results. We have new terminology (“whitespace-free”, “cleaned”) to describe these important changes now. All the Defects4J artifacts are first piped into ‘clean-java’ to become whitespace-free (all documentation and whitespace-related changes are completely removed). Then, score calculation and ground truth construction are operated on the cleaned version. The goal of this is twofold: (1) To evaluate Flexeme and SmartCommit consistently given the differences in the tool (2) To account for the hard pipeline limitation, where the ground truth is computed on the cleaned version.
  • We came across a lot of bugs when trying to integrate pre-cleaning into the pipeline, particularly Git checkout and commit errors. I was blocked by the Git overwriting untracked files error (which resulted in 6 Time bugs failing), and Mike took it over. I finally was able to run the whole pipeline on all bugs end-to-end over the weekend and got new results for analysis by Monday.
  • I helped Thomas automating analysis Table 3, but my proficiency in R is limited, and I did not have time to work on it to act on Mike’s code review. Thankfully, Thomas worked on the RQs R auto-analysis implementation.

Next Goals

  • Review and merge the clean-java.
  • Run the new pipeline on the new Flexeme decomposition and SmartCommit decomposition
  • Update RQ2 analysis to remove the performance bias generated the tools
  • Open issues for remaining work on the paper. The todo list is in my personal notes at the moment.
  • Work on the paper
Written on July 30, 2023