Week 6

Previous Goals:

  • Integrate Mike’s implementation into the pipeline
  • Get Mike’s and Thomas’ code review on unix-patch branch
  • Read paper of replicating running the tools on synthetic commits & Paper on new tangled dataset
  • Run experiments and complete TODOs as requested by Thomas and Mike.

This Week’s Progress:

  • I break the 5 components of the pipeline (generate diffs, compute diff metrics, generate ground truth, decompose commit with tools, and compute scores) into 5 separate scripts, resulting into the elimination of evaluate.sh and evaluate_all.sh (scripts that run the whole pipeline end-to-end). The rationale for this is that we have results ready for certain components (such as SmartCommit decomposition), and the decompose component of the pipeline takes significantly much more time than the others. Thus, having individual scripts to run experiments is worth doing.
  • I moved from parallelizing the evaluate.sh script, which runs the whole pipeline on a provided list of bugs, to parallelization of 5 individual scripts.
  • I obtained Mike and Thomas’ on my PR. It was a very big PR, with a lot of tangled changes. Thus, I tried to solve this by separating it into 3 smaller pull requests that builds on each other sequentially. I apologize for this, and am learning to commit in small logical units and separate PRs into smaller concerns.
  • Thomas and Mike gave me a review of the three 3 PRs, and I addressed their requested changes.
  • I discussed the Git commit history manipulation idea with Mike and gave him a description of what the program should do. We discovered that both SmartCommit and Flexeme only care about the single buggy commit as input, and thus rewriting Git history is needed. Rather, we only need to add 3 commits to clean the diff to the history.
  • Thomas and I had a productive discussion about the pipeline limitations, our priorities, and the upcoming tasks. We came up with a direction/solution for the current hard pipeline limitation, and agree that the metrics, ground truth, and score will all be run on the clean VC diff, and decomposition will be run on unclean (original) VC diff. Mike’s implementation of rewriting Git commit history is a nice-to-have, but not a priority right now.

Next Goals:

  • Create a Docker Image on Linux to finally set up end to end testing on Github Actions
  • Top priority: run new diff metrics program and ground truth construction on all compatible bug files. The paper needs results now!
  • Resolved end to end testing issue: Java being non-deterministic for SmartCommit
  • I will be volunteering at ISSTA, so I expect to be taken away from research for some time :)
Written on July 14, 2023