This is the replication package for "On the Evaluation of Commit Message Generation Models: An Experimental Study" accepted to ICSME 2021 and "A large-scale empirical study of commit message generation: models, datasets and evaluation" accepted to EMSE 2022.
Welcome to use our dataset, MCMD, and the evaluation scripts to test the performance of the commit message generation!
Citations for these two works can be found here.
conda create -n MCMD python=3.8 numpy=1.19.2 -y
conda activate MCMD
conda install ipykernel -y # this two lines can help jupyter notebook find the correct kernel
python -m ipykernel install --user --name MCMD --display-name "MCMD" # this two lines can help jupyter notebook find the correct kernel
pip install nltk==3.6.2 scipy==1.5.2 pandas==1.1.3 krippendorff==0.4.0 scikit-learn==0.24.1 sumeval==0.2.2 sacrebleu==1.5.1 matplotlib==3.5.1
docker pull itaowei/commit_msg_empirical
CommitGen(CmtGen), NMT, CoDiSum, PtrGNCMsg, NNGen, CoRec, CodeBERT.
-
Existing Datasets: CommitGendata, NNGendata, CoDiSumdata
-
Our Dataset: MCMD
More info about our dataset can be found here.
Usage demo about the metrics can be found here.
See RQ1 results here.
RQ2 results: RQ2.ipynb.
RQ3 results: RQ3.ipynb.
RQ4 results: RQ4.ipynb.
RQ5 results: RQ5.ipynb.
Evaluation results of our improvements to NNGen can be found in nngen_improvement.ipynb.
If you use this code or MCMD, please consider citing us:)
@inproceedings{conf/icsme/TaoWSDH0ZZ21,
author = {Wei Tao and
Yanlin Wang and
Ensheng Shi and
Lun Du and
Shi Han and
Hongyu Zhang and
Dongmei Zhang and
Wenqiang Zhang},
title = {On the Evaluation of Commit Message Generation Models: An Experimental
Study},
booktitle = {IEEE International Conference on Software Maintenance and Evolution,
ICSME 2021, Luxembourg, September 27 - October 1, 2021},
pages = {126--136},
year = {2021},
url = {https://doi.org/10.1109/ICSME52107.2021.00018},
doi = {10.1109/ICSME52107.2021.00018}
}
@inproceedings{journals/emse/TaoWSDH0ZZ22,
author = {Wei Tao and
Yanlin Wang and
Ensheng Shi and
Lun Du and
Shi Han and
Hongyu Zhang and
Dongmei Zhang and
Wenqiang Zhang},
title = {A Large-Scale Empirical Study of Commit Message Generation: Models,
Datasets and Evaluation},
journal = {Empir. Softw. Eng.},
year = {2022},
doi = {10.1007/s10664-022-10219-1}
}
Download ICSME citation, EMSE citation