Code Comment Analysis–A Review Paper

  • Syed Zohaib Hassan Beijing Institute of Technology, Beijing, China
  • Ayesha Irshad Abbottabad University of Science & Technology, Abbottabad, Pakistan
  • Jamaluddin Mir Universiti Tun Hossein Onn Malaysia, Batu Pahat, Malaysia
  • Ayesha Aslam Abbottabad University of Science & Technology, Abbottabad, Pakistan
  • Kalsoom Ayaz Abbottabad University of Science & Technology, Abbottabad, Pakistan
  • Muhammad Awais Bawazir Bahria University, Islamabad, Pakistan
Keywords: Code comment, Comment analysis, Software, Code mapping

Abstract

The compilers are manufactured in such a way that they ignore most comments in software systems' source code. In addition to being a key source of system documentation, code comments are essential for both establishment and improvement. System comments or just quantitative assertions regarding the quality of the program are presently the only known techniques for evaluating software quality. In software development, comments are being used as a regular practice to enhance the clarity of code and to transmit the enthusiasm of programmers in a more conveyed manner. Whereas programmers seldom bother to keep their comments current. Comments are an important source of information about how the framework works. Various disciplines have concentrated on the content of the online client comments in different contexts, utilizing manual quantitative/subjective or (semi-)automatic methods. The wide variety and disciplinary partitions make it hard to get a handle on an outline of those views which have proactively been inspected. The huge number of daily comments inundating the newsroom can be amazing, particularly when a huge chunk is unfriendly or "contaminated" in content and tone. When dealing with complex documents such as source code, it can be hard to link the dots between the practical linguistic information contained within the code as well as the corresponding textual explanation found within the code, making it unsuitable for use in program analysis and mining assignments. Analysis of code comments on software improvement is examined in this research. Studies on code comments have been summarized in this paper, which covers four main areas: relevance of code comments, quality of code comment sources, code comment analysis, as well as a research approach for code comments and difficulties. It provides more comprehensive information for future research by analyzing effective methods for this study issue.

References

Allamanis, M. B. (2015). Suggesting accurate method and class names. In Proceedings of the 2015 10th Joint meeting on Foundations of Software Engineering, Bergamo, Italy. https://doi.org/10.1145/2786805.2786849

Antoniol, G. C. (2000). Tracing object-oriented code into functional requirements. In Proceedings IWPC 2000. 8th International Workshop on Program Comprehension, Limerick, Ireland.

Ayala, C. T. (2021). Use and misuse of the term experiment in mining software repositories research.IEEE Transactions on Software Engineering (Ahead of print). https://doi.org/10.1109/TSE.2021.3113558

Chatterjee, P. G. (2017). Extracting code segments and their descriptions from research articles. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), Buenos Aires, Argentina. https://doi.org/10.1109/MSR.2017.10

Corazza, A. D. (2011). Investigating the use of lexical information for software system clustering. In 15th European Conference on Software Maintenance and Reengineering, Oldenburg, Germany. https://doi.org/10.1109/CSMR.2011.8

de Freitas Farias, M. A. ( 2020). Identifying self-admitted technical debt through code comment analysis with a contextualized vocabulary. Information and Software Technology, 121, 106270. https://doi.org/10.1016/j.infsof.2020.106270

Fluri, B. W. (2007). Do code and comments co-evolve? on the relation between source code and comment changes. In 14th Working Conference on Reverse Engineering (WCRE 2007), Vancouver, BC. IEEE.https://doi.org/10.1109/WCRE.2007.21

Fluri, B. W. (2009). Analyzing the co-evolution of comments and source code. Software Quality Journal, 17(4), 367-394. https://doi.org/10.1007/s11219-009-9075-x

Gašević, D. K. (2009). Ontologies and software engineering. In Handbook on Ontologies. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-540-92673-3_27

Haav, H. M. (2001). In Proceedings of the 5th East-European Conference ADBIS, Vilnius, Lithuania.

Haiduc, S. A. (2010). On the use of automated text summarization techniques for summarizing source code. In 17th Working Conference on Reverse Engineering, Beverly, MA. https://doi.org/10.1109/WCRE.2010.13

Howard, M. J. S. (2013). Automatically mining software-based, semantically-similar words from comment-code mappings. In 10th working conference on mining software repositories (MSR), San Francisco, CA. https://doi.org/10.1109/MSR.2013.6624052

Hu, X. L. (2018). Deep code comment generation. In IEEE/ACM 26th International Conference on Program Comprehension (ICPC). Gothenburg, Sweden. https://doi.org/10.1145/3196321.3196334

Hu, X. L. (2020). Deep code comment generation with hybrid lexical and syntactical information. Empirical Software Engineering, 25(3), 2179-2217. https://doi.org/10.1007/s10664-019-09730-9

Jiang, Z. M. (2006). Examining the evolution of code comments in Postgre SQL. In Proceedings of the 2006 International Workshop on Mining Software Repositories, Trier, Germany. https://doi.org/10.1145/1137983.1138030

Kagdi, H. C. ( 2007). A survey and taxonomy of approaches for mining software repositories in the context of software evolution. Journal of Software Maintenance and Evolution: Research and Practice, 19(2), 77-131. https://doi.org/10.1002/smr.344

Khamis, N. W. (2010). Automatic quality assessment of source code comments: The JavadocMiner. In International Conference on Application of Natural Language to Information Systems, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13881-2_7

Kuhn, A. D. (2007). Semantic clustering: Identifying topics in source code. Information and software technology, 49(3), 230-243. https:// doi.org/10.1016/j.infsof.2006.10.017

Lalband, N. &. (2019). Software engineering for smart healthcare applications. International Journal of Innovative Technology and Exploring Engineering, 8(6S4), 325-331. https://doi.org/10.35940/ijitee.F1066.0486S419

LeClair, A. H. (2020). Improved code summarization via a graph neural network. In Proceedings of the 28th international conference on program comprehension, Seoul, Republic of Korea. https://doi.org/10.1145/3387904.3389268

Lemos, O. A. (2020). Comparing identifiers and comments in engineered and non-engineered code: A large-scale empirical study. In Proceedings of the 35th Annual ACM Symposium on Applied Computing, New York, NY. https://doi.org/10.1145/3341105.3373972

Liang, Y. &. ( 2018). Automatic generation of text descriptive comments for code blocks. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC. https://doi.org/10.1609/aaai.v32i1.12233

Liu, Z. X. ( 2019). Automatic generation of pull request descriptions. In 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), San Diego, CA. https://doi.org/10.1109/ASE.2019.00026

Madani, N. G. (2010). Recognizing words from source code identifiers using speech recognition techniques. In 14th European Conference on Software Maintenance and Reengineering, Madrid, Spain. https://doi.org/10.1109/CSMR.2010.31

McBurney, P. W., & McMillan, C. (2014). Automatic documentation generation via source code summarization of method context. In Proceedings of the 22nd International Conference on Program Comprehension, New York, NY.

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. Retrieved from https://bit.ly/3CVn8de.

Moreno, L. A. S. ( 2013). Automatic generation of natural language summaries for java classes. In 21st International Conference on Program Comprehension (ICPC), San Francisco, California. https://doi.org/10.1109/ICPC.2013.6613830

Movshovitz-Attias, D., & Cohen, W. (2013, August). Natural language models for predicting programming comments. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria.

Nolan, D. &. (2014). XPath, XPointer, and XInclude. In XML and Web Technologies for Data Sciences with RSpringer, New York, NY. https:// doi.org/10.1007/978-1-4614-7900-0_4

Padioleau, Y. T. (2009). Listening to programmers-taxonomies and characteristics of comments in operating system code. In IEEE 31st International Conference on Software Engineering, Washington, DC. https://doi.org/10.1109/ICSE.2009.5070533

Palomba, F. Z. (2018). Automatic test smell detection using information retrieval techniques. In IEEE International Conference on Software Maintenance and Evolution (ICSME), Limassol, Cyprus. https://doi.org/10.1109/ICSME.2018.00040

Panichella, S. A. (2012). Mining source code descriptions from developer communications. In 20th IEEE International Conference on Program Comprehension (ICPC), Passau, Germany. https://doi.org/10.1109/ICPC.2012.6240510

Paul, S. &. ( 1994). A framework for source code search using program patterns. IEEE Transactions on Software Engineering, 20(6), 463-475. https://doi.org/10.1109/32.295894

Pressman, R. S. (2005). Software engineering: A practitioner's approach. London, UK: Palgrave macmillan.

Punyamurthula, S. (2015). Dynamic model generation and semantic search for open source projects using big data analytics (Master thesis). University of Missouri, Kansas City, Missouri.

Qin Ming, C. M. (2018). Image Semantic Comment Based on Classification Fusion and Association Rules Mining. Computer Engineering and Science.

Rodeghero, P. M. (2014). Improving automated source code summarization via an eye-tracking study of programmers. In Proceedings of the 36th International Conference on Software Engineering, Hyderabad, India. https://doi.org/10.1145/2568225.2568247

Roy, C. K. (2007). A survey on software clone detection research. Queen's School of Computing TR, 541(115), 64-68.

Scalabrino, S. B.V. (2019). Automatically assessing code understandability. IEEE Transactions on Software Engineering, 47(3), 595-613. https://doi.org/10.1109/TSE.2019.2901468

Song, X. S. ( 2019). A survey of automatic generation of source code comments: Algorithms and techniques. IEEE Access, 7, , 111411-111428. https://doi.org/10.1109/ACCESS.2019.2931579

Sridhara, G. H. S. (2010). Towards automatically generating summary comments for java methods. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, Montpellier, France. https://doi.org/10.1145/1858996.1859006

Srinivasan I. K. (2016). Summarizing source code using a neural attention model. In ACL, Berlin, Germany.

Svyatkovskiy, A. D. (2020). Intellicode compose: Code generation using transformer. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece. https://doi.org/10.1145/3368089.3417058

Syriani, E. L. (2018). Systematic mapping study of template-based code generation. Computer Languages, Systems & Structures, 52, 43-62. https://doi.org/10.1016/j.cl.2017.11.003

Tan, L. (2009). Leveraging code comments to improve software reliability (PhD thesis). University of Illinois at Urbana-Champaign, Champaign, IL.

Tan, L. (2015). Code comment analysis for improving software quality. In The art and science of analyzing software data. Burlington, MA: Morgan Kaufmann.

Tan, L., Yuan, D., Krishna, G., & Zhou, Y. (2007, October). /* iComment: Bugs or bad comments?*. In Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, New York, NY. https://doi.org/10.1145/1323293.1294276

Tan, L., Yuan, D., & Zhou, Y. (2007, May). Hot comments: How to make program comments more useful? In HotOS, San Daiego, CA.

Van Roy, P. (2004). Concepts, techniques, and models of computer programming. Cambridge, MA: MIT press.

Vermeulen, A. A. (2000). The elements ofJava (tm) style. Cambridge University Press, Cambridge, MA. https://doi.org/10.1017/CBO9780511585852

Wen, F. N. (2019). A large-scale empirical study on code-comment inconsistencies. In IEEE/ACM 27th International Conference on Program Comprehension (ICPC), Montreal, QC. https://doi.org/10.1109/ICPC.2019.00019

Williams, C. C. (2005). Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering, 31(6), 466-480. https://doi.org/10.1109/TSE.2005.63

Wong, E. Y. (2013). Autocomment: Mining question and answer sites for automatic comment generation. In 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), Silicon Valley, CA. https://doi.org/10.1109/ASE.2013.6693113

Xu, S. Y. (2019). Commit message generation for source code changes. In International Joint Conferences on Artificial Intelligence, Freiburg, Germany. https://doi.org/10.24963/ijcai.2019/552

Yang, B. L. (2019). A survey on research of code comment. In Proceedings of the 3rd International Conference on Management Engineering, Software Engineering and Service Sciences, New York, NY. https://doi.org/10.1145/3312662.3312710

Yihong, L. (2018). An Image Feature Extraction and Semantic Annotation Method Based on Visual Memory[J]. Computer Knowledge and Technology, 2018(15).

Ying, A. T. (2005). Source code that talks: An exploration of Eclipse task comments and their implication to repository mining. ACM SIGSOFT Software Engineering Notes, 30(4), 1-5. https://doi.org/10.1145/1082983.1083152

Yu Hai, L. B. X. (2016). Source code annotation quality assessment method based on combined classification algorithm. Journal of Computer
Applications, 36(12), 3448-3453.
Published
2022-01-22
How to Cite
Syed Zohaib Hassan, Ayesha Irshad, Jamaluddin Mir, Ayesha Aslam, Kalsoom Ayaz, & Muhammad Awais Bawazir. (2022). Code Comment Analysis–A Review Paper. Journal of Management Practices, Humanities and Social Sciences, 6(1), 88-105. https://doi.org/10.33152/jmphss-6.1.9
Section
Articles