Code Duplication++

D. Zage and N. Whilte
Security and Software Engineering Research Center, Indiana, United States

Keywords: Cyber, Software Analysis, Clones, Vulnerabilities

Code duplication (or clones) causes an increase in software size and, eventually, every supplementary line of code enters the maintenance process, thereby increasing time and cost. Studies on code duplication percentages range from 15% to 25%, suggesting that the software industry is a major candidate for improvement. However, identifying significant or flawed code patterns is more important and not addressed in clone research. We developed a metrics-based approach for analyzing software designs (during design and implementation) that assists designers in engineering quality into the product. From our analysis of large industrial projects, we have discovered that within a software system are hidden relationships and structures that can be illuminated by evaluating and measuring software development artifacts. Among these patterns are clones. These relationships can be used to answer questions about the stability of the software product and guide software development techniques. Software development is itself a pattern-selecting and pattern-making process and the development pattern is inherent in the structure of the software. The coherent patterns of modules are gradually realized becoming effective and ontologically significant by virtue of their development. Our aim is to categorize software modules to support developers’ efforts to recognize and apply patterns.