(413) A Survey of Software Metric Use in Research Software Development
A Survey of Software Metric Use in Research Software Development
Abstract
Background
Complex software libraries and tools are essential for conducting research across various disciplines (science, engineering, business, humanities).
Ensuring quality and reliability of software is crucial to avoid less trustworthy results that could mislead research conclusions.
Aims
This work aims to understand research software developers' use of traditional software engineering concepts, like metrics, to evaluate software quality and the software development process.
The study aims to identify relevant metrics for research software compared to traditional software engineering metrics.
Method
A survey of research software developers was conducted to assess their knowledge and usage of code and process metrics, along with the influence of demographics on these metrics.
Results
Participants: 129 respondents
Most respondents knew about metrics; knowledge of specific metrics was limited.
Metrics most used concerned performance and testing; less focus on code complexity metrics, even though it poses a challenge.
Conclusions
Research software developers value metrics but face obstacles in their implementation. Further research is needed to evaluate metrics for continuous process improvement.
Index Terms
Survey, Software Metrics, Software Engineering, Research Software
I. INTRODUCTION
Researchers in diverse fields increasingly use software for research (termed research software).
Research software engineers (RSEs) design and implement software for research and seek recognition for their contributions.
Quality of research software impacts the reliability of research results.
Previous work demonstrates the need for software engineering (SE) practices to improve research software quality, including requirements, design, testing, and code complexity.
Metrics are essential to evaluate software reliability and quality over time.
A. Software Metrics Overview
Definition: A metric is a function that assesses software or process effectiveness; measurements apply metrics to derive values.
Important metric categories:
In-process (development process metrics)
Code-oriented (complexity metrics)
Research software often includes elements like version control and testing in open-source projects.
II. RESEARCH QUESTIONS
The research seeks to answer:
RQ1: What is the level of metrics knowledge and use by research software developers?
RQ2: Which metrics are most commonly used?
RQ3: What is the relationship between knowledge of metrics and their perceived usefulness?
RQ4: Do developers perceive code complexity as a problem?
RQ5: Is there a relationship between complexity problems and the use of associated metrics?
Non-Traditional Metrics
Several unique metrics relevant to research software identified:
Performance Metrics:
Examples: FLOPS (floating point operations/second), I/O operations, network throughput (MB/sec).
Green Computing Metrics:
Focus on energy efficiency and carbon emissions.
Correctness and Reproducibility:
Metrics for acceptance of results and error tolerance in modeling and simulation.
Failure Rate Metrics: Frequency of software failures is critical (especially in large systems).
Recognition Metrics:
Recognition through citations or project downloads is important to RSEs.
III. SURVEY DESIGN
Developed a survey to gather insights into metrics impact on projects:
Solicitation sent to high-performance computing and research software mailing lists.
Target Audience: Various domains of research software development.
Survey Questions
General Questions (GQ):
Project description, team size, project role, development stage, etc.
Metrics Questions (MQ): Knowledge, usefulness, specific metrics used.
Code Complexity Questions (CQ): Problems arising from complexity, frequency of use of complexity metrics.
IV. RESULTS
A. Demographics Analysis
Project Description:
79.8% respondents focused on Scientific Computing Software.
Project Size:
Most respondents were part of small teams; this could affect their metrics usage and knowledge.
Project Role:
Predominantly technical roles (developers/architects), impacting perceived metrics importance.
Project Development Stage:
Majority were in the released phase, indicating established metrics programs should be in place.
B. Overall Analysis of Knowledge and Use
Majority reported low metrics knowledge (GQ1) and usefulness (GQ3).
Metrics Knowledge vs Usefulness Table: Evidence of a correlation between knowledge level and perceived usefulness (p < .01).
C. Knowledge of Specific Metrics
89 unique metrics identified: / categorized into:
Code Metrics
Process Metrics
Testing Metrics
General Quality Metrics
Performance Metrics
Recognition Metrics
D. Productivity Evaluation Using Metrics
Metrics rarely used for individual/team evaluation.
E. Influence of Demographics on Metrics Knowledge and Use
Project Size: Smaller teams reported less knowledge and perceived usefulness than larger teams (χ2, p-value < .001).
Project Role: No significant difference in metrics knowledge or usefulness perception across roles.
Project Stage: Researchers with unreleased software reported a more varied perception of metrics knowledge and usefulness (p-value < .01).
F. Code Complexity Findings
Most respondents acknowledged code complexity as a problem but not useful frequency of associated metrics.
V. DISCUSSION
Insights by Research Questions
RQ1: Majority reported low to very low knowledge; familiarity with a high number of metrics.
RQ2: Performance and testing metrics most frequently recognized and utilized.
RQ3: A strong relationship exists between perceived usefulness and the likelihood of using metrics.
RQ4: Most agree code complexity is an issue needing attention.
RQ5: Low usage and perceived utility of complexity metrics despite reported complexity problems.
VI. THREATS TO VALIDITY
A. Internal Threats
Survey design may introduce bias; questions were neutrally worded.
There may be selection bias in who participated.
B. External Threats
Survey sample may not represent all research software developers due to targeted mailing lists.
C. Construct Threats
Possible misunderstanding of survey questions by respondents.
VII. CONCLUSIONS
The survey showed research developers have metric knowledge but limited actual application of SE metrics; code complexity remains poorly managed despite acknowledgment of its issues
Need for further exploration of process metrics to increase adoption of useful metrics.
ACKNOWLEDGMENTS
Recognition of survey respondents and support from NSF grants.
APPENDIX: SPECIFIC METRICS IDENTIFIED
High-Level Categories:
Code Metrics
General Quality Metrics
Performance Metrics
Process Metrics
Recognition Metrics
Testing Metrics