Software Engineering: A Modern Approach

Marco Tulio Valente

1 Code Review: A Brief Introduction

By Marco Tulio Valente and Aline Torres

1.1 Introduction

Code review is one of the most important practices to ensure the medium- and long-term health of a system’s codebase. It is now adopted by most software companies. For instance, the 2019 Stack Overflow Survey included a question about this practice (see figure). Among the over 70,000 developers who answered this question, only 23.6% did not use code review in their work (link).

Stack Overflow survey question on code review adoption

The idea of code review is simple: every piece of code written by a developer must be inspected by at least one other developer, called a reviewer. The reviewer can add comments to the code under review, seeking to clarify doubts, suggest improvements, and identify bugs.

Therefore, a dialogue is established—in the form of a comment thread—between the author of the code and the reviewer. As a result, the author can modify their implementation to meet any of the suggestions that the reviewer makes, or they can reply and justify why a suggestion does not make sense. After this dialogue, the code is expected to be approved by the reviewer and integrated into the repository.

1.2 Pull Requests

In this section, we present a brief tutorial on how to review code using GitHub’s Pull Requests (PR) feature. This feature enables developers to submit code for integration into the main repository of a given project. However, before integration, GitHub allows developers to review this code.

The next figure shows a comment on a PR of the microsoft\vscode project (link). As we can see, the reviewer initially congratulates the PR author, highlighting that the solution proposed by the submitted code is very elegant. Then, he also points out four possible changes related to internal design issues (exposing a setter method), layout (improving colors and tab alignment), and he finally identifies a bug. In this case, he also attaches a video that reproduces this bug.

1.3 Motivations

In 2013, Alberto Bacchelli and Christian Bird conducted a study on code review with 873 Microsoft developers and testers (link). The next figure shows the main motivations for performing code review, according to the participants’ opinions.

Survey with Microsoft developers on the motivations for code review

According to the participants, the main motivation is to find bugs in the submitted code. However, other factors are also important, such as (1) improving the code, (2) proposing alternative implementation solutions, and (3) sharing knowledge. This last benefit can occur in both directions, meaning the author can learn from the reviewer’s comments and vice versa.

Regarding the third point above, code review has an essential benefit in preventing the creation of knowledge islands in software projects. Instead, the practice helps to socialize knowledge about the code and to create a culture of collaboration and exchange of ideas among team members.

1.4 What to Review?

The following are issues that should be considered in code reviews:

General bugs
Code that is more complex than necessary
Code that uses a less efficient algorithm or data structure
Code that violates design principles (see more in Chapter 5)
Code that violates the system’s architecture (see more in Chapter 7)
Code that does not handle exceptions and errors
Code with code smells (see more in Chapter 9)
Premature optimizations
Lack of tests
Lack of documentation
Security or privacy vulnerabilities
Performance issues
Usability issues
Inappropriate or sub-optimal use of APIs
Use of unauthorized libraries or frameworks
Problems related to dynamic memory allocation
Concurrency problems and bugs
Code with layout or indentation problems
Code that violates naming conventions

1.5 Recommendations for Reviewers

In this section, we list some recommendations on how to behave in a code review.

To arrive at them, we started with Michaela Greiler’s article How to Give Respectful and Constructive Code Review Feedback (link). We then analyzed over 120 PRs from GitHub projects, looking for cases that follow the provided recommendations. In total, we examined at least 259 comments to select the examples we will show next.

Generic Recommendations

Let’s start with some general recommendations:

Reviewers should always remember that the objective is to detect clear problems in the submitted code. For example, it is natural that the reviewer might have solved the same problem differently. However, they should not propose alternatives without clear and unequivocal advantages. In other words, the reviewer should only propose an alternative if it is, in fact, better.
In line with the previous recommendation, you should avoid subjective comments and those related to personal style. For example, unless a variable’s name is really problematic, it is advisable not to start a debate about it.
In your reviews, never use offensive, sarcastic, or ironic words. Instead, always be polite and maintain professionalism.
Always restrict your comments to the code that was submitted and avoid discussing personal matters or other issues.

Specific Recommendations

The following are more specific recommendations:

In your comments, try to ask questions, not make judgments. Here is an actual example given by a reviewer of one of the projects we studied: Is this actually used? Or is it something necessary to make the template magic work?
If you have made a wrong or meaningless comment, acknowledge your mistake and express gratitude, as in the following example: Ah, I see the point. Thanks for your explanation.
Whenever possible, use emojis, as they make the language more informal and friendly.
Whenever possible, reference either internal or external documentation to the project. This will help support your comments, as in this example: [a given feature] allows id as in [URL]
Be sure to praise the submitted work if you think it is of higher quality, as in this example: Thanks for the clear test instructions.
If necessary, use images and screenshots to explain your comments. The next figure shows an example. In this case, the reviewer intended to clarify that the submitted code behaves incorrectly when a particular filter is defined as All.

Example of a screenshot added by a reviewer in their review

Try to justify your comments when the author is a junior developer. Here’s an example: I suggest switching the ArrayList of Students to a HashMap<String, Student>, because we can then verify more efficiently if a certain Student is present in this data structure….
Whenever reasonable, use the pronoun we instead of you, as it makes clear that the author and the reviewer are working together. For example, instead of saying, could you make this attribute private?, you can ask: could we make this attribute private?
Last but not least, if you have a very strong disagreement regarding the submitted code (for example, if you think everything is wrong) or if the review is not converging on approving the PR, try to schedule a meeting with the author to express your viewpoint and attempt to reach a consensus. However, this synchronous interaction should be used only as an exception and be reserved for important cases.

1.6 Recommendations for Authors

Authors should be professional and polite in their responses. They should also understand that code review is not a proficiency exam. Therefore, as an author, you should not take the review personally or think the reviewer is judging your competence.

Another relevant point is that authors should submit small PRs if they want to obtain quick and more useful response from their reviewers. For example, the authors of Software Engineering at Google (link) recommend that a PR should have no more than 200 lines of code.

If PRs are too large, the review quality may also significantly drop, as summarized in this tweet:

Ask a programmer to review 20 lines of code, they’ll find 7 issues. Ask them to review 500 lines & they’ll find 0 issues. — Hays Stanford (@haysstanford) March 10, 2021

1.7 Automatic Code Reviews

Before concluding, we would like to note that many issues addressed in code reviews can be automatically detected using static analysis tools. For instance, issues such as naming conventions (camel case, snake case, etc.), code layout, and indentation style (tabs or spaces, for example), can be standardized using linters. This prevents human reviewers from wasting time on these issues.

Exercises

What is the main difference between code review and pair programming?
Describe one disadvantage of code review.
Given the following code for a Stack class, what comments would you make if you were responsible for reviewing this code? There are at least four issues with this code that should be mentioned in a review.

If preferred, you may perform your review in a GitHub PR. That is, open a PR in one of your repositories with the code below. If you do not know how to do this, read the Pull Requests section of Appendix A. This procedure might seem strange, since you will be reviewing a PR that you created. However, our objective is to simulate, for educational purposes, a code review session.
```
import java.util.ArrayList;
import java.util.EmptyStackException;

public class Stack<T> {

private ArrayList<T> elements = new ArrayList<T>();
public int size = 0;

  public int size() {
    return size;
  }

  public boolean isEmpty() {
           return (size == 0);
  }

  public void empilha(T elem) {
    elements.add(elem);
    size++;
  }

  public T pop() throws EmptyStackException {
    if (isEmpty())
       throw new EmptyStackException();
    T elem = elements.get(size-1);
    size--;
    return elem;
  }
}
```
Is it possible to simultaneously use Code Review and Trunk-Based Development (TBD)? Yes or no? Justify your answer. If needed, refer to the following section of the textbook to learn more about TBD.
Is it possible to use Code Review and Continuous Integration (CI) simultaneously? Yes or no? Justify your answer. If needed, consult the following section of the textbook to learn more about CI.
The following link contains the code for a function from an open-source system called FitNesse, which is used as an example in the Clean Code book. If you were in charge of reviewing this function, what would be your first comment?

How would you review each of the following code snippets? For each snippet, identify any potential issues or improvements that should be addressed in a code review.

const yyyymmddstr = moment().format("YYYY/MM/DD");

getUserInfo();
getClientData();
getCustomerRecord();

const locations = ["Austin", "New York", "San Francisco"];
locations.forEach(l => {
  doStuff();
  doSomeOtherStuff();
  // ...
  // ...
  // ...
  dispatch(l);
});

These examples are from a repository that documents best practices for writing JavaScript code.

Check out the other articles on our site.