Book cover
All rights reserved. Version for personal use only.
This web version is subjected to minor edits. To report errors or typos, use this form.

Home | Dark Mode | Cite

Software Engineering: A Modern Approach

Marco Tulio Valente

5 Design Principles

The most fundamental problem in computer science is problem decomposition: how to take a complex problem and divide it up into pieces that can be solved independently. – John Ousterhout

This chapter starts with an introduction to software design, where we define and emphasize the importance of this activity (Section 5.1). Next, we discuss relevant properties of software design. Specifically, we cover Conceptual Integrity (Section 5.2), Information Hiding (Section 5.3), Cohesion(Section 5.4), and Coupling (Section 5.5). In Section 5.6, we discuss a set of design principles, including Single Responsibility, Interface Segregation, Dependency Inversion, Prefer Composition over Inheritance, Law of Demeter, Open/Closed, and Liskov Substitution. Lastly, we explore the usage of metrics to evaluate the quality of software designs (Section 5.7).

5.1 Introduction

John Ousterhout’s statement that opens this chapter provides an excellent definition for software design. Although not explicitly stated, he assumes that when we talk about design, we are indeed seeking a solution to a particular problem. In Software Engineering, this problem consists of implementing a system that meets the functional and non-functional requirements defined by the customers—or Product Owner/Manager, to use modern terms. Subsequently, Ousterhout suggests how we should proceed to come up with this solution: we must decompose, i.e., break down the initial problem, which may be quite complex, into smaller parts. Finally, the statement imposes a restriction on this decomposition: it must allow each part of the project to be solved or implemented independently.

This definition may give the impression that design is a simple activity. However, when designing software, we have to contend with a significant adversary: the complexity that characterizes modern software systems. For this reason, Ousterhout emphasizes that problem decomposition is a fundamental challenge not only in Software Engineering but also in Computer Science!

The key strategy for combating the inherent complexity in software design is to create abstractions. In Software Engineering, an abstraction is a simplified representation of a given entity. Despite this simplification, we can interact with and take advantage of the abstracted entity without having to know the details of its implementation. Functions, classes, interfaces, and packages are classic abstractions provided by programming languages.

In summary, the primary objective of software design is to decompose a problem into smaller parts. Additionally, these parts should be implemented independently. Finally, and equally important, it should be possible to handle them at a higher level of abstraction. While their implementation might be challenging and complex for the developers directly involved in this task, the created abstraction should be simple to use for other developers.

5.1.1 Example

To illustrate this discussion on software design, we will use the example of a compiler. The requirements in this case are clear: given a program in a language X it should be converted into a program in a language Y, which is usually the language of a machine. However, designing a compiler is not trivial. Thus, after years of research, a solution (or design) for implementing compilers was proposed, as illustrated in the following figure.

Main modules of a compiler

Our initial problem, i.e., designing a compiler, was broken down into four smaller problems, which we will briefly describe next. First, we need to implement a lexical analyzer, which breaks the input file into tokens (like if, for, while, x, +, etc.).

Then, we need to implement a syntax analyzer, which analyzes the tokens and checks if they follow the source language grammar. After that, we should transform the list of tokens into a hierarchical structure, known as Abstract Syntax Tree (AST). Finally, we have to implement the semantic analyzer, which is used to detect, for example, type errors; and the code generator, which converts the program’s AST to a lower-level language that can be executed by a given hardware or virtual machine.

This description, although simple, makes clear the objective of software design: to decompose a problem into smaller parts. That is, in our example, we have to design and implement a lexical analyzer, a syntax analyzer, a semantic analyzer, and a code generator. In fact, there are many challenges in each of these tasks, but we are closer to a solution than when we started to think about the compiler design.

Continuing with the example, the lexical analyzer implementation may require some effort. However, it should be as easy to use as invoking a function that returns the next token of the input file:

String token = Scanner.next_token();

Thus, we were able to encapsulate (or abstract) the complexity of a lexical analyzer into a single function.

5.1.2 Topics of Study

Software design depends on experience and, to some extent, on talent and creativity. Despite this, high-quality designs tend to share some properties. Thus, we will start by covering desired design properties, such as conceptual integrity, information hiding, cohesion, and coupling. After that, we will move our focus to design principles, which are guidelines to ensure that a design meets the mentioned properties. It is also possible to use a quantitative approach, based on metrics, to evaluate design properties. Thus, to conclude the chapter, we will study metrics to assess cohesion, coupling, and complexity.

Notice: The properties and principles discussed in this chapter apply to object-oriented designs. That is, the assumption is that we are dealing with systems implemented (or that will be implemented) in programming languages such as Java, C++, C#, Python, Go, Ruby, TypeScript, etc. Surely, some of the covered topics apply to structured designs (using languages such as C) or to functional designs (using languages such as Haskell, Elixir, Erlang, or Clojure). But it’s not our goal to provide complete coverage of the design aspects in such cases.

5.2 Conceptual Integrity

Conceptual integrity is a design property proposed by Frederick Brooks—the same professor behind Brooks’ Law, as we mentioned in Chapter 1. The property was articulated in 1975 in the first edition of his book, The Mythical Man-Month (link). Brooks argues that software should not be a collection of features without coherence and cohesion. Thus, conceptual integrity facilitates system usage and comprehension. For example, when we follow this property, a user familiar with one part of a system feels comfortable using another part as the features and user interface remain consistent.

To provide a counterexample, let’s assume a system that uses tables to present its results. However, depending on the pages where they are used, these tables have different layouts in terms of font sizes, the use of bold, line spacing, etc. Additionally, in some tables, we can sort the data by clicking on the column titles, but in other tables this feature is not available. Finally, the values representing prices are shown in different currencies. In some tables, they refer to euros; in other tables, they refer to dollars. These issues denote a lack of conceptual integrity and, as we have stated, add accidental complexity to the system’s use and understanding.

In the first edition of his book, Brooks emphatically defended the principle, stating on page 42 that:

Conceptual integrity is the most important consideration in system design. It is better to have a system omit certain anomalous features and improvements, but to reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas.

In 1995, in a commemorative edition marking the 20th anniversary of the book (link, page 257), Brooks again defended the principle, with even more emphasis:

Today I am more convinced than ever. Conceptual integrity is central to product quality.

Every time we discuss conceptual integrity, there is a debate about whether the principle requires a central authority, such as a single architect or product manager, to decide which functionalities will be included in the system. On the one hand, this role is not part of the definition of conceptual integrity. On the other hand, there is a consensus that key design decisions should not be delegated to large committees. When this happens, the tendency is to produce systems with more features than necessary, leading to bloated systems. For example, one group may advocate for feature A, while another supports feature B. Perhaps, the two features are mutually exclusive, but to reach a consensus, the committee might decide to implement both. As a result, the two groups are satisfied, even though the system’s conceptual integrity is compromised. There is a phrase that nicely illustrates this discussion: A camel is a horse designed by a committee.

In the previous paragraphs, we emphasized the impact of a lack of conceptual integrity on customers. However, the property also applies to the design and source code of software systems. In this case, the affected parties are the developers, who may face increased challenges in comprehending, maintaining, and evolving the system. Here are some examples of a lack of conceptual integrity at the code level:

  • When one part of the system uses a naming convention for variables (for instance, camel case, like totalScore), whereas in another part, another convention is used (for example, snake case, like total_score).

  • When one part of the system uses a certain framework for rendering web pages, while another part uses a second framework or a different version of the first framework.

  • When a problem is solved in one part of the system using a data structure X, while in another part, a similar problem is solved using a data structure Y.

  • When functions from one part of the system that need certain information, like a server address, retrieve it from a configuration file. However, in other functions, this information is passed as a parameter.

These examples reveal a lack of standardization and, consequently, of conceptual integrity. They are a problem because they make it harder for a developer responsible for maintaining one part of the system to be assigned to maintain another one.

Real World: Samuel Roso and Daniel Jackson, researchers from MIT in the USA, give an example of a system that implements two functionalities with very similar purposes, which is an indicator of conceptual integrity problems (link). According to them, in a well-known blogging system, when a user included a question mark in the title of a post, a window was opened asking them if they wanted to allow answers to that post. However, the researchers argue that this strategy make users confused, since the system already had a commenting feature. Thus, the confusion occurred due to two similar functionalities: comments (in regular posts) and answers (in posts with titles ending with a question mark).

5.3 Information Hiding

This property was first discussed in 1972 by David Parnas, in one of the most influential Software Engineering papers of all time, entitled On the criteria to be used in decomposing systems into modules (link). The paper’s abstract begins as follows:

This paper discusses modularization as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time. The effectiveness of a modularization is dependent upon the criteria used in dividing the system into modules.

Parnas uses the term module in his paper, but at a time when object orientation had not yet emerged, at least not as we know it today. In this chapter, written almost 50 years after Parnas’ work, we use the term class instead of module. The reason is that classes are the principal unit of modularization in popular programming languages, such as Java, C++, and TypeScript. However, the contents of this chapter apply to other modularization units, including those smaller than classes, such as methods and functions, and also to larger units, like packages.

Information hiding brings the following advantages to a software project:

  • Parallel development: Suppose a system implemented using classes C1, C2, …, Cn. If these classes hide their internal design decisions, it becomes easier to assign their implementation to different developers. This ultimately leads to a reduction in the implementation time of the system.

  • Changeability: Suppose we discover that class Cx has performance problems. If the implementation details of Cx are hidden from the rest of the system, it becomes easier to replace it with another class Cy, which uses a more efficient data structure and algorithm. This change is also less likely to result in bugs in other classes.

  • Comprehensibility: Suppose a new developer is hired by the company. Thus, he can be assigned to work on some classes only. In other words, he will not need to comprehend the entire system, but only the implementation of the classes for which he is responsible.

To achieve these benefits, classes must hide their design decisions that are subject to changes. A design decision is any aspect of the class’s design, such as the algorithms and data structures used in its code. Nowadays, the attributes and methods that a class intends to hide are declared with the private modifier, available in languages such as Java, C++, C#, and Ruby.

However, information hiding does not mean that a class should encapsulate all its data and code. Certainly, this would result in a class without utility. Indeed, a useful class must have public methods, i.e., methods that can be called and used by clients. We also say that a class’s public members define its interface. This concept is very important as it represents the visible part of the class.

Interfaces must be stable, as changes in a class’s interface may trigger updates in clients. To give an example, suppose a class Math, with methods providing mathematical operations. Suppose a method sqrt that computes the square root of its parameter. Suppose further that this method’s signature is changed to, for example, return an exception if the received argument is negative. This change affects the clients that call sqrt, since they will have to make their calls in a try block and be prepared to catch the exception.

5.3.1 Example

Consider a system for managing parking lots. The main class of this system is as follows:

import java.util.Hashtable;

public class ParkingLot {

  public Hashtable<String, String> vehicles;

  public ParkingLot() {
    vehicles = new Hashtable<String, String>();
  }

  public static void main(String[] args) {
    ParkingLot p = new ParkingLot();
    p.vehicles.put("TCP-7030", "Accord");
    p.vehicles.put("BNF-4501", "Corolla");
    p.vehicles.put("JKL-3481", "Golf");
  }
}

This class does not hide its internal data that might change in the future. Specifically, the hash table that stores the vehicles in the parking lot is public. Thus, clients, such as the main method, have access to this data and can, for example, add vehicles to `ParkingLot. If we decide to use another data structure in the future, we will need to update all clients.

By analogy, this first implementation of ParkingLot would be equivalent in a manual parking system to customers, after parking their car, entering the control booth and writing down their car’s license plate and model in the control sheet.

Next, we show an improved implementation, where the class encapsulates the data structure responsible for storing the vehicles. To park a vehicle, there is now the park method. This gives the class developers the freedom to change the internal data structures without impacting the clients. The only restriction is that the signature of the park method should be preserved.

import java.util.Hashtable;

public class ParkingLot {

  private Hashtable<String,String> vehicles;

  public ParkingLot() {
    vehicles = new Hashtable<String, String>();
  }

  public void park(String license, String vehicle) {
    vehicles.put(license, vehicle);
  }

  public static void main(String[] args) {
    ParkingLot p = new ParkingLot();
    p.park("TCP-7030", "Accord");
    p.park("BNF-4501", "Corolla");
    p.park("JKL-3481", "Golf");
  }
}

In summary, this new version hides a data structure that is subject to future changes. It also provides a stable interface for usage by the class clients, represented by the park method.

Real World: It is said that in 2002, Amazon’s CEO, Jeff Bezos, sent an email to the company developers with the following software design guidelines (this message is also mentioned in the Software Engineering textbook by Fox and Patterson (link, page 81)):

1. All teams responsible for different subsystems of Amazon.com will hence-forth expose their subsystem’s data and functionality through service interfaces only.

2. No subsystem is to be allowed direct access to the data owned by another subsystem; the only access will be through an interface that exposes specific operations on the data.

3. Furthermore, every such interface must be designed so that someday it can be exposed to outside developers, not just used within Amazon.com itself.

Essentially, these guidelines mean that Amazon developers should, after receiving the message, follow the information hiding ideas proposed by Parnas in 1972.

5.3.2 Getters and Setters

Methods get and set, often referred to simply as getters and setters, are widely used in object-oriented languages. A common recommendation for using these methods is as follows: all data in a class should be private; if external access is required, it should be done via getters (for read access) and setters (for write access).

As an example, see the following Student class, in which get and set methods are used to access an enrollmentNumber attribute.

class Student {

  private int enrollmentNumber;
  ...
  public int getEnrollmentNumber() {
    return enrollmentNumber;
  }

  public setEnrollmentNumber(int enrollmentNumber) {
    this.enrollmentNumber = enrollmentNumber;  
  }
  ...
}

However, getters and setters do not guarantee information hiding. On the contrary, they are a source of information leakage. Here’s what John Ousterhout says about these methods (link, Section 19.6):

Although it may make sense to use getters and setters if you must expose instance variables, it’s better not to expose instance variables in the first place. Exposed instance variables mean that part of the class’s implementation is visible externally, which violates the idea of information hiding and increases the complexity of the class’s interface.

In short: make sure you need to expose a class’s private information. If it is really necessary, consider implementing this exposure through getters and setters instead of making the attribute public.

In our example, let’s assume that it’s vital for clients to be able to read and modify the students’ enrollment numbers. Therefore, it’s better to provide access to this attribute through getters and setters since they offer a more stable interface for this purpose for the following reasons:

  • In the future, we may need to retrieve the enrollment number from a database, meaning this data will not always be in main memory. This logic could be implemented in the get method without impacting any class client.

  • In the future, we may need to add a check digit to the enrollment numbers. The logic for computing this extra digit could be implemented in the set method without affecting the clients.

Furthermore, getters and setters are required by some libraries, such as debugging, serialization, and mock libraries (we will study more about mocks in Chapter 8).

5.4 Cohesion

The implementation of any class should be cohesive, meaning that classes should implement a single function or service. Specifically, all methods and attributes of a class should contribute to the implementation of the same service. Another way to explain cohesion is by stating that every class should have a single responsibility in the system. Or, there should only be one reason to modify a class.

Cohesion has the following advantages:

  • It facilitates the implementation, understanding, and maintenance of a class.

  • It facilitates defining a single developer for maintaining a class.

  • It facilitates reuse and testing, as it’s simpler to reuse and test a cohesive class than a class that is responsible for many tasks.

Separation of concerns is another recommended property in software design, which is very similar to the concept of cohesion. It recommends that a class should implement only one concern. In this context, concern refers to any functionality, requirement, or responsibility of the class. Therefore, the following recommendations are equivalent: (1) a class should have a single responsibility; (2) a class should implement a single concern; (3) a class should be cohesive.

5.4.1 Examples

Example 1: The previous discussion was about class cohesion. However, the concept also applies to methods or functions. For example, consider a function like the following:

float sin_or_cos(double x, int op) {
  if (op == 1)
    "calculates and returns the sine of x"
  else
    "calculates and returns the cosine of x"
}

This function—which is an extreme example, and we hope, uncommon in practice—has a serious cohesion problem, as it does two things: it calculates the sine or cosine of its argument. It is then highly advisable to have separate functions for each of these tasks.

Example 2: Now, consider the following class:

class Stack<T> {
  boolean empty() { ... }
  T pop() { ... }
  push (T) { ... }
  int size() { ... }
}

This is a cohesive class, as all its methods implement important operations in a Stack data structure.

Example 3: Let’s return to the ParkingLot class, to which we’ve now added four manager-related attributes:

class ParkingLot {
  ...
  private String managerName;
  private String managerPhone;
  private String managerSSN;
  private String managerAddress;
  ...
}  

However, this class’s key responsibility is to control the parking lot operation, including methods like park(), calculatePrice(), releaseVehicle(), among others. Thus, it should not take on responsibilities related to the management of the parking lot’s employees. For that, a second class should be created, called, for example, Employee.

5.5 Coupling

Coupling refers to the strength of the connection between two classes. Although it might sound simple, the concept is subject to some nuances. Thus, for didactic purposes, we will divide coupling into two main types: acceptable coupling and poor coupling.

We say there is an acceptable coupling between class A and class B when:

  • Class A only uses public methods from class B.

  • The interface provided by B is stable, both syntactically and semantically. This means that the signatures of B’s public methods do not change frequently, and neither does the behavior of these methods. Therefore, changes in B will rarely impact A.

By contrast, there is a poor coupling between class A and B when changes in B can easily impact A. This occurs mainly in the following situations:

  • When classes A and B share a global variable or data structure, for instance, when B changes the value of a global variable used by A.

  • When class A directly accesses a file or database of class B.

  • When B’s interface is not stable. For example, the public methods of B are frequently renamed.

Poor coupling is characterized by the fact that the dependency between the classes is not mediated by a stable interface. For example, it is difficult to assess the impact that updating a global variable may have on other parts of the system. Conversely, when an interface is updated, this impact is more explicit. For example, in statically typed languages, the compiler will indicate the clients that need to be updated.

In summary, coupling is not necessarily bad, particularly when it occurs with the interface of a stable class that provides a relevant service to the source class. Poor coupling, however, should be avoided, as it refers to coupling that is not mediated by interfaces.

There is often a common recommendation regarding coupling and cohesion:

Maximize cohesion and minimize coupling.

Indeed, if a class depends on many other classes, it may be taking on too many responsibilities in the form of non-cohesive features. Remember that a class should have a single responsibility (or a single reason to change). On the other hand, we should be careful with the meaning of minimize coupling in this case. The objective is not to completely eliminate coupling, as it is natural for a class to depend on other classes, especially those that implement basic services, such as data structures, input/output, etc. On the other hand, at least poor coupling should be eliminated or reduced to the minimum possible.

5.5.1 Examples

Example 1: Consider the ParkingLot class, as used in Section 5.3, which has a Hashtable attribute. Therefore, ParkingLot is coupled to Hashtable. However, in our classification, this is an acceptable coupling; that is, it should not be a cause for major concerns, for the following reasons:

  • ParkingLot only uses the public methods of Hashtable.

  • The Hashtable interface is stable, as it is part of Java’s API (assuming the code is implemented in this language). Thus, a change in the public interface of Hashtable would not only break our ParkingLot class but perhaps millions of other classes in Java projects around the world.

Example 2: Consider the following code, where a file is shared by classes A and B, maintained by distinct developers. The B.g() method writes an integer to the file, which is read by A.f(). Thus, this form of communication results in a bad coupling between these classes. For instance, B’s developer might not be aware that the file is being used by A. Consequently, to facilitate the implementation of a new feature, she might change the file format without notifying the maintainer of A.

class A {
  private void f() {
    int total; ...
    File file = File.open("file1.db");
    total = file.readInt();
    ...
  }
}
class B {
  private void g() {
    int total;
    // computes total value
    File file = File.open("file1.db");
    file.writeInt(total);
    ...
    file.close();
  }
}

Before moving on, a quick note: in the example, there is also coupling between B and File. However, this is an acceptable coupling since the class needs to manipulate a file. Thus, to achieve this, nothing is better than using a class (File) from the language API.

Example 3: Now, we show an improved implementation for classes A and B from the previous example:

class A {

  private void f(B b) {
    int total;
    total = b.getTotal();
    ...
  }
}
class B {

  int total;

  public int getTotal() {
    return total;
  }

  private void g() {
    // computes total value
    File file = File.open("file1");
    file.writeInt(total);
    ...
  }
}

In this new implementation, the dependency from A to B is made explicit. First, B has a public method that returns total. Moreover, class A depends on B through a parameter in f, which is used to call getTotal(). This method is public, so B’s developers have explicitly decided to expose this information to clients. Therefore, in this new version, the coupling from A to B is acceptable. In other words, it’s not a coupling that raises major concerns.

Interestingly, in the first implementation (Example 2), class A doesn’t declare any variable or parameter of type B. Despite that, there is a poor form of coupling between the classes. In the second implementation, the opposite occurs, as A.f() has a parameter of type B. Regardless, the coupling in this case has a better nature, as it is easier to study and maintain A without knowing details about B.

Some authors also use the terms structural coupling and evolutionary coupling, with the following meaning:

  • Structural Coupling between A and B occurs when a class A has an explicit reference in its code to a class B. For example, the coupling between ParkingLot and Hashtable is structural.

  • Evolutionary (or Logical) Coupling between A and B occurs when changes in class B usually propagate to A. Thus, in Example 2, where class A depends on an integer stored in an file maintained by B, there is an evolutionary coupling between A and B. For example, changes in the file format have an impact on A.

Structural coupling can be acceptable or poor, depending on the stability of the target class’s interface. Evolutionary coupling, especially when most changes in B propagate to other classes, represents poor coupling.

During his time working at Facebook (now Meta), Kent Beck created a glossary of software design terms. In this glossary, coupling is defined as follows (link):

Two elements are coupled if changing one implies changing the other. […] Coupling can be subtle (we often see examples of this at Facebook). Site events, where parts of the site stop working for a time, are often caused by nasty bits of coupling that no one expected—changing a configuration in system A causes timeouts in system B, which overloads system C.

The definition of coupling proposed by Beck—two elements are coupled if changing one implies changing the other—corresponds to our definition of evolutionary coupling. That is, it appears that Beck is not concerned about acceptable coupling (i.e., structural and stable). Indeed, he comments on that in the second paragraph of the glossary entry:

Coupling is expensive but some coupling is inevitable. The responsive designer eliminates coupling triggered by frequent or likely changes and leaves in coupling that doesn’t cause problems in practice.

The definition also makes it clear that coupling can be indirect. That is, changes in A can propagate to B, and then affect C.

Real World: An example of a real problem caused by indirect coupling became known as the left-pad episode. In 2016, a copyright dispute prompted a developer to remove one of his libraries from the npm directory, widely used for distributing Node.js software. The removed library implemented a single JavaScript function named leftPad with only 11 lines of code. It filled a string with blanks to the left. For example, leftPad ('foo', 5) would return ' foo', i.e., foo with two blanks to the left.

Thousands of sites depended on this trivial function but under indirect dependencies. For instance, they used npm to dynamically download library A1, which in turn depended on a library A2, and so on until reaching a library An with a direct dependency on left-pad. As a result, A1, A2, …, An were down for a few hours until the library was reinserted into the npm repository. In short, the sites were affected by a problem in a trivial library, and most had no idea they depend on.

5.6 SOLID and Other Design Principles

Design principles are recommendations that software developers should follow to achieve the design properties we presented in the previous section. Thus, the properties can be seen as generic qualities of good designs, while the principles serve as guidelines.

We will study seven design principles, as listed in the next table. The table also shows the properties contemplated by following each principle.

Design Principle Design Property
Single Responsibility Cohesion
Interface Segregation Cohesion
Dependency Inversion Coupling
Prefer Composition over Inheritance Coupling
Demeter Information Hiding
Open/Closed Extensibility
Liskov Substitution Extensibility

Five of these are known as SOLID Principles, which is an acronym created by Robert Martin and Michael Feathers (link). The name comes from the initial letter of the principles:

  • Single Responsibility Principle
  • Open Closed Principle
  • Liskov Substitution Principle
  • Interface Segregation Principle
  • Dependency Inversion Principle

The design principles we’ll discuss share a common goal: they not only solve a design problem but also ensure that the solution can be more easily maintained and evolve over time. Major challenges within software projects often arise during maintenance. Typically, maintenance tasks, including bug fixes and the implementation of new features, tend to become progressively slower, more costly, and riskier. Thus, the design principles we will study help to produce flexible designs that make maintenance easier.

5.6.1 Single Responsibility Principle

This principle is an application of the idea of cohesion. It proposes the following: every class should have a single responsibility. Moreover, responsibility, in the context of the principle, means reasons for changing a class. That is, there should be only one reason to modify any class in a system.

An important corollary of this principle recommends separating presentation from business rules. In other words, a system should have presentation classes, which implement the interface with users, and classes responsible for business rules. These latter classes perform computations directly related to the system domain. These are distinct interests and responsibilities, and they undergo modifications for different reasons. Hence, they should be implemented in separate classes. In fact, it is not surprising that there are developers nowadays who specialize only in front-end requirements (i.e., presentation classes) and developers who specialize in backend requirements (i.e., classes that implement business rules).

Example: The following Course class illustrates a violation of the Single Responsibility Principle. The calculateDropoutRate method in this class has two responsibilities: calculate the dropout rate of a course and presenting it on the system console.

class Course {
  void calculateDropoutRate() {
    rate = "compute dropout rate";
    System.out.println(rate);
  }

}

Thus, a better design consists of dividing these responsibilities between two classes: a user interface class (Console) and a business rule class (Course), as shown in the following code. Among other benefits, this solution allows the reuse of the business class with other interface classes, such as web or mobile interfaces.

class Console {
  void printDropoutRate(Course course) {
    double rate = course.calculateDropoutRate();
    System.out.println(rate);
  }
}

class Course {
  double calculateDropoutRate() {
    double rate = "compute the dropout rate"
    return rate;
  }
}

5.6.2 Interface Segregation Principle

Like the previous principle, this principle is also an application of the idea of cohesion. In fact, it is a particular case of Single Responsibility, but with a focus on interfaces. The principle defines that interfaces should be small, cohesive, and, more importantly, specific for each type of client. The goal is to prevent clients from depending on interfaces with many methods they won’t use. To avoid this, two or more specific interfaces can, for example, replace a general-purpose one.

A violation of the principle occurs, for example, when an interface has two sets of methods, Mx and My. The first set is used by clients Cx (which do not use the methods My). Conversely, methods My are used only by clients Cy. Consequently, this interface should be broken down into two smaller and specific interfaces: one interface containing only the methods Mx and the second interface containing only methods My.

Example: Consider an Account interface with the following methods: (1) getBalance, (2) getInterest, and (3) getSalary. This interface violates the Interface Segregation Principle since only savings accounts pay interest; and only salary accounts have an associated salary (assuming the latter are used by companies to pay employees their monthly salaries).

interface Account {
  double getBalance();
  double getInterest(); // only applicable to SavingsAccount
  int getSalary(); // only applicable to SalaryAccount
}

An alternative, which follows the Interface Segregation Principle, involves creating two specific interfaces (SavingsAccount and SalaryAccount) that extend the generic one (Account).

interface Account {
  double getBalance();
}

interface SavingsAccount extends Account {
  double getInterest();
}

interface SalaryAccount extends Account {
  int getSalary();
}  

5.6.3 Dependency Inversion Principle

This principle recommends that a client class should primarily depend on abstractions and not on concrete implementations, as abstractions (i.e., interfaces) tend to be more stable than concrete implementations (i.e., classes). The idea is to invert the dependencies: instead of relying on concrete classes, clients should depend on interfaces. Hence, a more intuitive name for the principle would be Prefer Interfaces to Classes.

To further explain the principle, consider an interface I and a class C1 that implements it. If possible, a client should depend on I and not on C1. The reason is that when a client depends on an interface, it becomes immune to changes in its implementations. For example, instead of C1, the implementation can be changed to C2, which will have no impact on the client in question.

Example 1: The following code illustrates the scenario we just described. In this code, the Client class can work with concrete objects from classes C1 and C2. In fact, it does not need to know the concrete class that implements the interface I that it references in its code.

interface I { ... }

class C1 implements I {
  ...
}
class C2 implements I {
  ...
}
class Client {

  I i;

  Client (I i) {
    this.i = i;
    ...
  } ...
}
class Main {

  void main () {
    C1 c1 = new C1();
    new Client(c1);
    ...
    C2 c2 = new C2();
    new Client(c2);
    ...
  }
}

Example 2: Now, we present another example of the Dependency Inversion Principle. In the following code, this principle justifies the choice of the interface Projector as the type of the g method parameter.

void f() {
  EpsonProjector projector = new EpsonPProjector();
  ...
  g(projector);
}
void g(Projector projector) {  // Projector is an interface
  ...
}

Tomorrow, the type of the local variable projector in f can change to, for instance, SamsungProjector. If this happen, the implementation of g will remain working, as when we use an interface, we are prepared to receive parameters of any class that implement it.

Example 3: As a last example, suppose a library that provides a List interface and three concrete implementations (classes) for it: ArrayList, LinkedList, and Vector. Whenever possible, clients of this library should declare variables, parameters, or attributes using the List interface because that way their code is automatically compatible with the three current implementations of this interface and also with future ones.

5.6.4 Prefer Composition Over Inheritance

Before explaining the principle, let’s clarify that there are two types of inheritance:

  • Class inheritance (example: class A extends B) involves code reuse. Throughout this book, we will refer to class inheritance simply as inheritance.

  • Interface inheritance (example: interface I extends J) does not involve code reuse. Thus, this form of inheritance is simpler and does not raise major concerns. When we need to reference it, we will use the full name: interface inheritance.

Returning to the principle, when object-oriented programming gained popularity in the 1980s, inheritance was often seen as a magical solution to software reuse challenges. Consequently, designers at that time began to consider deep class hierarchies as indicators of robust design. However, as time progressed, it became clear that inheritance isn’t this magical solution it was once thought to be. Instead, it typically introduces maintenance and evolution problems due to the strong coupling between subclasses and their superclass. These problems are described, for example, by Gamma and colleagues in their book on design patterns (link, page 19):

Because inheritance exposes a subclass to details of its parent’s implementation, it’s often said that inheritance breaks encapsulation. The implementation of a subclass becomes so bound up with the implementation of its parent class that any change in the parent’s implementation will force the subclass to change.

However, we should note that the principle doesn’t forbid the usage of inheritance. It simply advises that if there are two design solutions—one based on inheritance and the other on composition—the latter is generally the preferable one. For clarity, a composition relationship exists when a class A contains an attribute of another class B.

Example: Suppose we need to implement a Stack class. There are at least two solutions, via inheritance or composition, as shown next:

Implementation using inheritance:

class Stack extends ArrayList {
  ...
}

Implementation using composition:

class Stack {
  private ArrayList elements;
  ...
}

The implementation using inheritance is not recommended for two reasons: (1) a Stack, conceptually, is not an ArrayList but a data structure that can use an ArrayList in its implementation; (2) when inheritance is used, Stack inherits methods like get and set from ArrayList, which are not part of the specification of stacks.

As another benefit, when we use composition the relationship between the classes is not static, as it is in the case of inheritance. In the implementation using inheritance, Stack is statically coupled to ArrayList, meaning that it is not possible to change this decision at runtime. On the other hand, when a composition-based solution is adopted, this change becomes possible, as shown in the following example:

class Stack {

  private List elements;

  Stack(List elements) {
    this.elements = elements;
  }
  ...
}

In this example, the data structure with the stack elements is passed as a parameter to the Stack constructor. Thus, we can create Stack objects with different data structures. For instance, in one stack, the elements can be stored in an ArrayList; in another stack, they can be in a Vector. It’s worth noting that the Stack constructor receives a List as a parameter, which is an interface type. Therefore, the example also illustrates the previous Prefer Interfaces to Class principle.

Let’s conclude our presentation by summarizing three points:

  • Inheritance is classified as a white-box reuse mechanism, as subclasses tend to have access to implementation details of the base class. On the other hand, composition is a mechanism of black-box reuse.

  • A design pattern that helps to replace a solution based on inheritance with one based on composition is the Decorator Pattern, which we will study in the next chapter.

  • Due to the problems discussed in this section, more recent programming languages, like Go and Rust, do not include support for inheritance.

5.6.5 Demeter Principle

This principle is named after a research group at Northeastern University in Boston, USA, called Demeter, which conducted research in software modularization. In the late ’80s, as part of their studies, they proposed a set of rules to improve information hiding in object-oriented systems, which became known as the Demeter Principle or Demeter Law. The principle, also known as the Principle of Least Privilege, argues that a method’s implementation should only invoke the following other methods:

  • from its own class (case 1)
  • from objects received as parameters (case 2)
  • from objects created by the method itself (case 3)
  • from attributes of the method’s class (case 4)

Example: In the following code, method m1 makes four calls that respect the Demeter Principle. Then, we have method m2, which contains a call that violates the principle.

class DemeterPrinciple {

  T1 attr;

  void f1() { 
    ...
  }

  void m1(T2 p) {   // method following Demeter
    f1();           // case 1: own class
    p.f2();         // case 2: parameter
    new T3().f3();  // case 3: created by the method
    attr.f4();      // case 4: class attribute
  }

  void m2(T4 p) {  // method violating Demeter
    p.getX().getY().getZ().doSomething();
  }

}

Method m2, by sequentially calling three get methods, violates the Demeter Principle. This is because the intermediate objects, returned by the getters, serve as mere pass-through objects to reach the final object. In this example, this final object is the one that provides an useful operation, such as doSomething(). However, the intermediate objects accessed along the chain of calls may expose information about their state. Apart from making the call more complex, the exposed information can change as a result of future maintenance in the system. Thus, if one of the links in the call sequence break, m2 has to find an alternative route to reach the final method. In summary, calls that violate the Demeter Principle tend to break encapsulation since they manipulate a sequence of intermediary objects, which are subjected to changes.

Thus, the Demeter Principle advises that methods should only communicate with their friends, that is, either methods of their own class or methods of objects they received as parameters or created. On the other hand, it is not recommended to communicate with friends of friends.

An example, formulated by David Bock (link), nicely illustrates the benefits of the Demeter Principle. The example is based on three objects: a newspaper delivery person, a customer, and the customer’s wallet. A violation of Demeter occurs if, to receive the payment for a newspaper, the delivery person has to execute the following code:

price = 3.00;
Wallet wallet = customer.getWallet();
if (wallet.getTotalValue() >= price) {  // violates Demeter
   wallet.debit(price);                 // violates Demeter
} else {
  // I'll be back tomorrow to collect the payment
}

In this code, the delivery person has to access the customer’s wallet, via getWallet(), and then directly retrieve the bills to pay for the newspaper. However, no customer would accept a delivery person having such freedom. A more realistic solution is the following:

price = 3.00;
try {
  customer.pay(price);
}
catch (InsufficientValueException e) {
  // I'll be back tomorrow to collect the payment
}

In this new code, the customer does not give the delivery person access to his or her wallet. On the contrary, the delivery person is even unaware that the customer has a wallet. This data structure is encapsulated within the Customer class. Instead, the customer offers a pay method, which is called by the delivery person. Finally, an exception signals when the customer does not have enough money to pay for the newspaper.

5.6.6 Open/Closed Principle

This principle, originally proposed by Bertrand Meyer back in the ’80s (link), advocates an idea apparently paradoxical: a class must be closed for modifications, but open for extensions.

However, the key idea is that developers should not only implement a class, but also prepare it for extensions and customizations. To achieve this, they can use parameters, inheritance, higher-order (or lambda) functions, and design patterns like Abstract Factory, Template Method, and Strategy. In the next chapter, we will discuss some of these design patterns.

In short, the Open/Closed Principle recommends that, whenever necessary, we should implement flexible and extensible classes that can adapt to various usage scenarios without the need to edit their source code.

Example 1: An example of a class that follows the Open/Closed Principle is the Collections class in Java. It has a static method that sorts a list in ascending order. Here’s an example of using this method:

List<String> names;
names = Arrays.asList("john", "megan", "alexander", "zoe");

Collections.sort(names);

System.out.println(names);  
// result: ["alexander","john","megan","zoe"]

However, in the future, we might want to use sort to sort strings based on their length in characters. Fortunately, Collections is prepared for this new use case. But for this, we need to implement a Comparator object that compares the strings by their length, as in the following code:

Comparator<String> comparator = new Comparator<String>() {
  public int compare(String s1, String s2) {
    return s1.length() - s2.length();
  }
};
Collections.sort(names, comparator);

System.out.println(names);   
// result: [zoe, john, megan, alexander]

In other words, the Collections class turned out to be open to cope with this new requirement, while keeping its code closed, that is, we didn’t need to change the source code of the class.

Example 2: Now we will show an example of a function that does not follow the Open/Closed Principle.

double calcTotalScholarships(Student[] list) {
  double  total = 0.0;
  foreach (Student student in list) {
    if (student instanceof UndergraduateStudent) {
      UndergraduateStudent undergrad;
      undergrad = (UndergraduateStudent) student;
      total += "code that calculates undergrad scholarship";
    }
    else if (student instanceof MasterStudent) {
      MasterStudent master;
      master = (MasterStudent) student;
      total += "code that calculates master scholarship";
    }
  }
  return total;
}

If tomorrow we need to create another subclass of Student, for example, DoctoralStudent, the code of calcTotalScholarships needs to be changed too. In other words, the function is not ready to seamlessly accommodate extensions for new types of students.

The Open/Closed Principle requires class designers to anticipate possible extension points. However, it’s important to highlight that it is not feasible for a class to accommodate all potential extensions. For instance, the sort method (Example 1) uses a version of the MergeSort algorithm, but clients cannot customize this algorithm. Therefore, in terms of configuring the sorting algorithm, the method does not comply with the Open/Closed Principle.

5.6.7 Liskov Substitution Principle

As we discussed in the Prefer Composition over Inheritance principle, inheritance is no longer as popular as it was in the 80s. Today, the use of inheritance is more restrained and uncommon. However, some use cases still justify its adoption. Indeed, inheritance establishes an is-a relationship between subclasses and a base class. The advantage is that methods common to the subclasses can be implemented only once, in the base class. After that, they are inherited in all subclasses.

The Liskov Substitution Principle defines rules for redefining methods in subclasses. The principle’s name references Barbara Liskov, an MIT professor and the recipient of the 2008 Turing Award. Among her various works, Liskov conducted research on object-oriented type systems. In one of these works, she enunciated the principle that later was named after her.

To explain the Liskov Substitution Principle, let’s start with this code:

void f(A a) {
  ...
  a.g();
  ...
}

The f method can receive as argument objects of subclasses B1, B2, …, Bn of the base class A , as shown below:

f(new B1());  // f can receive objects from subclass B1 
...
f(new B2());  // and from any subclass of A, such as B2
...
f(new B3());  // and B3

The Liskov Substitution Principle specifies the semantic conditions that subclasses must adhere to for a program to behave as expected.

Let’s consider that subclasses B1, B2, …, Bn redefine the implementation of g() inherited from the base class A. According to the Liskov Substitution Principle, these redefinitions are possible but they must not violate the original contract of g in the base class A.

Example 1: Suppose a PrimeNumber class with methods to perform computations related to prime numbers. This class has subclasses that implement alternative algorithms for the same purpose. Specifically, the getPrime(n) method returns the n-th prime number. This method is implemented in PrimeNumber and it is redefined in its subclasses.

Furthermore, assume that getPrime(n) contract specifies it should compute any prime number for the argument n varying from 1 to 1 million. In this context, a subclass would violate this contract if its implementation of getPrime(n) only handles prime numbers up to, for example, 900,000.

To illustrate the principle further, consider a client that calls p.getPrime(n), where initially p references a PrimeNumber object. Assume now that p is substituted by an object from a subclass. Consequently, after the substitution, the call will execute the getPrime(n) implementation provided by this subclass. Essentially, the Liskov Substitution Principle prescribes that this change in called methods should not affect the client’s behavior. To achieve this, the getPrime(n) implementations in the subclasses must perform the same tasks as the original method, possibly more efficiently. For example, they should accept the same input parameter range or a broader one. However, they cannot be redefined to accept a more restrictive range than the one provided by the base class implementation.

Example 2: Let’s present a second example that does not respect the Liskov Substitution Principle.

class A {
  int sum(int a, int b) {
    return a+b;
  }
}
class B extends A {

  int sum(int a, int b) {
    String r = String.valueOf(a) + String.valueOf(b);
    return Integer.parseInt(r);
  }

}
class Client {

  void f(A a) {
    ...
    a.sum(1,2); // can return 3 or 12
    ...
  }

}

class Main {

  void main() {
    A a = new A();
    B b = new B();
    Client client = new Client();
    client.f(a);
    client.f(b);
  }

}

In this example, the method that adds two integers is redefined in the subclass to concatenate the respective values converted to strings. Therefore, a developer maintaining the Client class might find this confusing. In one run, calling sum(1,2) returns 3 (i.e., 1+2). In the next execution, the same call returns 12 (i.e., 1 + 2 = 12 or 12, as an integer).

5.7 Source Code Metrics

Over the years, various metrics have been proposed to quantify software design properties. Typically, they require access to the source code of a system, meaning that the project must have been implemented. By analyzing the structure of the source code, these metrics express properties such as size, cohesion, coupling, and complexity of the code in a quantitative way, using numeric values. The goal is to facilitate an evaluation of the quality of an existing design in an objective manner.

However, monitoring a system’s design using source code metrics is not a common practice nowadays. One of the reasons is that several design properties, such as cohesion and coupling, involve a degree of subjectivity, making their measurement challenging. Additionally, the interpretation of metric values highly depends on contextual information. A specific result might be acceptable in one system but not in another system from a different domain. Even among the classes of a given system, the interpretation of metric values can vary considerably.

In this section, we will delve into metrics to measure the following properties of a design: size, cohesion, coupling, and complexity. We will outline the steps for calculating these metrics and provide some examples. Additionally, there are tools for calculating these metrics, with some functioning as plugins for well-known Integrated Development Environments (IDEs).

5.7.1 Size

The most popular source code metric is lines of code (LOC). It can be used to measure the size of a function, class, package, or an entire system. When reporting LOC results, it should be clear which lines are indeed counted. For example, if comments or blank lines are considered or not.

Although LOC can give an idea of a system’s size, it shouldn’t be used to measure developers’ productivity. For instance, if a developer implemented 1 KLOC in a week and another developer implemented 5 KLOC, we cannot affirm that the second one was five times more productive. Among other reasons, the requirements they implemented might have different complexities. Ken Thompson, one of the developers of the Unix operating system, has an interesting saying about this:

One of my most productive days was throwing away 1000 lines of code.

This quote is attributed to Thompson in a book authored by Eric Raymond, on page 24. Therefore, software metrics, whatever they may be, should not be viewed as a goal. In the case of LOC, for example, this could encourage developers to duplicate code simply to meet a set goal.

Other size metrics include number of methods, number of attributes, number of classes, and number of packages.

5.7.2 Cohesion

A well-known metric for calculating cohesion is called LCOM (Lack of Cohesion Between Methods). In general, software metrics are interpreted as follows: the larger the metric value, the poorer the quality of the code or design. However, cohesion is an exception to this rule, as the higher the cohesion measure, the better the design. For this reason, LCOM measures the lack of cohesion in classes. The larger the LCOM value, the higher the lack of cohesion within a class, and consequently, the poorer its design.

To calculate the LCOM value of a class C, first, we should compute the following set:

M(C) = { (f1, f2) | f1 and f2 are methods of C }

It consists of all unordered pairs of methods from class C. Then, we should also compute the following set:

A(f) = Set of attributes accessed by method f

The value of LCOM(C) is defined as follows:

LCOM(C) = | { (f1, f2) in M(C) | A(f1) and A(f2) are disjoint sets } |

That is, LCOM(C) is the number of pairs of methods within class C—taken from all possible pairs—that don’t use common attributes; in other words, the intersection is empty.

Example: To illustrate the computation of LCOM, suppose this class:

class A {

  int a1;
  int a2;
  int a3;

  void m1() {
    a1 = 10;
    a2 = 20;
  }

  void m2() {
    System.out.println(a1);
    a3 = 30;
  }

  void m3() {
    System.out.println(a3);
  }

}

Considering this class, the following table shows the sets M and A, and the intersection of the A sets.

Example of LCOM calculation

Therefore, in this example, LCOM(C) equals 1, as the class has three possible pairs of methods, but two access at least one attribute in common (refer to the third column of the table). There is only one pair of methods that does not share common attributes.

LCOM assumes that, in a cohesive class, any pair of methods should access at least one common attribute. In other words, what contributes to cohesion in a class are methods accessing the same attributes. Therefore, cohesion is compromised, i.e., LCOM increases by one, every time we find a pair of methods(f1, f2), where f1 and f2 manipulate completely different attributes.

In the LCOM calculation, constructors and getters/setters are not considered. Constructors often share attributes with most other methods, while the opposite tends to be true for getters and setters.

Lastly, it’s important to mention that there are alternative proposals for calculating LCOM. The version we present, known as LCOM1, was proposed by Shyam Chidamber and Chris Kemerer in 1991 (link). The alternative versions are called LCOM2, LCOM3, etc. Therefore, when reporting LCOM values, it is important to state which version of the metric is being used.

5.7.3 Coupling

CBO (Coupling Between Objects) is a metric to measure structural coupling between two classes. It was proposed by Chidamber and Kemerer (link and link).

Given a class A, CBO counts the number of classes on which A has syntactical (or structural) dependencies. A depends on a class B when:

  • A calls a method in B
  • A accesses a public attribute of B
  • A inherits from B
  • A declares a local variable, parameter, or return type of type B
  • A catches an exception of type B
  • A throws an exception of type B
  • A creates an object of type B.

Suppose a class A with two methods (method1 and method2):

class A extends T1 implements T2 {

  T3 a;

  T4 method1(T5 p) throws T6 {
    T7 v;
    ...
  }

  void method2() {
    T8 = new T8();
    try {
      ...
    }
    catch (T9 e) { ... }
  }

}

As indicated by numbering the types that class A depends on, we have that CBO(A) = 9.

The definition of CBO does not distinguish the target classes responsible for the dependencies. For instance, it doesn’t matter whether the dependency is on a class from the language’s library (e.g., Hashtable) or a less stable class from an application still under development.

5.7.4 Complexity

Cyclomatic Complexity (CC) is a metric proposed by Thomas McCabe in 1976 to measure the complexity of code within a function or method (link). Sometimes, it is also referred to as McCabe’s Complexity. In the context of this metric, complexity relates to the difficulty of maintaining and testing a function. The definition of CC is based on the concept of control flow graphs. In these graphs, the nodes represent the commands in a function or method, and the edges represent possible control flows. Thus, commands like an if generate two control flows. The metric’s name derives from the fact that the metric is calculated using a concept of Graph Theory called cyclomatic number.

However, there’s a simple alternative to calculate a function’s CC, which doesn’t require control flow graphs. This alternative defines CC as following:

CC = number of decision commands in a function + 1

Decision commands include if, while, case, for, etc. The intuition behind this formula is that these commands make the code harder to understand and test, and hence, more complex.

Therefore, calculating CC is straightforward: given the source code of a function, count the number of commands listed above and add 1. The lowest CC value is 1, which occurs in code that contains no decision commands. In the article where he defined the metric, McCabe suggests that a reasonable but not magic upper limit for CC is 10.

Bibliography

Robert C. Martin. Clean Architecture: A Craftsman's Guide to Software Structure and Design, Prentice Hall, 2017.

John Ousterhout. A Philosophy of Software Design, Yaknyam Press, 2018.

Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.

Frederick Brooks. The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley, anniversary edition, 1995.

Diomidis Spinellis. Code Quality. Addison-Wesley, 2006.

Andrew Hunt, David Thomas. The Pragmatic Programmer: From Journeyman to Master. Addison-Wesley, 1999.

Mauricio Aniche. Orientação a Objetos e SOLID para Ninjas. Projetando classes flexíveis. Casa do Código, 2015.

Thomas J. McCabe. A Complexity Measure. IEEE Transactions on Software Engineering, 1976.

Shyam Chidamber and Chris Kemerer. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 1994.

Shyam Chidamber and Chris Kemerer. Towards a metrics suite for object oriented design. Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA), 1991.

Exercises

1. Describe three benefits of information hiding.

2. Suppose two classes, A and B, that are implemented in different directories; class A has a reference in its code to class B. Moreover, whenever a developer needs, as part of a maintenance task, to modify classes A and B, he concludes the task by moving B to the same directory as A. (a) By acting in this way, which design project property (when measured at the level of directories) is the developer improving? (b) And which design property (also measured for directories) is negatively affected?

3. Classitis is a term used by John Ousterhout to describe the proliferation of small classes in a system. According to him, classitis result in classes that are individually simple but, on the other hand, increase the total complexity of a system. Using the concepts of coupling and cohesion, explain the problem caused by this disease.

4. Define: (a) acceptable coupling; (b) poor coupling; (c) structural coupling; (d) evolutionary (or logical) coupling.

5. Give an example of: (1) acceptable and structural coupling; (2) poor and structural coupling.

6. Is it possible for class A to be coupled with class B without having a reference to B in its code? If so, is this coupling acceptable, or is it a case of poor coupling?

7. Suppose a program where all the code is implemented in the main method. Does it have a cohesion or coupling problem? Justify.

8. What design principle is not followed by this code?

void onclick() {
  id1 = textfield1.value();       // presentation
  account1 = BD.getAccount(id1);
  id2 = textfield2.value();
  account2 = BD.getAccount(id2);
  value = textfield3.value();    // non-functional req.
  beginTransaction();
  try {
    account1.withdraw(value);    // business rule
    account2.deposit(value);
    commit();
  }          
  catch() {
    rollback();
  }
}  

9. There are three key concepts in object oriented programming: encapsulation, polymorphism, and inheritance. Suppose you have been asked to design a new object-oriented language. However, you can only choose two of the three discussed concepts. Which concept would you eliminate from your language? Justify your answer.

10. What design principle is not followed by this code? How would you modify the code to follow this principle?

void sendMail(BankAccount account, String msg) {
  Customer customer = account.getCustomer();
  String address = customer.getMailAddress();
  "code that sends the mail"
}  

11. What design principle is not followed by this code? How would you modify the code to follow this principle?

void printHiringDate(Employee employee) {
  Date date = employee.getHireDate();
  String msg = date.format();
  System.out.println(msg);
}  

12. The preconditions of a method are boolean expressions involving its parameters (and possibly the state of the class) that must be true before its execution. Similarly, postconditions are boolean expressions involving the method’s result. Considering these definitions, which design principle is violated by this code?

class A {  
  int f(int x) { // pre: x > 0
    ...
    return exp;
  }              // pos: exp > 0
  ...
}
class B extends A {  
  int f(int x) { // pre: x > 10
  ...
  return exp;
  }              // pos: exp > -50
  ...
}

13. Calculate the CBO and LCOM of the following class:

class A extends B {

  C f1, f2, f3;

  void m1(D p) {
    "uses f1 and f2"
  }

  void m2(E p) {
    "uses f2 and f3"
  }

  void m3(F p) {
    "uses f3"  
  }
}

14. Which of the following classes is more cohesive? Justify by calculating the LCOM values for each one.

class A {

  X x = new X();

  void f() {
    x.m1();
  }

  void g() {
    x.m2();
  }

  void h() {
    x.m3();
  }
}
class B {

  X x = new X();
  Y y = new Y();
  Z z = new Z();

  void f() {
    x.m();
  }

  void g() {
    y.m();
  }

  void h() {
    z.m();
  }

}

15. Why does LCOM measure the lack and not the presence of cohesion? Justify.

16. Should all methods of a class be considered in the LCOM calculation? Yes or no? Justify.

17. The definition of cyclomatic complexity is independent of programming language. True or false? Justify.

18. Provide an example of a function with minimum cyclomatic complexity. What is this complexity?

19. Cristina Lopes is a professor at the University of California, Irvine, USA and the author of a book about programming styles (link). She discusses in the book several implementations for the same problem, called term frequency. Given a text file, you should list the n-most frequent words in descending order of frequency, ignoring stop words, i.e., articles, prepositions, etc. The Python source code for all implementations discussed in the book is publicly available on GitHub (and for this exercise, we made a fork of the original repository). Analyze two of these versions:

  • Monolithic version (link).

  • Object-oriented version (link).

First, review the code of both versions (each version is under 100 lines). Then, argue about the advantages of the OO solution over the monolithic one. For this, try to extrapolate the size of the system. For example, suppose it is implemented by different developers, and each one is responsible for a part of the project.