Design and development of software could not withstand the ever-changing requirements and technologies if there was no way to write efficient modular code.

You learned ways of designing such software using functions, modules, and classes. These constructs will help you with modularity and better organization of your code, but just using them does not guarantee better results. Things can get even more complicated if they are misused or overused.

So how can you write your code using constructs like classes and modules, so that they will bring better readability and modularity of your system?

Previously, when discussing functions, you were introduced to a couple of suggestions on how they should be short and do one thing, or that the name of a function should reveal the intent, and other similar advices could be drawn. While these can be generally used across the project to improve the codebase, there are other factors that are leaning more toward the reusability and modularity of your code.

Systems are expected to have a long lifetime; therefore; it is crucial to understand and follow the techniques and best practices that promise more maintainable and modular software.

Maintaining Modularity

Modules ensure that the code is easier to understand, and they lower the overwhelming feeling when you delve into the code for improvements or feature additions. They should encapsulate parts of functionality of a system and constrain how these parts interact among each other.

Too many interactions between different modules can lead to confusion and unwanted side effects when something needs to be changed in a module. Responsibility of a module needs to be transparent so that you can reasonably know what its function is.

A change in one part of the application code should not affect or break other parts of the system. To enable independent evolvement of modules, they need to have well-defined interfaces that do not change. The logic behind that principle is that interfaces can of course be changed without affecting modules that depend on it.

Here are some design guidelines to consider:

Acyclic dependencies principle
Stable dependencies principle
Single responsibility principle

Note

Only a few of the design principles are mentioned here; some of them belong to the SOLID design principles introduced by Robert C. Martin. Different approaches can be found in other design principles.

The acyclic dependency principle ensures that when you split your monolithic application into multiple modules, these modules—and the classes accompanying them—have dependencies in one direction only. If there are cyclic dependencies, where modules or classes are dependent in both directions, changes in module A can lead to changes in module B, but then changes in module B can cause unexpected behavior in module A, from where changes originated. In large and complex systems, these kinds of cyclic dependencies are harder to detect and are often sources of code bugs. It is also not possible to separately reuse or test modules with such dependencies.

Depending on your development ecosystem and language of choice, build tools can help you with identifying circular dependencies. These tools are often using the fact that the module's source code is a part of the same code repository. You might decide to move a module in its own code repository, whether it is because you want to reuse that module in other projects or just because you want to trace it and version it separately from other modules and application code. In this case, you need to be especially careful not to end up having a circular dependency.

If this kind of dependency happens to occur anyway, then there are strategies to break the cyclic dependency chain. High-level modules that consist of complex logic should be reusable and not be affected by the changes in the low-level modules that provide you with application specifics. To decouple these two levels of modules, strategies like dependency inversion or dependency injection, which rely on introduction of an abstraction layer, can be used.

Dependency inversion is defined as follows:

High-level modules should not depend on low-level modules. Both should depend on abstractions.
Abstractions should not depend on details. Details should depend on abstractions.

There is a difference in how to implement this approach between statically typed languages, like Java or C#, and dynamically typed languages, like Python or Ruby. Statically typed languages typically support the definition of an interface. An interface is an abstraction that defines the skeleton code that needs to be extended in your other custom classes. Your code depends on abstraction and implements the details of the desired action. An example would be having a class device that defines a show() interface. The interface does not implement how the show() method should work. Instead, you can create other, more low-level classes, like a Firewall class, that extends and implements the show() method. The implementation is then specific to the low-level class. A Firewall class might have a different implementation than a LoadBalancer class, for example.

Developing against predefined interface abstractions promotes reusability and provides a stable bond with other modules. Besides using abstractions as a means of breaking the cyclic dependency, you also benefit from easier changes of the implementations and more flexible testability of the code, because you can produce mock implementations of interfaces in your tests.

In dynamically typed languages, no explicit interface is defined. In these types of languages, you would normally use duck typing, which means that the appropriateness of an object is not determined by its type (like in the statically typed languages) but rather by the presence of properties and methods. The name "duck typing" comes from the saying, "If it walks like a duck and it quacks like a duck, then it must be a duck." In other words, an object can be used in any context, until it is used in an unsupported way. Interfaces are defined implicitly with adding new methods and properties to the modules or classes.

Observe the following example of the app.py module.

import db


class App:
    def __init__(self):
        self.running = False
        self.database = db.DB()

    def startProgram(self):
        print('Starting the app...')
        self.database.setupDB()
        self.running = True

    def runTest(self, DB):
        print('Checking if app is ready')
        if 'Users' in DB.keys():
            return True
        else:
            return False

The signature of db.py is as follows:

from init import Initialization


class DB:
    def __init__(self):
        self.DB = None

    def setupDB(self):
        print('Creating database...')
        self.DB = {}
        init = Initialization()
        self.DB = init.loadData(DB)

The initialization class is part of the init module.

import app 

initData = {
    'Users': [ 
        {'name': 'Jon', 'title': 'Manager'}, 
        {'name': 'Jamie', 'title': 'SRE'} 
    ] 
}


class Initialization: 
    def __init__(self): 
        self.data = initData 
        self.application = app.App() 

    def loadData(self, DB): 
        print(self.data) 
        DB = self.data 
        validate = self.application.runTest(DB) 
        if validate: 
            return DB 
        else: 
            raise Exception('Data not loaded')

This is an example of a cyclic dependency where the app module uses the database module for setting up the database, and the database module uses the init module for initializing database data. In return, the init module calls the app module runTest() method that checks if the app can run.

In theory, you need to decide in which direction you want the dependency to progress. The heuristic is that frequently changing, unstable modules can depend on modules that do not change frequently and are as such more stable, but they should not depend on each other in the other direction. This is the so-called Stable-Dependencies Principle.

Observe how you can, in a simple way, break the cyclic dependency between these three modules by extracting another module, appropriately named validator.

class Validator: 
    def runTest(self, DB): 
        print('Checking if app is ready...') 
        if 'Users' in DB.keys(): 
            return True 
        else: 
            return False

The app class no longer implements the logic of the runTest() method, so the init module does not reference it anymore.

import validator

<...  output omitted ...>

class Initialization:
    def __init__(self): 
        self.data = initData 
        self.validator = validator.Validator() 

    def loadData(self, DB): 
        DB = self.data 
        validate = self.validator.runTest(DB) 
        if validate: 
            return DB 
        else: 
            raise Exception('Data not loaded')

The cyclic dependency is now broken by splitting the logic into separate modules with a more stable interface, so that other modules can rely on using it.

Note

Python supports explicit abstractions using the Abstract Base Class (ABC) module, which allows you to develop abstraction methods that are closer to the statically typed languages.

Modules with stable interfaces are also more plausible candidates for moving to separate code repositories, so they can be reused in other applications. When developing a modular monolithic application, it is not recommended to rush over and move modules in separate repositories if you are not sure how stable the module interfaces are. Microservices architecture pushes you to go into that direction, because each microservice needs to be an independent entity and in its own repository. Cyclic dependencies are harder to resolve in the case of microservices.

A single module should be responsible to cover some intelligible and specific technical or business feature. As said by Robert C. Martin, coauthor of the Agile Manifesto, a class should have one single reason to change. That will make it easier to understand where lay the code that needs to be changed when a part of an application must be revisited. When a change request comes, it should originate from a tightly coupled group of people, either from the technical or business side, that will ask for a change of a single narrowly defined service. It should not happen that a request from a technical group causes an unwanted change in the way business logic works. When you develop software modules, group the things that change for the same reasons, and separate those that change for different reasons. When this idea is followed, the single-responsibility design principle is satisfied.

Modules, classes, and functions are tools that should reduce complexity of your application and increase reusability. Sometimes, there is a thin line between modular, readable code, and code that is getting too complex.

Next, you will learn about how to improve modular designed software even further.

Loose Coupling

Loose coupling in software development vocabulary means reducing dependencies of a module, class, or function that uses different modules, classes, or functions directly. Loosely coupled systems tend to be easier to maintain and more reusable.

The opposite of loose coupling is tight coupling, where all the objects mentioned are more dependent on one another.

Reducing the dependencies between components of a system results in reducing the risk that changes of one component will require you to change any other component. Tightly coupled software becomes difficult to maintain in projects with many lines of code.

In a loosely coupled system, the code that handled interaction with the user interface will not be dependent on code that handles remote API calls. You should be able to change user interface code without affecting the way remote calls are being made, and vice versa.

Your code will benefit from designing self-contained components that have a well-defined purpose. Changing a part of your code without having to worry that some other components will be broken is crucial in fast-growing projects. Changes are smaller and do not cause any ripple effect across the system, so the development and testing of such code is faster. Adding new features is easier because the interface, for interaction with the module, and implementation will be separated.

So how do you define if a module is loosely or tightly coupled?

Coupling criteria can be defined by three parameters:

Size
Visibility
Flexibility

This criteria is based on the research of Steve McConnell, an author of many textbooks and articles on software development practices.

The number of relations between modules, classes, and functions defines the size criterion. Smaller objects are better because it takes less effort to connect to them from other modules. Generally speaking, functions and methods that take one parameter are more loosely coupled than functions that take 10. A loosely coupled function should not have more than two arguments; more than two should require justification. Functions that look similar, or they share some common code, should be avoided and alternated if they exist. A class with too many methods is not an example of loosely coupled code.

When you implement a new fancy solution to your problem, you should ask yourself if your code became less understandable by doing that. Your solutions should be obvious to other developers. You do not get extra points if you are hiding and passing data to functions in a complex way. Being undisguised and visible is better.

For your modules to be flexible, it should be straightforward to change the interface from one module to the other. Examine the following code:

import addressDb 


class Interface: 
    def __init__(self, name, address): 
        self.name = name 
        self.address = address 
        self.state = "Down" 


class Device: 
    def __init__(self, hostname): 
        self.hostname = hostname 
        self.motd = None 
        self.interface = Interface 

    def add_to_address_list(self): 
        addressDb.add(self.interface)

The device module interacts with the add() function of a module addressDb.

def add(interface): 
    <... implementation omitted ...> 
    print(f'adding address {interface.address}')

At first sight, this code looks good. There are no cyclic dependencies between the modules. There is actually just one dependency. The function takes one argument and there is no data hiding or global data modification, so it looks pretty good. What about flexibility? What if you have another class called "Routes" that also wants to add addresses to the database, but it does not use the same concept of interfaces? The addressDb module expects to get an interface object from where it can read the address. You cannot use the same function for the new Routes class; therefore, the rigidness of the add() function is making code that is tightly coupled. Try to solve this using the next approach.

def add(address): 
     <... implementation omitted ...> 
    print(f'adding address {address}')

The add() function now expects an address string that can be stored directly without traversing the object first. The function is not tied anymore to the interface object; it is the responsibility of the caller to send the appropriate value to the function.

class Device:
    def add_to_address_list(self):
        addressDb.add(self.interface.address)

Note

Python, which uses duck typing, did not require the exact object of type interface, but only an object that can conform with the add() function. In statically typed languages, the correct object type would be required.

If you fundamentally change the conditions of a function in a loosely coupled system, no more than one module should be affected.

The easier a module or function can call another module or function, the less tightly coupled it is, which is good for the flexibility and maintenance of your program code.

Cohesion

Cohesion is usually discussed together with loose coupling. It interprets classes, modules, and functions and defines if all of them aim for the same goal. The purpose of a class or module should be focused on one thing and not too broad in its actions. Modules that contain strongly related classes and functions can be considered to have strong or high cohesion.

The goal is to make cohesion as strong as possible. Aiming at strong cohesion, your code should become less complex, because the logically separated code blocks will have a clearly defined purpose. This should make it easier for developers to remember the functionality and intent of the code.

def save_and_notify(device, users):
    filepath = '/opt/var/'
    file = open(f'{filepath}', "w")
    file.write(device.show())
    file.close
    for user in users:
        sendEmail(user)

The save_and_notify() function is an example of low cohesion, because even the name suggests that the code in the function performs more than one action; it backs up the data and notifies the users.

Note

Do not rely on a function name to identify if it has high or low cohesion.

A function should focus on doing one thing well. When a function executes a single thing, it is considered a strong, functional cohesion as described by McConnell.

Here is an example of a code with lower cohesion:

def log(logdata): 
    file = open('/var/logs/app.log}', "w") 
    file.write(logdata) 
    file.close 
    logdata = [] 

    return logdata

In the log() function, the collected logs in the logdata variable are first being logged to disk, and then the same data is cleared in the next step. This is an example of communicational cohesion, in which there are multiple operations that need to be performed in a specific order, and those steps operate on the same data. Instead, you should separate the operations into their own functions, in which the first logs the data, and the second—ideally, somewhere close to the definition of the variable—clears the data for future usage.

Another example is logical cohesion, which happens when there are multiple operations in the same function and the specific operation is selected by passing a control flag in the arguments of a function.

def actions(device, users, action):
    if action == 'backup': 
        file = open(f'{filepath}', "w") 
        file.write(device.show()) 
        file.close 
    if action == 'notify': 
        for user in users: 
            sendEmail(user) 
    if action == 'wipe': 
        device = None 
        return device

Instead of relying on a flag inside a single function, it would be better to create three separate functions for these operations. If the task in a function would not be to implement the operations, but only to delegate commands based on a flag (event handler), then you would have stronger cohesion in your code.

Content Review Question

What is the optimal combination of coupling and cohesion?

暗夜星空

Friday, October 23, 2020

3.3 Modular Design Benefits

Maintaining Modularity

Note

Note

Loose Coupling

Note

Cohesion

Note

Content Review Question

No comments:

Post a Comment