What is Debugging?
Debugging is a broad topic in Software Engineering; whatever technology we are using to develop our piece of software, we’ll need to have knowledge about debugging. There are some debugging techniques that are generic and can be applied anywhere but specialized debugging techniques are also there to support specific technologies. In this article, we’ll talk about some of these debugging methods/approaches that we should know as a developer to build awesome systems.
“Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program.” - P. Adragna, University of London
How to approach Debugging?
Debugging is not a specific task rather it consists of a series of tasks. All bugs come from a basic proposition which is, something was once right but now it’s not working perfectly. Developers should follow a process that helps to narrow down the search space and debug the issue. Without it, resolving bugs might seem a hard nut to crack.
Sometimes, we try to put a random hypothesis around the problem and change codes without having any solid idea about the issue and run the code, again and again, to see if it’s resolved. Even if we manage to solve the issue once, that won’t happen every single time and clearly this is not a very good approach to solve any bug.
The debugging process can be divided into four main steps: localizing, classifying, understanding, and finally repairing the bug. Localizing means identifying where the bug is in the code. A particular service or function might be responsible for producing the bug.
Using version control management system is another good approach to identify or narrow down the bug from the commits, git-bisect is a useful tool to achieve this. After that, we have to classify the bug i.e. is it a syntactic, semantic error, or is it depending on some external values or is it related to build errors.
Understanding the bug is like putting a hypothesis as to what is causing the issue and finally trying to solve the issue around that. This is the cycle where the hypothesis might fail and we have to go back to the previous step again; this cycle goes on and on until repairment is successful.
Different debugging methods
When it comes to debugging, the tool and techniques we use are extremely important and can determine how easy it is to fix problems within our code. Let’s talk about a few popular techniques.
A breakpoint is an intentional way to stop or pause at a specific line in a program that has been set up for debugging purposes. Debugger puts a piece of code on that line and during the execution process, it can call the debugger or creates an interrupt at CPU level which passes to the exception handler of the debugger.
Different debuggers have different implementations of their own debugging control algorithms that have the option to add a condition to the breakpoint which helps to conditionally stop or pause the code at a certain stage, and some have the option to manipulate a piece of data before resuming the execution again!
Print debugging is the most used technique that every developer uses at a certain point in their development career, which is basically dumping values during the runtime of an application to the application console or user interface.
We developers do this to understand the flow inside a function or a service, sometimes to understand the shape of a fetched or manipulated data. Also, during debugging, we put print-statements in the code by assuming from where the bug may be generated, there is no convention of what to put in the print-statements.
Though this is not a very solid method of debugging, it is very useful in various cases where silly mistakes are involved, or somehow the developer couldn’t understand the flow of the application or at some point if we can identify the code block which might be responsible for the bug.
This is a popular technique nowadays that involves debugging a process remotely when the debugger and the application are not running in the same platform/server. Remote debugging is at times done in tandem with production debugging.
When we develop software for small devices with a very limited resource that cannot contain both the codes for running the application and the application actually doing the debugging, remote debugging is the go-to method.
Also, it is useful when we try to debug large applications with multiple components connected together and communicate with each other to serve the requests of a client e.g. microservices.
As we know, a debugger needs to be connected to both the OS and the application via basic hooks so that it can trigger the debugging process. When we install a remote-debugging code in our application server, which itself is not a debugger, it just connects to the application running the server.
It also connects to the remote debugger via network and sends or receives commands to the remote debugger. We can use RookOut to achieve remote debugging so easily and effectively, also there are other tools that you can look out for remote debugging.
In the above figure, it shows how a remote debugging process works, a remote machine on the left side having the debugger connected to it and the target machine is on the right side which actually runs our process and we are intending to debug. You can see, both sides have a component called Network Routines, this is used to communicate between the two sides
When we have a service running in production, most of the time we use some sort of logging/stack-tracing framework that can send all debugging messages to one or more files. Also, we can put debug levels (e.g. info, debug, warning, error, etc.) in most of the log services and it gives us a better experience to manage and go through the logs when we have different log-levels in an application.
Whenever we face a bug then immediately we can look into the log files or the logger service to reproduce the bug or identify the root cause of the issue. But we have to make sure that we’re putting enough detailed logs in our logger services, some of them may require OS-level permissions to do so.
For example, when the web application logs all requests made by the client, and at some point, it crashes, we can see the last request came from the client which eventually caused the crash. We can replicate the crash by sending the same request to the server during the debugging process. But, log-files are not the standalone debugging method, it can be orchestrated with other tools or techniques which helps us in the debugging process.
A memory dump often referred to as a core dump is a method of analyzing a memory dump. But what is a memory dump? When a program crashes, the system creates a snapshot of the memory and the data it holds at that specific time, this is called the memory dump. So, this is useful because developers can easily find exactly where the process was crashed and the memory dump was taken, and see a ”stack-trace” of the functions which were invoked at that point and also values of the local or global variables that are accessible.
But the important fact here is that we can get a memory dump without having a crash because there are different kinds of bugs where the process never crashes. We are talking about memory dump of a process but we can also get a memory dump of the whole system where multiple processes are running from different applications.
This is helpful when you have more than one piece of applications communicating with each other and you can’t identify exactly who is responsible for creating the bug. But if there is a kernel crash that is different from a process or system crash, sometimes it’s difficult to get the memory dump because it frees the memory before taking the dump.
Replay debugging is another useful technique, the idea comes from reverse engineering where we have a well-defined flow of the original process and from there we can track-back to the issue. The first step of replay debugging is to take a recording of the process during its execution where all events invocation, memory management, data changes in hard-disks, user inputs, and also other connected device outputs are added to the snap.
This piece of recording is generated by executing the process and then monitoring different events that trigger and finally when the process ends (it can be successfully finished or having a system interrupt or a process might crash) it contains all the information that is relevant to the execution cycle.
Developers or Quality Assurance Engineers can use this recording to walk through the execution path of the process or connect to a debugger and try to identify what went wrong. This might come in handy when we try to regenerate a bug because it gives us a holistic view of the whole process.
Having good debugging skills is necessary for every developer but that cannot be acquired overnight, it needs experience because debugging does not have any bulletproof technique or tools that will suffice to debug every other bug that exists. So, experimenting and investigating deeply the reasons for the bug and using techniques from the above will definitely help us to dig into the bug and finally solve that.