Understanding logging and monitoring

Switching to a more grounded topic, one of the driving principles of DevOps is logging and monitoring instances, endpoints, services, and whatever else you can track and trace. This is necessary because regardless of whatever you do, how clean your code is, or how good your server configuration is, something will fail, go wrong, or just inexplicably stop working altogether. This will happen. It’s a fact of life. It is in fact, Murphy’s law:

Anything that can go wrong will go wrong at the worst possible time.

Familiarizing yourself with this truth is important for a DevOps engineer. Once you have acknowledged it, then you can deal with it. Logging and monitoring come in because when something does go wrong, you need the appropriate data to respond to that event, sometimes automatically.

The rest of this section has been laid out in terms of logging, monitoring, and alerts. Each one of these aspects plays an important role in keeping the DevOps train (workload) on the right track.

Logging

If you are not from a technical background or are new to logging principles, think of logging in this way:

Every day after school, a schoolboy would go to an old woman selling matches and give her money for one matchbox. However, he’d take no matchboxes in return. Then one day, as the boy went about his usual routine, he saw the woman about to speak up and he said, “I know you’re probably wondering why I give you money for the matchbox but don’t take one in return. Would you like me to tell you?” The woman replied, “No, I just wanted to tell you that the price of matches has gone up.”

In this case, the woman is the logger, and the boy is the person viewing the log. The woman doesn’t care about the reason. She’s just collecting the data, and when the data changes, she collects the changed data. The boy checks in every day and goes about his routine uninterrupted until something changes in the log. Once the log changes, the boy decides whether to react or not depending on what he would consider to be an appropriate response.

In subsequent chapters, you’ll learn about logs, how to analyze them (usually with Python), and appropriate responses to logs. But at present, all you need to know is that good bookkeeping/logging has built empires because history and the lessons that we learn from it are important. They give us perspective and the appropriate lessons that we need to respond to future events.