The Problem With Logging

Another aspect of logging levels are how/where it is configured and how easy it is to change. If your configuration management process means it takes 2 weeks of paperwork and the CIO to sign in blood to make a change from Error level to Info level to get some more info about a problem then those other levels may as well not exist. Being able to dynamically change these levels at run-time without recompilation or application restarts are key.

The code footprint of logging can be greatly reduced by using AOP techniques. It’s pretty much the canonical example of aspect oriented programming. A simple method boundary aspect (straight from the postsharp site) looks something like:

public class SimplestTraceAttribute : OnMethodBoundaryAspect
public override void OnEntry( MethodExecutionEventArgs eventArgs)
Trace.TraceInformation(Entering {0}., eventArgs.Method);
public override void OnExit( MethodExecutionEventArgs eventArgs)
Trace.TraceInformation(Leaving {0}., eventArgs.Method);

In postsharp these aspects can be projected onto your types conditionaly at compilation time using either method/class level attributes or assembly level multicast.

[assembly:SimplestTraceAttribute (AttributeTargetTypes=TypesThatIWantToLog;AndSomeOtherTypes)]

Having these as aspects also allows you to produce versions on the fly with higher levels of instrumentation.

There are also dynamic proxy based techniques that work well in combination with dependency injection to achieve the same result.

I agree with your overall point though,knowing what to log is a fine art; one that I am more than a little amateurish at myself.

It all depends on why you have logs in the first place. Is it for debugging, troubleshooting occasional errors / exceptions, helping to analyze the program flow, give extra technical information to your customers, etc …

In general the more complex the system gets, the more you will need logs to help see what happens and where, but that mainly interests developers.

Users would be more interested in why their instance is not getting or sending emails properly, in general communication with other systems that are configured by the user.

On my project we basically log anytime something important happens, how’s that for arbitrary and non-descript =)

We log everything deep down in the datalayer, and all unhandled get logged and use is redirected to an error page.

I think just like anything, moderation works best.

Also, filesystem based logging is not a good idea in systems you need to scale heavily, asynchronous stuff like UDP or message queuing are a better bet. Look ma, no locks.

First of all, there are two activities: logging and tracing.

Logging is for end user, it should be written in a language that the user understands and it should be informative for the user.

Tracing is for developers and one can put any internal debug informating in the file. I agree that too much information here can render tracing useless.

As for perfomance issues, the best solution is to delegate actual logging to an external application or service (syslog on Unixes is a good example).

Professionally, I develop technical software, i.e. software that controls a piece of hardware, which gets shipped to a customer. In our case, when a customer submits a problem report, logging is the only mechanism we have of getting some amount of technical information to tell us what the user was doing, and what went wrong.

Nice post

I totally agree with your opinion, for myself I don’t use log4net, because I found it to complicated. I wrote my own logging web service that I have implemented in all my webapps. It can log 2 types of logs, exception and action logs. After every request to the backoffice I do a total action log (contains input, ouput and authentication) or an exception log contains (input, exception and authentication) I also write the loggedby identifier and the duration time.

So basically I always do a total log write, but beside that I wrote some SQL jobs that handles the log entries daily, weekly, monthly and yearly for statistics or cleanup, based on the logged by identifier.

So the log will be (not) inserted into the statics based on the identifier and every log is deleted periodically.

For my opinion I think this is the best approach, log as much as possible, but do an automatic cleanup periodically.

Sounds like a mess there. Syslogd can log messages from hundreds of running processes running hundreds of threads on a busy busy server and not spend more than a running glance at the overhead it creates. Why oh why can’t all logging mechanisms be that efficient?

Can you elaborate on your example problem? I too am struggling to understand what causes a thread to acquire a log lock and then do anything that causes it to wait on another lock before releasing the log lock. Is it writing the log into a database?

God, how many times I was saved by logs. Then again, I’m a trouble shooter. My speciality is discovering what is wrong, whether it’s a design issue or a bug, and determining what the parameters for fixing it are. Not that anyway pays me to do that specifically, of course – they just ask me to do it beyond my normal day to day tasks.

There is a certain method to my madness. At the DEBUG level I want to know the I/O. A surprising amount of bugs result from one not sending what he thought he was sending, or receiving what he thought he was receiving. At the INFO level – if I used such a distinction from DEBUG – I put flow control. Every method being called, and every flow control fork being taken (including loops, of course). I think ERROR and FATAL are self-evidend, and yes, they should be communicated through the interface too – if the program has an interface at all.

Now, the thing Jeff seems to be forgetting is that this error logging framework is intended for debugging problems when you have access to two things: 1) the problem report and; 2) the error log.

When you are tending your own web site this does not apply, and if you happen to use open source and be willing to discover why Apache is behaving erraticaly, then you might not ever be on the user side either. But when you are tending a commercial application and CommercialApplication ™ keeps acting in unexpected ways, and, specially, when the support is not allowed access to your installation, then The Log is all there is, really.

I log only when debugging. I prefer it to a debugger, in fact.

If my loops are running fine, why the hell would I want my code outputting every time a loop iteration begins?

My lord.

I think you’re exaggerating. I use logging a lot, and it’s never caused me any deadlocks.

I do agree with points 1 and 2. However,
if the information you’re logging is at all valuable, it deserves to be surfaced in the application itself,
Of course not! If you use logs for debugging, it’s no reason to build a special page just for showing them. And even if you do, this doesn’t make them non-logs.

I don’t care how awesome your log parsing tools are
Ctrl-F in my editor works just fine

Good luck getting everyone on your team to agree on the exact definitions of FATAL, ERROR, DEBUG, INFO
Easy! An exception is ERROR, everything else is DEBUG. I’m not a big supporter of log levels myself, but that’s not an argument against logging.

My app keeps a length-limited in-memory log of recent events, at quite a high level of detail. It attaches the contents of that log to crash dumps. The extra context that the log provides over and above the instantaneous information provided by the crash dump is really useful - well worth a few extra lines of code and a few extra microseconds.

Why is your logging system acquiring a lock? What about simple logging, like to a file, a socket or a fifo/pipe, needs to be locked? Are you not opening the files with O_APPEND (or whatever the windows equivalent is)? Are entire log entries not being written with a single write call?

Yes, there is a balance. I’m trying to maintain some code, and it feels like 50% of the source lines are involved with writing to the log. This makes the code unreadable.

There’s a bug somewhere, and there’s so much gunf in the log already, that I feel the need to create a separate log to write stuff I actually want to know to help diagnose this problem. I’ll take my own logging system out again when the problem’s resolved.

I can depend on the type of app. In your case as a website the desire to log is low. However I work in transport. We have literally tens of thousands of machines world wide. If a bug report comes in we have very little chance of getting a good bug report from the original driver.

A good log is the difference between having a chance of identifying the issue and having no hope in hell.
And yes we should catch as many of these issues at development and we try to but one can never be sure if the hardware that works well in Europe will work in the same way in say: Kuwait. So logging and even a level of comprehensive logging can be v.useful to some.

Agree, but I think something is missing here: why logging.
In most cases, logging is made with debugging purposes.
If a log suits that goal, then it is OK. Otherwise, it is an excessive logging.

Of course, this is not something easy at all to determine - no one knows of bugs in advance :slight_smile:

Perhaps your goal was more profiling. There are a lot of reasons to log and it’s part of a complete application. IBM and Tandem does a fabulous job, and I really miss it over Linux. Some of the reasons to log:

Compliance and Audit
System troubleshooting
Proactive security
Application errors
Application information

Here’s a nice article:

I’m with andy on this one, why are you keeping the file open for extended periods of time? and why use a database connection to log things? Hey, why not use a deamon (or service in the windows world) to handle the logging, that way you will just have to pass the error to it and it handles the rest. This would give you a minimal load on your application too. Hey, you could even push it to it’s own box. How is that for scalability!

I’m for logging as much as possible. And the only problem I see with logging is I/O and hard drive space.