Logs can be tech debt - traces should replace? #2

michaelwilde · 2021-05-11T06:45:41Z

michaelwilde
May 11, 2021

Curious what you think about logs. I think they have tech debt trapped in them and I don't see a need for them in the world of distributed tracing -- and we should move to that world.

Rarely do logs have a standard format across every log (space delimited, json, csv, etc--never mind field names)
Improvement is often hard to justify
Logs don't have the connective tissue that links events together in the way that traces
Few solutions allow you to sample high-noise logs and represent reality when used for reporting/search/query.

I've always thought CISOs should demand their developers and their vendors to emit traces from code that runs. Example: you see a log entry of "david logged in at 3pm successfully", you actually don't know all the function calls your SSO provider is making or the app that is using a third authentication mechanism, etc.

Other than failing services or reputational impact, where's the motivation to go back and improve the logs, or start getting more detail on how software is behaving so use cases beyond debugging are more empowered.

cpswan · 2021-05-11T08:11:31Z

cpswan
May 11, 2021
Maintainer

It's not a topic I've thought that deeply about, but here goes anyway...

There are various different purposes to logs, and where it comes to diagnostics there's overlap into tracing. But like everything in IT there are trade offs happening here. One of the main causes of tickets in most estates is disks filling up with logs, because the logs are being badly managed. The people doing that are terrified that tracing will just fill those disks up quicker. Of course these days we shouldn't be worrying about individual disks on individual servers when we have infinite object storage in the sky, but infinite storage can cost $∞.

Agreed that log formats have been a disaster since forever. Thoughtful people have invested countless hours coming up with sophisticated standard schemas for stuff, and idiots in a hurry have no comprehension of that work - that's almost the definition for a whole chunk of tech debt that's out there, with more being thrown down as I type.

Having lived through the early days of Security Information and Event Management (SIEM) I've watched how logs can be abused, and how there's an entire cottage industry out there stitching a picture of what's happening in the world back together out of disparate logs from all over the place. It's a mess. Since we've now had open standards for messaging for some substantial time (MQTT, AMQP, NATS etc.) there are clearly better ways, and yet... we still see logs being used time and again as a poor person's messaging system.

As we watch Colonial Pipeline finally focus some minds in Washington on issues of resilience (almost 4 years after NotPetya nearly wiped out four $MultiBn companies in an afternoon) I'm once again hopeful that there's going to be a push for improvement; but if I learned anything from last time it's 'don't hold your breath'.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logs can be tech debt - traces should replace? #2

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Logs *can be* tech debt - traces *should* replace? #2

michaelwilde May 11, 2021

Replies: 1 comment

cpswan May 11, 2021 Maintainer

Logs can be tech debt - traces should replace? #2

michaelwilde
May 11, 2021

cpswan
May 11, 2021
Maintainer