theme | highlightTheme |
---|---|
night |
monokai |
Pillars of Observability
- Metric (coarse-grain)
- Log (🤷)
- Trace (fine-grain)
- https://www.oreilly.com/library/view/distributed-systems-observability/9781492033431/ch04.html
- https://medium.com/@copyconstruct/logs-and-metrics-6d34d3026e38
--
It's just emitting data to a tool with a variety of precision
The tool often limits the precision you can include--
Metrics are low-fidelity event aggregates
Tells you IF failure; but not WHY failure
--
Tracing is logging with opinion + tooling
You can DIY or SaaS:
- AWS X-Ray
- DataDog APM
- honeycomb.io
- context
- correlation
- level
You cannot predict future questions
— be generous
Logs impose no limits on size* / cardinality
... consider data sensitivity; probably don't dump blobs--
--
Task finished: CreateRoutes2: duration=3.014
- hard to parse
- slow to filter (using
like
operation) - ambiguous unit
--
example: ./examples.md
--
good logs are a consequence of good code
--
queue service > validator service > sender service
Q: who should log?
new problem; how to pass context + correlation?
--
controller > queue
controller > validator
controller > sender
controller > log
Q: who should log?
A: The controller
everything else performs a discrete function
--
// get next
work := queue.Pop()
log = logger.WithField("work_id", item.ID)
// validate
if reason := s.validator.IsValid(work.Body); reason != nil {
logger.WithField("reason", reason).Info("item invalid")
return
}
// send
if err := s.sender.Send(work.Body); !err != nil {
logger.WithError(err).Error("failed to send item")
return
}
// commit work
work.Delete()
// emit
logger.Info("done")
* assume error handling!
--
func Sender (i Item) error {
// serialize
body, err := json.Marshal(i)
if err != nil {
return errors.Wrap(err, "serialization failure")
}
// send
resp, err := http.Post(url, "application/json", bytes.NewBuffer(body))
if err != nil {
return errors.Wrap(err, "http failure")
}
defer res.Body.Close()
// check response
if response.StatusCode != http.StatusOK {
return fmt.Errorf("backend failure: %s", response.Status)
}
return nil
}
dont log AND throw (that's 2 things)
--
Code falls into 2 flavours:
- Controller
level=INFO
- Everything else
level=DEBUG
--
example: ./go/service/main.go
treat logs as an event emitting dependancy
... a cross-component concern
👉 Concensus 👈
☝
correlation_id / request_id / user_id / asset_id / ...
--
reaching consensus through tooling
package appcontext
type RequestContext struct { ... }
type ClientContext struct { ... }
type SystemContext struct {
Application string `json:"application,omitempty"`
Version string `json:"version,omitempty"`
Environment string `json:"environment,omitempty"`
}
func WithSystemContext(ctx context.Context, val SystemContext) context.Context {
return context.WithValue(ctx, key, val)
}
func GetSystemContext(ctx context.Context) (val SystemContext, ok bool) {
val, ok = ctx.Value(key).(SystemContext)
return
}
...
A broad category which is important to collectively agree on.
--
--
ERROR: client is not authorized
This belongs in the response to the client:
401 Unauthorized
(or maybe an "access log")
--
Uninteresting plumbing
INFO: executed 'SELECT * FROM foo'
INFO: parsed JSON
aka. i was prototyping and accidentally committed it
--
Predicting the future
INFO: about to handle request
aka. i don't trust my language to trap exceptions
--
--
The system cannot continue
FATAL: failed to connect to database
--
A transient problem during processing
ERROR: timeout while saving
--
Processing degraded but can continue
WARN: config unset; using default
opinion: use INFO
--
System did what you asked it to do
INFO: done
INFO: batch complete
INFO: cache refreshed
--
Low-level supporting steps.
Usually disabled due to poor signal-to-noise ratio.
Danger zone: Take care with sensitive data!
you'll get it wrong the first time; iterate