05
Feb 21

Observability

TL;DR:
Our industry doesn’t have the right definition of Observability and it’s going to cost enormous amounts of money and cause continued outages until that changes. ((Normally in this blog I don’t talk about things close to work much, because I’m a software engineer at a large company and engineers in that position leave work at work most of the time lest we be perceived as speaking for our company by people who don’t yet understand that companies do not usually let engineers handle public relations for rather good reasons. This post is an exception because it’s a very nonspecific thing that impacts our entire industry, not just one company. It’s my thoughts on this, not my company’s (they don’t pay me enough to work in PR).
For context, I’ve been working in monitoring for a decade or so and relying on it for about three decades. My original background was in both electronics and computers, so it focussed more on control theory and statistics and the device driver level of programming than it did on writing word processor applications and the like. Because of that I came to the monitoring field with a formal background in what monitoring is actually for and that influences my opinions.))

I’ve been getting increasingly frustrated with something for the last year or four. Mostly it gets dismissed because it’s difficult to monetise even though it’s not difficult to actually understand or implement. Engineers are so close to it that they rarely think about it and vendors think it’s not readily profitable which is a situation that lends itself well to being deliberately ignored. Specifically, I’m talking about the large gap between what our industry calls Observability and what Observability actually is.

And that gap is not some academic triviality – it means that there is a massive gaping hole in our tooling which directly results in lost revenue and a lot of grief and unnecessary work for engineers.

Observability tooling is simply not fit for purpose right now, not from any vendor. It’s not enough to have nice UIs and to be able to click from a metric to a log to a trace. Those are “nice to haves”. What we actually need to have is support in our tooling to let engineers readily codify their own mental models of how their service works so everyone else can see what’s going on without needing all the context and experience that the original engineers had to have to build up those mental models.

Detection of the difficult, nuanced nasties in our systems, and correctly answering hard questions like “how badly will this AWS API degredation affect our app’s latency” or “how much is it going to cost to run this service in two years at this growth rate” cause headaches in almost every company at every scale above “we have just one engineer and she knows how everything works”. Answering those questions correctly while engineers’ mental models are still stuck in their heads is often difficult or even impossible, because the answers are formed by combining several people’s knowledge and it won’t all fit in one head. We need to be able to codify models because that lets other people using them with their data to answer these questions, or even automate answering them.

But right now we can’t do this easily, because in the monitoring industry over the last few years, the term “Observability” has been used exclusively as a marketing term; at first by a few vendors, and then more and more, but they have all used it to mean, in essence, “our product”. None have used the actual definition of the term or anything close to it, and so nobody has been working on the missing tools we need to achieve it.

The difference between what the term actually means and what we’ve been using it for is why just pushing to adopt what any particular vendor describes as observability tools isn’t a solution to any monitoring team’s problems. Observability in reality is a very fundamental concept and like most fundamentals, it can be stated in a very straightforward and easy to understand way. In five steps, it goes like this:

1. We have a system. This system can be anything – whether that is your company’s application as a whole, a particular service, a particular host or anything else does not matter.

2. We have a model of this system which has some number of internal states. Whether that’s two states (“Working” and “Broken”) or two million does not matter.

3. We have some outputs from the system. Whether these are in the form of metrics, logs, traces or something else does not matter.

4. If we can tell what internal state the system is in just by watching the outputs, then that system is Observable. If there are states the system can be in, but we cannot tell that we are in them from watching the outputs, then that system is only partially observable.

5. The system’s Observability is a measure of how many states we can tell we are in compared to how many states there are in our model.

Yes, I know that the Wikipedia definition is more involved. And if you’ve studied state space modelling and done a few courses on control theory, the Wikipedia definition is a better one. If you’re also familiar with the history of formal methods in computer science, you’ll know why this doesn’t really matter; and if not, the short version is that we as an industry can’t really create formal models for our systems because we don’t understand them well enough and we don’t teach the math for that to enough engineers, so a simple view of Observability is all we can use right now. But it is sufficient to do real work with.

Mostly the industry has looked at the original definition of Observability – when it’s looked at it at all – from the point of view of outputs, and making outputs more readable and making it easier to create new outputs. That’s where we got the deeply unhelpful “three pillars of observability” meme from. But for us as engineers and for companies in general the far more important part today is the other critical component the definition talks about — it requires that we have a model of the system. Monitoring is necessary but not sufficient to give us Observability.

Right now, we can pick any vendor you choose, build the ultimate data platform, ship all the metrics and logs and traces and any other data we can think of so that we can view it at any resolution, at any frequency of updates, in any combination, using any graphical display and we still won’t know what is going on without that model. We could have built the world’s best monitoring system and we wouldn’t be able to tell what’s going on.

Right now, that model only exists in the heads of us as engineers (especially those of us who pull shifts on-call). We can read some specific metrics or logs and tell you that a particular service is broken and how badly and why and what has to be done to fix it. And if one of us walks in front of a bus tomorrow (or, y’know, catches Covid-19), or gets poached by another company, or retires; well that model is now lost for as long as it takes another engineer to relearn it, if they can. It also means that onboarding of new hires takes longer than it has to, and that our systems are more opaque to the rest of the company.

Don’t believe me? Sit a C-level executive or a new hire down in front of a raw dashboard and ask them what the system is doing. They won’t have the context, so they won’t know. If they don’t give up immediately and think it’s a prank, they’ll probably look at the dashboard and in front of your eyes, try to build a model from scratch using any metadata they can find like the titles of graphs or the units on axes or using past data in the graph to see if the last hour looks different from the last week. You’ve probably seen new hires doing this during their onboarding if you were supervising, training or mentoring them.

This is the largest weakness in the monitoring industry’s market offerings at present. The industry is wholly focussed on extracting outputs in various forms from the system and presenting them to engineers in various ways, or giving them toolkits to do so themselves. Competition has centered on how many data points a vendor can ingest, on their pricing models, on comparative evaluations of their UIs and several other things of little fundamental merit. Very little attention has been given to developing tools that allow us to extract the model from the engineers’ heads and codify it so that people without their institutional knowledge can understand what is going on.

Yes. We’ve all heard of the wonderful sunlit uplands described with marketing terms like MLOps and AIOps, and there’s a wealth of things that could be said about this citing a history going back through Machine Learning, AI, Big Data, Heuristics, AI again, Expert Systems and even further back depending on how long you’ve been in this industry, but suffice it to say that these systems cannot do this job, they explictly have no deeper context and just try to find patterns in data without understanding the data. When the lead of Google’s ML-for-SRE program says so openly in conference talks, maybe the time has arrived for the industry to accept that throwing data at a black box and hoping for magic to happen just isn’t a workable plan here.

Worse yet, the current trend is to talk about how Observability solves the “unknown unknowns” problem, ie. recovery when the system is in an unknown state. This is a category error. Observability is basically the ratio of the number of states you can tell you’re in to the number of states in your model; if you’re in an unknown state then your model was incomplete and how can you now use observability to fix your model? The idea is gibberish.

The reality is that if you are in an unknown state, you don’t know what’s going on, by literal definition. You need to build a new model or extend an old one to cope. This puts you in the category of fault diagnosis, not monitoring. You will be actively opening up the black box of the system and poking around to figure out what happened and that puts you outside the entire mathematical framework of observability. In that scenario, Observability is the end goal, not the means to get there. And you will need different tooling and there’s precious little going on in that space either.

The obvious question at this point is, “So what? Why does this matter? How does it cost us money to not do this?”. Okay, let’s look at the money.

Monitoring SaaS providers have various different billing models but to one extent or another, they all boil down to this: you pay according to the number of data points you produce. Produce more points, pay more money. This is less pronounced for on-prem solutions but their resource requirements mean that the same effect is there, just less directly measured and harder to plan for. This creates a tension between Engineering teams wanting to ship all the metrics they can at a very high frequency; and Finance teams wanting to reduce the final bill that creates. And one of the oldest tropes here is when Finance asks “do you need all these metrics” and Engineering says “I don’t know what I need to know until I need to know it”. This Sir Humphrey-esque response is caused by not having an adequate model of the system. Teams who have such a model, even informally in their heads, know what they need to see as outputs, and how often. It lets them produce data points showing what they need to know and not bother with what they don’t care about. It reduces their costs. The more they codify these models, the lower the final cost.

When something does go wrong, having a model of the system that isn’t in an engineer’s head makes automated fault detection possible, and opens the door to the possibility of automated or assisted remediation (this leans into the next buzzword the monitoring industry is going to co-opt, Controllablity), all of which reduce time to restore service and allows companies to both meet SLAs and to have tighter SLAs and reduces penalty payouts and makes sales easier as your reputation for reliability increases.

Beyond the immediacy of operational requirements, when you have a model of a system, you can make better predictions about what it will do in the long term. This means predicting the costs of meeting teams’ and customers’ needs. It means the ability to invest when it’s needed and not until needed, and it lets you understand the right amount to invest.

In other words, if we as an industry want to improve reliability, reduce opex, optimise capex and know what is actually happening with our systems as it happens, then our true goal is to build an observable system and that requires two things – monitoring and modelling.

We have monitoring today and we will continue to improve it. We do not have modelling, and no vendor seems interested in providing it right now. Instead we find ourselves witnessing a rerun of the Intel-v-AMD frequency wars in a new space as competing vendors claim they can ingest more data points per second than their competitors (and as happened in the frequency wars, are looking to pivot to a new metric like cardinality as they hit fundamental limits), as if that was what was needed. Or we see them make up a definition of what Observability is and try to leverage that, as if that could help.

As an engineer at the coal face, I’d be happier if they would just build the tools we desperately need instead.


14
Jan 21

New toys!

I often get gift vouchers for woodworking shops for the solstice holidays and birthdays and so on, and with Covid this year there’s been no real chance to go pottering round woodworking shops in person so those all built up; over the holidays I finally cashed them in and over the last week or two the delivery guys have been dropping off new toys at the house.

First up, no more burnt fingers when sanding bowls…

I finally found a BS10 Charnwood bowl sander kit after a lot of searching – it seemed everyone in Ireland and the UK and Europe was out of stock of this and none was expected till February, but Raitt’s up in Donegal had one or two left and even though they were technically closed for the holidays, I guess they were having the same kind of holidays I was having and they had it in the post before the new year started (I will definitely be going back there). I’ve had a small play with this since I got it and it’s such a huge step forward. I’m actually looking forward to the next bowl sanding now which seems a bit off 😀

Next up, I’ve been watching Colwin Way’s videos that he was doing for Axminister all last year during lockdown from his home workshop:

They were pretty good tutorials and one of the things he kept using is in that thumbnail – it’s a small platform that mounts in the banjo of the lathe along with a velcro-attached sandpaper disk on a plywood disk which has a metal faceplate ring on the back that his chuck grabs onto, so the lathe becomes a disk sander. It’s not a new idea by any means, but the way he was doing it was very straightforward and one tutorial covered how to build it and I have an Axminister lathe myself and a second faceplate ring for my chuck so the other parts (and they’re not hugely expensive for my lathe, came to about 25 quid in total) went into the order from the Carpentry Store

It’s a nice little bit of functionality to add for doing things like small toys and such on the lathe, as well as for ordinary rectilinear woodworking stuff. Now I just need to find storage space for it 😀

Next, I finally got myself a ring live center for the tailstock. Splitting stock with a 60 degree cone center is hopefully a thing of the past now (and since that caused some mushrooms to explode on me in the beginning, I will not miss it…).

That live center is the multihead one from Axminster so I can remove the ring part and replace it with one of a few others, and I’ve seen some examples on youtube of people turning their own insets for it for things like turning spheres and the like, so it looks to have some potential.

I also got a few carving bits from Saburrtooth for the dremel. I was watching Rebecca DeGroot making some experimental bits and pieces (seriously, you need to see her stuff on instagram, it’s phenomenal) and some of the techniques she was using looked like fun.

I only had a few minutes playing time with these so far, but there’s potential there. I also have had this idea in my head for the last six months for a bowl that I want to make that I’ll need these for (though that technique came more from Stewart Furini than Rebecca DeGroot). But that bowl isn’t even started yet, I’ve only just finished getting some resin into the blank to stabilise two annoying cracks…

I might have chosen the wrong pigments for those but I think I can work with it. If not, I’ll make another smaller one. Or something. I’m still working out how to get the thing I can see in my head onto paper, let alone into wood.

Moving on, one of the things Colwin Way was covering was pen turning and I kindof always thought I’d give that a try, it seems to be mandatory if you have a minilathe, so I got a beginner’s pen turning kit. I expect the first few to be the pen equivalent to a funnel, but hey, it’s all good fun, right? 😀

It’s a rather nice mandrel that one apparently, which is blind luck, it happened to be the one the Carpentry Store stocked that fitted my lathe. Anyway, I’ll give it a go and see how it works out.

There was a bit of restocking as well. I’ve used up all the black, yellow, green and red from my sample pack of Chestnut stains on various holiday decorations and things so I ordered bottles of those to replenish (I figure I’ll keep doing that as I use the little bottles up so that I only have to buy the stuff I’m actually using), as well as some more acrylic sealer because I was out. And I got one of the little timber kits for Calum to build and paint, which should be fun:

There was a light pull drive as well because I want a friction drive to try to get Calum turning little things on the lathe, and I don’t want him using a chuck or a pronged drive center just yet – if he sticks a tool into something with a friction drive center, all that happens is the work stops spinning, rather than there being enough of a connection between spindle and work that it can throw the tool around on him. That hasn’t arrived yet, but should be fun.

And lastly… well, look, I got two nice woodturning tools last year, a Crown PM 10mm bowl gouge and an Ashley Iles 1″ skew chisel. They’re lovely tools but I’ve been afraid to go near them because my sharpening setup was… well, it wasn’t getting it done. I was grinding into the jig itself for one gouge, for another I couldn’t get the jig to clamp it at all, the axis line for the inside and outside curves on the gouges weren’t aligned anymore so I had asymmetric tools, it was a mess.

That last one is an old photo from April – I did change out the carborundum wheels there for aluminium oxide ones a few months back, but the jigs themselves are cheap knockoffs of the wolverine guides and they just were driving me nuts and they were wearing in places and I couldn’t see how to get better performance out of them, so I splurged and bought the BGM-400 kit, took the grinder platform apart and rebuilt it all…

So now I have the Tormek mounting bar, and the gouge sharpening jig, the general purpose sharpening jig that does roughing gouges and parting tools and so on, and the skew chisel sharpening bit. And it’s very very neat indeed. I’m hoping to be able to fix the bad sharpening on all of my gouges, stop shortening the tools quite so fast by keeping a consistent grinding angle rather than constantly having to grind off another bit of steel to correct a bad grind and so on. I know, it’s not a slow speed grinder and it’s not got CBN wheels, but those will have to wait for another day and with a consistent jig setup for sharpening I might be able to start using that lovely Ashley Isles skew soon…

There was one more thing, and I know this is all excessive but it’s birthday gifts and solstice gifts and some savings that would have been spent over five or six months all happening at once before Brexit really bites and we can’t get this stuff or the price doubles in the next few weeks. That’s more likely than not I think – we already are having increased problems getting stuff from the UK or Europe, DPD won’t ship to Ireland at the moment at all because Brexit has mangled both exports from the UK to the EU and shipping from EU to Ireland (we are still in the EU in case anyone was wondering 😀 ) via the land bridge over the UK, which has caused a few things to go astray (and I don’t even want to think about what’s happening to hardwood prices at the moment).

Plus as I said about the bowl sander, a lot of stuff is out of stock everywhere with no really clear restocking date; and we’ve already seen one major woodturning shop in the UK (the Toolpost) close its doors with Covid and Brexit being cited as the cause. Hopefully more don’t follow, but 2020 was not a year that gave us all a lot of cause for optimism, and so far 2021 has started off not getting out of the first week before the US has an actual coup attempt and the news cycle in Ireland has been something of a nightmare as well; we’re now 20% worse than the US for Covid infection rates per capita – literally no other country in the world has had positivity test results this bad or this fast a descent into disaster – and on top of that we just had the Mother and Baby homes report published which simultaneously informed us of nine thousand dead infants in the small percentage of religious institutions investigated and also tried to absolve from blame the religious orders running those institutions and the state that signed contracts with them. Maybe this just isn’t optimism’s best year either.

But I’m one of the very lucky ones in that I can try retail therapy for now instead of the internal screaming. This last present to myself is a real luxury – I bought a set of six cheap airbrushes (the suction variety rather than the gravity fed ones) to put the chestnut stains in. Changing bottles on the one gun was leading to a rather surprising amount of stain winding up on my hands and the bottles and the floor because of capillary action and general clumsiness – I’m hoping this will alleviate it (and there’s a seven-euro-special one as well for gravity feeding so I can try other paints). Now, the airbrushes from aliexpress aren’t exactly what you’d call good compare to a Badger or an Iwata (in fact I don’t think they’re good enough to even dust off an Iwata), but for the kind of thing I’m doing, I think using an Iwata would be an absolute travesty of waste. Those should be used by people who know what they’re doing on very fine work, not the stuff I’m doing, which is the airbrush equivalent of painting houses (badly). They’re cheap, they’re cheerful and they should be here before March and I’m looking forward to playing with them.


30
Dec 20

Snowman platoon

So along with the Gnomes, I wanted to do a batch job of something for solstice gifts from the lathe, and while I initially thought trees or baubles, it came down to snowmen in the end. So, first off, ordered a few sycamore spindle blanks from the Carpentry Store.

At that time, and at the moment, they’re one of the most economical (ie. I was being cheap) sources, but only for spindles. I mention this because we’re now two days from Brexit kicking in across the water in the UK, and already the costs of importing blanks from my normal source at HomeOfWood.co.uk has jumped (though not as much as it has for his continental customers). It’s a very small complaint compared to what’s coming, but the woodworking hobbists here in Ireland may be facing into having to find either local suppliers or continental ones, so costs may be rising right as we face into the damage done by the covid pandemic to our economies. So, y’know, yay. Kinda glad I stocked up over the last few months.

First order of business was to rough out the blanks.

The normally really fast anglefinder off aliexpress wasn’t quite long enough to get to the center reliably on the 70x70mm blanks, so back to more traditional methods.

Then into the lathe and rough out to round. No neat tricks or features here, just the routine operation. Round with the roughing gouge, turn a tenon on both ends, measure to find the half-way point between shoulders and part it in half there.

I really need to get a ring center, this cone live center is fine and all but it wedges something awful into endgrain like this. I did buy a ring drive center, but I don’t think you can use those in the tailstock without a fair amount of grease and smoke. F.Pain talks about using them in The Practical Woodturner but he’s also talking about lathes made from wood and a few other things that don’t really get done much today because (a) we have better alternatives and (b) the skills are mostly lost.

And that’s them all done (along with the blanks that became test gnomes and test trees and so on). Every spindle blank gave two snowmen blanks (and you’ll note there are some missing, they got turned into a test snowman and a wood witch for a friend who’s recovering from the nasty bug of the year).

First thing to buy after you buy a lathe, is a large dustpan and brush.

Also, I think this is what killed my air compressor. It has since given up the ghost with a loud pop, a hiss as it dumped the full tank of air, and a strong stench of burning oil and rubber. Le sigh.

Next up, story stick. Because I don’t want to make that many snowmen and have them all horribly different and disfigured, that’s a Calvin and Hobbes routine.

I used nails here, driven in at the lines and then filed the heads off with the grinder. That’s a standard approach and I think I’m going to completely ditch the entire thing as a bad idea from now on. Yes, you can mark lines with the nails readily enough, but the lathe does want to grab the whole thing out of your hands if you’re clumsy and storing this would be a PITA. I plan to copy this onto a piece of thick-ish plasticard with lines drawn on it and that’ll store in an envelope more readily. Much much easier for the small shed. And marking off every line with a pencil is just easier than with nails, and that’s what I did with the gnomes.

Okay, so that’s the cutoff at the base marked out with a skew cut, and the head has been taken to thickness with my shed-made parting/beading tool thing.

And that’s the middle snowball thicknessed (the thorax? Absnowmen? What body part categories does a snowman have?). And pencil lines marked in at midpoints by eye. And now came the humility lesson.

I tries this with the skew. Yeah. Nope. At this point (and I’m pretty sure, now too), my skew skills were not up to the task. I mean, this was a month of work ago, and we all know that time dilation is in full effect here in 2020 so this was many years ago now, but still. I eventually had to give up on the skew and do the rest of the blanks with the spindle gouge.

Also, I’m going with BB Turning’s advice here about not getting perfect spheres in a snowman. Yup. That’s exactly what I did. Totally deliberately.

And then on to my woodturning secret, buying 80 grit sandpaper in bulk.

And from 80 grit up to 220 grit and then everything was set aside for finishing (the idea being to batch things, so you do each step and repeat, not follow through on every piece like normal).

That’s as close to a montage as I’m getting (also those other new blanks in the last one there were for trees and there are still a few out in the shed today, mocking my working speed).

At this point, various large work projects as well as the gnomes ate all the shed time and it took a while to get to the next stage. But eventually, the snowmen got back on the lathe for sanding sealer, sanding to 360 and then yorkshire grit and hampshire sheen wax (because it’s toy safe).

Then drilling for a nose was done by hand with a 3.5mm drill bit. And that meant it was time to crank out some noses….

Step jaws (which are really meant for expansion mode inside bowls but they do grip down to a small diameter internally and are more stable than my pen jaws), and a dowel length bought some years ago in Woodies of all places. It’s probably poplar, at least it’s not pine.

And yes, I did redeem myself doing all of this with the skew. And after the first few noses, I actually got the rythym of it, and okay, Steve Jones’ job is safe, but at least I wasn’t quite so dismayed by the results this time.

And yeah, that dowel goes right through the headstock spindle and you just pull out the 70 or 80mm you need to work on at a time.

Mark off the nose length with dividers…

Deepen the mark for the base of the nose with the point of the skew and another mark for where the tenon will end (judged by eye, about 5mm).

Shape with the skew – no, not scraping, but cutting, like a real turner 😀
Don’t look too close at that bevel btw. I’m going to sort out my sharpening jig, honest.

Sand to 120 grit at this stage, and on the other side of the work so I don’t have to move the rest.

That there is a tiny dinky Moore&Wright vernier calipers that only goes to 70mm but is so damn handy to have to hand in the shed. Not giving up my 300mm one but this one is remarkably handy for stuff.

Also, that trick where you cut down to dimension with the caliper itself? Managed that a few times, kinda. Mostly it was just parting tool and guesswork but I got it once or twice.

And then part off at the headstock side of that tenon, using the skew to get as much tenon as possible.

Now pull through the next 80mm (stop the fecking lathe first you daft eejit) and repeat 20 times because you miscounted how many you needed.

Honestly though, I was really happy with those.

Drill a bunch of holes in some scraps and get out the airbrush and….

Two-tone carrot noses 😀

Oh, and also drill a 6mm hole in the top of every snowman’s head for hats, which you can turn in a few different styles (but you’ll have to explain the daft punk headgear to everyone apparently).

Chestnut ebonising lacquer for the hats works well I found.

And now you just have to post them off mid-lockdown. Well, most of them. The one in the Fez is staying with us, and one or two others might as well. We’ll see.

Honestly, I only did 18 of them (including the prototype). I don’t know how BB Turning gets through hundreds every year without going loopy or absolutely mastering the skew.

Overall, I’m happy with them. They came out pretty well, and honestly adding the eyes and arms and things doesn’t give them a lot more. I mean, I did make a few with eyes, but I wonder how long those eyes will remain CA’d in place…

(the original prototype)

Also, I did finally ask Steve Jones about the kit he’s using (yes, I know it’s not the tool, but still) and as a result…

Ashley Iles 1″ flat skew. Not that spendy, about forty euro, but it’s such a more solid tool in the hand. Looking forward to trying it once I sort out the sharpening jig. It’s flat on the faces and round on the edges, so when I get a catch and the work slams it into the rest, it won’t dig out as sharp a gouge as mine does right now.

Tip came dipped in wax to prevent slicing yourself when unpacking it. It’s the small stuff like that that makes the difference with good tools like this – the tang is a bit longer, the fit and finish is that bit better and when you put that all together, it’s just an entirely different class of tool even though there’s no one thing I can point to and say “this is why”.

Really looking forward to using this. And still have some more spindle blanks…