Data, not information
> Microsoft says it can measure 43 metrics* and spit 'em all out into a pretty, GUI-fied Azure console.
And as usual, all these monitoring tools do is find easy to record data from the kernel and present it in wizzy, pretty graphical format.
Even though it is all totally irrelevant.
Providing this is like telling a car driver the piston temperature, the headlight colour, the average pressure exerted by passengers on the seats and the methane content of the cabin. What drivers want to know is answers to the important questions, such as: am I going too fast? will something break? do I need to take corrective action?
And so it is with computer monitoring. All the monitoring services seem to be in a race (and, truth be told: have been for decades) to provide the greatest number of different measurements of obscure, irrelevant and often inter-related factors. However none of them provide anything that is of primary importance, such as: how long do I have to wait for the answer to appear? can I run something in the background without affecting the important stuff? Is there time to back my stuff up before I go for lunch?
So if Microsoft want to merely mimic all the crap that today's tools produce, they'll go down the metrics route. Even though most of the stuff is irrelevant, has little effect on the BIG QUESTIONS and in itself (without knowing what the applications are doing) provides nothing of value. As all the other monitoring tools that have come, promised and then disappeared into oblivion have done since the 1980s (remember "sar" and "vmstat"?).
However, if they want to truly provide something that is useful, user-centric and actionable they will extract I-O data (volume, latency, cache efficiency) on a per-file basis. Query times broken down into CPU, storage and network latency. Internet access times by site (or IP address), memory usage per processs - identifying shared, CoW'd and local volumes - best of all - a calculation of how much "slack" is available in these key areas for users to fill up with additional workloads.
All of this is hard. And most of it is not available from Unix or Linux kernels without a lot of hacking about. If it was easy then the dozens of other capacity planning / performance monitoring programs, companies and freeware would have done it years ago. However they have all failed to produce information that users value - which is the tricky bit, but, ultimately, all that matters.