Re: "dropping the mean absolute percent error by more than 50 per cent across all benchmarks"
There's a link to the paper in the article, you know.
A quick skim suggests that typical error margins for llvm-mca and IACA are around 17-24%, while those for Ithemal are in the 8-9% range.
As the paper points out, though, a number of applications just care which of several alternatives is likely to be fastest, and the accuracy of the specific predicted timings doesn't matter as long as the winner is correct. They use Spearman correlation to gauge that, and there Ithemal came in at around 96%, versus ~91% for llvm-mca and ~92% for IACA. So if Ithemal is, say, being used by a compiler to select particular optimization strategies in inner loops, it might well squeeze out a not-insignificant performance improvement over existing implementations.
Ithemal also seems to frequently do better in edge cases where the other models are way off (see Figure 3 of the paper and accompanying discussion). The authors theorize that these represent cases dominated by undocumented Intel microarchitectural optimizations that even Intel's own IACA does not model.
The ANN network in this application, by the way, is a pretty standard RNN-based Deep Learning network, with LSTM (Long Short-Term Memory) components. That seems like a good choice for this sort of application, since what you want to do is train the network on a set of small, well-labeled atoms (instructions) and then on sequences of them. It's interesting, though, that they also tried a DAG-RNN and it didn't perform as well. Graph ANNs have done well in some other domains; Colyer summarized a paper on them not that long ago. The authors ascribe its underperformance here to the strong effect of specific instruction ordering on microarchitectural optimization.