"Why is network latency acceptable for end-points but not in servers?"
I think it comes down to basic expectations. Network latency is not acceptable for end points in many cases, which is why we have many CDNs out there to specifically reduce that. Also fancy DNS systems that route traffic based on region or IP that use things like BGP anycast. One place I was at changed DNS solutions because it shaved something like 15ms off their DNS queries.
Other cases it's not a big deal, people have always been used to the latency and it's difficult to break the speed of light, so there really isn't a lot you can do, other than aforementioned options. Building multiple data centers and geo-diversifying your data is a really complicated topic to get right, which is why of course most folks don't do it.
Even going back a decade+ in Google's early days they were addressing latency issues by storing everything in main memory.
There's solutions to address latency on the server end, just a matter of how much latency is acceptable. The extreme cases of "must have everything local on SSD" i think are very rare. My own workload on an e-commerce site that pulls in a bunch of $$ runs mostly out of memory as well (storage workload is 90%+ write), it wasn't intended to be that way(was a surprise to all of us when we got the real stats), but with heavy caching it just worked out that way.
A better, longer term solution is of course to have smarter applications. Things that use I/O better, rather than brute force it by adding more software, as some of the early DBAs told me years ago - you can make a query 10% faster by adding hardware, or 100-1000% faster by fixing the query/data.