Is awesome. Try it folks.
The commercial release by Google of its Spanner database as a public beta last month came as both a pleasant surprise and a wake-up call: perhaps the end of NoSQL databases is in sight. Spanner represents the pinnacle of the so-called NewSQL movement, which is a rejection of NoSQL values and a return to the old values of SQL, …
It may be awesome, and I use it often, but an embedded local database engine is not really the same level of technology as a distributed database server. The Jet4 database engine is pretty good too, despite its age, and better at multi user use than SQLite, or was last time I looked, but they are minnows compared to Spanner.
As far as I am aware theres no limit on concurrent reads but there is a limit on concurrent writes (1).
I use SQLite for APIs (using slim framework) and its blazing fast when loaded into RAM. Saves buckets of cash on disk IO cost when used this way. Even for much larger databases since RAM is cheaper than disk IO (especially on Azure) usually.
Obviously if you have terabytes of data this isnt practical, but most people dont.
Biggest database I have loaded into RAM is 51GB and the API it serves handles 200k requests a day with ease.
Using a combination of write queuing and read caching the concurrent write limit generally isnt an issue though.
Nginx is very useful for caching which helps prevent problems occurring when the database is locked for writing.
Using an AMQP solution helps with setting up a jury rigged write queuing system.
That said, I generally dont subscribe to the "more boxes is more scale" way of thinking in terms of databases.
This particular set up is a hard sell for shops used to the Microsoft mentality that calls for more boxes (and therefore more licenses...hmm part of the plan MS?) to increase scale...but I see that as a waste of resources since its unlikely that are actually maxxing a box out before you throw another in. What you really have is a bottleneck. Something that can usually be resolved with more (or faster) NICs and faster storage.
If you use replication then reads are pretty much 100% available for an RDBMS. Writes can be too with certain caveats. Googles system doesn't actually offer anything over that. The election process can still be a problem as there's a pause and what about data which has been stored by the now dead system. It won't be available on the copies yet.....It doesn't look like they've solved anything at all!
"This is an implementation of synchronized clocks using GPS receivers and atomic clocks in every data center. This can cause problems during a partition if a node can’t connect to a master – its clock will drift, causing the election of Paxos masters to slow down."
I'd have thought that a GPS/atomic clock would have enough accuracy and resolution that any data centre that was unable to connect would still have the same time as other data centres, for quite a while.
Exactly, if you use NTP and lose the time server link you get drift, but if you have local stratum-1 servers (i.e. time-servers that get their time from an atomic clock either directly, or most commonly from GPS time-transfer) that simply should not happen.
Still, all that using 'time' as a marker does is reduce the window of uncertainty in any split-decision issues, its not like an atomic (computing sense) transaction counter or similar that could be used to eliminate it. After all, you will get some variation in packet delays from originator(s) to SQL-like server(s) so time is not an absolute marker for event order in this case, but if you know your worst-case error is only tens of microseconds then you can at least narrow the window of event/decision uncertainty to be resolved.
Also (back to another rant of mine) to Google time-smoothing - that is a bad idea, but only needed or possibly justified if you use time_t / UTC as your system clock. How do you guarantee drift at stable rates? Keeping all system clocks on atomic time (e.g. GPS, or TDT) avoid the leap-second issues and allows reliable syncing to an atomic-disciplined local clock.
I think it's unrealistic to expect a one page article to get just a random person up to speed on complex subjects like these.
Having said that, I found what was covered clearly and concisely explained and, as a mostly db-side dev, had no great trouble following the general gist of it.
Maybe it's just me though ;-)
p.s. I recognize the possibility that you might know more about the subject and found the treatment too simple. However, bear in mind that much more detail might have lost the rest of us.
@zebthecat, typically you would still only writes to one master, this ensures that write data is "always" consistent, even if there is a split brain.
There is an alternative which is quite common though, where there is a qurom of hosts, data can be written to any host, but the transaction is not committed until the majority agree it has been successful..
"I want to save data and I want to save it now. I dont want to define a new row with int or VARCHAR or define how large the table can be."
Don't we all? But the rest or us want to be able to find it again and to have it consistent when we find it, especially if multiple people are saving related data. We also have multiple people wanting to access the same data.
However there's always the much under-rated cardfile.exe which may do just what you want.
There was a very good reason relational DB theory was created - because it mirrors real world business activity for which flat file DBs had reached the end of the road in the 70s. This is a crucial point that the NoSQL kids who refused to learn from history completely ignored (or didn't even know about) and happily re-iinvented 1950s & 1960s wheels thinking they were being cutting edge. Oh, but they had some cool buzzwords & marketing. Yay.
I agree with your points in general. SQL has some massive advantages when dealing with structured data.
It's a lingua franca, rather than requiring you to learn a query language for each product.
Also, while setting up the tables may not be super easy (and why should it be?), you can do pretty much anything you want with the system afterwards, as long as your query skills and the performance are up to it.
However, even outside of web-by/scaling/distribution problems SQL does have at least one massive Achille's heel - self-referential data, like parts-of-parts and modelling a car engine for example is not a good fit for the standard relational model. Configuration data is another case - defaults, options, overrides - they make sense in JSON, less so in SQL.
I suspect a good solution might involve co-opting graph databases on occasion. And I rather like the liberty that comes with sprinkling in some JSON data into Postgresql columns, without having to shoehorn every problem into SQL-type ERDs.
i.e. I don't think MongoDB and the like are so hot outside of some particular, albeit broad, domains, but I like the fact that we are experiencing so much innovation in alternative ways to organize data rather than being forced to use SQL for everything.
"SQL does have at least one massive Achille's heel - self-referential data, like parts-of-part"
Oh dear. I wish someone had told me that before I wrote one of those. Especially with the fun complication that one part at the bottom of the assembly could have a serial number and that that serial number would apply to each part above it.
Configuration bits and pieces have never been a problem in my experience. Bits made from other bits (bill of materials) can be but now-a-days there are a few standard things you can do to work round that in a logical way, at least in 99% of cases I've seen.
The classis problem with distributed relational databases occur when attempting to update the same row in two locations. Heterogeneous distributed systems have to deal with consistency at the application level when ATM cash withdrawals happen at the same time as e-commerce, e-banking, batch and branch transactions. It turns out that replicating immutable transactions avoids many of the needs for ACID updates.
Spanner is a great example of what can be achieved when you have control over a dedicated network, but so is Teradata, Netezza, Exdata, SQL/Server PDW.
So I don't understand now. The architecture of Spanner sounds interesting at all, but based on what I've read in this article, the benefits of Spanner are primarily achieved by having a very high quality infrastructure under it.
Spanner itself is a NoSQL database that incorporates some of the SQL philosophy, but it is still just as susceptible to the exact same problems as literally every other database out there.
Did I miss something? What can Spanner do that every other distributed database on the planet can do, given the same high quality infrastructure?
The other major differentiator is the use of the Raft algorithm for leader election. The article says Paxos, but it's really raft. This gives it a scalability that conventionally architected RDBMSs currently can't achieve, and why the architecture's been lifted for Apache Kudu. This makes it Stupidly Scalable and very performant in the single datacentre. It's Google's addition of "TrueTime" that allows them to scale across datacentres.
"you can replicate doing the same but across geographic locations."
Not really you can't. There's a very good reason google dropped their already-heavily-modified mysql for spanner. Critically traditional transactional architectures come with huge trade offs when you try and distribute the transaction geographically. Throwing Paxos into the mix gives you the consensus semantics needed to scale transaction management and throwing global time synchronisation on top allows you to use those semantics consistently (for some large value of "consistently").
I mean, you're welcome to try, but you'll soon run into a wall.
It sounds like a layer over big table to aid partitioning and querying.
If you have building scalable systems for any length of time, you'll already have developed all the strategies you need to deal with making MySQL/NoSQL work for you, and at a fraction of the cost.
"It sounds like a layer over big table..."
Only on the surface. BigTable is a log-structured merge-tree. It is very, very highly optimised for random IO, to the point where doing any kind of scan at all is going to slow performance to a shocking crawl. Spanner sits in the middle ground between immutable columnar data and pure key-value stores. Not as fast for raw analytics, and not as fast for raw IO, but mutable and analytical at the same time.
Really? I think that's true for NoSQL but there isn't a clear leader for transactional MySQL workloads. There seem to be a lot of strategies, and there seem to be a lot of bolt-on solutions to MySQL that implement some kind of middleware sharding layer. There are very few truly scalable (shared nothing) strategies that also provide the guarantees of ACID, transactions without caveats, and all the things you would want from your operational data store (e.g. consistent backups). ClustrixDB seems to provide that (as well as some measure of MySQL Compatibility), and NuoDB (though I haven't tried it because I'd have to rewrite my app)
Biting the hand that feeds IT © 1998–2019