Why can't RDBM's cluster the way NoSQL does?

Question

One of the big plusses for nosql DBMS is that they can cluster more easily. Supposedly with NoSQL you can create hundreds of cheap machines that store different pieces of data and query it all at once.

My question is this, why can't relational DBMS do this like mysql or sql server? Is it that the vendors just haven't figured out a technical way to do this with their existing product, or is there some issue with the relational model that prevents this from being feasible? What is so great about the NoSQL way of storing and accessing data (key/value, documents, etc) that makes clustering easier, if this is true at all?

Storing different bits of data over different machines (sharding) is technically incredibly easy, compared to something like Oracle RAC which can run on up to 63 nodes each presenting the same database, all being ACID compliant etc. "Clustering" in NoSQL is easy because there's no ACID, they use their own proprietary APIs and they're relatively simple. — Philᵀᴹ, Feb 17 '13 at 16:29
So i have read and understood (most of) concerned's reply below. The issue seems to have much more to do with ACID then it does with the relational model. Are there solutions that use the relational model, even if they are not fully acid compliant, in the same way NoSQL is? It seems like NoSQL should really be named NoACID as it has nothing to do with sql or relational model, and everything to do with consistency, atomicity, data access and storage locations, etc. — fregas, Feb 22 '13 at 22:33
@fregas - NoSQL doesn't have any formal definition. It's just a buzzword applied to various database management systems. Quorum replication (A.K.A. eventual consistency) is used by many such systems (although by no means all) as a performance optimisation. I'm not aware of any RDBMS product that uses quorum replication - certainly none of the mainstream ones do. There's no theoretical reason why it couldn't be done, but it would be rather complex and of questionable value given the level of scalability that can be achieved by shared disk systems anyway. — ConcernedOfTunbridgeWells, Feb 22 '13 at 23:06
@ConcernedOfTunbridgeWells Quorum Replication is inconsistent with ACID though which is why it won't be done. — Chris Travers, Feb 23 '13 at 14:14
@fregas I agree with you. Same is case, when people say "NoSQL avoid joins to give quicker results". But joins are avoided by denormalization. If denormalization is OK, then joins can be avoided even in RDBMS !!. So to extend your thought they should be NoACID-Denormalized-Datadump. It is just data dump of any structure; it can be inconsistent; it may not be durable. In short NoSQL accept no restriction and call themselves DIFFERENT !! very misleading. — Kaushik Lele, Jun 06 '15 at 11:32
@ChrisTravers MySQL replication by default follows eventual consistency. — asn, Jul 07 '23 at 06:22

ConcernedOfTunbridgeWells · Answer 1 · 2018-09-03T09:01:45.963

Distributed Database Systems 101

Or, Distributed Databases - what the FK does 'web scale' actually mean?

Distributed database systems are complex critters and come in a number of different flavours. If I dig in to the depths of my dimly remembered studies on this at university I'll try to explain some of the key engineering problems to building a distributed database system.

First, some terminology

ACID (Atomicity, Consistency, Isolation and Durability) properties: These are the key invariants that have to be enforced for a transaction to be reliably implemented without causing undesirable side effects.

Atomicity requires that the transaction complete or rollback completely. Partially finished transactions should never be visible, and the system has to be built in a way that prevents this from happening.

Consistency requires that a transaction should never violate any invariants (such as declarative referential integrity) that are guaranteed by the database schema. For example, if a foreign key exists it should be impossible to insert a child record with a reverence to a non-existent parent.

Isolation requires that transactions should not interfere with each other. The system should guarantee the same results if the transactions are executed in parallel or sequentially. In practice most RDBMS products allow modes that trade off isolation against performance.

Durability requires that once committed, the transaction remains in persistent storage in a way that is robust to hardware or software failure.

I'll explain some of the technical hurdles these requirements present on distributed systems below.

Shared Disk Architecture: An architecture in which all processing nodes in a cluster have access to all of the storage. This can present a central bottleneck for data access. An example of a shared-disk system is Oracle RAC or Exadata.

Shared Nothing Architecture: An architecture in which processing nodes in a cluster have local storage that is not visible to other cluster nodes. Examples of shared-nothing systems are Teradata and Netezza.

Shared Memory Architecture: An architecture in which multiple CPUs (or nodes) can access a shared pool of memory. Most modern servers are of a shared memory type. Shared memory facilitates certain operations such as caches or atomic synchronisation primitives that are much harder to do on distributed systems.

Synchronisation: A generic term describing various methods for ensuring consistent access to a shared resource by multiple processes or threads. This is much harder to do on distributed systems than on shared memory systems, although some network architectures (e.g. Teradata's BYNET) had synchronisation primitives in the network protocol. Synchronisation can also come with a significant amount of overhead.

Semi-Join: A primitive used in joining data held in two different nodes of a distributed system. Essentially it consists of enough information about the rows to join being bundled up and passed by one node to the other in order to resolve the join. On a large query this could involve significant network traffic.

Eventual Consistency: A term used to describe transaction semantics that trade off immediate update (consistency on reads) on all nodes of a distributed system for performance (and therefore higher transaction throughput) on writes. Eventual consistency is a side effect of using Quorum Replication as a performance optimisation to speed up transaction commits in distributed databases where multiple copies of data are held on separate nodes.

Lamport's Algorithm: An algorithm for implementing mutual exclusion (synchronisation) across systems with no shared memory. Normally mutual exclusion within a system requires an atomic read-compare-write or similar instruction of a type normally only practical on a shared memory system. Other distributed synchronisation algorithms exist, but Lamport's was one of the first and is the best known. Like most distributed synchronisation mechanisms, Lamport's algorithm is heavily dependent on accurate timing and clock synchronisation beteen cluster nodes.

Two Phase Commit (2PC): A family of protocols that ensure that database updates involving multiple physical systems commit or roll back consistently. Whether 2PC is used within a system or across multiple systems via a transaction manager it carries a significant overhead.

In a two-phase commit protocol the transaction manager asks the participating nodes to persist the transaction in such a way that they can guarantee that it will commit, then signal this status. When all nodes have returned a 'happy' status it then signals the nodes to commit. The transaction is still regarded as open until all of the nodes send a reply indicating the commit is complete. If a node goes down before signalling the commit is complete the transaction manager will re-query the node when it comes back up until it gets a positive reply indicating the transaction has committed.

Multi-Version Concurrency Control (MVCC): Managing contention by writing new versions of the data to a different location and allowing other transactions to see the old version of the data until the new version is committed. This reduces database contention at the expense of some additional write traffic to write the new version and then mark the old version as obsolete.

Election Algorithm: Distributed systems involving multiple nodes are inherently less reliable than a single system as there are more failure modes. In many cases some mechanism is needed for clustered systems to deal with failure of a node. Election algorithms are a class of algorithms used to select a leader to coordinate a distributed computation in situations where the 'leader' node is not 100% determined or reliable.

Horizontal Partitioning: A table may be split across multiple nodes or storage volumes by its key. This allows a large data volume to be split into smaller chunks and distributed across storage nodes.

Sharding: A data set may be horizontally partitioned across multiple physical nodes in a shared-nothing architecture. Where this partitioning is not transparent (i.e. the client must be aware of the partition scheme and work out which node to query explicitly) this is known as sharding. Some systems (e.g. Teradata) do split data across nodes but the location is transparent to the client; the term is not normally used in conjunction with this type of system.

Consistent Hashing: An algorithm used to allocate data to partitions based on the key. It is characterised by even distribution of the hash keys and the ability to elastically expand or reduce the number of buckets efficiently. These attributes make it useful for partitioning data or load across a cluster of nodes where the size can change dynamically with nodes being added or dropping off the cluster (perhaps due to failure).

Multi-Master Replication: A technique that allows writes across multiple nodes in a cluster to be replicated to the other nodes. This technique facilitates scaling by allowing some tables to be partitioned or sharded across servers and others to be synchronised across the cluster. Writes must be replicated to all nodes as opposed to a quorum, so transaction commits are more expensive on a multi-master replicated architecture than on a quorum replicated system.

Non-Blocking Switch: A network switch that uses internal hardware parallelism to achieve throughput that is proportional to the number of ports with no internal bottlenecks. A naive implementation can use a crossbar mechanism, but this has O(N^2) complexity for N ports, limiting it to smaller switches. Larger switches can use more a complex internal topology called a non-blocking minimal spanning switch to achieve linear throughput scaling without needing O(N^2) hardware.

Making a distributed DBMS - how hard can it be?

Several technical challenges make this quite difficult to do in practice. Apart from the added complexity of building a distributed system the architect of a distributed DBMS has to overcome some tricky engineering problems.

Atomicity on distributed systems: If the data updated by a transaction is spread across multiple nodes the commit/rollback of the nodes must be coordinated. This adds a significant overhead on shared-nothing systems. On shared-disk systems this is less of an issue as all of the storage can be seen by all of the nodes so a single node can coordinate the commit.

Consistency on distributed systems: To take the foreign key example cited above the system must be able to evaluate a consistent state. For example, if the parent and child of a foreign key relationship could reside on different nodes some sort of distributed locking mechanism is needed to ensure that outdated information is not used to validate the transaction. If this is not enforced you could have (for example) a race condition where the parent is deleted after the its presence is verified before allowing the insert of the child.

Delayed enforcement of constraints (i.e. waiting until commit to validate DRI) requires the lock to be held for the duration of the transaction. This sort of distributed locking comes with a significant overhead.

If multiple copies of data are held (this may be necessary on shared-nothing systems to avoid unnecessary network traffic from semi-joins) then all copies of the data must be updated.

Isolation on distributed systems: Where data affected on a transaction resides on multiple system nodes the locks and version (if MVCC is in use) must be synchronised across the nodes. Guaranteeing serialisability of operations, particularly on shared-nothing architectures where redundant copies of data may be stored requires a distributed synchronisation mechanism such as Lamport's Algorithm, which also comes with a significant overhead in network traffic.

Durability on distributed systems: On a shared disk system the durability issue is essentially the same as a shared-memory system, with the exception that distributed synchronisation protocols are still required across nodes. The DBMS must journal writes to the log and write the data out consistently. On a shared-nothing system there may be multiple copies of the data or parts of the data stored on different nodes. A two-phase commit protocol is needed to ensure that the commit happens correctly across the nodes. This also incurs significant overhead.

On a shared-nothing system the loss of a node can mean data is not available to the system. To mitigate this data may be replicated across more than one node. Consistency in this situation means that the data must be replicated to all nodes where it normally resides. This can incur substantial overhead on writes.

One common optimisation made in NoSQL systems is the use of quorum replication and eventual consistency to allow the data to be replicated lazily while guaranteeing a certain level of resiliency of the data by writing to a quorum before reporting the transaction as committed. The data is then replicated lazily to the other nodes where copies of the data reside.

Note that 'eventual consistency' is a major trade-off on consistency that may not be acceptable if the data must be viewed consistently as soon as the transaction is committed. For example, on a financial application an updated balance should be available immediately.

Shared-Disk systems

A shared-disk system is one where all of the nodes have access to all of the storage. Thus, computation is independent of location. Many DBMS platforms can also work in this mode - Oracle RAC is an example of such an architecture.

Shared disk systems can scale substantially as they can support a M:M relationship between storage nodes and processing nodes. A SAN can have multiple controllers and multiple servers can run the database. These architectures have a switch as a central bottleneck but crossbar switches allow this switch to have a lot of bandwidth. Some processing can be offloaded onto the storage nodes (as in the case of Oracle's Exadata) which can reduce the traffic on the storage bandwidth.

Although the switch is theoretically a bottleneck the bandwidth available means that shared-disk architectures will scale quite effectively to large transaction volumes. Most mainstream DBMS architectures take this approach because it affords 'good enough' scalability and high reliability. With a redundant storage architecture such as fibre channel there is no single point of failure as there are at least two paths between any processing node and any storage node.

Shared-Nothing systems

Shared-nothing systems are systems where at least some of the data is held locally to a node and is not directly visible to other nodes. This removes the bottleneck of a central switch, allowing the database to scale (at least in theory) with the number of nodes. Horizontal partitioning allows the data to be split across nodes; this may be transparent to the client or not (see Sharding above).

Because the data is inherently distributed a query may require data from more than one node. If a join needs data from different nodes a semi-join operation is used to transfer enough data to support the join from one node to another. This can result in a large amount of network traffic, so optimising the distribution of the data can make a big difference to query performance.

Often, data is replicated across nodes of a shared-nothing system to reduce the necessity for semi-joins. This works quite well on data warehouse appliances as the dimensions are typically many orders of magnitude smaller than the fact tables and can be easily replicated across nodes. They are also typically loaded in batches so the replication overhead is less of an issue than it would be on a transactional application.

The inherent parallelism of a shared-nothing architecture makes them well suited to the sort of table-scan/aggregate queries characteristic of a data warehouse. This sort of operation can scale almost linearly with the number of processing nodes. Large joins across nodes tend to incur more overhead as the semi-join operations can generate lots of network traffic.

Moving large data volumes is less useful for transaction processing applications, where the overhead of multiple updates makes this type of architecture less attractive than a shared disk. Thus, this type of architecture tends not to be used widely out of data warehouse applications.

Sharding, Quorum Replication and Eventual Consistency

Quorum Replication is a facility where a DBMS replicates data for high availability. This is useful for systems intended to work on cheaper commodity hardware that has no built-in high-availability features like a SAN. In this type of system the data is replicated across multiple storage nodes for read performance and redundant storage to make the system resilient to hardware failure of a node.

However, replication of writes to all nodes is O(M x N) for M nodes and N writes. This makes writes expensive if the write must be replicated to all nodes before a transaction is allowed to commit. Quorum replication is a compromise that allows writes to be replicated to a subset of the nodes immediately and then lazily written out to the other nodes by a background task. Writes can be committed more quickly, while providing a certain degree of redundancy by ensuring that they are replicated to a minimal subset (quorum) of nodes before the transaction is reported as committed to the client.

This means that reads off nodes outside the quorum can see obsolete versions of the data until the background process has finished writing data to the rest of the nodes. The semantics are known as 'Eventual Consistency' and may or may not be acceptable depending on the requirements of your application but mean that transaction commits are closer to O(1) than O(n) in resource usage.

Sharding requires the client to be aware of the partitioning of data within the databases, often using a type of algorithm known as 'consistent hashing'. In a sharded database the client hashes the key to determine which server in the cluster to issue the query to. As the requests are distributed across nodes in the cluster there is no bottleneck with a single query coordinator node.

These techniques allow a database to scale at a near-linear rate by adding nodes to the cluster. Theoretically, quorum replication is only necessary if the underlying storage medium is to be considered unreliable. This is useful if commodity servers are to be used but is of less value if the underlying storage mechanism has its own high availability scheme (for example a SAN with mirrored controllers and multi-path connectivity to the hosts).

For example, Google's BigTable does not implement Quorum Replication by itself, although it does sit on GFS, a clustered file system that does use quorum replication. BigTable (or any shared-nothing system) could use a reliable storage system with multiple controllers and partition the data among the controllers. Parallel access would then be achieved through partitioning of the data.

Back to RDBMS platforms

There is no inherent reason that these techniques could not be used with a RDBMS. However lock and version management would be quite complex on such a system and any market for such a system is likely to be quite specialised. None of the mainstream RDBMS platforms use quorum replication and I'm not specifically aware of any RDBMS product (at least not one with any significant uptake) that does.

Shared-disk and shared-nothing systems can scale up to very large workloads. For instance, Oracle RAC can support 63 processing nodes (which could be large SMP machines in their own right) and an arbitrary number of storage controllers on the SAN. An IBM Sysplex (a cluster of zSeries mainframes) can support multiple mainframes (each with substantial processing power and I/O bandwidth of their own) and multiple SAN controllers. These architectures can support very large transaction volumes with ACID semantics, although they do assume reliable storage. Teradata, Netezza and other vendors make high-performance analytic platforms based on shared-nothing designs that scale to extremely large data volumes.

So far, the market for cheap but ultra-high volume fully ACID RDBMS platforms is dominated by MySQL, which supports sharding and multi-master replication. MySQL does not use quorum replication to optimise write throughput, so transaction commits are more expensive than on a NoSQL system. Sharding allows very high read throughputs (for example Facebook uses MySQL extensively), so this type of architecture scales well on read-heavy workloads.

An interesting debate

BigTable is a shared-nothing architecture (essentially a distributed key-value pair) as pointed out by Michael Hausenblas below. My original evaluation of it included the MapReduce engine, which is not a part of BigTable but would normally be used in conjunction with it in its most common implementations (e.g. Hadoop/HBase and Google's MapReduce framework).

Comparing this architecture with Teradata, which has physical affinity between storage and processing (i.e. the nodes have local storage rather than a shared SAN) you could argue that BigTable/MapReduce is a shared disk architecture through the globally visible parallel storage system.

The processing throughput of a MapReduce style system such as Hadoop is constrained by the bandwidth of a non-blocking network switch.¹ Non-blocking switches can, however, handle large bandwidth aggregates due to the parallelism inherent in the design, so they are seldom a significant practical constraint on performance. This means that a shared disk architecture (perhaps better referred to as a shared-storage system) can scale to large workloads even though the network switch is theoretically a central bottleneck.

The original point was to note that although this central bottleneck exists in shared-disk systems, a partitioned storage subsystem with multiple storage nodes (e.g. BigTable tablet servers or SAN controllers) can still scale up to large workloads. A non-blocking switch architecture can (in theory) handle as many current connections as it has ports.

¹ Of course the processing and I/O throughput available also constitutes a limit on performance but the network switch is a central point through which all traffic passes.

Good write-up, however one factual mistake: BigTable is shared-nothing, details see http://research.google.com/archive/bigtable.html - care to correct this? — Michael Hausenblas, Feb 21 '13 at 12:55
No, BigTable is shared disk. Any processing node can retrieve data from any storage node. There is no storage local to processing nodes that is not globally accessible. It does use a clustered file system but all nodes are globally visible. — ConcernedOfTunbridgeWells, Feb 21 '13 at 14:30
@Michael Hausenblas On second thoughts, if you take the BigTable DB in isolation I'd go with the shared-nothing claim. I have conflated it with the whole MapReduce/Hadoop stack (where there is no specific affinity between processing and storage) in this article. You could quite reasonably argue the inappropriateness of that conflation. — ConcernedOfTunbridgeWells, Feb 21 '13 at 16:24
A couple technical thoughts. In fact quorum replication is what is done on PostgreSQL's streaming replication for master/slave setups. Data must commit to the master only by default but you can also require that it is also written to n slaves before the commit is returned. — Chris Travers, Feb 23 '13 at 14:21

score 22 · Answer 2 · edited Mar 06 '14 at 15:17

Relational databases can cluster like NoSQL solutions. Maintaining ACID properties may make this more complex and one must be aware of the tradeoffs made to maintain these properties. Unfortunately, exactly what the trade-offs are depends on the workload and of course the decisions made while designing the database software.

For example, a primarily OLTP workload may have additional single query latency even as the cluster's throughput scales nicely. That extra latency could go un-noticed or be a real deal breaker, all depending on the application. In general, clustering will improve throughput and hurt latency, but even that 'rule' is suspect if an application's queries are particularly amenable to parallel processing.

What the company that I work for, Clustrix, offers is a series of homogenous computation and storage nodes connected by a relatively high speed network. Relational data is hash distributed across the nodes on a per-index basis in chunks we call 'slices'. Each slice will have two or more replicas spread throughout the cluster for durability in the event of node or disk failure. Clients can connect to any node in the cluster to issue queries using the MySQL wire protocol.

It's a bit un-natural to think of the components of an ACID database independently since so much of it dovetails together, but here goes:

Atomicity - Clustrix uses two phase commits to ensure atomicity. UPDATE and DELETE operations will also lock rows through our distributed lock manager because we internally turn those operations into SELECT followed by exact UPDATE/DELETE operations.

Atomicity obviously increases the amount of messaging between the nodes participating in a transaction and increases load on those nodes to process the commit protocol. This is part of the overhead for having a distributed system and would limit scalability if every node participated in every transaction, but nodes only participate in a transaction if they have one of the replicas being written.

Consistency - Foreign keys are implemented as triggers, which are evaluated at commit time. Big range UPDATE and DELETE operations can hurt our performance due to locking, but we fortunately don't see these all that often. It's far more common to see a transaction update/delete a few rows and then commit.

The other part of consistency is maintaining a quorum via the PAXOS consensus protocol which ensures that only clusters with the majority of the known nodes are able to take writes. It's of course possible for a cluster to have quorum but still have data missing (all replicas of a slice offline), which will cause transactions that access one of those slices to fail.

Isolation - Clustrix provides MVCC isolation at the container level. Our atomicity guarantee that all applicable replicas receive a particular write before we report the transaction committed mostly reduces the isolation problem to what you'd have in the non-clustered case.

Durability - Each slice of relational data is stored to two or more nodes to provide resiliency in case of node or disk failure. It's probably also worth noting here that the appliance version of our product has an NVRAM card where the WAL is stored for performance reasons. A lot of single instance databases will improve the performance of their WALs by checkpointing at intervals instead of at each commit. That approach is problematic in a distributed system because makes 'replay to where?' a complicated question. We sidestep this in the appliance by providing a hard durability guarantee.

Chris Travers · Answer 3 · 2018-09-05T08:11:41.893

The fundamental answer is that the consistency model is different. I am writing this to expand ConcernedOfTunbridge's answer which really ought to be the reference point for this.

The basic point of the ACID consistency model is that it makes a bunch of fundamental guarantees as to the state of the data globally within the system. These guarantees are subject to the CAP theorem limitations which mean, basically, that to make them work, you need to have all authoritative sources on the same page before you tell the application you have committed a transaction. Multi-master replication is thus very hard to do without running into these constraints. Certainly once you start doing asynchronous replication in a multi-master environment these guarantees go out the window. The ACID consistency model is a strong consistency model intended for important or critical information.

The BASE consistency model is a weak consistency model intended for non-critical information. Because the guarantees are significantly weaker, the ability to offer such weak guarantees in multi-master systems is more easily attainable because the guarantees are, well, weak.

RDBMS's can and do scale as well as NoSQL solutions though!

However there are cases where RDBMS's can and do scale up to an extent that NoSQL might not even be able to match. It just does so differently. I will look at Postgres-XC as the example of how scaling out is possible without sacrificing strong consistency guarantees.

The way in which these particular RDBMS's do it is to implement something kind of like a sharding solution with a proxy and kind of like a shared disk architecture but significantly more complex than either. These do not scale in the same way as NoSQL solutions and so the tradeoffs are different.

The Postgres-XC model is, I understand, inspired by Teradata. It consists of nodes in two different roles, as storage nodes or coordinators. Coordinators are multi-master (no real replication is involved) and they connect to storage nodes to handle actual data processing. The storage nodes replicate in a master-slave setup. Each storage node contains what is in essence a shard of the database, but the coordinators maintain a consistent picture of the data.

A significant separation of responsibilities is involved here. The storage nodes manage data, check constraints, locally enforcible foreign key constraints, and handle at least some aggregation of data. The coordinators handle those foreign keys that cannot be locally enforced, as well as windowing and other data considerations that may pull from multiple data nodes. In essence coordinators make ACID possible in distributed transactions in a multi-master setup where the user doesn't even know the transactions are distributed.

In this regard, Postgres-XC offers something a bit like the NoSQL scaling options but there is some added complexity due to the additional consistency guarantees. I understand that there are commercial RDBMS's that offer this sort of scalability out there however. Teradata does this. Additionally shared disk systems can scale out in a similar way and both DB2 and Oracle offer such solutions. So it is entirely unfair to say that RDBMS's can't do this. They can. The need however has been so small in the past that economies of scale have been insufficient to make the proprietary solutions very affordable to most players.

Finally a note on VoltDB. Like the NoSQL solutions, I see VoltDB as a very specialized tool. It is very fast but at the expense of multi-round-trip transactions and durability on disk. This means you have a very different set of concerns. VoltDB is what you get when RDBMS pioneers build a NoSQL solution ;-). VoltDB is fast in part because it defines concurrency and durability out of the equation. Durability becomes a network property, not an intra-host property and concurrency is managed by running queries one at a time, internally parallelized, in sequential order. It is not a traditional RDBMS (and that's a good thing btw since it can go places the traditional RDBMS can't, but the converse is also very much true).

Edit: It's also important to consider the implication of joins. In a clustered systems, joins become a very different performance issue. If everything is on the same node, they can improve performance but if you have to make a round-trip to a different node this imposes a very high cost. So data models do make differences and the approach of clustering has performance impacts. Approaches like Postgres-XC and Postgres-XL assume that you can spend a fair bit of time thinking things through so you can appropriately shard your data and keep joined data together. But that imposes design overhead. On the other hand, this scales much better than many NoSQL solutions and can be tuned appropriately. For example we (at Adjust) use a NoSQL-like clustering strategy for our 3.5PB of data in PostgreSQL that is basically log analysis. And a lot of our design is deeply inspired by NoSQL clustering strategies. So sometimes the data model does constrain the clustering model as well.

score 6 · Answer 4 · answered Feb 22 '13 at 14:14

6

My answer won't be as well-written as the previous one, but here goes.

Michael Stonebraker of Ingres fame has created a MPP shared-nothing column-store (Vertica) and a MPP shared-nothing New SQL database (VoltDB) which distributes data between different nodes in a cluster and maintains ACID. Vertica has since been bought by HP.

I believe other New SQL databases maintain ACID as well, although I'm not sure how many of them distribute their rows over a cluster, etc.

Here's a talk Stonebraker gave on New SQL relative to NoSQL and "Old SQL". http://www.youtube.com/watch?v=uhDM4fcI2aI

answered Feb 22 '13 at 14:14

geoffrobinson

181
3

2

What is this "New SQL" and "Old SQL"? Would you care to clarify? – ypercubeᵀᴹ Feb 22 '13 at 14:42
1

"Old SQL" would be SQL Server, Oracle, MySQL, PostgreSQL, etc. Here's the definition from Wikipedia for NewSQL which is pretty good: "NewSQL is a class of modern relational database management systems that seek to provide the same scalable performance of NoSQL systems for OLTP workloads while still maintaining the ACID guarantees of a traditional single-node database system." I highly recommend the video I posted if interested in learning more. – geoffrobinson Feb 22 '13 at 14:45
As a note here, and as I explained in my answer, VoltDB handles scalability by defining durability and concurrency out of the equation. In essence with VoltDB, you get no intrasystem durability, and no concurrent access to data.
New SQL is like an Indie 500 race car, but Old SQL is still the semi truck or maybe the freight train's engine.
– Chris Travers Dec 06 '13 at 06:30

score 0 · Answer 5 · answered Mar 15 '13 at 00:40

0

MySQL clustering can shard using multi mastering replication and hashing shards across the cluster.

answered Mar 15 '13 at 00:40

Jeremy Singer

9
1