Before we get to the point , Let's first understand what is distributed CAP theorem .
According to Baidu Encyclopedia ,CAP Theorem also known as CAP principle , In a distributed system ,Consistency（ uniformity ）,
Availability（ usability ）,Partition tolerance（ Partition tolerance ）, You can only have two of three properties at the same time , You can't have all three .
One ,CAP Definition of
Consistency ( uniformity )：
“all nodes see the same data at the same
time”, After the update operation succeeds and returns to the client , All nodes have identical data at the same time , This is distributed consistency . Consistency is inevitable in concurrent systems , For clients , Consistency refers to how to obtain the updated data during concurrent access . From the server side , How to copy and distribute updates to the whole system , To ensure the final consistency of data .
Availability ( usability ):
Availability means “Reads and writes always
succeed”, That is, the service is always available , And it's normal response time . Good usability mainly means that the system can serve users well , There is no bad user experience such as user operation failure or access timeout .
Partition Tolerance ( Partition tolerance ):
That is, when the distributed system encounters a node or network partition failure , Still able to provide external services to meet the consistency or availability .
The fault tolerance requirement of partition can make the application be a distributed system , And it looks like it's a working whole . For example, in today's distributed system, one or several machines are down , The rest of the machines are still working properly to meet the needs of the system , No experience impact for users .
Two ,CAP Proof of theorem
Now let's prove it , Why can't three characteristics be satisfied at the same time ?
Suppose there are two servers , One for application A And database V, One for application B And database V, The network between them can be interconnected , It is equivalent to two parts of distributed system .
When consistency is met , Two servers
N1 and N2, At first, the data of the two servers is the same ,DB0=DB0. When availability is met , Whether it's a request or not N1 perhaps N2, Will get immediate response . When the fault tolerance of partition is satisfied ,N1 and N2 Any party down , Or when the network is not available , Will not affect N1 and N2 Normal operation between each other .
When the user passes N1 In A Apply request data update to server DB0 after , At this time N1 Servers in DB0 Change to DB1, Data synchronization update operation through distributed system ,N2 Database in server V0 Also updated to DB1, At this time , User through B The data obtained by sending a request to the database is the data after immediate update DB1.
This is the normal operation , But in the distributed system , The biggest problem is network transmission , Now suppose an extreme situation ,N1 and N2 The network is disconnected , But we still need to support this kind of network exception , That is to say, it satisfies the fault tolerance of partition , So can consistency and availability be satisfied at the same time ?
hypothesis N1 and N2 When communicating with each other, the network suddenly breaks down , There are users to N1 Send data update request , that N1 Data in DB0 Will be updated to DB1, Because the network is disconnected ,N2 Database in is still DB0;
If this time , There are users to N2 Send data read request , Because the data has not yet been synchronized , The application can't immediately return the latest data to the user DB1, What shall I do? ? There are two options , first , Sacrificing data consistency , Respond to old data DB0 For users ; second , Sacrifice availability , Blocking waiting , Until the network connection is restored , After the data update operation is completed , Then respond to the latest data for users DB1.
The above process is relatively simple , But it also shows that the distributed system should satisfy the fault tolerance of partition , Only in both consistency and availability , Choose one . That is to say, distributed system can not satisfy three characteristics at the same time . This requires us to make a choice when building the system , that , How to choose is a better strategy ?
Three , Trade-off strategy
CAP Only two of the three characteristics can be satisfied , Then there are three strategies to choose between ：
CA without P：
If not required P（ Partition not allowed ）, be C（ Strong consistency ） and A（ usability ） It's guaranteed . But give up P At the same time, it means giving up the expansibility of the system , That is, distributed nodes are limited , No way to deploy child nodes , This is against the original intention of distributed system design .
CP without A：
If not required A（ available ）, It means that every request needs to be consistent between servers , and P（ partition ） Will cause unlimited synchronization time ( That is, wait for the data synchronization to complete before you can access the service normally ), In case of network failure or message loss, etc , It's about sacrificing the user experience , Wait for all data to be consistent before allowing users to access the system . Designed as CP We have a lot of systems , The most typical is distributed database , as Redis,HBase etc . For these distributed databases , Data consistency is the most basic requirement , Because if you can't even meet this standard , So it's better to use relational database directly , There is no need to waste resources to deploy distributed database .
AP wihtout C：
To be highly available and allow partitioning , Then consistency needs to be abandoned . Once a partition occurs , Possible loss of contact between nodes , For high availability , Each node can only provide services with local data , This will lead to global data inconsistency . Typical applications are like the scene of a certain meter's mobile phone , Maybe a few seconds ago, when you browse the products, the page prompts that there is inventory , When you are ready to place an order , The system prompts you to fail to place the order , Item sold out . This is actually the first
A（ usability ） Ensure the normal service of the system , And then made some sacrifices in terms of data consistency , Although it will affect some user experience , But it will not cause serious blocking of the user's shopping process .
Three , summary
Now , For most large Internet application scenarios , Many hosts , Scattered deployment , And now the cluster size is growing , More and more nodes , So node failure , Network failure is normal , Therefore, the fault tolerance of partition becomes an inevitable problem for a distributed system . Then you can only C and A Choose between . But it may be different for traditional projects , Take the bank's transfer system , No compromise on data consistency involving money ,C Must guarantee , In case of network failure , Prefer to stop service , You can A and P Choose between .
to make a long story short , There is no best strategy , A good system should be designed according to business scenarios , Only the right one is the best .