Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid expiration and eviction during data syncing #1185

Open
wants to merge 5 commits into
base: unstable
Choose a base branch
from

Conversation

lyq2333
Copy link

@lyq2333 lyq2333 commented Oct 17, 2024

When we sync data from the source Valkey to the destination Valkey using some sync tools like redis-shake, the destination Valkey think it's a master and can perform expiration and eviction, which may cause data corruption. This problem has been discussed in redis/redis#9760 (reply in thread) and Redis already have a solution. But in Valkey we haven't fixed it by now.

i.e. we call set key 1 ex 1 on the source server and transfer this command to the destination server. Then we call incr key on the source server before the key expired, we will have a key on the source server and its value is 2. But when the command arrived at the destination server, the key may be expired and has deleted. So we will have a key on the destination server and its value is 1, which is inconsistent with the source server.

In standalone mode, we can use writable replica to simplify the sync process. However, in cluster mode, we still need a sync tool to help us transfer the source data to the destination. The sync tool usually work as a normal client and the destination works as a master which keep expiration and eviction.

In this PR, we add a new mode for master named 'pseudo-replica'. In this mode, server stop expiration and eviction just like a replica. Notice that this mode exists only in sync state to avoid data inconsistency caused by expiration and eviction. The server in pseudo-replica mode can't turn to a real replica by replicaof or cluster replicate and vice versa.

Any better idea?

Copy link

codecov bot commented Oct 17, 2024

Codecov Report

Attention: Patch coverage is 65.71429% with 12 lines in your changes missing coverage. Please review.

Project coverage is 70.68%. Comparing base (a62d1f1) to head (aed8834).

Files with missing lines Patch % Lines
src/config.c 42.85% 4 Missing ⚠️
src/networking.c 66.66% 4 Missing ⚠️
src/cluster_legacy.c 33.33% 2 Missing ⚠️
src/replication.c 33.33% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1185      +/-   ##
============================================
+ Coverage     70.65%   70.68%   +0.02%     
============================================
  Files           114      114              
  Lines         61799    63102    +1303     
============================================
+ Hits          43664    44602     +938     
- Misses        18135    18500     +365     
Files with missing lines Coverage Δ
src/commands.def 100.00% <ø> (ø)
src/db.c 88.79% <100.00%> (+0.28%) ⬆️
src/evict.c 97.75% <100.00%> (+0.02%) ⬆️
src/expire.c 96.57% <100.00%> (+0.14%) ⬆️
src/server.c 88.78% <100.00%> (+0.14%) ⬆️
src/server.h 100.00% <ø> (ø)
src/cluster_legacy.c 86.19% <33.33%> (-0.13%) ⬇️
src/replication.c 87.36% <33.33%> (-0.05%) ⬇️
src/config.c 78.68% <42.85%> (-0.02%) ⬇️
src/networking.c 88.49% <66.66%> (-0.01%) ⬇️

... and 83 files with indirect coverage changes

@soloestoy
Copy link
Member

good work! @valkey-io/core-team please take a look

@lyq2333 lyq2333 requested a review from a team October 17, 2024 11:34
Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. I think it's a little bit strange, but with some better documentation it's not too strange.

We need to have good support for migration tools, especially for users who want to migrate from proprietary software to open source software. 😄

Redis already have a solution.

I remember this discussion from the Redis times. Do you know what Redis solution is (the same REPLCONF PSEUDO-MASTER?) or is it secret?

Any better idea?

Have you considered the idea to let RediShake act as a primary and let the target database replicate from RediShake? It can act as a replication proxy?

+-----------+   PSYNC  +-----------+   PSYNC  +--------+
| Source DB |<---------| RediShake |<---------| Valkey |
+-----------+          +-----------+          +--------+

Comment on lines 1276 to 1280
* - pseudo-master <0|1>
* Set this connection behaving like a master if server.pseudo_replica is true.
* Sync tools can set their connections into 'pseudo-master' state to visit expired keys.
* */
void replconfCommand(client *c) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sync tool sends REPLCONF PSEUDO-MASTER to say "I'm a pseudo-master, so ignore expire for this connection"?

Almost all other REPLCONF commands are sent by the replica to the primary before doing the PSYNC. This one is different. Can we add a more explicit comment about this difference, similar to REPLCONF GETACK which is also different:

 * - getack <dummy>
 * Unlike other subcommands, this is used by primary to get the replication
 * offset from a replica.

Copy link
Author

@lyq2333 lyq2333 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sync tool sends REPLCONF PSEUDO-MASTER to say "I'm a pseudo-master, so ignore expire for this connection"?

Yes. It does look a bit strange. I add a new command CLIENT IMPORT-SOURCE <on|off> like what @soloestoy say. Maybe it'll look better this way?

valkey.conf Outdated
Comment on lines 804 to 810
# Make the master behave like a replica, which forbids expiration and evcition.
# This is useful for sync tools, because expiration and evcition may cause the data corruption.
# Sync tools can set their connections into 'pseudo-master' state by REPLCONF PSEUDO-MASTER to
# behave like a master(i.e. visit expired keys).
#
# pseudo-replica no

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "pseudo-master" is a little bit confusing. :) I understand how it works now, but only after I read the test cases. Let's try to improve this documentation later.

In Valkey we are no longer not using "master", so new commands and configs should use "primary".

Maybe we can use master if Redis has exactly the same REPLCONF or config, but otherwise let's use primary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, the new mode server is master node, but it is named as pseudo-replica, it is confused by most people. Let us first give it a better name first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import-mode?

Should this node forbid writes from all other clients? It makes it behave even more like a replica.

Btw, I'm thinking now that if we want a better implementation of slot migration, maybe it can use the same or a similar feature. The slot replication is also similar to replication but initiated from the source node. @enjoy-binbin how is the implementation you want to upstream?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because clients may sync data to a working server, I think this is not very friendly to the 24/7 businesses if we simply forbid writes from all other clients. But if we allow writes from other clients, we may face the same dual-write problem as writable replica, I'll document it in the valkey.conf.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the slot replication is also similar the replication in the implementation, it is something like slot RDB + slot replication propagate + slot failover, something like that

@hwware hwware added the major-decision-pending Major decision pending by TSC team label Oct 17, 2024
@soloestoy
Copy link
Member

+-----------+   PSYNC  +-----------+   PSYNC  +--------+
| Source DB |<---------| RediShake |<---------| Valkey |
+-----------+          +-----------+          +--------+

interesting, and we have also considered such a method, but there are several issues.

First, as a cloud provider, we do not allow an instance to become a replica for external instances. This is a very risky operation, especially since the primary connection uses a super user, which has excessive permissions. Additionally, the replica needs to establish an outbound connection, this is also not allowed. These restrictions, I believe, are not unique to cloud providers, many users' security control policies also prohibit such actions.

Another point is that, in a cluster mode, the source and target instances for migration typically have different slot distributions, and redisShake can help with correctly routing the data.

Copy link
Member

@soloestoy soloestoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, I think we should only allow write commands for pseudo-master client when server is in pseudo-replica mode.

src/evict.c Outdated
@@ -546,8 +546,8 @@ int performEvictions(void) {
goto update_metrics;
}

if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION) {
result = EVICT_FAIL; /* We need to free memory, but policy forbids. */
if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION || server.pseudo_replica) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to place it in isSafeToPerformEvictions together with the server.primary_host check.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If place it in isSafeToPerformEvictions , the server will ignore maxmemory. I think it's better to return an OOM error to stop the syncing process.

src/replication.c Outdated Show resolved Hide resolved
@lyq2333
Copy link
Author

lyq2333 commented Oct 18, 2024

Redis already have a solution.

I remember this discussion from the Redis times. Do you know what Redis solution is (the same REPLCONF PSEUDO-MASTER?) or is it secret?

It's secret. I used to push this PR to Redis Community(#13077) and they say they will PR their implementation. But no news after that.

Any better idea?

Have you considered the idea to let RediShake act as a primary and let the target database replicate from RediShake? It can act as a replication proxy?

+-----------+   PSYNC  +-----------+   PSYNC  +--------+
| Source DB |<---------| RediShake |<---------| Valkey |
+-----------+          +-----------+          +--------+

What @soloestoy said is a major limitation for us. BTW, sometimes the destination already has some data, PSYNC will delete them.

@enjoy-binbin
Copy link
Member

i did not read it carefully, internally we simply pause the expiration on both side i guess.
Redis has an issue that seems to be discussing this: redis/redis#13478

@lyq2333
Copy link
Author

lyq2333 commented Oct 18, 2024

i did not read it carefully, internally we simply pause the expiration on both side i guess. Redis has an issue that seems to be discussing this: redis/redis#13478

Interesting. I think this PR can solve that problem if the server has no other read. If the server has other read, one possible solution is to disable all expiration in lookupKey for all clients. But I don't want this PR to affect read/write from normal client.

src/networking.c Outdated
Comment on lines 3570 to 3572
"PSEUDO-PRIMARY (ON|OFF)",
" Set this connection behaving like a primary if server.import_mode is true.",
" Sync tools can set their connections into 'pseudo-primary' state to visit expired keys.",
Copy link
Contributor

@zuiderkwast zuiderkwast Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"PSEUDO-PRIMARY (ON|OFF)",
" Set this connection behaving like a primary if server.import_mode is true.",
" Sync tools can set their connections into 'pseudo-primary' state to visit expired keys.",
"IMPORT-SOURCE (ON|OFF)",
" Mark this connection as an import source if server.import_mode is true.",
" Sync tools can set their connections into 'import-source' state to visit",
" expired keys.",

/* Pseudo Replica */
int pseudo_replica; /* If true, server is a pseudo replica. */
/* Import Mode */
int import_mode; /* If true, server is in import mode and forbid expiration and evcition. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int import_mode; /* If true, server is in import mode and forbid expiration and evcition. */
int import_mode; /* If true, server is in import mode and forbid expiration and eviction. */

@@ -801,6 +801,13 @@ replica-priority 100
#
# replica-ignore-disk-write-errors no

# Make the primary forbid expiration and evcition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Make the primary forbid expiration and evcition.
# Make the primary forbid expiration and eviction.

@@ -801,6 +801,13 @@ replica-priority 100
#
# replica-ignore-disk-write-errors no

# Make the primary forbid expiration and evcition.
# This is useful for sync tools, because expiration and evcition may cause the data corruption.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This is useful for sync tools, because expiration and evcition may cause the data corruption.
# This is useful for sync tools, because expiration and eviction may cause the data corruption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-decision-pending Major decision pending by TSC team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants