Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multishard reader: use buffer hint to reduce cross-shard round trips and opportunities for shard reader eviction-recreation cycles #21113

Open
denesb opened this issue Oct 15, 2024 · 0 comments · May be fixed by #20815
Assignees
Labels
area/repair area/repair-based operations repare based node operations backport/none Backport is not required

Comments

@denesb
Copy link
Contributor

denesb commented Oct 15, 2024

In each internal iteration, repair wants to fill a buffer of 32MiB. To do so, it will keep reading from its reader until this buffer is full. Readers have a default buffer size of 8KiB. This means that repair needs (32 * 1024 * 1024) / (8 * 1024) -> 4 * 1024 reader buffer-fill iterations to fill the repair buffer. This is not a problem normally, but when repair is running in a mixed-shard cluster, this can mean that there is 4K shard round-trips and consequently, 4K opportunities for the shard reader to be evicted and that it has to be recreated on the next fill buffer call.

This was identified as one of the main culprits to the slowness of mixed-shard repairs. To help this, a buffer hint to the multishard reader. This is an internal hint, used by the multishard reader to provide a hint to the shard reader, on how much data exactly is needed by the multishard reader from the respective shard. This hint allows eliminating extraneous cross-shard round-trips and possible shard reader evict-recreate cycles. Building on this, repair sets its own row buffer size as the max buffer size on the multishard reader, ensuring that the row buffer is filled with the minimum amount of cross-shard round trips and minimal reader recreation.

@denesb denesb added area/repair area/repair-based operations repare based node operations labels Oct 15, 2024
@denesb denesb self-assigned this Oct 15, 2024
@github-actions github-actions bot added the backport/none Backport is not required label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/repair area/repair-based operations repare based node operations backport/none Backport is not required
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant