Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: boost: mulitshard_combining_reader_as_mutation_source_test: allow redundant range tombstone changes #1

Open
wants to merge 2 commits into
base: fix-evictable-reader
Choose a base branch
from

Conversation

michoecho
Copy link

test_range_tombstones_v2 is too strict for this reader -- it expects a particular sequence of range tombstone changes, but multishard_combining_reader, when tested with a small buffer, generates -- as expected -- additional (redundant) range tombstone change pairs (end+start).

We would rather not modify test_range_tombstones_v2, because catching redundancy is good in general, so to fix this problem, we wrap the tested reader with a filter that removes these redundant range tombstone changes, when they are expected.

kbr-scylla and others added 2 commits June 23, 2023 17:12
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
…ow redundant range tombstone changes

test_range_tombstones_v2 is too strict for this reader -- it expects a
particular sequence of range tombstone changes, but multishard_combining_reader,
when tested with a small buffer, generates -- as expected -- additional
(redundant) range tombstone change pairs (end+start).

We would rather not modify test_range_tombstones_v2, because catching
redundancy is good in general, so to fix this problem, we wrap the
tested reader with a filter that removes these redundant range tombstone
changes, when they are expected.
@kbr-scylla kbr-scylla force-pushed the fix-evictable-reader branch 2 times, most recently from 853829f to fd0b1a4 Compare June 27, 2023 12:35
kbr-scylla pushed a commit that referenced this pull request Jun 27, 2023
View building from staging creates a reader from scratch (memtable
+ sstables - staging) for every partition, in order to calculate
the diff between new staging data and data in base sstable set,
and then pushes the result into the view replicas.

perf shows that the reader creation is very expensive:
+   12.15%    10.75%  reactor-3        scylla             [.] lexicographical_tri_compare<compound_type<(allow_prefixes)0>::iterator, compound_type<(allow_prefixes)0>::iterator, legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()(managed_bytes_basic_view<(mutable_view)0>, managed_bytes
+   10.01%     9.99%  reactor-3        scylla             [.] boost::icl::is_empty<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    8.95%     8.94%  reactor-3        scylla             [.] legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()
+    7.29%     7.28%  reactor-3        scylla             [.] dht::ring_position_tri_compare
+    6.28%     6.27%  reactor-3        scylla             [.] dht::tri_compare
+    4.11%     3.52%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    4.09%     4.07%  reactor-3        scylla             [.] sstables::index_consume_entry_context<sstables::index_consumer>::process_state
+    3.46%     0.93%  reactor-3        scylla             [.] sstables::sstable_run::will_introduce_overlapping
+    2.53%     2.53%  reactor-3        libstdc++.so.6     [.] std::_Rb_tree_increment
+    2.45%     2.45%  reactor-3        scylla             [.] boost::icl::non_empty::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.14%     2.13%  reactor-3        scylla             [.] boost::icl::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.07%     2.07%  reactor-3        scylla             [.] logalloc::region_impl::free
+    2.06%     1.91%  reactor-3        scylla             [.] sstables::index_consumer::consume_entry(sstables::parsed_partition_index_entry&&)::{lambda()#1}::operator()() const::{lambda()#1}::operator()
+    2.04%     2.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    1.87%     0.00%  reactor-3        [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe
+    1.86%     0.00%  reactor-3        [kernel.kallsyms]  [k] do_syscall_64
+    1.39%     1.38%  reactor-3        libc.so.6          [.] __memcmp_avx2_movbe
+    1.37%     0.92%  reactor-3        scylla             [.] boost::icl::segmental::join_left<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::
+    1.34%     1.33%  reactor-3        scylla             [.] logalloc::region_impl::alloc_small
+    1.33%     1.33%  reactor-3        scylla             [.] seastar::memory::small_pool::add_more_objects
+    1.30%     0.35%  reactor-3        scylla             [.] seastar::reactor::do_run
+    1.29%     1.29%  reactor-3        scylla             [.] seastar::memory::allocate
+    1.19%     0.05%  reactor-3        libc.so.6          [.] syscall
+    1.16%     1.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst
+    1.07%     0.79%  reactor-3        scylla             [.] sstables::partitioned_sstable_set::insert

That shows some significant amount of work for inserting sstables
into the interval map and maintaining the sstable run (which sorts
fragments by first key and checks for overlapping).

The interval map is known for having issues with L0 sstables, as
it will have to be replicated almost to every single interval
stored by the map, causing terrible space and time complexity.
With enough L0 sstables, it can fall into quadratic behavior.

This overhead is fixed by not building a new fresh sstable set
when recreating the reader, but rather supplying a predicate
to sstable set that will filter out staging sstables when
creating either a single-key or range scan reader.

This could have another benefit over today's approach which
may incorrectly consider a staging sstable as non-staging, if
the staging sst wasn't included in the current batch for view
building.

With this improvement, view building was measured to be 3x faster.

from
INFO  2023-06-16 12:36:40,014 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 963957ms = 50kB/s

to
INFO  2023-06-16 14:47:12,129 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 319899ms = 150kB/s

Refs scylladb#14089.
Fixes scylladb#14244.

Signed-off-by: Raphael S. Carvalho <[email protected]>
kbr-scylla pushed a commit that referenced this pull request Aug 29, 2023
All rpc::client objects have to be stopped before they are
destroyed. Currently this is done in
messaging_service::shutdown(). The cql_test_env does not call
shutdown() currently. This can lead to use-after-free on the
rpc::client object, manifesting like this:

Segmentation fault on shard 0.
Backtrace:
column_mapping::~column_mapping() at schema.cc:?
db::cql_table_large_data_handler::internal_record_large_cells(sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long) const at ./db/large_data_handler.cc:180
operator() at ./db/large_data_handler.cc:123
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long>(std::__invoke_other, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long>(db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:114
 (inlined by) std::_Function_handler<seastar::future<void> (sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long), db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1>::_M_invoke(std::_Any_data const&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:290
std::function<seastar::future<void> (sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long)>::operator()(sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long) const at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:591
 (inlined by) db::cql_table_large_data_handler::record_large_cells(sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long) const at ./db/large_data_handler.cc:175
seastar::rpc::log_exception(seastar::rpc::connection&, seastar::log_level, char const*, std::__exception_ptr::exception_ptr) at ./build/release/seastar/./seastar/src/rpc/rpc.cc:109
operator() at ./build/release/seastar/./seastar/src/rpc/rpc.cc:788
operator() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:1682
 (inlined by) void seastar::futurize<seastar::future<void> >::satisfy_with_result_of<seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&) const::{lambda()#1}&&) at ./build/release/seastar/./seastar/include/seastar/core/future.hh:2134
 (inlined by) operator() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:1681
 (inlined by) seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:781
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2319
 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:2756
seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2925
seastar::reactor::run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2808
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:265
seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:156
operator() at ./build/release/seastar/./seastar/src/testing/test_runner.cc:75
 (inlined by) void std::__invoke_impl<void, seastar::testing::test_runner::start_thread(int, char**)::$_0&>(std::__invoke_other, seastar::testing::test_runner::start_thread(int, char**)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<void, seastar::testing::test_runner::start_thread(int, char**)::$_0&>, void>::type std::__invoke_r<void, seastar::testing::test_runner::start_thread(int, char**)::$_0&>(seastar::testing::test_runner::start_thread(int, char**)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:111
 (inlined by) std::_Function_handler<void (), seastar::testing::test_runner::start_thread(int, char**)::$_0>::_M_invoke(std::_Any_data const&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:290
std::function<void ()>::operator()() const at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:591
 (inlined by) seastar::posix_thread::start_routine(void*) at ./build/release/seastar/./seastar/src/core/posix.cc:73

Fix by making sure that shutdown() is called prior to destruction.

Fixes scylladb#12244

Closes scylladb#12276
kbr-scylla pushed a commit that referenced this pull request Aug 29, 2023
View building from staging creates a reader from scratch (memtable
+ sstables - staging) for every partition, in order to calculate
the diff between new staging data and data in base sstable set,
and then pushes the result into the view replicas.

perf shows that the reader creation is very expensive:
+   12.15%    10.75%  reactor-3        scylla             [.] lexicographical_tri_compare<compound_type<(allow_prefixes)0>::iterator, compound_type<(allow_prefixes)0>::iterator, legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()(managed_bytes_basic_view<(mutable_view)0>, managed_bytes
+   10.01%     9.99%  reactor-3        scylla             [.] boost::icl::is_empty<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    8.95%     8.94%  reactor-3        scylla             [.] legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()
+    7.29%     7.28%  reactor-3        scylla             [.] dht::ring_position_tri_compare
+    6.28%     6.27%  reactor-3        scylla             [.] dht::tri_compare
+    4.11%     3.52%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    4.09%     4.07%  reactor-3        scylla             [.] sstables::index_consume_entry_context<sstables::index_consumer>::process_state
+    3.46%     0.93%  reactor-3        scylla             [.] sstables::sstable_run::will_introduce_overlapping
+    2.53%     2.53%  reactor-3        libstdc++.so.6     [.] std::_Rb_tree_increment
+    2.45%     2.45%  reactor-3        scylla             [.] boost::icl::non_empty::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.14%     2.13%  reactor-3        scylla             [.] boost::icl::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.07%     2.07%  reactor-3        scylla             [.] logalloc::region_impl::free
+    2.06%     1.91%  reactor-3        scylla             [.] sstables::index_consumer::consume_entry(sstables::parsed_partition_index_entry&&)::{lambda()#1}::operator()() const::{lambda()#1}::operator()
+    2.04%     2.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    1.87%     0.00%  reactor-3        [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe
+    1.86%     0.00%  reactor-3        [kernel.kallsyms]  [k] do_syscall_64
+    1.39%     1.38%  reactor-3        libc.so.6          [.] __memcmp_avx2_movbe
+    1.37%     0.92%  reactor-3        scylla             [.] boost::icl::segmental::join_left<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::
+    1.34%     1.33%  reactor-3        scylla             [.] logalloc::region_impl::alloc_small
+    1.33%     1.33%  reactor-3        scylla             [.] seastar::memory::small_pool::add_more_objects
+    1.30%     0.35%  reactor-3        scylla             [.] seastar::reactor::do_run
+    1.29%     1.29%  reactor-3        scylla             [.] seastar::memory::allocate
+    1.19%     0.05%  reactor-3        libc.so.6          [.] syscall
+    1.16%     1.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst
+    1.07%     0.79%  reactor-3        scylla             [.] sstables::partitioned_sstable_set::insert

That shows some significant amount of work for inserting sstables
into the interval map and maintaining the sstable run (which sorts
fragments by first key and checks for overlapping).

The interval map is known for having issues with L0 sstables, as
it will have to be replicated almost to every single interval
stored by the map, causing terrible space and time complexity.
With enough L0 sstables, it can fall into quadratic behavior.

This overhead is fixed by not building a new fresh sstable set
when recreating the reader, but rather supplying a predicate
to sstable set that will filter out staging sstables when
creating either a single-key or range scan reader.

This could have another benefit over today's approach which
may incorrectly consider a staging sstable as non-staging, if
the staging sst wasn't included in the current batch for view
building.

With this improvement, view building was measured to be 3x faster.

from
INFO  2023-06-16 12:36:40,014 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 963957ms = 50kB/s

to
INFO  2023-06-16 14:47:12,129 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 319899ms = 150kB/s

Refs scylladb#14089.
Fixes scylladb#14244.

Signed-off-by: Raphael S. Carvalho <[email protected]>

Closes scylladb#14476
kbr-scylla pushed a commit that referenced this pull request Aug 29, 2023
View building from staging creates a reader from scratch (memtable
+ sstables - staging) for every partition, in order to calculate
the diff between new staging data and data in base sstable set,
and then pushes the result into the view replicas.

perf shows that the reader creation is very expensive:
+   12.15%    10.75%  reactor-3        scylla             [.] lexicographical_tri_compare<compound_type<(allow_prefixes)0>::iterator, compound_type<(allow_prefixes)0>::iterator, legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()(managed_bytes_basic_view<(mutable_view)0>, managed_bytes
+   10.01%     9.99%  reactor-3        scylla             [.] boost::icl::is_empty<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    8.95%     8.94%  reactor-3        scylla             [.] legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()
+    7.29%     7.28%  reactor-3        scylla             [.] dht::ring_position_tri_compare
+    6.28%     6.27%  reactor-3        scylla             [.] dht::tri_compare
+    4.11%     3.52%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    4.09%     4.07%  reactor-3        scylla             [.] sstables::index_consume_entry_context<sstables::index_consumer>::process_state
+    3.46%     0.93%  reactor-3        scylla             [.] sstables::sstable_run::will_introduce_overlapping
+    2.53%     2.53%  reactor-3        libstdc++.so.6     [.] std::_Rb_tree_increment
+    2.45%     2.45%  reactor-3        scylla             [.] boost::icl::non_empty::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.14%     2.13%  reactor-3        scylla             [.] boost::icl::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.07%     2.07%  reactor-3        scylla             [.] logalloc::region_impl::free
+    2.06%     1.91%  reactor-3        scylla             [.] sstables::index_consumer::consume_entry(sstables::parsed_partition_index_entry&&)::{lambda()#1}::operator()() const::{lambda()#1}::operator()
+    2.04%     2.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    1.87%     0.00%  reactor-3        [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe
+    1.86%     0.00%  reactor-3        [kernel.kallsyms]  [k] do_syscall_64
+    1.39%     1.38%  reactor-3        libc.so.6          [.] __memcmp_avx2_movbe
+    1.37%     0.92%  reactor-3        scylla             [.] boost::icl::segmental::join_left<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::
+    1.34%     1.33%  reactor-3        scylla             [.] logalloc::region_impl::alloc_small
+    1.33%     1.33%  reactor-3        scylla             [.] seastar::memory::small_pool::add_more_objects
+    1.30%     0.35%  reactor-3        scylla             [.] seastar::reactor::do_run
+    1.29%     1.29%  reactor-3        scylla             [.] seastar::memory::allocate
+    1.19%     0.05%  reactor-3        libc.so.6          [.] syscall
+    1.16%     1.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst
+    1.07%     0.79%  reactor-3        scylla             [.] sstables::partitioned_sstable_set::insert

That shows some significant amount of work for inserting sstables
into the interval map and maintaining the sstable run (which sorts
fragments by first key and checks for overlapping).

The interval map is known for having issues with L0 sstables, as
it will have to be replicated almost to every single interval
stored by the map, causing terrible space and time complexity.
With enough L0 sstables, it can fall into quadratic behavior.

This overhead is fixed by not building a new fresh sstable set
when recreating the reader, but rather supplying a predicate
to sstable set that will filter out staging sstables when
creating either a single-key or range scan reader.

This could have another benefit over today's approach which
may incorrectly consider a staging sstable as non-staging, if
the staging sst wasn't included in the current batch for view
building.

With this improvement, view building was measured to be 3x faster.

from
INFO  2023-06-16 12:36:40,014 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 963957ms = 50kB/s

to
INFO  2023-06-16 14:47:12,129 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 319899ms = 150kB/s

Refs scylladb#14089.
Fixes scylladb#14244.

Signed-off-by: Raphael S. Carvalho <[email protected]>
(cherry picked from commit 1d8cb32)
Signed-off-by: Raphael S. Carvalho <[email protected]>

Closes scylladb#14764
kbr-scylla pushed a commit that referenced this pull request Sep 7, 2023
Add a document describing in detail how to use clang's "-ftime-trace"
option, and the ClangBuildAnalyzer tool, to find the source files,
header files and templates which slow down Scylla's build the most.

I've used this tool in the past to reduce Scylla build time - see
commits:

   fa7a302 (reduced 6.5%)
   f840943 (reduced 0.1%)
   6ebf32f (reduced 1%)
   d01e1a7 (reduced 4%)

I'm hoping that documenting how to use this tool will allow other
developers to suggest similar commits.

Refs #1.

Signed-off-by: Nadav Har'El <[email protected]>

Closes scylladb#15209
kbr-scylla pushed a commit that referenced this pull request Sep 15, 2023
…r stage

Example scenario:

  1. coordinator A sends RPC #1 to trigger streaming
  2. coordinator fails over to B
  3. coordinator B performs streaming successfully
  4. RPC #1 arrives and starts streaming
  5. coordinator B commits the transition to the post-streaming stage
  6. coordinator B executes global token metadata barrier

We end up with streaming running despite the fact that the current
coordinator moved on. Currently, this won't happen, because streaming
holds on to erm. But we want to change that (see scylladb#14995), so that it
does not block barriers for migrations of other tablets. The same
problem applies to tablet cleanup.

The fix is to use tablet_metadata_guard around such long running
operations, which will keep hold to erm so that in the above scenario
coordinator B will wait for it in step 6. The guard ensures that erm
doesn't block other migrations because it switches to the latest erm
if it's compatible. If it's not, it signals abort_source for the guard
so that such stale operation aborts soon and the barrier in step 6
doesn't wait for long.
kbr-scylla pushed a commit that referenced this pull request Mar 7, 2024
…ports' from Yaron Kaikov

This PR includes 3 commits:

- **[actions] Add a check for backport labels**: As part of the Automation of ScyllaDB backports project, each PR should get either a `backport/none` or `backport/X.Y` label. Based on this label we will automatically open a backport PR for the relevant OSS release.

In this commit, I am adding a GitHub action to verify if such a label was added. This only applies to PR with a based branch of `master` or `next`. For releases, we don't need this check

- **Add Mergify (https://mergify.com/) configuration file**: In this PR we introduce the `.mergify.yml` configuration file, which
include a set of rules that we will use for automating our backport
process.

For each supported OSS release (currently 5.2 and 5.4) we have an almost
identical configuration section which includes the four conditions before
we open a backport pr:
* PR should be closed
* PR should have the proper label. for example: backport/5.4 (we can
  have multiple labels)
* Base branch should be `master`
* PR should be set with a `promoted` label - this condition will be set
  automatically once the commits are promoted to the `master` branch (passed
gating)

Once all conditions are applied, the verify bot will open a backport PR and
will assign it to the author of the original PR, then CI will start
running, and only after it pass. we merge

- **[action] Add promoted label when commits are in master**: In Scylla, we don't merge our PR but use ./script/pull_github_pr.sh` to close the pull request, adding `closes scylladb/scylladb <PR number>` remark and push changes to `next` branch.

One of the conditions for opening a backport PR is that all relevant commits are in `master` (passed gating), in this GitHub action, we will go through the list of commits once a push was made to `master` and will identify the relevant PR, and add `promoted` label to it. This will allow Mergify to start the process of backporting

Closes scylladb#17365

* github.com:scylladb/scylladb:
  [action] Add promoted label when commits are in master
  Add mergify (https://mergify.com/) configuration file
  [actions] Add a check for backport labels
kbr-scylla pushed a commit that referenced this pull request May 9, 2024
Perevent stalls from "unpacking" of large
canonical mutations seen with test_add_many_nodes_under_load
when called from `group0_state_machine::transfer_snapshot`:

```
  ++[1#1/44 14%] addr=0x395b2f total=569 count=6 avg=95: ?? ??:0
  | ++[2#1/2 56%] addr=0x3991e3 total=321 count=4 avg=80: ?? ??:0
  | ++          - addr=0x1587159:
  | |             std::__new_allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/new_allocator.h:147
  | |             (inlined by) std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/allocator.h:198
  | |             (inlined by) std::allocator_traits<std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/alloc_traits.h:482
  | |             (inlined by) std::_Vector_base<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::_M_allocate at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/stl_vector.h:378
  | |             (inlined by) std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >::reserve at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/vector.tcc:79
  | |             (inlined by) ser::idl::serializers::internal::vector_serializer<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > >::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer_impl.hh:226
  | |             (inlined by) ser::deserialize<std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264
  | |             (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:31
  | ++          - addr=0x1587085:
  | |             seastar::with_serialized_stream<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>, ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> >(seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator>&)::{lambda(auto:1&)#1}, void, void> at ././seastar/include/seastar/core/simple-stream.hh:646
  | |             (inlined by) ser::serializer<clustering_key_prefix>::read<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ./build/dev/gen/idl/keys.dist.impl.hh:28
  | |             (inlined by) ser::deserialize<clustering_key_prefix, seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> > at ././serializer.hh:264
  | |             (inlined by) ser::deletable_row_view::key() const::{lambda(auto:1&)#1}::operator()<seastar::fragmented_memory_input_stream<bytes_ostream::fragment_iterator> const> at ./build/dev/gen/idl/mutation.dist.impl.hh:1268
  | | ++[3#1/1 100%] addr=0x15865a3 total=577 count=7 avg=82:
  | | |              seastar::memory_input_stream<bytes_ostream::fragment_iterator>::with_stream<ser::deletable_row_view::key() const::{lambda(auto:1&)#1}> at ././seastar/include/seastar/core/simple-stream.hh:491
  | | |              (inlined by) seastar::with_serialized_stream<seastar::memory_input_stream<bytes_ostream::fragment_iterator> const, ser::deletable_row_view::key() const::{lambda(auto:1&)#1}, void> at ././seastar/include/seastar/core/simple-stream.hh:639
  | | |              (inlined by) ser::deletable_row_view::key at ./build/dev/gen/idl/mutation.dist.impl.hh:1264
  | |   ++[4#1/1 100%] addr=0x157cf27 total=643 count=8 avg=80:
  | |   |              mutation_partition_view::do_accept<partition_builder> at ./mutation/mutation_partition_view.cc:212
  | |   ++           - addr=0x1516cac:
  | |   |              mutation_partition::apply at ./mutation/mutation_partition.cc:497
  | |     ++[5#1/1 100%] addr=0x14e4433 total=1765 count=22 avg=80:
  | |     |              canonical_mutation::to_mutation at ./mutation/canonical_mutation.cc:60
  | |       ++[6#1/2 98%] addr=0x2452a60 total=1732 count=21 avg=82:
  | |       |             service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:761
  | |       ++          - addr=0x2858782:
  | |       |             service::group0_state_machine::transfer_snapshot at ./service/raft/group0_state_machine.cc:303
```

Signed-off-by: Benny Halevy <[email protected]>
kbr-scylla pushed a commit that referenced this pull request May 9, 2024
Freezing large mutations synchronously may cause
reactor stalls, as seen in the test_add_many_nodes_under_load
dtest:
```
  ++[1#1/37 5%] addr=0x15b0bf total=99 count=2 avg=50: ?? ??:0
  | ++[2#1/2 67%] addr=0x15a331f total=66 count=1 avg=66:
  | |             bytes_ostream::write at ././bytes_ostream.hh:248
  | |             (inlined by) bytes_ostream::write at ././bytes_ostream.hh:263
  | |             (inlined by) ser::serialize_integral<unsigned int, bytes_ostream> at ././serializer.hh:203
  | |             (inlined by) ser::integral_serializer<unsigned int>::write<bytes_ostream> at ././serializer.hh:217
  | |             (inlined by) ser::serialize<unsigned int, bytes_ostream> at ././serializer.hh:254
  | |             (inlined by) ser::writer_of_column<bytes_ostream>::write_id at ./build/dev/gen/idl/mutation.dist.impl.hh:4680
  | | ++[3#1/1 100%] addr=0x159df71 total=132 count=2 avg=66:
  | | |              (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}::operator() at ./mutation/mutation_partition_serializer.cc:99
  | | |              (inlined by) row::maybe_invoke_with_hash<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1} const, cell_and_hash const> at ./mutation/mutation_partition.hh:133
  | | |              (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}::operator() at ./mutation/mutation_partition.hh:152
  | | |              (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}, true>::operator() at ././utils/compact-radix-tree.hh:1888
  | | |              (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit_slot<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}, true>&> at ././utils/compact-radix-tree.hh:1560
  | | ++           - addr=0x159d84d:
  | | |              compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}, true>&> at ././utils/compact-radix-tree.hh:1364
  | | |              (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}, true>&, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> > at ././utils/compact-radix-tree.hh:799
  | | |              (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::node_base<cell_and_hash, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)1, 4u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)2, 8u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::indirect_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)3, 16u>, compact_radix_tree::tree<cell_and_hash, unsigned int>::direct_layout<cell_and_hash, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)6, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)0, 0u, (compact_radix_tree::tree<cell_and_hash, unsigned int>::layout)4, 32u> >::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}, true>&> at ././utils/compact-radix-tree.hh:807
  | |   ++[4#1/1 100%] addr=0x1596f4a total=329 count=5 avg=66:
  | |   |              compact_radix_tree::tree<cell_and_hash, unsigned int>::node_head::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}, true> > at ././utils/compact-radix-tree.hh:473
  | |   |              (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::visit<compact_radix_tree::tree<cell_and_hash, unsigned int>::walking_visitor<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}, true> > at ././utils/compact-radix-tree.hh:1626
  | |   |              (inlined by) compact_radix_tree::tree<cell_and_hash, unsigned int>::walk<row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}>(ser::deletable_row__cells<bytes_ostream>&&) const::{lambda(unsigned int, cell_and_hash const&)#1}> at ././utils/compact-radix-tree.hh:1909
  | |   |              (inlined by) row::for_each_cell<(anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> >(ser::deletable_row__cells<bytes_ostream>&&, row const&, schema const&, column_kind)::{lambda(unsigned int, atomic_cell_or_collection const&)#1}> at ./mutation/mutation_partition.hh:151
  | |   |              (inlined by) (anonymous namespace)::write_row_cells<ser::deletable_row__cells<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:97
  | |   |              (inlined by) write_row<ser::writer_of_deletable_row<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:168
  | |     ++[5#1/2 80%] addr=0x15a310c total=263 count=4 avg=66:
  | |     |             mutation_partition_serializer::write_serialized<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/mutation_partition_serializer.cc:180
  | |     | ++[6#1/2 62%] addr=0x14eb60a total=428 count=7 avg=61:
  | |     | |             frozen_mutation::frozen_mutation(mutation const&)::$_0::operator()<ser::writer_of_mutation_partition<bytes_ostream> > at ./mutation/frozen_mutation.cc:85
  | |     | |             (inlined by) ser::after_mutation__key<bytes_ostream>::partition<frozen_mutation::frozen_mutation(mutation const&)::$_0> at ./build/dev/gen/idl/mutation.dist.impl.hh:7058
  | |     | |             (inlined by) frozen_mutation::frozen_mutation at ./mutation/frozen_mutation.cc:84
  | |     | | ++[7#1/1 100%] addr=0x14ed388 total=532 count=9 avg=59:
  | |     | | |              freeze at ./mutation/frozen_mutation.cc:143
  | |     | |   ++[8#1/2 74%] addr=0x252cf55 total=394 count=6 avg=66:
  | |     | |   |             service::storage_service::merge_topology_snapshot at ./service/storage_service.cc:763
```

This change uses freeze_gently to freeze
the cdc_generations_v2 mutations one at a time
to prevent the stalls reported above.

Signed-off-by: Benny Halevy <[email protected]>
kbr-scylla pushed a commit that referenced this pull request May 9, 2024
Prevent stalls coming from applying large
mutations in memory synchronously,
like the ones seen with the test_add_many_nodes_under_load
dtest:
```
  | | |   ++[5#2/2 44%] addr=0x1498efb total=256 count=3 avg=85:
  | | |   |             replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()#1}::operator() at ./replica/memtable.cc:804
  | | |   |             (inlined by) logalloc::allocating_section::with_reclaiming_disabled<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()#1}&> at ././utils/logalloc.hh:500
  | | |   |             (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()#1}&&)::{lambda()#1}::operator() at ././utils/logalloc.hh:527
  | | |   |             (inlined by) logalloc::allocating_section::with_reserve<logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()#1}>(logalloc::region&, replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()#1}&&)::{lambda()#1}> at ././utils/logalloc.hh:471
  | | |   |             (inlined by) logalloc::allocating_section::operator()<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator()() const::{lambda()#1}> at ././utils/logalloc.hh:526
  | | |   |             (inlined by) replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0::operator() at ./replica/memtable.cc:800
  | | |   |             (inlined by) with_allocator<replica::memtable::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const> const&, db::rp_handle&&)::$_0> at ././utils/allocation_strategy.hh:318
  | | |   |             (inlined by) replica::memtable::apply at ./replica/memtable.cc:799
  | | |     ++[6#1/1 100%] addr=0x145047b total=1731 count=21 avg=82:
  | | |     |              replica::table::do_apply<frozen_mutation const&, seastar::lw_shared_ptr<schema const>&> at ./replica/table.cc:2896
  | | |       ++[7#1/1 100%] addr=0x13ddccb total=2852 count=32 avg=89:
  | | |       |              replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator() at ./replica/table.cc:2924
  | | |       |              (inlined by) seastar::futurize<void>::invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2032
  | | |       |              (inlined by) seastar::futurize_invoke<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&> at ././seastar/include/seastar/core/future.hh:2066
  | | |       |              (inlined by) replica::dirty_memory_manager_logalloc::region_group::run_when_memory_available<replica::table::apply(frozen_mutation const&, seastar::lw_shared_ptr<schema const>, db::rp_handle&&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0> at ./replica/dirty_memory_manager.hh:572
  | | |       |              (inlined by) replica::table::apply at ./replica/table.cc:2923
  | | |       ++           - addr=0x1330ba1:
  | | |       |              replica::database::apply_in_memory at ./replica/database.cc:1812
  | | |       ++           - addr=0x1360054:
  | | |       |              replica::database::do_apply at ./replica/database.cc:2032
```

This change has virtually no effect on small mutations
(up to 128KB in size).
build/release/scylla perf-simple-query --write --default-log-level=error --random-seed=1 -c 1
Before:
median 80092.06 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   53291 insns/op,        0 errors)
After:
median 78780.86 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   53311 insns/op,        0 errors)

To estimate the performance ramifications on
large mutations, I measured perf-simple-query --write
calling unfreeze_gently in all cases:
median 77411.26 tps ( 71.3 allocs/op,   8.0 logallocs/op,  14.3 tasks/op,   53280 insns/op,        0 errors)

Showing the allocations that moved out of logalloc
(in memtable::apply of frozen_mutation) into seastar
allocations (in unfreeze_gently) and <1% cpu overhead.

Signed-off-by: Benny Halevy <[email protected]>
kbr-scylla pushed a commit that referenced this pull request Jun 11, 2024
Seen the following unexplained assertion failure with
pytest -s -v --scylla-version=local_tarball --tablets repair_additional_test.py::TestRepairAdditional::test_repair_option_pr_multi_dc
```
INFO  2024-05-27 11:18:05,081 [shard 0:main] init - Shutting down repair service
INFO  2024-05-27 11:18:05,081 [shard 0:main] task_manager - Stopping module repair
INFO  2024-05-27 11:18:05,081 [shard 0:main] task_manager - Unregistered module repair
INFO  2024-05-27 11:18:05,081 [shard 1:main] task_manager - Stopping module repair
INFO  2024-05-27 11:18:05,081 [shard 1:main] task_manager - Unregistered module repair
scylla: repair/row_level.cc:3230: repair_service::~repair_service(): Assertion `_stopped' failed.
Aborting on shard 0.
Backtrace:
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3f040c
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x41c7a1
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x3dbaf
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x8e883
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x3dafd
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x2687e
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x2679a
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x36186
  0x26f2428
  0x10fb373
  0x10fc8b8
  0x10fc809
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x456c6d
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x456bcf
  0x10fc65b
  0x10fc5bc
  0x10808d0
  0x1080800
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3ff22f
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x4003b7
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3ff888
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36dea8
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36d0e2
  0x101cefa
  0x105a390
  0x101bde7
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27b89
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27c4a
  0x101a764
```

Decoded:
```
~repair_service at ./repair/row_level.cc:3230
~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:491
 (inlined by) ~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:491
~shared_ptr at ././seastar/include/seastar/core/shared_ptr.hh:569
 (inlined by) seastar::shared_ptr<repair_service>::operator=(seastar::shared_ptr<repair_service>&&) at ././seastar/include/seastar/core/shared_ptr.hh:582
 (inlined by) seastar::shared_ptr<repair_service>::operator=(decltype(nullptr)) at ././seastar/include/seastar/core/shared_ptr.hh:588
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:727
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::invoke<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&>(seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&) at ././seastar/include/seastar/core/future.hh:2035
 (inlined by) seastar::futurize<std::invoke_result<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::type>::type seastar::smp::submit_to<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::smp_submit_to_options, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) at ././seastar/include/seastar/core/smp.hh:367
seastar::futurize<std::invoke_result<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::type>::type seastar::smp::submit_to<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) at ././seastar/include/seastar/core/smp.hh:394
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:725
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>(std::__invoke_other, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>(seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:114
 (inlined by) std::_Function_handler<seastar::future<void> (unsigned int), seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}>::_M_invoke(std::_Any_data const&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
```

FWIW, gdb crashed when opening the coredump.

This commit will help catch the issue earlier
when repair_service::stop() fails (and it must never fail)

Signed-off-by: Benny Halevy <[email protected]>
kbr-scylla pushed a commit that referenced this pull request Jun 11, 2024
Enriches the output of `scylla fiber` with resolved names of coroutine resume functions.

Before:

```
[shard  2] #0  (task*) 0x0000602004c9fbf0 0x0000000000642880 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16
[shard  2] #1  (task*) 0x0000602000344c90 0x0000000000642880 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16
[shard  2] scylladb#2  (task*) 0x0000602004b30c50 0x0000000000642880 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16
```

After:

```
[shard  2] #0  (task*) 0x0000602004c9fbf0 0x0000000000642880 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (.resume is seastar::future<void> sstables::parse<unsigned int, std::pair<sstables::metadata_type, unsigned int> >(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::disk_array<unsigned int, std::pair<sstables::metadata_type, unsigned int> >&) [clone .resume] )
[shard  2] #1  (task*) 0x0000602000344c90 0x0000000000642880 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (.resume is sstables::parse(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::statistics&) [clone .resume] )
[shard  2] scylladb#2  (task*) 0x0000602004b30c50 0x0000000000642880 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (.resume is sstables::sstable::read_simple<(sstables::component_type)8, sstables::statistics>(sstables::statistics&)::{lambda(sstables::sstable_version_types, seastar::file&&, unsigned long)#1}::operator()(sstables::sstable_version_types, seastar::file&&, unsigned long) const [clone .resume] )
```

Closes scylladb#19091
kbr-scylla pushed a commit that referenced this pull request Jun 27, 2024
For convenience.

Note that this line info only points to the function as a whole, not to the
current suspend point.
I think there's no facility for converting the `__coro_index` to the current suspend point automatically.

Before:

```
(gdb) scylla fiber seastar::local_engine->_current_task
[shard  1] #0  (task*) 0x0000601008e8e970 0x000000000047aae0 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (.resume is seastar::future<void> sstables::parse<unsigned int, std::pair<sstables::metadata_type, unsigned int> >(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::disk_array<unsigned int, std::pair<sstables::metadata_type, unsigned int> >&) [clone .resume] )
[shard  1] #1  (task*) 0x00006010092acf10 0x000000000047aae0 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (.resume is sstables::parse(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::statistics&) [clone .resume] )
[shard  1] scylladb#2  (task*) 0x0000601008e648d0 0x000000000047aae0 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (.resume is sstables::sstable::read_simple<(sstables::component_type)8, sstables::statistics>(sstables::statistics&)::{lambda(sstables::sstable_version_types, seastar::file&&, unsigned long)#1}::operator()(sstables::sstable_version_types, seastar::file&&, unsigned long) const [clone .resume] )
```

After:

```
(gdb) scylla fiber seastar::local_engine->_current_task
[shard  1] #0  (task*) 0x0000601008e8e970 0x000000000047aae0 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (sstables::parse<unsigned int, std::pair<sstables::metadata_type, unsigned int> >(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::disk_array<unsigned int, std::pair<sstables::metadata_type, unsigned int> >&) at sstables/sstables.cc:352)
[shard  1] #1  (task*) 0x00006010092acf10 0x000000000047aae0 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (sstables::parse(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::statistics&) at sstables/sstables.cc:570)
[shard  1] scylladb#2  (task*) 0x0000601008e648d0 0x000000000047aae0 vtable for seastar::internal::coroutine_traits_base<void>::promise_type + 16  (sstables::sstable::read_simple<(sstables::component_type)8, sstables::statistics>(sstables::statistics&)::{lambda(sstables::sstable_version_types, seastar::file&&, unsigned long)#1}::operator()(sstables::sstable_version_types, seastar::file&&, unsigned long) const at sstables/sstables.cc:992)

```

Closes scylladb#19478
kbr-scylla pushed a commit that referenced this pull request Aug 28, 2024
Large json return values are streamed to avoid memory pressure
and stalls, but are destroyed all at once. This in itself can cause
stalls [1].

Destroy them gently to avoid the stalls.

[1]

++[0#1/1 100%] addr=0x46880df total=514498 count=7004 avg=73:
|              seastar::backtrace<seastar::backtrace_buffer::append_backtrace_oneline()::{lambda(seastar::frame)#1}> at ./build/release/seastar.lto/./seastar/include/seastar/util/backtrace.hh:64
++           - addr=0x4680b35:
|              seastar::backtrace_buffer::append_backtrace_oneline at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:839
|              (inlined by) seastar::print_with_backtrace at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:858
++           - addr=0x46800f7:
|              seastar::internal::cpu_stall_detector::generate_trace at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:1469
++           - addr=0x4680178:
|              seastar::internal::cpu_stall_detector::maybe_report at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:1206
|              (inlined by) seastar::internal::cpu_stall_detector::on_signal at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:1226
++           - addr=0x3dbaf: ?? ??:0
  ++[1#1/812 13%] addr=0x217b774 total=69336 count=990 avg=70:
  |               rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericValue at /usr/include/rapidjson/document.h:721
  | ++[2#1/3 85%] addr=0x217b7db total=58974 count=842 avg=70:
  | |             rapidjson::GenericMember<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericMember at /usr/include/rapidjson/document.h:71
  | |             (inlined by) rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericValue at /usr/include/rapidjson/document.h:733
  | | ++[3#1/4 45%] addr=0x217b7db total=902102 count=12903 avg=70:
  | | |             rapidjson::GenericMember<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericMember at /usr/include/rapidjson/document.h:71
  | | |             (inlined by) rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericValue at /usr/include/rapidjson/document.h:733
  | | -> continued at addr=0x217b7db above
  | | |+[3#2/4 40%] addr=0x217b8b3 total=794219 count=11363 avg=70:
  | | |             rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericValue at /usr/include/rapidjson/document.h:726
  | | | ++[4#1/1 100%] addr=0x217b7db total=909571 count=13012 avg=70:
  | | | |              rapidjson::GenericMember<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericMember at /usr/include/rapidjson/document.h:71
  | | | |              (inlined by) rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>::~GenericValue at /usr/include/rapidjson/document.h:733
  | | | -> continued at addr=0x217b7db above
  | | |+[3#3/4 15%] addr=0x43d35a3 total=296768 count=4246 avg=70:
  | | |             seastar::shared_ptr_count_for<rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator> >::~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:492
  | | |             (inlined by) seastar::shared_ptr_count_for<rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator> >::~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:492
  | | | ++[4#1/2 98%] addr=0x43e7d06 total=289680 count=4144 avg=70:
  | | | |             seastar::shared_ptr<rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator> >::~shared_ptr at ././seastar/include/seastar/core/shared_ptr.hh:570
  | | | |             (inlined by) alternator::make_streamed(rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>&&)::$_0::operator() at ./alternator/executor.cc:127
  | | | ++          - addr=0x184e0a6:
  | | | |             std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/coroutine:240
  | | | |             (inlined by) seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose at ./build/release/seastar.lto/./seastar/include/seastar/core/coroutine.hh:125
  | | | |             (inlined by) seastar::reactor::run_tasks at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:2651
  | | | |             (inlined by) seastar::reactor::run_some_tasks at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:3114
  | | | | ++[5#1/1 100%] addr=0x2503b87 total=310677 count=4417 avg=70:
  | | | | |              seastar::reactor::do_run at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:3283
  | | | |   ++[6#1/2 78%] addr=0x46a2898 total=400571 count=5450 avg=73:
  | | | |   |             seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0::operator() at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:4501
  | | | |   |             (inlined by) std::__invoke_impl<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&> at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
  | | | |   |             (inlined by) std::__invoke_r<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&> at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:111
  | | | |   |             (inlined by) std::_Function_handler<void (), seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0>::_M_invoke at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
  | | | |   ++          - addr=0x4673fda:
  | | | |   |             std::function<void ()>::operator() at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:591
  | | | |   |             (inlined by) seastar::posix_thread::start_routine at ./build/release/seastar.lto/./seastar/src/core/posix.cc:90
  | | | |   ++          - addr=0x8c946: ?? ??:0
  | | | |   ++          - addr=0x11296f: ?? ??:0
  | | | |   ++[6#2/2 22%] addr=0x2502c1e total=113613 count=1549 avg=73:
  | | | |   |             seastar::reactor::run at ./build/release/seastar.lto/./seastar/src/core/reactor.cc:3166
  | | | |   ++          - addr=0x22068e0:
  | | | |   |             seastar::app_template::run_deprecated at ./build/release/seastar.lto/./seastar/src/core/app-template.cc:276
  | | | |   ++          - addr=0x220630b:
  | | | |   |             seastar::app_template::run at ./build/release/seastar.lto/./seastar/src/core/app-template.cc:167
  | | | |   ++          - addr=0x22334bc:
  | | | |   |             scylla_main at ./main.cc:672
  | | | |   ++          - addr=0x20411cc:
  | | | |   |             std::function<int (int, char**)>::operator() at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:591
  | | | |   |             (inlined by) main at ./main.cc:2072
  | | | |   ++          - addr=0x27b89: ?? ??:0
  | | | |   ++          - addr=0x27c4a: ?? ??:0
  | | | |   ++          - addr=0x28c8fb4:
  | | | |   |             _start at ??:?

Closes scylladb#19968
kbr-scylla pushed a commit that referenced this pull request Sep 3, 2024
By turning server::shutdown() into a coroutine, we need not dynamically
allocate "nr_conn".

Verified as follows:

(1) In terminal #1:

    build/Dev/scylla --overprovisioned --developer-mode=yes \
        --memory=2G --smp=1 --default-log-level error \
        --logger-log-level cql_server=debug:cql_server_controller=debug

> INFO  [...] cql_server_controller - Starting listening for CQL clients
>                                     on 127.0.0.1:9042 (unencrypted,
>                                     non-shard-aware)
> INFO  [...] cql_server_controller - Starting listening for CQL clients
>                                     on 127.0.0.1:19042 (unencrypted,
>                                     shard-aware)

(2) In terminals scylladb#2 and scylladb#3:

    tools/cqlsh/bin/cqlsh.py

(3) Press ^C in terminal #1:

> DEBUG [...] cql_server - abort accept nr_total=2
> DEBUG [...] cql_server - abort accept 1 out of 2 done
> DEBUG [...] cql_server - abort accept 2 out of 2 done
> DEBUG [...] cql_server - shutdown connection nr_total=4
> DEBUG [...] cql_server - shutdown connection 1 out of 4 done
> DEBUG [...] cql_server - shutdown connection 2 out of 4 done
> DEBUG [...] cql_server - shutdown connection 3 out of 4 done
> DEBUG [...] cql_server - shutdown connection 4 out of 4 done
> INFO  [...] cql_server_controller - CQL server stopped

This patch is best viewed with "git show --word-diff=color".

Suggested-by: Benny Halevy <[email protected]>
Signed-off-by: Laszlo Ersek <[email protected]>
kbr-scylla pushed a commit that referenced this pull request Sep 12, 2024
Seen the following unexplained assertion failure with
pytest -s -v --scylla-version=local_tarball --tablets repair_additional_test.py::TestRepairAdditional::test_repair_option_pr_multi_dc
```
INFO  2024-05-27 11:18:05,081 [shard 0:main] init - Shutting down repair service
INFO  2024-05-27 11:18:05,081 [shard 0:main] task_manager - Stopping module repair
INFO  2024-05-27 11:18:05,081 [shard 0:main] task_manager - Unregistered module repair
INFO  2024-05-27 11:18:05,081 [shard 1:main] task_manager - Stopping module repair
INFO  2024-05-27 11:18:05,081 [shard 1:main] task_manager - Unregistered module repair
scylla: repair/row_level.cc:3230: repair_service::~repair_service(): Assertion `_stopped' failed.
Aborting on shard 0.
Backtrace:
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3f040c
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x41c7a1
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x3dbaf
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x8e883
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x3dafd
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x2687e
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x2679a
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x36186
  0x26f2428
  0x10fb373
  0x10fc8b8
  0x10fc809
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x456c6d
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x456bcf
  0x10fc65b
  0x10fc5bc
  0x10808d0
  0x1080800
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3ff22f
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x4003b7
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x3ff888
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36dea8
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libseastar.so+0x36d0e2
  0x101cefa
  0x105a390
  0x101bde7
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27b89
  /home/bhalevy/.ccm/scylla-repository/local_tarball/libreloc/libc.so.6+0x27c4a
  0x101a764
```

Decoded:
```
~repair_service at ./repair/row_level.cc:3230
~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:491
 (inlined by) ~shared_ptr_count_for at ././seastar/include/seastar/core/shared_ptr.hh:491
~shared_ptr at ././seastar/include/seastar/core/shared_ptr.hh:569
 (inlined by) seastar::shared_ptr<repair_service>::operator=(seastar::shared_ptr<repair_service>&&) at ././seastar/include/seastar/core/shared_ptr.hh:582
 (inlined by) seastar::shared_ptr<repair_service>::operator=(decltype(nullptr)) at ././seastar/include/seastar/core/shared_ptr.hh:588
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:727
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::invoke<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&>(seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&) at ././seastar/include/seastar/core/future.hh:2035
 (inlined by) seastar::futurize<std::invoke_result<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::type>::type seastar::smp::submit_to<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::smp_submit_to_options, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) at ././seastar/include/seastar/core/smp.hh:367
seastar::futurize<std::invoke_result<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>::type>::type seastar::smp::submit_to<seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) at ././seastar/include/seastar/core/smp.hh:394
 (inlined by) operator() at ././seastar/include/seastar/core/sharded.hh:725
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>(std::__invoke_other, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int>(seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:114
 (inlined by) std::_Function_handler<seastar::future<void> (unsigned int), seastar::sharded<repair_service>::stop()::{lambda(seastar::future<void>)#1}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}>::_M_invoke(std::_Any_data const&, unsigned int&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:290
```

FWIW, gdb crashed when opening the coredump.

This commit will help catch the issue earlier
when repair_service::stop() fails (and it must never fail)

Signed-off-by: Benny Halevy <[email protected]>
(cherry picked from commit 3884575)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants