Ruben's Q2 update

cppalliance · Jul 10, 2024 · 5359a3c · 5359a3c
1 parent dbb2648
commit 5359a3c
Showing 1 changed file with 114 additions and 0 deletions.
diff --git a/_posts/2024-07-09-RubenQ2.md b/_posts/2024-07-09-RubenQ2.md
@@ -0,0 +1,114 @@
+---
+layout: post
+nav-class: dark
+categories: q2_update
+title: "Ruben's Q2 2024 Update"
+author-id: ruben
+author-name: Rubén Pérez Hidalgo
+---
+
+## On C++20 modules and Boost
+
+This quarter started with exciting discussions about the possibility to introduce C++20 modules in Boost. I've dedicated a lot of time to study and reduce Boost.MySQL build times, so I promptly volunteered to conduct some investigation on the benefits and costs of modules.
+
+I've written two articles (available [here](https://anarthal.github.io/cppblog/modules) and [here](https://anarthal.github.io/cppblog/modules2)) about this topic. They can be roughly summed up as:
+
+- Module clean builds aren't as fast as I'd expect, but partial re-builds are much faster.
+- Modules are still poorly supported by compilers and tooling, although work is being performed on them.
+- Making a library consumable as a module requires non-trivial development, testing and maintenance effort.
+- At the point of writing, mixing standard includes and imports causes problems (although the standard mandates the opposite). As a consequence, if we're to support consuming a certain Boost library as a module, all its dependencies must support it, too.
+- At the point of writing, most of the Boost community considers the effort is too big, and prefers to wait until modules are more stable.
+- Distributing modules so they can be consumed by CMake is highly non-trivial and one of the biggest blockers.
+
+I'm happy for this discussion to have taken place. We've all learnt a lot, and we now know enough to make informed decisions.
+
+## using std::cpp 2024
+
+I've had the honor of getting a talk on Boost.Asio's universal async model accepted for [using std::cpp 2024](https://eventos.uc3m.es/105614/section/47656/using-std-cpp-2024.html). You can find the slides and code samples [here](https://github.com/anarthal/usingstdcpp-2024).
+
+The best part of it has been getting in touch with the community. I've met a lot of real Asio users and been able to discuss their pain-points in person. I'm happy to see most people working with C++17 and C++20 rather than old standards. Many people manifested interest in C++20 coroutines, but are not willing to roll their own awaitable types - they just wanted to call `co_await` and go. I was surprised to learn that they didn't know that `asio::deferred` already could do this for them, so I think I got the talk topic right.
+
+I think we need to lead by example, and I'd like to re-write [servertech-chat](https://github.com/anarthal/servertech-chat) using C++20 coroutines - it's a great example for newcomers to learn these.
+
+I've also been answering many of the questions coming up in #boost-asio in Slack.
+
+## Client-side SQL formatting enhancements
+
+Boost.MySQL client-side SQL allows composing queries client-side without incurring in injection vulnerabilities. It works great for simple queries, but it was too verbose for cases involving ranges. Consider batch lookup:
+
+```cpp
+// Retrieve all users matching the IDs provided in ids
+asio::awaitable<std::vector<user>> lookup_users(any_connection& conn, std::span<const std::int64_t> ids)
+{
+    // Compose the query
+    mysql::format_context ctx(conn.format_opts().value());
+    ctx.append_raw("SELECT * FROM user WHERE id IN (");
+    bool is_first = true;
+    for (auto id: ids)
+    {
+        // Comma separator
+        if (!is_first) ctx.append_raw(", ");
+        is_first = false;
+
+        // Actual id
+        mysql::format_sql_to(ctx, "{}", id);
+    }
+    ctx.append_raw(")");
+    std::string query = std::move(ctx).get().value();
+
+    // Run it
+    mysql::static_results<user> res;
+    co_await conn.async_execute(query, res);
+    co_return {res.rows().begin(), res.rows().end()};
+}
+```
+
+That's verbose and easy to get wrong. And the price of an error here can be a vulnerability.
+To solve this, we've added built-in support for ranges:
+
+```cpp
+asio::awaitable<std::vector<user>> lookup_users(any_connection& conn, std::span<const std::int64_t> ids)
+{
+    // Compose the query. May generate "SELECT * FROM user WHERE id IN (10, 21, 202)"
+    auto query = mysql::format_sql(conn.format_opts().value(), "SELECT * FROM user WHERE id IN ({})", ids);
+
+    // Run it
+    mysql::static_results<user> res;
+    co_await conn.async_execute(query, res);
+    co_return {res.rows().begin(), res.rows().end()};
+}
+```
+
+Much better, isn't it? And if you need additional functionality, `mysql::sequence` allows to pass custom glue strings and per-element formatting functions. Most cases, including batch inserts, can be expressed in terms of a single format string.
+
+Our next step here is implementing an easy-to-use execution request that colaesces composing the query and executing it in a single step. This came up during the review, and it's finally going to be a reality.
+
+## Pipeline mode
+
+MySQL client/server protocol is strictly half duplex. The client sends a request, and the server responds. Performing some measuring, some use cases involving lightweight requests are dominated by round-trip time. In these cases, coalescing individual requests into a single message helps performance.
+
+Use cases fitting this description include connection setup code and preparing/closing statements in batch.
+
+The `connection_pool` class has been using this feature internally for a release, and we now expose it for the general public:
+
+```cpp
+// Sets up a connection for re-use. connection_pool cleans up connections in a similar way
+asio::awaitable<void> setup_connection(any_connection& conn)
+{
+    // Build a pipeline describing what to do.
+    mysql::pipeline_request req;
+    req.add_reset_connection() // wipe session state
+        .add_set_character_set(mysql::utf8mb4_charset) // SET NAMES utf8mb4
+        .add_execute("SET time_zone = '+00:00'"); // Use UTC as time zone
+    std::vector<mysql::stage_response> res;
+
+    // Execute the pipeline
+    co_await conn.async_run_pipeline(req, res, asio::deferred);
+}
+```
+
+We can write this because we fully control the library's serialization and networking, rather than wrapping other MySQL clients.
+
+## Other Boost.MySQL work
+
+I've also worked on lesser (but necessary) tasks on Boost.MySQL, including enabling buffer size limits for `any_connection`, adding support for C++20 time types to our `date` and `datetime` types, as well as some bug fixing and refactoring.