Skip to content

Commit

Permalink
Ruben's Q2 update
Browse files Browse the repository at this point in the history
  • Loading branch information
anarthal authored and louistatta committed Jul 10, 2024
1 parent dbb2648 commit 5359a3c
Showing 1 changed file with 114 additions and 0 deletions.
114 changes: 114 additions & 0 deletions _posts/2024-07-09-RubenQ2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
---
layout: post
nav-class: dark
categories: q2_update
title: "Ruben's Q2 2024 Update"
author-id: ruben
author-name: Rubén Pérez Hidalgo
---

## On C++20 modules and Boost

This quarter started with exciting discussions about the possibility to introduce C++20 modules in Boost. I've dedicated a lot of time to study and reduce Boost.MySQL build times, so I promptly volunteered to conduct some investigation on the benefits and costs of modules.

I've written two articles (available [here](https://anarthal.github.io/cppblog/modules) and [here](https://anarthal.github.io/cppblog/modules2)) about this topic. They can be roughly summed up as:

- Module clean builds aren't as fast as I'd expect, but partial re-builds are much faster.
- Modules are still poorly supported by compilers and tooling, although work is being performed on them.
- Making a library consumable as a module requires non-trivial development, testing and maintenance effort.
- At the point of writing, mixing standard includes and imports causes problems (although the standard mandates the opposite). As a consequence, if we're to support consuming a certain Boost library as a module, all its dependencies must support it, too.
- At the point of writing, most of the Boost community considers the effort is too big, and prefers to wait until modules are more stable.
- Distributing modules so they can be consumed by CMake is highly non-trivial and one of the biggest blockers.

I'm happy for this discussion to have taken place. We've all learnt a lot, and we now know enough to make informed decisions.

## using std::cpp 2024

I've had the honor of getting a talk on Boost.Asio's universal async model accepted for [using std::cpp 2024](https://eventos.uc3m.es/105614/section/47656/using-std-cpp-2024.html). You can find the slides and code samples [here](https://github.com/anarthal/usingstdcpp-2024).

The best part of it has been getting in touch with the community. I've met a lot of real Asio users and been able to discuss their pain-points in person. I'm happy to see most people working with C++17 and C++20 rather than old standards. Many people manifested interest in C++20 coroutines, but are not willing to roll their own awaitable types - they just wanted to call `co_await` and go. I was surprised to learn that they didn't know that `asio::deferred` already could do this for them, so I think I got the talk topic right.

I think we need to lead by example, and I'd like to re-write [servertech-chat](https://github.com/anarthal/servertech-chat) using C++20 coroutines - it's a great example for newcomers to learn these.

I've also been answering many of the questions coming up in #boost-asio in Slack.

## Client-side SQL formatting enhancements

Boost.MySQL client-side SQL allows composing queries client-side without incurring in injection vulnerabilities. It works great for simple queries, but it was too verbose for cases involving ranges. Consider batch lookup:

```cpp
// Retrieve all users matching the IDs provided in ids
asio::awaitable<std::vector<user>> lookup_users(any_connection& conn, std::span<const std::int64_t> ids)
{
// Compose the query
mysql::format_context ctx(conn.format_opts().value());
ctx.append_raw("SELECT * FROM user WHERE id IN (");
bool is_first = true;
for (auto id: ids)
{
// Comma separator
if (!is_first) ctx.append_raw(", ");
is_first = false;

// Actual id
mysql::format_sql_to(ctx, "{}", id);
}
ctx.append_raw(")");
std::string query = std::move(ctx).get().value();

// Run it
mysql::static_results<user> res;
co_await conn.async_execute(query, res);
co_return {res.rows().begin(), res.rows().end()};
}
```
That's verbose and easy to get wrong. And the price of an error here can be a vulnerability.
To solve this, we've added built-in support for ranges:
```cpp
asio::awaitable<std::vector<user>> lookup_users(any_connection& conn, std::span<const std::int64_t> ids)
{
// Compose the query. May generate "SELECT * FROM user WHERE id IN (10, 21, 202)"
auto query = mysql::format_sql(conn.format_opts().value(), "SELECT * FROM user WHERE id IN ({})", ids);
// Run it
mysql::static_results<user> res;
co_await conn.async_execute(query, res);
co_return {res.rows().begin(), res.rows().end()};
}
```

Much better, isn't it? And if you need additional functionality, `mysql::sequence` allows to pass custom glue strings and per-element formatting functions. Most cases, including batch inserts, can be expressed in terms of a single format string.

Our next step here is implementing an easy-to-use execution request that colaesces composing the query and executing it in a single step. This came up during the review, and it's finally going to be a reality.

## Pipeline mode

MySQL client/server protocol is strictly half duplex. The client sends a request, and the server responds. Performing some measuring, some use cases involving lightweight requests are dominated by round-trip time. In these cases, coalescing individual requests into a single message helps performance.

Use cases fitting this description include connection setup code and preparing/closing statements in batch.

The `connection_pool` class has been using this feature internally for a release, and we now expose it for the general public:

```cpp
// Sets up a connection for re-use. connection_pool cleans up connections in a similar way
asio::awaitable<void> setup_connection(any_connection& conn)
{
// Build a pipeline describing what to do.
mysql::pipeline_request req;
req.add_reset_connection() // wipe session state
.add_set_character_set(mysql::utf8mb4_charset) // SET NAMES utf8mb4
.add_execute("SET time_zone = '+00:00'"); // Use UTC as time zone
std::vector<mysql::stage_response> res;

// Execute the pipeline
co_await conn.async_run_pipeline(req, res, asio::deferred);
}
```
We can write this because we fully control the library's serialization and networking, rather than wrapping other MySQL clients.
## Other Boost.MySQL work
I've also worked on lesser (but necessary) tasks on Boost.MySQL, including enabling buffer size limits for `any_connection`, adding support for C++20 time types to our `date` and `datetime` types, as well as some bug fixing and refactoring.

0 comments on commit 5359a3c

Please sign in to comment.