Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SST/IST request preparation missing on initial sync #627

Open
Hecke29 opened this issue Oct 28, 2022 · 2 comments
Open

SST/IST request preparation missing on initial sync #627

Hecke29 opened this issue Oct 28, 2022 · 2 comments

Comments

@Hecke29
Copy link

Hecke29 commented Oct 28, 2022

Using docker-image mariadb:10.9 but also mariadb:10.7.
The IST Receiver opens up but WSREP: Prepared SST request: rsync|10.1.0.96:4568/rsync_sst or similar never happens.
Basically the joiner never requests a transfer from any other node. So consequently the transfer never starts.

It's a docker-setup that uses bridge-networking which is complex I know: but what is the condition to the SST-Request being prepared at all? Why is it not happening?

Joiner-Logs:

mariadb_1   | 2022-10-28  9:03:00 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 4804590)
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: ####### processing CC 4804590, local, ordered
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: Process first view: 67936391-c563-11ec-8631-f70270539624 my uuid: 524090c4-569f-11ed-8a56-ab7e738222ef
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: Server mariadb-1 connected to cluster at position 67936391-c563-11ec-8631-f70270539624:4804590 with ID 524090c4-569f-11ed-8a56-ab7e738222ef
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: Server status change disconnected -> connected
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: ####### My UUID: 524090c4-569f-11ed-8a56-ab7e738222ef
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes
mariadb_1   | 2022-10-28  9:03:00 0 [Note] WSREP: Service thread queue flushed.
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: State transfer required: 
mariadb_1   | 	Group state: 67936391-c563-11ec-8631-f70270539624:4804590
mariadb_1   | 	Local state: 00000000-0000-0000-0000-000000000000:-1
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: Server status change connected -> joiner
mariadb_1   | 2022-10-28  9:03:00 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
mariadb_1   | 2022-10-28  9:03:00 0 [Note] WSREP: Joiner monitor thread started to monitor
mariadb_1   | 2022-10-28  9:03:00 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address 'mariadb-1.mariadb-galera-cluster' --datadir '/var/lib/mysql/' --parent 1 --progress 0'
mariadb_1   | WSREP_SST: [INFO] rsync SST started on joiner (20221028 09:03:00.889)
mariadb_1   | 2022-10-28  9:03:01 2 [Note] WSREP: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 4804590, STRv: 3
mariadb_1   | 2022-10-28  9:03:01 2 [Note] WSREP: IST receiver addr using tcp://10.1.0.96:4568
mariadb_1   | 2022-10-28  9:03:01 2 [Note] WSREP: IST receiver bind using tcp://0.0.0.0:4568
mariadb_1   | 2022-10-28  9:03:01 2 [Note] WSREP: Prepared IST receiver for 0-4804590, listening at: tcp://0.0.0.0:4568
mariadb_1   | 2022-10-28  9:03:03 0 [Note] WSREP: (524090c4-8a56, 'tcp://0.0.0.0:4567') turning message relay requesting off

Output of ps aux:

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
mysql          1  0.1  0.4 834088 34448 ?        Ssl  09:02   0:00 mariadbd
mysql         65  0.0  0.0   2880   996 ?        S    09:03   0:00 sh -c wsrep_sst_rsync --role 'joiner' --address 'mariadb-1.mariadb-galera-cluster' --datadir '/var/lib/mysql/' --parent 1 --pr
mysql         66  0.1  0.0   5012  3932 ?        S    09:03   0:00 bash /usr//bin/wsrep_sst_rsync --role joiner --address mariadb-1.mariadb-galera-cluster --datadir /var/lib/mysql/ --parent 1 -
mysql        191  0.0  0.0   8920  2700 ?        S    09:03   0:00 rsync --daemon --no-detach --port 4444 --config /var/lib/mysql/rsync_sst.conf
mysql       1239  0.0  0.0   2780  1052 ?        S    09:06   0:00 sleep 1
root        1240  0.0  0.0   7052  1560 pts/0    R+   09:06   0:00 ps aux

I already confirmed the wsrep_sst_rsync script is waiting for the MAGIC_FILE here:

while [ ! -r "$MAGIC_FILE" ] && check_pid "$TRANSFER_PID" && \
          ps -p $MYSQLD_PID >/dev/null 2>&1
do
        sleep 1
done
@ayurchen
Copy link
Member

Hi,
Just before that line you have

    echo "ready $ADDR:$RSYNC_PORT/$MODULE"

when the parent process (mysqld) reads that, it will send SST request to cluster and a donor will be chosen, that should be reflected in the error log. If there is nothing more in the error log after the lines that you posted, then something unrelated to networking happened. If this node managed to connect to cluster, then networking is fine.
What's in the error log on other nodes?

@hajnalmt
Copy link

We experience the same hanging with mariabackup sst too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants