Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why accessing memory cause numerous pgfault? #300

Open
xfan1024 opened this issue Dec 21, 2024 · 1 comment
Open

Why accessing memory cause numerous pgfault? #300

xfan1024 opened this issue Dec 21, 2024 · 1 comment

Comments

@xfan1024
Copy link

xfan1024 commented Dec 21, 2024

Description

I've noticed that on the SG2042, extensive memory access operations in user mode result in significant time spent in kernel mode. It seems that many page faults are occurring in the kernel.

Typically, page faults happen during the initial memory access or if the system has swap enabled. However, even with swap disabled and after the initial access, there are still numerous page faults. Are these page faults necessary? If not, can they be optimized?

Steps to reproduce

pgfault.py

this script help to monitor the pgfault count.

import time

def read_pgfault():
    with open("/proc/vmstat", "r") as f:
        for line in f:
            if line.startswith("pgfault"):
                return int(line.split()[1])
    return 0

def main():
    previous_pgfault = None
    while True:
        current_pgfault = read_pgfault()
        if previous_pgfault is not None:
            diff = current_pgfault - previous_pgfault
            print(f"{time.strftime('%Y-%m-%d %H:%M:%S')} Current pgfault: {current_pgfault}, Diff: {diff}")
        previous_pgfault = current_pgfault
        time.sleep(1)

if __name__ == "__main__":
    main()

memtest.c

this is the test code to access memory.

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <pthread.h>

#define NUM_THREADS 64
#define NUM_ELEMENTS ((size_t)((1ull * 1024 * 1024 * 1024) / sizeof(uint64_t) / NUM_THREADS))
#define NUM_ITERATIONS 128

struct thread_data
{
    uint64_t *data;
    size_t elements;
    size_t iterations;
};

void memtest(uint64_t *data, size_t elements)
{
    for (size_t i = 0; i < elements; i++)
        data[i] = (uint64_t)i;
}

void *thread_memtest(void *arg)
{
    struct thread_data *data = (struct thread_data *)arg;
    for (size_t i = 0; i < data->iterations; i++)
        memtest(data->data, data->elements);
    return NULL;
}

int main(int argc, char **argv)
{
    pthread_t threads[NUM_THREADS];
    struct thread_data thread_data[NUM_THREADS];
    
    for (size_t i = 0; i < NUM_THREADS; i++)
    {
        thread_data[i].data = (uint64_t *)malloc(NUM_ELEMENTS * sizeof(uint64_t));
        thread_data[i].elements = NUM_ELEMENTS;
        thread_data[i].iterations = NUM_ITERATIONS;
    }

    printf("press enter to warm up");
    getchar();
    for (size_t i = 0; i < NUM_THREADS; i++)
        memtest(thread_data[i].data, thread_data[i].elements);

    printf("press enter to start test");
    getchar();
    for (size_t i = 0; i < NUM_THREADS; i++)
        pthread_create(&threads[i], NULL, thread_memtest, &thread_data[i]);

    for (size_t i = 0; i < NUM_THREADS; i++)
        pthread_join(threads[i], NULL);
    return 0;
}

Test Results

Test on SG2042 (linux 6.6)

warm up stage

2024-12-21 12:25:30 Current pgfault: 747897, Diff: 0
2024-12-21 12:25:31 Current pgfault: 762836, Diff: 14939
2024-12-21 12:25:32 Current pgfault: 781113, Diff: 18277
2024-12-21 12:25:33 Current pgfault: 781113, Diff: 0

test stage

A large number of pgfaults occur here

2024-12-21 12:25:34 Current pgfault: 781113, Diff: 0
2024-12-21 12:25:35 Current pgfault: 781247, Diff: 134
2024-12-21 12:25:36 Current pgfault: 781247, Diff: 0
2024-12-21 12:25:37 Current pgfault: 781247, Diff: 0
2024-12-21 12:25:38 Current pgfault: 785357, Diff: 4110
2024-12-21 12:25:39 Current pgfault: 800029, Diff: 14672
2024-12-21 12:25:40 Current pgfault: 817000, Diff: 16971
2024-12-21 12:25:41 Current pgfault: 834280, Diff: 17280
2024-12-21 12:25:43 Current pgfault: 836192, Diff: 1912
2024-12-21 12:25:44 Current pgfault: 836320, Diff: 128
2024-12-21 12:25:45 Current pgfault: 836320, Diff: 0
2024-12-21 12:25:46 Current pgfault: 836320, Diff: 0
2024-12-21 12:25:47 Current pgfault: 836320, Diff: 0
2024-12-21 12:25:48 Current pgfault: 836362, Diff: 42
2024-12-21 12:25:49 Current pgfault: 836362, Diff: 0

Test on x86_64

only warm up stage cause pgfault, test stage no pgfault.

warm up stage

2024-12-22 01:39:34 Current pgfault: 1235160, Diff: 0
2024-12-22 01:39:35 Current pgfault: 1268376, Diff: 33216
2024-12-22 01:39:36 Current pgfault: 1268376, Diff: 0

test stage

These 134 pgfaults should not be caused by accessing the data array but by starting the threads.

2024-12-22 01:39:38 Current pgfault: 1268376, Diff: 0
2024-12-22 01:39:39 Current pgfault: 1268510, Diff: 134
2024-12-22 01:39:40 Current pgfault: 1268510, Diff: 0

@xfan1024
Copy link
Author

xfan1024 commented Dec 21, 2024

In the previous linux-6.1.55, the speed of concurrent memory access by 64 threads sometimes was even below 10MB/s.(not per thread speed, it's speed of all threads)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant