Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ksymbols): reimplement ksymbols #4464

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

oshaked1
Copy link
Contributor

@oshaked1 oshaked1 commented Dec 25, 2024

1. Explain what the PR does

The previous ksymbols implementation used a lazy lookup method, where only symbols marked as required ahead of time were stored. Trying to lookup a symbol that was not stored resulted in /proc/kallsyms being read and parsed in its entirety.
While most symbols being looked up were registered as required ahead of time, some weren't (in particular symbols needed for kprobe attachment) which incurred significant overhead when tracee is being initialized.

This new implementation stores all symbols, or if a requiredDataSymbolsOnly flag is used when creating the symbol table (used by default), only non-data symbols are stored (and required data symbols must be registered before updating). Some additional memory usage optimizations are included, for example encoding symbol owners as an index into a list of owner names, and also lazy symbol name lookups where the map of symbol name to symbol is populated only for symbols that were looked up once.

From measurements I performed, the extra memory consumption is around 25MB (from ~132MB to ~157MB when running tracee with no arguments on my machine).

Under the hood, this ksymbols implementation uses a generic symbol table implementation that can be used by future code for managing executable file symbols.

A significant advantage gained by storing all non-data symbols is the ability to lookup a function symbol that contains a given code address, a feature that I plan to use in the future.

This PR closes #4463 and renders #4325 irrelevant (because /proc/kallsyms reads no-longer happen "spontaneously").

Copy link
Collaborator

@NDStrahilevitz NDStrahilevitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work overall, though I do have some comments in mind.

pkg/utils/symbol_table.go Show resolved Hide resolved
mu sync.RWMutex
// All symbols sorted by their address in descending order,
// for quick binary searches by address.
sortedSymbols []*T
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why *T? Symbol is already a pointer by definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it is? From my understanding if I use T then sortedSymbols contains all of the symbols inline, which I tried but it made it harder to manage symbol references in the name to symbol map.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC say you have a struct S which implements some interface I. Then you can pass s S as I only with &s.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what you're doing is requiring concretely a **KernelSymbol instead of a *KernelSymbol as Symbol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC say you have a struct S which implements some interface I. Then you can pass s S as I only with &s.

If I understand correctly, this applies only if the struct methods are implemented using a pointer receiver, notice how I implemented them without it and given a non-pointer type when creating the underlying symbol table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointer receivers tend to be more efficient but that's besides the point. Yes you're correct but then why add the indirection at this point? Just stick to the (hopefully) stack allocated objects. Now you're forcing heap escape. If we're going through pointer eventually might as well just author the struct as a pointer receiver and make the code simpler.


// Add the new symbols to the sorted slice (which now becomes unsorted).
// Allocate the slice with the needed capacity to avoid overallocation.
oldSymbols := st.sortedSymbols
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this "copy" is unsafe as its just a pointer reassignment.
I think you should rather allocate the new slice, use on copy the old and parameter slice into the new allocation, then assign that one to the object field (while the prior slice will be given to GC).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a copy, I'm just saving a reference to the old slice in a temporary variable before assigning an empty slice to the struct variable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall operation you're doing is copying the old into a new extended. I was pinning to the particular line here as its:

  1. The first of the operation
  2. Seemed problematic to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand why it's unsafe

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You assign the pointer of st.sortedSymbols to oldSymbols. This isn't a copy.
Then to the pointer of st.sortedSymbols you assign the make. This may be reassigning a pointer which is safe or assigning to the content of the pointer. It's probably the former, but if the latter that is unsafe as you are essentially overriding the content of oldSymbols as well.
In any case since there is no sorted insertion here, i'm not sure why you do the whole reallocation, just append the items, simpler and quicker, and sometimes you will avoid reallocation if the slice has preallocated space.

pkg/utils/symbol_table.go Outdated Show resolved Hide resolved
symbols = append(symbols, symbol)
}
}
st.mu.RUnlock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should for safety's sake push one of these into the start of the following if block, and then duplicate this call before the last return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what you mean, why is that safer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you're using a mutex because different goroutines can call this data structure's methods from multiple locations. By unlocking here you're leaving a small gap where another goroutine can "sneak in", do changes and then unlock you back inside the if post changes.
That's why the gap between locks and unlocks should be as small as possible - so this unlock should be done near the inner Lock of the if or right before the return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood 👍

Owner: symbolOwner,
}
func (kst *KernelSymbolTable) ForEachSymbol(callback func(*KernelSymbol)) {
kst.symbols.ForEachSymbol(callback)
}

// copySliceOfPointersToSliceOfStructs converts a slice of pointers to a slice of structs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is left unused inside the package, and I see you've copied it elsewhere, this can be removed, no reason to leave dangling code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in the test, should I remove it from the package and call the inner version in the test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've copied it into the test file no? Then you're using that copy anyway and this is dangling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, getTheOnlySymbol() in the test file uses KernelSymbolTable.ForEachSymbol()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to copySliceOfPointersToSliceOfStructs , I see now it might not be clear from the snippet.

// limit to prevent unrelated symbols from being returned for an address lookup
const maxSymbolSize = 0x100000

var ownersMu sync.RWMutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following owner tracking data structures should be localized to the struct I think. Since this package doesn't expose a singleton (which it certainly could given that a kernel symbol table is a single system file) I don't think we should mix global and struct variables.
That is unless you want to make this a singleton.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I made it global is so that I could access it in the Owner() function, without needing to save a back-reference to the symbol table in each symbol (which would incur a significant memory overhead). Is there another way you think I could localize it? I don't want to use a singleton because of the configurable aspect (requiredDataSymbolsOnly, lazyNameLookup).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What purpose does the index:owner mapping serve? I see that when you parse the file you do string->int and when you call Owner() you do int->string (via array), but I don't see where the int plays a role in itself. Why not just use strings overall?
Further, if you expect that an array of owner string will be rather small (up to 5 lets say), then the map duplication is redundant (consider that in event args we prefer arrays to a map), and you could also drop the mutex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is to reduce memory usage, storing a uint16 takes less memory than an owner string. On my machine there are 103 possible owners so I'm not sure it would be smart to drop the map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it wouldn't be that bad storing the name, on my machine 30622/230472 symbols are owned by modules, and the system symbols could have a memory efficient representation (like a nil)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is to reduce memory usage, storing a uint16 takes less memory than an owner string. On my machine there are 103 possible owners so I'm not sure it would be smart to drop the map.

That's why I said it depends on the expected count.

Maybe it wouldn't be that bad storing the name, on my machine 30622/230472 symbols are owned by modules, and the system symbols could have a memory efficient representation (like a nil)

That does sound like a lot and storing the string directly would be costly... So going with the int was a good call.

I still don't like the mixed approach but maybe that means the kernelsymbol table should be a singleton. Tentative on this for now.

environment.WithRequiredSymbols(t.requiredKsyms),
)
t.kernelSymbols = environment.NewKernelSymbolTable(true, true)
// t.requiredKsyms may contain non-data symbols, but it doesn't affect the validity of this call
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since plenty of the symbols i've attempted to include here belong to the T symbols, I think trimming that logic will save both RSS, loading time, and code complexity.
If you do it, please do so in a separate commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could trim the symbols included for attachment purposes, but anything included as a ksymbol event dependency needs to stay becuase we don't know if it's a data symbol or not. I think the effect on RSS and loading time would be negligible, but maybe it's worth it for the sake of code complexity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would still leave us with plenty of symbols down. Aren't the mem_dump symbols also mostly T symbols?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would still leave us with plenty of symbols down. Aren't the mem_dump symbols also mostly T symbols?

We have no way to distinguish between T and D symbols requested by print_mem_dump (until they are found in the kallsyms file).

This implementation stores all symbols, or if a `requiredDataSymbolsOnly`
flag is used when creating the symbol table, only non-data symbols are saved
(and required data symbols must be registered before updating).
This new implementation uses a generic symbol table implementation that is
responsible for managing symbol lookups, and can be used by future code for
managing exeutable file symbols.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Significant initialization slowdown from kallsyms parsing
2 participants