You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A process can't change its own PID namespace. Why? The reason given is that would make the return value of getpid change during its execution, and lots of existing code doesn't expect that, and the consequences would be too awful.
But, why can't a process change its PID namespace during exec? It won't observe its own PID change because the memory in which it might record its PID changing would be lost by the exec. Other processes won't notice its PID change either, because they'll be in the parent PID namespace and so still see the original PID. (It could notice if the original PID was passed in argv or envp, but why would anyone do such a thing?)
Use case: I want to start a child process from my shell (e.g. bash), and have that child process be PID 1 of its own PID namespace. Currently, I end up with another process in-between my shell and the PID 1, the shell won't consider the PID 1 to be its direct child. (I can make my PID 1 a direct child using CLONE_PARENT, but then the shell doesn't know what to do with that child process since it didn't start it.)
Possible API: Add a new flag AT_NEWPID to execveat system call. (Could use 0x200, based on unlinkat/faccessat precedent of using that for syscall-specific AT_ flag.) Fail with EINVAL if AT_NEWPID passed when pid_ns_for_children is the same as your own PID namespace. Once execveat gets to "point of no return", if AT_NEWPID is set, move the process into its pid_ns_for_children.
Restrictions: (a) fail with EINVAL if you are PID 1 of your PID namespace; (b) fail with EINVAL if your pid_ns_for_children is not an empty/never-used PID namespace that has never had a PID 1 yet.
Possible implementation: Change create_pid_cachep to allocate storage for one extra struct upid level, so we can handle entering an empty child PID namespace. We only need a single extra level due to the above restrictions. After that, we allocate PID 1 in the child PID namespace IDR, then increment the struct pid level. We'd also need some flag in struct pid to indicate this has happened, so put_pid knows to call kmem_cache_free using the parent namespace (pid->numbers[pid->level-1].ns) pid_cachep not the actual namespace's.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
A process can't change its own PID namespace. Why? The reason given is that would make the return value of
getpid
change during its execution, and lots of existing code doesn't expect that, and the consequences would be too awful.But, why can't a process change its PID namespace during
exec
? It won't observe its own PID change because the memory in which it might record its PID changing would be lost by theexec
. Other processes won't notice its PID change either, because they'll be in the parent PID namespace and so still see the original PID. (It could notice if the original PID was passed inargv
orenvp
, but why would anyone do such a thing?)Use case: I want to start a child process from my shell (e.g.
bash
), and have that child process be PID 1 of its own PID namespace. Currently, I end up with another process in-between my shell and the PID 1, the shell won't consider the PID 1 to be its direct child. (I can make my PID 1 a direct child usingCLONE_PARENT
, but then the shell doesn't know what to do with that child process since it didn't start it.)Possible API: Add a new flag
AT_NEWPID
toexecveat
system call. (Could use0x200
, based onunlinkat
/faccessat
precedent of using that for syscall-specificAT_
flag.) Fail withEINVAL
ifAT_NEWPID
passed whenpid_ns_for_children
is the same as your own PID namespace. Onceexecveat
gets to "point of no return", ifAT_NEWPID
is set, move the process into itspid_ns_for_children
.Restrictions: (a) fail with
EINVAL
if you are PID 1 of your PID namespace; (b) fail withEINVAL
if yourpid_ns_for_children
is not an empty/never-used PID namespace that has never had a PID 1 yet.Possible implementation: Change
create_pid_cachep
to allocate storage for one extrastruct upid
level, so we can handle entering an empty child PID namespace. We only need a single extra level due to the above restrictions. After that, we allocate PID 1 in the child PID namespace IDR, then increment thestruct pid
level. We'd also need some flag instruct pid
to indicate this has happened, soput_pid
knows to callkmem_cache_free
using the parent namespace (pid->numbers[pid->level-1].ns
)pid_cachep
not the actual namespace's.Beta Was this translation helpful? Give feedback.
All reactions