gh-135953: Properly obtain main thread identifier in Gecko Collector#146045
gh-135953: Properly obtain main thread identifier in Gecko Collector#146045flowln wants to merge 3 commits intopython:mainfrom
Conversation
Since running a profiler via CLI (python -m profiling.sampling run) spawns a new subprocess where the actual user-specified code will run, a call to threading.main_thread() in the collector's process will not return the profiled process's main thread. To combat this, we rely on the fact that thread objects are inserted in such a way that the first object in the list represents the oldest ThreadState object [1], which corresponds to a ThreadState associated with the main thread. [1] - https://github.com/python/cpython/blob/1b118353bb0a9d816de6ef673f3b11775de5bec5/Include/internal/pycore_interp_structs.h#L831 Signed-off-by: Sofia Donato Ferreira <flowlnlnln@gmail.com>
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
The ordering is actually newest -> oldest, since the _remote_debugging code traverses the ThreadState linked list in the intepreter state, appending to a list of threads in-order. Signed-off-by: Sofia Donato Ferreira <flowlnlnln@gmail.com>
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
It seems to sometimes happen that some samples (possibly at the very start / end) do not have any active threads in the interpreter state. It is OK to make main_tid maybe None here, since if it is, the threads list is empty, and so the for loop will not execute anyways. Signed-off-by: Sofia Donato Ferreira <flowlnlnln@gmail.com>
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
|
Thanks for the PR! I think we should do this but I find the approach a bit flaky. We have enough information in the C layer to know what thread is the main thread so we can surface that in the data that we emit (or alternatively we can add to the unwinder a new method that tells us the tid of the main thread). Do you feel comfortable to do this in the C code? |
Since running a profiler via CLI (
python -m profiling.sampling run) spawns a new subprocess where the actual user-specified code will run, a call tothreading.main_thread()in the collector's process will not return the profiled process's main thread.To combat this, we rely on the fact that thread objects are inserted in such a way that the last object in the list represents the oldest
ThreadStateobject, which corresponds to aThreadStateassociated with the main thread.This allows the main thread to be highlighted on Firefox Profiler:
