faulthandler: make per-call thread dump cap configurable

# Feature or enhancement

### Proposal:

`faulthandler.dump_traceback()` and `faulthandler.dump_traceback_later()` cap their output at 100 threads via a hardcoded `MAX_NTHREADS = 100` in `Python/traceback.c`. Past the cap, the dump writes `"...\n"` and stops.

```c
/* Python/traceback.c */
#define MAX_NTHREADS 100

const char* _Py_NO_SANITIZE_THREAD
_Py_DumpTracebackThreads(int fd, PyInterpreterState *interp,
                         PyThreadState *current_tstate)
{
    ...
    if (nthreads >= MAX_NTHREADS) {
        PUTS(fd, "...\n");
        break;
    }
    ...
}
```

The constant has been there since faulthandler was added in 2010 (gh-2762). It's still documented as a fixed limit in `Doc/library/faulthandler.rst`:

> It is limited to 100 frames and 100 threads.

Proposal: add a keyword-only `max_threads` to both functions, default 100. Callers that omit it get identical behavior to today.

```python
faulthandler.dump_traceback(file=sys.stderr, all_threads=True, *, max_threads=100)
faulthandler.dump_traceback_later(timeout, repeat=False, file=sys.stderr,
                                  exit=False, *, max_threads=100)
```

### Motivation

100 was reasonable in 2010, but a few things can make it a problem now:

1. **Library-internal thread counts.** A `google-cloud-firestore` Watch subscription spawns ~2 long-lived gRPC threads (`google.api_core.bidi`, `grpc._channel.consume_request_iterator`). 50 active subscriptions ≈ 100 threads. Other widely-used libraries (gRPC channels generally, ThreadPoolExecutor, asyncio's default executor) push it further. Modern server processes routinely run with 100+ long-lived threads, regardless of what the application code itself spawns.

2. **Dump order is newest-thread-first.** `_Py_DumpTracebackThreads` walks `tstate` via `PyInterpreterState_ThreadHead → tstate->next`. New thread states are *prepended* to the list by `add_threadstate()` in `Python/pystate.c`, so this walk visits the most-recently-created thread first and the main thread last. With more than 100 threads alive, the cap is guaranteed to cut off the oldest entries - including the main thread.

When you reach for `dump_traceback_later` as a deadlock-detection watchdog, the dump can arrive missing the main thread - the one you actually wanted.

### Has this already been discussed elsewhere?

Searched the issue tracker, PR history, Discourse, and python-dev. Nothing found. gh-98825 is adjacent (per-thread context for tracebacks) but doesn't touch the cap.

### Links to previous discussion of this feature

None found.

---

## Implementation

Patch ready: ~180 lines across 12 files (most auto-generated clinic + globals). Builds clean on current `main`. `test_faulthandler` passes 50/50, 6 platform skips. Includes:

- Two tests in `Lib/test/test_faulthandler.py`:
  - `test_dump_traceback_max_threads`: spawns 6 worker threads, dumps with `max_threads=3`, asserts the truncation marker is present and only 3 thread headers appear.
  - `test_dump_traceback_max_threads_default`: dumps without the kwarg, asserts no marker on a process with <100 threads.
- Doc updates in `Doc/library/faulthandler.rst` (signatures, `versionchanged:: 3.15`, the limits paragraph).
- `Misc/NEWS.d/next/Library/...` entry.

### Implementation summary

| File | Change |
|---|---|
| `Python/traceback.c` | `_Py_DumpTracebackThreads` gains `max_nthreads`; `MAX_NTHREADS` define moves to the header (renamed `_Py_TRACEBACK_MAX_NTHREADS`). |
| `Include/internal/pycore_traceback.h` | Expose `_Py_TRACEBACK_MAX_NTHREADS`; update function signature. |
| `Include/internal/pycore_faulthandler.h` | Add `max_nthreads` to the watchdog `thread` struct. |
| `Modules/faulthandler.c` | New `max_threads` clinic kwarg on both functions (default 100). Fatal-signal handler passes `_Py_TRACEBACK_MAX_NTHREADS` explicitly. |
| `Python/pylifecycle.c` | `_Py_FatalError_DumpTracebacks` passes `_Py_TRACEBACK_MAX_NTHREADS` explicitly. |
| `Doc/library/faulthandler.rst` | Document the kwarg + `versionchanged`. |
| `Lib/test/test_faulthandler.py` | Two new tests. |

### Backward compatibility

Default `max_threads=100` matches the old cap, so omitting the kwarg gives no change. In-tree fatal-signal callers and the watchdog all pass `_Py_TRACEBACK_MAX_NTHREADS` explicitly.

No public C API change. `_Py_DumpTracebackThreads` and `_Py_TRACEBACK_MAX_NTHREADS` both live in `pycore_traceback.h`, gated by `Py_BUILD_CORE`.

### Open questions

1. **`register()` could get the same kwarg.** Left it out to keep the diff minimal. Signal-handler path has signal-safety constraints that make the rationale weaker there but not zero. Easy follow-up.

2. **`MAX_FRAME_DEPTH` is similarly hardcoded.** Same pattern would apply but ergonomics differ - a deep stack on a deadlocked thread is usually what you want, not what you'd cap. Probably a separate proposal.

I'll open a PR if there's interest as scoped.

### CPython versions tested on

3.15 (current `main`).

### Operating systems tested on

Linux, macOS.


### Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

### Links to previous discussion of this feature:

_No response_


### Linked PRs
* gh-149106

File	Change
`Python/traceback.c`	`_Py_DumpTracebackThreads` gains `max_nthreads`; `MAX_NTHREADS` define moves to the header (renamed `_Py_TRACEBACK_MAX_NTHREADS`).
`Include/internal/pycore_traceback.h`	Expose `_Py_TRACEBACK_MAX_NTHREADS`; update function signature.
`Include/internal/pycore_faulthandler.h`	Add `max_nthreads` to the watchdog `thread` struct.
`Modules/faulthandler.c`	New `max_threads` clinic kwarg on both functions (default 100). Fatal-signal handler passes `_Py_TRACEBACK_MAX_NTHREADS` explicitly.
`Python/pylifecycle.c`	`_Py_FatalError_DumpTracebacks` passes `_Py_TRACEBACK_MAX_NTHREADS` explicitly.
`Doc/library/faulthandler.rst`	Document the kwarg + `versionchanged`.
`Lib/test/test_faulthandler.py`	Two new tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

faulthandler: make per-call thread dump cap configurable #149085

Feature or enhancement

Proposal:

Motivation

Has this already been discussed elsewhere?

Links to previous discussion of this feature

Implementation

Implementation summary

Backward compatibility

Open questions

CPython versions tested on

Operating systems tested on

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

faulthandler: make per-call thread dump cap configurable #149085

Description

Feature or enhancement

Proposal:

Motivation

Has this already been discussed elsewhere?

Links to previous discussion of this feature

Implementation

Implementation summary

Backward compatibility

Open questions

CPython versions tested on

Operating systems tested on

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions