Commit Graph

302 Commits

Author SHA1 Message Date
Neil Schemenauer
c98c5b3449 gh-131253: free-threaded build support for pystats (gh-137189)
Allow the --enable-pystats build option to be used with free-threading.  The
stats are now stored on a per-interpreter basis, rather than process global.
For free-threaded builds, the stats structure is allocated per-thread and
then periodically merged into the per-interpreter stats structure (on thread
exit or when the reporting function is called). Most of the pystats related
code has be moved into the file Python/pystats.c.
2025-11-03 11:36:37 -08:00
Dino Viehland
ff0cf0af10 gh-139525: Don't specialize functions which have a modified vectorcall (#139524)
Don't specialize functions which have a modified vectorcall
2025-10-03 09:58:32 -07:00
Tapeline
ea77feecbb gh-138302: Specialize int ops only if ints are compact (GH-138347) 2025-09-02 01:33:15 +08:00
Sam Gross
485b16b4f7 gh-137238: Fix data race in _Py_slot_tp_getattr_hook (gh-137240)
Replacing the slot isn't thread-safe if the GIL is disabled. Don't
require that the slot has been replaced when specializing.
2025-08-05 09:32:22 -04:00
PuQing
a9e66a7c50 gh-132815: Add support for JUMP_BACKWARD in specialization stats (#135606) 2025-06-17 14:11:09 +02:00
Mark Shannon
f6f4e8a662 GH-132554: "Virtual" iterators (GH-132555)
* FOR_ITER now pushes either the iterator and NULL or leaves the iterable and pushes tagged zero

* NEXT_ITER uses the tagged int as the index into the sequence or, if TOS is NULL, iterates as before.
2025-05-27 15:59:45 +01:00
Tomas R.
c492ac7252 GH-131798: Split up and optimize CALL_ISINSTANCE (GH-133339) 2025-05-08 14:26:30 -07:00
Irit Katriel
296cd128bf Revert "gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396)" (#133498) 2025-05-06 13:12:26 +03:00
Diego Russo
9cc77aaf9d GH-131798: Split CALL_LEN into several uops (GH-133180) 2025-05-05 14:31:48 -07:00
Irit Katriel
082dbf7788 gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays (#133396) 2025-05-05 17:46:56 +01:00
Irit Katriel
5529213d4e gh-100239: specialize BINARY_OP/SUBSCR for list-slice (#132626) 2025-05-01 10:28:52 +00:00
Mark Shannon
3831752689 PyStats: Make sure that the failure_kinds array is big enough. (#133245) 2025-05-01 10:02:51 +00:00
Mark Shannon
622300bdfa GH-132554: Add stats for GET_ITER (GH-132592)
* Add stats for GET_ITER

* Look for common iterable types, not iterator types

* Add stats for self iter and fix naming in summary
2025-04-29 09:00:14 +01:00
Kumar Aditya
f3d877a27a gh-132643: use atomic load for dict in specializer (#132653) 2025-04-18 15:06:27 +05:30
mpage
619edb802e gh-132336: Mark a few "slow path" functions used by the interpreter loop as noinline (#132337)
Mark a few functions used by the interpreter loop as noinline

These are all the slow path and should not be inlined into the interpreter
loop. Unfortunately, they end up being inlined with LTO and the current PGO
task.
2025-04-10 10:41:15 +02:00
Irit Katriel
8c9ef8f1f8 gh-100239: more stats for BINARY_OP/SUBSCR specialization (#132230) 2025-04-08 08:50:51 +00:00
Irit Katriel
68e72cf3a8 gh-100239: fix bug in comparison (#132093) 2025-04-04 18:09:49 +01:00
Irit Katriel
df59226997 gh-100239: more refined specialisation stats for BINARY_OP/SUBSCR (#132068) 2025-04-04 15:33:31 +01:00
Dino Viehland
2984ff9e51 gh-130373: Avoid locking in _LOAD_ATTR_WITH_HINT (#130372)
Avoid locking in _LOAD_ATTR_WITH_HINT
2025-03-28 15:16:41 -07:00
Victor Stinner
5c44d7d99c gh-130931: Add pycore_interpframe.h internal header (#131249)
Move _PyInterpreterFrame and associated functions
to a new pycore_interpframe.h header.
2025-03-19 18:17:44 +01:00
Ken Jin
b2ed7a6d6a gh-131281: Add include for pystats builds (#131369)
Add include to for pystats builds
2025-03-18 00:36:06 +08:00
Mark Shannon
a45f25361d GH-131238: More refactoring of core header files (GH-131351)
Adds new pycore_stats.h header file to help break dependencies involving the pycore_code.h header.
2025-03-17 14:41:05 +00:00
T. Wouters
de2f7da77d gh-115999: Add free-threaded specialization for FOR_ITER (#128798)
Add free-threaded versions of existing specialization for FOR_ITER (list, tuples, fast range iterators and generators), without significantly affecting their thread-safety. (Iterating over shared lists/tuples/ranges should be fine like before. Reusing iterators between threads is not fine, like before. Sharing generators between threads is a recipe for significant crashes, like before.)
2025-03-12 16:21:46 +01:00
Irit Katriel
a1417b211f gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR (#129700) 2025-02-07 22:39:54 +00:00
Brandt Bucher
5fa7e1b7fd GH-129715: Remove _DYNAMIC_EXIT (GH-129716) 2025-02-07 11:41:17 -08:00
Diego Russo
567394517a GH-128842: Collect JIT memory stats (GH-128941) 2025-02-02 15:17:53 -08:00
Yan Yanchii
e6c76b947b GH-128872: Remove unused argument from _PyCode_Quicken (GH-128873)
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
2025-02-02 15:09:30 -08:00
Irit Katriel
4815131910 gh-100239: specialize bitwise logical binary ops on ints (#128927) 2025-01-29 09:28:21 +00:00
Sam Gross
a10f99375e Revert "GH-128914: Remove conditional stack effects from bytecodes.c and the code generators (GH-128918)" (GH-129202)
The commit introduced a ~2.5-3% regression in the free threading build.

This reverts commit ab61d3f430.
2025-01-23 09:26:25 +00:00
Mark Shannon
ab61d3f430 GH-128914: Remove conditional stack effects from bytecodes.c and the code generators (GH-128918) 2025-01-20 17:09:23 +00:00
Kirill Podoprigora
6c52ada551 gh-100239: Handle NaN and zero division in guards for BINARY_OP_EXTEND (#128963)
Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
2025-01-19 11:02:49 +00:00
Irit Katriel
3893a92d95 gh-100239: specialize long tail of binary operations (#128722) 2025-01-16 15:22:13 +00:00
mpage
b5ee0258bf gh-115999: Specialize LOAD_ATTR for instance and class receivers in free-threaded builds (#128164)
Finish specialization for LOAD_ATTR in the free-threaded build by adding support for class and instance receivers.
2025-01-14 11:56:11 -08:00
Mark Shannon
ddd959987c GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708) 2025-01-13 10:30:28 +00:00
T. Wouters
8f93dd8a8f gh-115999: Add free-threaded specialization for COMPARE_OP (#126410)
Add free-threaded specialization for COMPARE_OP, and tests for COMPARE_OP specialization in general.

Co-authored-by: Donghee Na <donghee.na92@gmail.com>
2025-01-07 06:41:01 -08:00
Ken Jin
7ef4907412 gh-128262: Allow specialization of calls to classes with __slots__ (GH-128263) 2024-12-31 12:24:17 +08:00
Yan Yanchii
fe4dd07a84 gh-119786: Mention InternalDocs/interpreter.md instead of non-existing adaptive.md (#128329)
`Python/specialize.c`: Mention `InternalDocs/interpreter.md` instead of non-existing `adaptive.md`


Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
2024-12-30 18:38:09 +00:00
Neil Schemenauer
1b15c89a17 gh-115999: Specialize STORE_ATTR in free-threaded builds. (gh-127838)
* Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and
  use in place of `_PyDictKeys_StringLookup`.

* Change `_PyObject_TryGetInstanceAttribute` to use that function
  in the case of split keys.

* Add `unicodekeys_lookup_split` helper which allows code sharing
  between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`.

* Fix locking for `STORE_ATTR_INSTANCE_VALUE`.  Create
  `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and
  `tp_version_tag` cannot change.

* Pass `tp_version_tag` to `specialize_dict_access()`, ensuring
  the version we store on the cache is the correct one (in case of
  it changing during the specalize analysis).

* Split `analyze_descriptor` into `analyze_descriptor_load` and
  `analyze_descriptor_store` since those don't share much logic.
  Add `descriptor_is_class` helper function.

* In `specialize_dict_access`, double check `_PyObject_GetManagedDict()`
  in case we race and dict was materialized before the lock.

* Avoid borrowed references in `_Py_Specialize_StoreAttr()`.

* Use `specialize()` and `unspecialize()` helpers.

* Add unit tests to ensure specializing happens as expected in FT builds.

* Add unit tests to attempt to trigger data races (useful for running under TSAN).

* Add `has_split_table` function to `_testinternalcapi`.
2024-12-19 10:21:17 -08:00
Donghee Na
48c70b8f7d gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737) 2024-12-19 11:08:17 +09:00
mpage
2de048ce79 gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711)
We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds:

_CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version.

_LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.
2024-12-13 10:17:16 -08:00
mpage
c84928ed6d gh-115999: Specialize CALL_KW in free-threaded builds (#127713)
* Enable specialization of CALL_KW

* Fix bug pushing frame in _PY_FRAME_KW

`_PY_FRAME_KW` pushes a pointer to the new frame onto the stack for
consumption by the next uop. When pushing the frame fails, we do not
want to push the result, `NULL`, to the stack because it is not
a valid stackref. This works in the default build because `PyStackRef_NULL`
 and `NULL` are the same value, so the `PyStackRef_XCLOSE()` in the error
handler ignores it. In the free-threaded build the values are not the same;
`PyStackRef_XCLOSE()` will attempt to decref a null pointer.
2024-12-11 15:18:22 -08:00
Sam Gross
a353455fca gh-125610: Fix STORE_ATTR_INSTANCE_VALUE specialization check (GH-125612)
The `STORE_ATTR_INSTANCE_VALUE` opcode doesn't support objects with
non-NULL managed dictionaries, so don't specialize to that op in that case.
2024-12-06 10:48:24 -05:00
mpage
dabcecfd6d gh-115999: Enable specialization of CALL instructions in free-threaded builds (#127123)
The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below.

A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe:

Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref.

Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__.

Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes.
Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure.

A few other miscellaneous changes were also needed:

Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it.

Add missing co_tlbc for _Py_InitCleanup.

Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.
2024-12-03 11:20:20 -08:00
Neil Schemenauer
276cd66ccb gh-115999: Add free-threaded specialization for SEND (gh-127426)
No additional thread safety changes are required.  Note that sending to
a generator that is shared between threads is currently not safe in the
free-threaded build.
2024-12-03 10:25:12 -08:00
Neil Schemenauer
0cb5222079 gh-115999: Specialize LOAD_SUPER_ATTR in free-threaded builds (gh-127128)
Use existing helpers to atomically modify the bytecode.  Add unit tests
to ensure specializing is happening as expected.  Add test_specialize.py
that can be used with ThreadSanitizer to detect data races.  
Fix thread safety issue with cell_set_contents().
2024-12-03 09:32:26 -08:00
Michael Droettboom
edefb8678a gh-127518: Fix pystats build after #127169 (#127526)
gh-127518: Fix pystats build after #127619
2024-12-02 20:17:08 +00:00
Mark Shannon
a8dd821d5b GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-127110)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Update docs

* Clearer calculation of work to do.
2024-12-02 10:12:17 +00:00
Donghee Na
e2713409cf gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227) 2024-12-02 10:38:17 +09:00
Sam Gross
71ede1142d gh-115999: Add free-threaded specialization for STORE_SUBSCR (#127169)
The specialization only depends on the type, so no special thread-safety
considerations there.

STORE_SUBSCR_LIST_INT needs to lock the list before modifying it.

`_PyDict_SetItem_Take2` already internally locks the dictionary using a
critical section.
2024-11-26 16:46:06 -05:00
mpage
d24a22e9b6 gh-115999: Record success in specialize (#127167)
Record success in `specialize`

This matches the existing behavior where we increment the success
stat for the generic opcode each time we successfully specialize
an instruction.
2024-11-22 12:07:05 -08:00