Commit Graph

47 Commits

Author SHA1 Message Date
mpage
b5ee0258bf gh-115999: Specialize LOAD_ATTR for instance and class receivers in free-threaded builds (#128164)
Finish specialization for LOAD_ATTR in the free-threaded build by adding support for class and instance receivers.
2025-01-14 11:56:11 -08:00
Mark Shannon
ddd959987c GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708) 2025-01-13 10:30:28 +00:00
Brandt Bucher
65ae3d5a73 GH-127809: Fix the JIT's understanding of ** (GH-127844) 2025-01-07 17:25:48 -08:00
Yan Yanchii
30efede33c gh-128195: Add _REPLACE_WITH_TRUE to the tier2 optimizer (GH-128203)
Add `_REPLACE_WITH_TRUE` to the tier2 optimizer
2024-12-24 05:17:47 +08:00
Donghee Na
48c70b8f7d gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737) 2024-12-19 11:08:17 +09:00
mpage
2de048ce79 gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711)
We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds:

_CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version.

_LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.
2024-12-13 10:17:16 -08:00
Ken Jin
6293d00e72 gh-120619: Strength reduce function guards, support 2-operand uop forms (GH-124846)
Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
2024-11-09 11:35:33 +08:00
Mark Shannon
faa3272fb8 GH-125837: Split LOAD_CONST into three. (GH-125972)
* Add LOAD_CONST_IMMORTAL opcode

* Add LOAD_SMALL_INT opcode

* Remove RETURN_CONST opcode
2024-10-29 11:15:42 +00:00
Brandt Bucher
b5b06349eb GH-125912: Teach the JIT's optimizer about _BINARY_OP_INPLACE_ADD_UNICODE (GH-125935) 2024-10-28 14:37:16 -07:00
mpage
f978fb4f8d gh-115999: Refactor LOAD_GLOBAL specializations to avoid reloading {globals, builtins} keys (gh-124953)
Each of the `LOAD_GLOBAL` specializations is implemented roughly as:

1. Load keys version.
2. Load cached keys version.
3. Deopt if (1) and (2) don't match.
4. Load keys.
5. Load cached index into keys.
6. Load object from (4) at offset from (5).

This is not thread-safe in free-threaded builds; the keys object may be replaced
in between steps (3) and (4).

This change refactors the specializations to avoid reloading the keys object and
instead pass the keys object from guards to be consumed by downstream uops.
2024-10-09 15:18:25 +00:00
Mark Shannon
da071fa3e8 GH-119866: Spill the stack around escaping calls. (GH-124392)
* Spill the evaluation around escaping calls in the generated interpreter and JIT. 

* The code generator tracks live, cached values so they can be saved to memory when needed.

* Spills the stack pointer around escaping calls, so that the exact stack is visible to the cycle GC.
2024-10-07 14:56:39 +01:00
Ken Jin
b84a763dca gh-120619: Optimize through _Py_FRAME_GENERAL (GH-124518)
* Optimize through _Py_FRAME_GENERAL

* refactor
2024-10-03 01:10:51 +08:00
Mark Shannon
a4fd7aa4a6 GH-115776: Allow any fixed sized object to have inline values (GH-123192) 2024-08-21 15:52:04 +01:00
Mark Shannon
bb1d30336e GH-118093: Make CALL_ALLOC_AND_ENTER_INIT suitable for tier 2. (GH-123140)
* Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it

* Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
2024-08-20 16:52:58 +01:00
Mark Shannon
c13e7d98fb GH-118093: Specialize CALL_KW (GH-123006) 2024-08-16 17:11:24 +01:00
Mark Shannon
1795d6ceba GH-122869: Add missing tier two optimizer cases (GH-122936) 2024-08-12 10:35:52 -07:00
Mark Shannon
1ca99ed240 Manually override bytecode definition in optimizer, to avoid build error (GH-122316) 2024-07-26 18:38:52 +01:00
Brandt Bucher
64857d849f GH-122294: Burn in the addresses of side exits (GH-122295) 2024-07-26 09:40:15 -07:00
Brandt Bucher
7b36b67b1e GH-118093: Add tier two support to several instructions (GH-121884) 2024-07-18 14:24:58 -07:00
Nadeshiko Manju
f385d99f57 gh-120437: Fix _CHECK_STACK_SPACE optimization problems introduced in gh-118322 (GH-120712)
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2024-06-19 23:34:39 +08:00
Mark Shannon
9cefcc0ee7 GH-120507: Lower the BEFORE_WITH and BEFORE_ASYNC_WITH instructions. (#120640)
* Remove BEFORE_WITH and BEFORE_ASYNC_WITH instructions.

* Add LOAD_SPECIAL instruction

* Reimplement `with` and `async with` statements using LOAD_SPECIAL
2024-06-18 12:17:46 +01:00
Mark Shannon
274f844830 GH-120619: Clean up RETURN_VALUE instruction (GH-120624)
* Rename _POP_FRAME to _RETURN_VALUE as it returns a value as well as popping a frame.

* Remove remaining _POP_FRAMEs
2024-06-17 14:40:11 +01:00
Saul Shanabrook
55402d3232 gh-119258: Eliminate Type Guards in Tier 2 Optimizer with Watcher (GH-119365)
Co-authored-by: parmeggiani <parmeggiani@spaziodati.eu>
Co-authored-by: dpdani <git@danieleparmeggiani.me>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
Co-authored-by: Ken Jin <kenjin@python.org>
2024-06-08 17:41:45 +08:00
Brandt Bucher
cfcc054dee GH-119476: Split _CHECK_FUNCTION_VERSION out of _CHECK_FUNCTION_EXACT_ARGS (GH-119510) 2024-05-28 12:45:11 -07:00
Mark Shannon
f5c6b9977a GH-118910: Less boilerplate in the tier 2 optimizer (#118913) 2024-05-10 17:43:23 +01:00
Mark Shannon
1ab6356ebe GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 support of calls. (GH-118322)
* Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations.

* Remove CALL_PY_WITH_DEFAULTS specialization

* Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
2024-05-04 12:11:11 +01:00
Mark Shannon
5b05d452cd GH-118095: Add tier 2 support for YIELD_VALUE (GH-118380) 2024-04-30 11:33:13 +01:00
Mark Shannon
f180b31e76 GH-118095: Handle RETURN_GENERATOR in tier 2 (GH-118180) 2024-04-25 11:32:47 +01:00
Mark Shannon
a6647d16ab GH-115480: Reduce guard strength for binary ops when type of one operand is known already (GH-118050) 2024-04-22 13:34:06 +01:00
Mark Shannon
e32f6e9e4b GH-115419: Tidy up tier 2 optimizer. Merge peephole pass into main pass (GH-117997) 2024-04-18 11:09:30 +01:00
Kirill Podoprigora
eebea7e515 gh-117176: Fix compiler warning in Python/optimizer_bytecodes.c (GH-117199) 2024-03-24 20:34:55 +02:00
Guido van Rossum
570a82d46a gh-117045: Add code object to function version cache (#117028)
Changes to the function version cache:

- In addition to the function object, also store the code object,
  and allow the latter to be retrieved even if the function has been evicted.
- Stop assigning new function versions after a critical attribute (e.g. `__code__`)
  has been modified; the version is permanently reset to zero in this case.
- Changes to `__annotations__` are no longer considered critical. (This fixes gh-109998.)

Changes to the Tier 2 optimization machinery:

- If we cannot map a function version to a function, but it is still mapped to a code object,
  we continue projecting the trace.
  The operand of the `_PUSH_FRAME` and `_POP_FRAME` opcodes can be either NULL,
  a function object, or a code object with the lowest bit set.

This allows us to trace through code that calls an ephemeral function,
i.e., a function that may not be alive when we are constructing the executor,
e.g. a generator expression or certain nested functions.
We will lose globals removal inside such functions,
but we can still do other peephole operations
(and even possibly [call inlining](https://github.com/python/cpython/pull/116290),
if we decide to do it), which only need the code object.
As before, if we cannot retrieve the code object from the cache, we stop projecting.
2024-03-21 12:37:41 -07:00
Mark Shannon
63289b9dfb GH-117066: Tier 2 optimizer: Don't throw away good traces if we can't optimize them perfectly. (GH-117067) 2024-03-20 18:24:02 +00:00
Guido van Rossum
76d0868907 Cleanup tier2 debug output (#116920)
Various tweaks, including a slight refactor of the special cases for `_PUSH_FRAME`/`_POP_FRAME` to show the actual operand emitted.
2024-03-18 11:08:43 -07:00
Ken Jin
617aca9e74 gh-115419: Change default sym to not_null (GH-116562) 2024-03-13 20:57:48 +08:00
Ken Jin
4298d69d4b gh-116420: Fix unused var compilation warnings (GH-116466)
Fix unused var compilation warnings
2024-03-08 00:19:59 +08:00
Mark Shannon
33c0aa3bb9 GH-115687: Most comparisons create Booleans, so propagate that information (GH-116360)
Most comparisons create booleans
2024-03-06 10:46:42 +00:00
Mark Shannon
0c81ce1360 GH-115819: Eliminate Boolean guards when value is known (GH-116355) 2024-03-05 15:06:00 +00:00
Mark Shannon
cbf3d38cbe GH-115685: Optimize TO_BOOL and variants based on truthiness of input. (GH-116311) 2024-03-05 11:23:46 +00:00
Ken Jin
ff96b81d78 gh-115480: Type propagate _BINARY_OP_ADD_UNICODE (GH-115710) 2024-03-02 03:40:04 +08:00
Ken Jin
d01886c5c9 gh-115685: Type/values propagate for TO_BOOL in tier 2 (GH-115686) 2024-03-01 06:13:38 +08:00
Guido van Rossum
0656509033 gh-116088: Insert bottom checks after all sym_set_...() calls (#116089)
This changes the `sym_set_...()` functions to return a `bool` which is `false`
when the symbol is `bottom` after the operation.

All calls to such functions now check this result and go to `hit_bottom`,
a special error label that prints a different message and then reports
that it wasn't able to optimize the trace. No executor will be produced
in this case.
2024-02-29 18:55:29 +00:00
Guido van Rossum
3409bc29c9 gh-115859: Re-enable T2 optimizer pass by default (#116062)
This undoes the *temporary* default disabling of the T2 optimizer pass in gh-115860.

- Add a new test that reproduces Brandt's example from gh-115859; it indeed crashes before gh-116028 with PYTHONUOPSOPTIMIZE=1
- Re-enable the optimizer pass in T2, stop checking PYTHONUOPSOPTIMIZE
- Rename the env var to disable T2 entirely to PYTHON_UOPS_OPTIMIZE (must be explicitly set to 0 to disable)
- Fix skipIf conditions on tests in test_opt.py accordingly
- Export sym_is_bottom() (for debugging)
- Fix various things in the `_BINARY_OP_` specializations in the abstract interpreter:
  - DECREF(temp)
  - out-of-space check after sym_new_const()
  - add sym_matches_type() checks, so even if we somehow reach a binary op with symbolic constants of the wrong type on the stack we won't trigger the type assert
2024-02-28 22:38:01 +00:00
Guido van Rossum
e2a3e4b748 gh-115816: Improve internal symbols API in optimizer (#116028)
- Any `sym_set_...` call that attempts to set conflicting information
  cause the symbol to become `bottom` (contradiction).
- All `sym_is...` and similar calls return false or NULL for `bottom`.
- Everything's tested.
- The tests still pass with `PYTHONUOPSOPTIMIZE=1`.
2024-02-28 17:55:56 +00:00
Mark Shannon
6ecfcfe894 GH-115816: Assorted naming and formatting changes to improve maintainability. (GH-115987)
* Rename _Py_UOpsAbstractInterpContext to _Py_UOpsContext and _Py_UOpsSymType to _Py_UopsSymbol.

* #define shortened form of _Py_uop_... names for improved readability.
2024-02-27 13:25:02 +00:00
Mark Shannon
10fbcd6c5d GH-115816: Make tier2 optimizer symbols testable, and add a few tests. (GH-115953) 2024-02-27 10:51:26 +00:00
Guido van Rossum
c0fdfba7ff Rename tier 2 redundancy eliminator to optimizer (#115888)
The original name is just too much of a mouthful.
2024-02-26 08:42:53 -08:00