Enforce int_max_str_digits on int-to-str conversions#7688
Conversation
The str-to-int direction already enforced sys.get_int_max_str_digits() via bytes_to_int; the int-to-str direction did not. CPython 3.14 enforces both per PEP 644. Adds check_int_to_str_digits helper in builtins::int (bit-count fast path + digit upper-bound from log10(2)), wired into the four Python-level entry points: repr, the str fast path in protocol::object, int.__format__ (decimal/n/empty spec only — binary bases x/o/b are exempt per CPython), and the DecimalD/I/U branches of vm::cformat for both str % and bytes %. Unmasks 8 expectedFailure tests across test_int (max_str_digits, DoS prevention, int_from_other_bases — each mirrored in IntSubclass), test_ast (test_repr_large_input_crash) and test_reprlib (test_numbers). Boundary cases (4299/4300/4301 digits at limit=4300) match CPython 3.14.4.
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including ⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds PEP 644 enforcement for decimal integer-to-string conversions by introducing a FormatSpec helper and a VM-level digit check; the check is invoked across decimal formatting paths (str/repr/format/f-strings) and rejects conversions exceeding Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: (module 'ast test_int test_reprlib' not found) Legend:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/common/src/format.rs (1)
484-489: Optional: narrowNumber(_)toNumber(Case::Lower).
Number(Case::Upper)('N') is rejected asUnknownFormatCode('N', "int")byformat_int, so it isn’t actually a decimal-int output path. Including it here means a huge int formatted with"N"will raiseValueError("Exceeds the limit …")before the format-code error has a chance to surface, changing the message users see for an invalid spec. Functionally both areValueError, so this is a low-priority polish; restricting the helper to formats that genuinely produce decimal output keeps the helper’s name honest:♻️ Proposed narrowing
pub fn is_decimal_int_format(&self) -> bool { matches!( self.format_type, - None | Some(FormatType::Decimal) | Some(FormatType::Number(_)) + None | Some(FormatType::Decimal) | Some(FormatType::Number(Case::Lower)) ) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/common/src/format.rs` around lines 484 - 489, The helper is_decimal_int_format currently treats FormatType::Number(_) as decimal, but Number(Case::Upper) ('N') is rejected later by format_int (producing UnknownFormatCode('N', "int")), causing an order-change in errors; update is_decimal_int_format to only return true for None, Some(FormatType::Decimal), or Some(FormatType::Number(Case::Lower)) so it truly represents formats that produce decimal output, referencing the is_decimal_int_format function, FormatType::Number, Case::Lower, and format_int/UnknownFormatCode to locate the relevant logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/common/src/format.rs`:
- Around line 484-489: The helper is_decimal_int_format currently treats
FormatType::Number(_) as decimal, but Number(Case::Upper) ('N') is rejected
later by format_int (producing UnknownFormatCode('N', "int")), causing an
order-change in errors; update is_decimal_int_format to only return true for
None, Some(FormatType::Decimal), or Some(FormatType::Number(Case::Lower)) so it
truly represents formats that produce decimal output, referencing the
is_decimal_int_format function, FormatType::Number, Case::Lower, and
format_int/UnknownFormatCode to locate the relevant logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 021928ab-60d9-4eb4-a307-192b88cd47d4
⛔ Files ignored due to path filters (3)
Lib/test/test_ast/test_ast.pyis excluded by!Lib/**Lib/test/test_int.pyis excluded by!Lib/**Lib/test/test_reprlib.pyis excluded by!Lib/**
📒 Files selected for processing (5)
crates/common/src/format.rscrates/vm/src/builtins/int.rscrates/vm/src/cformat.rscrates/vm/src/protocol/object.rsextra_tests/snippets/builtin_int.py
ShaharNaveh
left a comment
There was a problem hiding this comment.
TYSM for all the PRs recently:)
The test_denial_of_service_prevented_int_to_str regression test uses
support.Stopwatch, which calls time.get_clock_info('monotonic'). In
RustPython that function is gated to unix/windows targets only, so on
wasm32-wasip1 it surfaces as AttributeError and breaks the wasm-wasi CI.
Guard the test with skipUnless(hasattr(time, 'get_clock_info'), ...) so
it runs everywhere it can and is skipped on wasm.
Also narrow is_decimal_int_format to Number(Case::Lower): 'N' is rejected
by format_int as UnknownFormatCode, so excluding it preserves that error
path instead of intercepting it with the digit-limit check.
thanks for checking ! and also, i pushed a follow-up that skips the DoS test on wasm — it depends on time.get_clock_info, which is unix/windows-only in RustPython and surfaces as AttributeError via support.Stopwatch. thanks for keep watching :) |
ShaharNaveh
left a comment
There was a problem hiding this comment.
For custom patches we need to set some type of comment saying that this is our patch so update_lib can migrate the custom patches:)
| @unittest.expectedFailure # TODO: RUSTPYTHON | ||
| @unittest.skipUnless( | ||
| hasattr(time, "get_clock_info"), | ||
| "requires time.get_clock_info (not available on wasm)", |
There was a problem hiding this comment.
| "requires time.get_clock_info (not available on wasm)", | |
| "TODO: RUSTPYTHON; requires time.get_clock_info (not available on wasm)", |
| str(i) | ||
|
|
||
| @unittest.expectedFailure # TODO: RUSTPYTHON | ||
| @unittest.skipUnless( |
There was a problem hiding this comment.
Also, I'd set it to unittest.expectedFailireIf(not hasattr(time, "get_clock_info"), "TODO: RUSTPYTHON; ...")
scripts/update_lib uses TODO: RUSTPYTHON markers inside unittest decorator reason strings to identify and migrate custom RustPython patches across CPython library updates.
skipUnless silently hides the test forever; expectedFailureIf surfaces unexpected success once RustPython implements time.get_clock_info on wasm, prompting marker removal.
|
thanks and all is done |
| @unittest.expectedFailure # TODO: RUSTPYTHON | ||
| @unittest.expectedFailureIf( | ||
| not hasattr(time, "get_clock_info"), | ||
| "TODO: RUSTPYTHON; time.get_clock_info is not available on wasm", |
There was a problem hiding this comment.
how CPython wasm build handle this?
There was a problem hiding this comment.
it's verified via CPython source
the "time_get_clock_info" in Modules/timemodule.c (v3.14.0 lines 1630, 1946) has no cfg guards, so cpython exposes it unconditionally including wasm builds. The "expectedFailureIf" will surface as unexpected success once we lift RustPython's #[cfg(unix)]/#[cfg(windows)] gating in crates/vm/src/stdlib/time.rs
thanks for catching !
RustPython enforced
sys.get_int_max_str_digits()on the str → int direction (viabytes_to_int) but not the int → str direction. CPython 3.14 enforces both per PEP 644's DoS mitigation. This adds the missing checks.Entry points covered
repr(int)builtins/int.rs::repr_strstr(int)protocol/object.rs::strfast pathformat(int, …)builtins/int.rs::__format__(decimal/n/empty only)'%d' % int(str + bytes %)vm/cformat.rsDecimal{D,I,U} branchesBinary bases (
'x','o','b') remain exempt per CPython spec.Implementation
check_int_to_str_digitshelper inbuiltins::int— bit-count fast path (skip if < 1500 bits ≈ 452 digits), then upper boundbits × log10(2) + 1vs the configured limit.FormatSpec::is_decimal_int_format()incommon::formatfor the decimal-vs-binary branch decision in__format__.Tests unmasked (8)
test_int.IntStrDigitLimitsTests::test_max_str_digitstest_int.IntStrDigitLimitsTests::test_denial_of_service_prevented_int_to_strtest_int.IntStrDigitLimitsTests::test_int_from_other_basesIntSubclassStrDigitLimitsTests)test_ast.AST_Tests::test_repr_large_input_crashtest_reprlib.ReprTests::test_numbersVerification
test_int test_ast test_reprlib test_decimal test_fractions test_compile test_sys test_cmd_line), 1,276 tests, no regressionsextra_tests/snippets/builtin_int.pycovers all 9 decimal paths + 3 binary-base exemptions + disabled-limit; passes on RustPython and CPython 3.14.4Summary by CodeRabbit