-
Notifications
You must be signed in to change notification settings - Fork 151
Fix array compound literal parsing #309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| } | ||
| lex_expect(T_close_curly); | ||
| var->array_size = count; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
DrXiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some test cases to the test suite for validation.
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use static qualifier since shecc does not support. Check 'COMPLIANCE.md' carefully.
|
You MUST ensure bootstrapping is fully functional before submitting pull requests. |
|
IIUC, compound literals are a feature supported only since C99, and the shecc README mentions from the very beginning that this project aims to support ANSI C. Therefore, IMO, this at least does not "fix" anything. |
As far as I know, shecc is planned to fully support the C99 standard, so I think the term "fix" is acceptable. |
|
Fine, but since we haven't claimed to fully support C99 and, IIUC, array compound literals were never supported before, this seems more like supporting a new feature to me, rather than fixing an existing problem. |
|
In fact, shecc has ability to handle array compound literals, but it only captures the first element (#299). Therefore, this pull request specifically aims to fix it. |
Thanks, that resolves my doubt. |
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't add tests/array_ptr.c. Instead, consolidate tests/driver.sh.
visitorckw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase your branch to keep the git history clean.
Commits that fix problems introduced within the same pull request should be avoided.
bf5f3d0 to
b88294c
Compare
|
Looking at the changes, the core logic for handling array compound literals looks solid—good job centralizing the decay behavior to avoid ad-hoc fixes everywhere. But yeah, there are some style inconsistencies and comment opportunities that could make this even tighter. Here's what I'd suggest, focused on Style Fixes for
|
0de5d30 to
08f1e29
Compare
|
I defer to @ChAoSUnItY for confirmation. |
| fatal("Unsupported truncation operation with invalid target size"); | ||
| } | ||
| return; | ||
| case OP_sign_ext: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing the curly braces while this case branch has variable declaration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t really mean to drop those braces—they were just left behind while trying other tweaks. but it’s fine either way because C99 lets us declare source_size right after the case label without changing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although C99 allows this. But since in our compiler's standard workflow, stage 0 will be compiled by GCC with -Wpedantic compilation option included, and this will cause building process to output warning even in this case it didn't do anything in a harmful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted it to the original order in my latest commit
08f1e29 to
b13f5b6
Compare
cc86cd4 to
b619761
Compare
|
@lorettayao, note that the commit message body should be wrapped at 72 characters. |
Unify and correct the handling of array compound literals across parsing, semantic analysis, and lowering. The compiler now builds a temporary array by writing each element, tracking initializer counts, and returning the array’s address instead of collapsing the literal to its first element. Decay to a scalar is performed only when a scalar value is required. A centralized helper now governs literal decay, replacing scattered ad-hoc callers in binary operators, assignments, function-call arguments, and ternary expressions. This resolves cases where array literals were incorrectly forced to scalars in pointer contexts and brings the behavior closer to standard C expectations. The update also corrects scalarization in variadic and pointer arithmetic contexts, ensures pointer-typed ternary results, handles zero-length array literals as constant zero, avoids double-brace consumption in the parser, and regenerates ARM and RISC-V snapshots to match the corrected lowering. These changes restore correct pointer semantics for array compound literals and address sysprog21#299.
b619761 to
79847d5
Compare
| } | ||
| lex_expect(T_close_curly); | ||
| var->array_size = count; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line.
79847d5 to
a4aba54
Compare
DrXiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second commit (a4aba54) modified both the C source code and the test suites, but its message only describes the latter. The message does not accurately reflect the changes made.
DrXiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that the term "decay" appears in the commit messages, but I am not sure why it is used. Could you explain the reason behind this choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that a blank line is required between two function definitions. For example:
Correct example:
void func1(void)
{
/* ... */
}
/* (Comment)
* ...
* ...
*/
int func2(void)
{
/* ... */
}
void func3(int a)
{
/* ... */
}Incorrect example:
void func1(void)
{
/* ... */
}
/* (Comment)
* ...
* ...
*/
int func2(void)
{
/* ... */
}
void func3(int a)
{
/* ... */
}
i use the word decay to specify fixes around treating array compound literals this way—e.g., assigning (int[]){1,2,3} to an int * or passing it to a function should transparently yield a pointer |
a4aba54 to
3e1f41a
Compare
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The work touches too many places with similar boilerplate:
if (!var->ptr_level && var->array_size == 0)
rhs = scalarize_array_literal(parent, &bb, rhs, var->type);
This pattern appears in:
- read_body_assignment (line 3815)
- read_body_statement assignments (line 4563)
- read_body_statement nv assignments (line 4661)
- read_ternary_operation twice
Problem: The special case logic is multiplying, not disappearing.
| bool is_array_literal_placeholder(var_t *var) | ||
| { | ||
| return var && var->array_size > 0 && !var->ptr_level && | ||
| var->var_name[0] == '.'; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detecting temporaries by checking if name starts with . is fragile. This is pattern-matching on internal naming conventions rather than explicit type tracking.
Better approach: Add an explicit is_compound_literal flag to var_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added is_compound_literal flag to var_t in defs.h and and switched is_array_literal_placeholder to rely on it instead of the synthetic .t naming.
| if (func && param_num < func->num_params) { | ||
| var_t *target = &func->param_defs[param_num]; | ||
| if (!target->ptr_level && !target->array_size) | ||
| param = | ||
| scalarize_array_literal(parent, bb, param, target->type); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only scalarizes for defined params, but variadic args (func->va_args) might still misbehave. Test case needed:
printf("%p", (int[]){1,2,3}); // Should print pointer, not 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the test case taking care this in the latest commit
|
Consider test case below // Variadic function with array literal
int main(void) {
int *p = (int[]){42, 43, 44};
return *p == 42 ? 0 : 1;
}The existing tests cover basic pointer decay but not all edge cases from the cubic analysis. |
83716cd to
d90a118
Compare
|
I encountered a CI failure with "buffer overflow detected" during the stage 1 bootstrap. Debugging with AddressSanitizer revealed a stack-buffer-overflow in src/parser.c where a generated temporary name (34 bytes) exceeded the MAX_VAR_LEN limit (32 bytes). I have increased MAX_VAR_LEN to 128 to accommodate these generated identifiers and prevent similar overflows in strcpy operations. Please let me know if you would prefer a local fix instead of this global change. |
Add regression tests that verify pointer and decay semantics of array
compound literals. The tests cover decay during initialization, passing
array literals to functions that expect pointer arguments, pointer-typed
ternary expressions involving literal branches, and correctness for
char[] and short[] literals in simple computations.
Add a driver test to ensure array compound literals decay to pointers
(int *p = (int[]){42,43,44}) and fix parsing to track compound literals
explicitly and keep them pointer-like for varargs and pointer initializers.
These tests confirm the behavior introduced by the refined compound
literal semantics implemented in the preceding commit.
Parser cleanup: flattened the nested parameter-handling conditionals in
parser.c so array-literal parameters are scalarized and promoted/resized
without deep if-nesting, improving readability while preserving the
decay semantics enforced by the new tests
Change the MAX_VAR_LEN to 128 to deal with generated temporary name
(34 bytes) exceeded the MAX_VAR_LEN limit (32 bytes)
d90a118 to
62c0f47
Compare
Instead of asking for decisions, you should present the relevant considerations from your own perspective, since you are both the implementer and the domain expert for the above code. |
|
I defer to @ChAoSUnItY for confirmation. |
Summary
Fix a segfault when evaluating array compound literals like
(int[]){1,2,3,4,5}in expression context. The literal now yields the temporary array’s address (via decay) instead of collapsing to a single element, so indexing and reads work correctly.Motivation
Previously, code like the snippet below crashed because the array literal was reduced to a scalar and later treated as a pointer.
Reproduction (manual)
Before: segfault
After: prints
a[0..4]and Sum = 6 as expectedApproach (high level)
(type[]){…}produces a real temporary array object and the expression value decays to its address.Scope
Tests
Compatibility
Issue
Summary by cubic
Fix parsing of array compound literals so they allocate a temporary array and decay to its address, preserving pointer semantics and preventing segfaults.
Bug Fixes
Refactors
Written for commit 62c0f47. Summary will update automatically on new commits.