3

If I understand correctly, this programme has undefined behavior in C++ because the intermediate value p + 1 is a pointer to uninitialized memory:

int main () {
    int x = 0;
    int *p = &x;
    p = p + 1 - 1;
    *p = 5;
}

If void were put in main's argument list (as required by the C grammar), would it also be undefined behavior in C?

9
  • 1
    Pointer to uninitialized memory is usually not a problem. Pointing to unallocated memory might be. I don't know if this counts as unallocated memory. The UB is triggered because p + 1 might not exists as an address, but 99.9999999% of the times wouldn't be a problem because the stack is large enough to hold one more int. Additionally any allocated memory in C has an extra element after I think. Commented Feb 14, 2022 at 22:35
  • 1
    No, calculating a pointer with a value that points to an invalid memory address is not UB. Only dereferencing the pointer is. But, your final value is okay. So, here, using *p is not UB. Commented Feb 14, 2022 at 22:36
  • 2
    p + 1 is a pointer to a memory that is not owned by x, but is not an undefined behavior. There is the exception for p + 1 in C++ standard. Commented Feb 14, 2022 at 22:36
  • 4
    Right, p + 1 after the allocated memory is ok unless you dereference it. Commented Feb 14, 2022 at 22:38
  • 1
    It is wrong. A compiler knows nothing about the stack size and may consider any p + n if n >= 1 as impossible, otherwise it would result in UB, and may apply any optimizations assuming n <= 1. Commented Feb 14, 2022 at 22:47

2 Answers 2

8

There is neither undefined behavior. You can consider a single object as an array with one element. Using the pointer arithmetic the pointer may point to element past the last element of the array so this statement

p = p + 1 - 1;

is correct.

From the C Standard (6.5.6 Additive operators)

7 For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

and

  1. ...Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object.

Pay attention to that

  1. ...If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
Sign up to request clarification or add additional context in comments.

15 Comments

@alfC Reread the quote #7 from the Standard provided in my answer.
@alfC 1) A stack is not something specified by the C standard. It's an implementation detail. 2) The C standard is pretty clear about p + 1 being a valid operation. So any implementation that uses a stack must make sure it's valid
@alfC If you put very large object on your stack, your program most likely crash for other reasons
@alfC: The C standard says, as quoted, that in pointer arithmetic, a single object acts as an array of one element and that you can add to make a pointer to one beyond the last element of an array. That is conclusive; the behavior is defined because the C standard defines it.
@ByteEater This is totally the same i n the C++ Standard even there are used the same phrases. Based on this ranges for arrays are built in C++ in algorithms and in the range-based for loop.
|
3

I think it's a bit unfortunate that the OP chose p + 1 - 1 as an example because p + 1 is not undefined behavior as shown in Vlad from Moscow's answer.

The question is more interesting if we consider p + 2 - 2. Here p + 2 is indeed undefined behavior. But does that matter if in the full expression we "undo this computation".

There is an analog for integers. E.g. given i a signed integer and if i + 2 overflows, thus being undefined behavior, is the expression i + 2 - 2 ok or undefined behavior?

The answer to both is that it is undefined behavior. If an expression is undefined behavior and the program would reach that expression in its evaluation then the whole program exhibits undefined behavior.

There is a more know case about this: computing the mid point of signed integers: (a + b) / 2 is UB if a + b overflows, even if the the final value would fit in the data type.

7 Comments

The C standard allows the computation of p + 2 - 2. It does not define it. The C standard is not a dictatorship that controls what you can do or even a walled garden you must remain inside. It is an open world where the standard provides fundamental city services—roads, public transit, water and other utilities, libraries and other public buildings—but C implementations can both build their own buildings and go outside the city limits…
… Java, for example, was intended to be a closed language where only defined things are allowed, so a program that works in one Java implementation works in another. C was intended to be an open language where it is easy for programmers to pick it up on a new platform because it used a lot of common features but where each C implementation could add things useful for its targets or other purposes, even if they do not work in other systems. So the C standard allows programs to do all sorts of things it does not define; it is up to the C implementations to support them or not.
@EricPostpischil but it is Undefined Behavior, correct?
The C standard does not define it. That means it is “undefined behavior” as the C standard, perhaps unfortunately, uses the phrase. It does not mean the behavior cannot be defined, because things other than the C standard may define it. For example, the C implementation may define it. “Undefined Behavior” is not a proper noun; it should not be capitalized. It should be regarded only as the C standard taking its hands out of the matter, not as a prohibition on using the code.
@EricPostpischil: It is unfortunate that the C Standard recursively says all three ways by which it characterizes behaviors as UB have the same meaning "behavior that is undefined", rather that saying e,.g. "behavior over which the Standard waives jurisdiction". When the Standard says UB occurs because of "non-portable or erroneous" constructs, that doesn't mean constructs that are non-portable and therefore erroneous, but instead includes constructs that are non-portable but correct.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.