Showing posts with label sequence points. Show all posts
Showing posts with label sequence points. Show all posts

Sunday, February 20, 2011

Undefined Behavior.....

Kernighan and Ritchie wisely point out, ``if you don't know how they are done on various machines, that innocence may help to protect you.''

The C specifications are interesting because they leave many behaviors undefined. For example, if you try to use an uninitialized variable, the results are technically undefined.n some languages, the specifications might stipulate that if that ever happened, the program should gracefully halt with an error message or something. Or that all variables be initialized with default values as soon as they’re declared. The architects of C decided to leave it up to the compiler-makers to decide how to handle it. The reason is optimization. If the language’s designers required a graceful halt upon use of an uninitialized variable, then in order to be compliant, compiler-makers would have to build that into their compilers… which comes with a performance cost...

 Some other behaviors which create undefined result:
  • Division by Zero. In practice, this usually results in the program halting (possibly with a core dump), but it doesn’t have to, according to the C standards. 1/0 could be defined to be absolutely anything, and computing it might cause the computer to format its hard-drive– that would still be C-compliant.
  • i++ += i + i++. Assume i starts at 0… what do you think i should become after this operation? The more general rule is: any time you try to read a variable twice within a computation in which you also write to that variable, the behavior is undefined.
  • Trying to read or write from memory which hasn’t been allocated. Thus all the trouble with buffer overflows.

Example:
void main()
{
int i=10;
f(i++,i++,i++);
printf("i=:%d",i);
getch();
}
void f(int a,int b,int c)
{
printf("a=%d:b=%d:c=%d",a,b,c);
}
 
output:
undefined....

Explanation:The behavior of this program is undefined in the C standard..this depend on the compiler implementation and vary from compiler to compiler..The behavior is due to function call f(i++,i++,i++) due to the sequence points concept....

Thursday, February 10, 2011

Sequence points...


From Wikipedia, the free encyclopedia
A sequence point in imperative programming defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed..

The C language defines the following sequence points:
    ·         Left operand of the logical-AND operator (&&). The left operand of the logical-AND operator is completely evaluated and all side effects complete before continuing. If the left operand evaluates to false (0), the other operand is not evaluated.
·         Left operand of the logical-OR operator (||). The left operand of the logical-OR operator is completely evaluated and all side effects complete before continuing. If the left operand evaluates to true (nonzero), the other operand is not evaluated.
·         Left operand of the comma operator. The left operand of the comma operator is completely evaluated and all side effects complete before continuing. Both operands of the comma operator are always evaluated. Note that the comma operator in a function call does not guarantee an order of evaluation.
·         Function-call operator. All arguments to a function are evaluated and all side effects complete before entry to the function. No order of evaluation among the arguments is specified.
·         First operand of the conditional operator. The first operand of the conditional operator is completely evaluated and all side effects complete before continuing.
·         The end of a full initialization expression (that is, an expression that is not part of another expression such as the end of an initialization in a declaration statement).
·         The expression in an expression statement. Expression statements consist of an optional expression followed by a semicolon (;). The expression is evaluated for its side effects and there is a sequence point following this evaluation.
·         The controlling expression in a selection (if or switch) statement. The expression is completely evaluated and all side effects complete before the code dependent on the selection is executed.
·         The controlling expression of a while or do statement. The expression is completely evaluated and all side effects complete before any statements in the next iteration of the while or do loop are executed.
·         Each of the three expressions of a for statement. The expressions are completely evaluated and all side effects complete before any statements in the next iteration of the for loop are executed.
·         The expression in a return statement. The expression is completely evaluated and all side effects complete before control returns to the calling function.

The Standard states that:
 Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
Now,In an expression statement, the ``next sequence point'' is usually at the terminating semicolon, and the ``previous sequence point'' is at the end of the previous statement. An expression may also contain intermediate sequence points..

Undefined Behaviour::
"C guarantees that all side effects of a given expression is completed by the next sequence point in the program. If two or more operations with side effects affecting each other occur before the next sequence point, the behavior is undefined. "

just take a look at this...

#include <stdio.h>
int main()
{
    int i = 5;
    printf("%d %d %d\n", i, i--, ++i);
    return 0;
}
The output is 5 6 5 when compiled with gcc and 6 6 6 when compiled with Microsoft C/C++ compiler.
take a look at another one....

#include <stdio.h>
int main()
{
    int a = 5;
    a += a++ + a++;
    printf("%d\n", a);
    return 0;
}
output as 17 with both the compilers...
The behaviour of such C programs is undefined. In the statement printf("%d %d %d\n", i, i--, ++i); and a += a++ + a++;, semicolon is the only sequence point.Such code may behave differently when compiled with different compilers.

From K&R. In Section 2.12 (Precedence and Order of Evaluation) of the book, the authors write,
C, like most languages, does not specify the order in which the operands of an operator are evaluated. (The exceptions are &&, ||, ?:, and ','.) For example, in a statement like
x = f() + g();
f may be evaluated before g or vice versa; thus if either f or g alters a variable on which the other depends, x can depend on the order of evaluation. Intermediate results can be stored in temporary variables to ensure a particular sequence.

One unhappy situation is typified by the statement
a[i] = i++;
The question is whether the subscript is the old value of i or the new. Compilers can interpret this in different ways, and generate different answers depending on their interpretation.