See N3042.
The addition of nullptr
and
nullptr_t
is bad.
Introduction
The macro
NULL
, that goes back quite early, was meant to provide a tool to specify a null pointer constant such that it is easily visible and such that it makes the intention of the programmer to specifier a pointer value clear. Unfortunately, the definition as it is given in the standard misses that goal, because the constant that is hidden behind the macro can be of very different nature.A null pointer constant can be any integer constant of value 0 or such a constant converted to
void*
. Thereby several types are possible forNULL
. Commonly used are0
withint
,0L
withlong
and(void*)0
withvoid*
.
- This may lead to surprises when invoking a type-generic macro with an
NULL
argument.- Conditional expressions such as
(true ? 0 : NULL)
and(true ? 1 : NULL)
have different status depending howNULL
is defined. Whereas the first is always defined, the second is a constraint violation ifNULL
has typevoid*
, and defined otherwise. In particular, the second happens to work in C++ but most of the times not in C.- A
NULL
argument that is passed as a sentinel to a...
function that expects a pointer can have severe consequences. On many architectures nowadaysint
andvoid*
have different sizes, and so ifNULL
is just0
, a wrongly sized argument is passed to the function.- In particular, C++ can’t have
NULL
as(void*)0
becausevoid*
does not implicitly convert to other pointer types. Thus it is usually an integer constant of value zero. On the C side (e.g byprintf
) such a passed integer constant is then interpreted asvoid*
orchar*
; such a re-interpretation has undefined behavior.
NULL
. It is not, however, an important
problem.
NULL
as sentinel for pointer types could
be done by giving it the proper type. We already make
void*
and char*
“compatible” for the
purpose of va_arg
(only).
NULL
defined as
void*
has no effect whatsoever on C.
Besides, the definitions of NULL
for C and C++
are likely to disagree already. My system’s
NULL
definition is:
/*
* Written by Todd C. Miller, September 9, 2016
* Public domain.
*/
#ifndef NULL
#if !defined(__cplusplus)
#define NULL ((void *)0)
#elif __cplusplus >= 201103L
#define NULL nullptr
#elif defined(__GNUG__)
#define NULL __null
#else
#define NULL 0L
#endif
#endif
It certainly is true that a NULL
defined as
integer constant expression can’t be used as variadic
argument where the callee expects a pointer. There is no fix
for all pointer types other than casting the null pointer, but
for void*
and char*
, the fix is
forbidding a definition of NULL
as integer
constant expression.
Rationale
Why do we need a specific
nullptr
constant?Null pointer constants in C are a feature that is somewhat defined orthogonal to the type system. They are based on the concept of “integer constant expressions” and may in fact have any integer type (even
bool
, enumerations, character constants or expressions such asx-x
are possible) as long as the value can be determined at translation time and happens to be zero. On top of that ambiguity concerning integer types, it is even permitted to use an explicit cast tovoid*
and to still obtain null pointer constant.The standard macro
NULL
inherits from these confusing definitions and has no standardized type and no standardized behavior in contexts that are different from simple conversion to a pointer type. For example a use ofNULL
as an argument to a...
function is not guaranteed to work.
If
NULL
has integer type but different alignment or size thanvoid*
any access withva_arg
that interprets such an argument could crash the program.If
NULL
has integer type and null pointers are not represented as all-bit zero, such a transferred integer cannot be reinterpreted as a pointer value that would be a null pointer.If
NULL
has integer type (and notvoid*
) and if even the integer type, saylong
, has the correct size and alignment, an interpretation of that past-in integer in the formchar* a = va_arg(ap, char*);
has undefined behavior. As an exception
va_arg
allows the reinterpretation betweenvoid*
andchar*
, for example, but not from integer type to pointer type.
Note how the last point is not even fixed by
nullptr
.
Also, it is not easy to detect if an argument to a function or even macro is a null pointer constant or only an arbitrary null pointer value. In C, compile time code distinction is usually done in the preprocessor or by
_Generic
. The preprocessor doesn’t work withNULL
because it might not even be a preprocessor constant._Generic
is difficult to use because it is based on types and not values, although there are ways to abuse properties of conditional expressions, integer constant expressions, null pointer constants and_Generic
to do so.
This is utter nonsense. You really don’t need to
differentiate between null pointer constants and other null
pointer values; you don’t differentiate between integer
constant expressions and other integer values or perhaps string
literals and other char[]
expressions either. If,
for some reason, this was desired, the solution would be a
facility to do just that—let the function check whether the
argument is a constant expression or a literal or just any
other expression.
Another reason to strengthen the definition of null pointer constants in C is the common confusion between a null pointer and a pointer that points to the zero address in the OS, as is suggested by using integer literals such as
0
to express null pointer constants. Also, the fact that on some architectures a null pointer is not necessarily represented with a all-zero bit-pattern always needs special attention when teaching C and is quite surprising for beginners. If it were that these sophistic distinctions would be necessary for the expressiveness of the language, that could perhaps be acceptable, but here it clearly is a random burden that is imposed on generations of teachers and students that is only rooted in history and has no reason d’être as of today; all other programming languages that have concepts similar to pointers in C do quite well without this ambiguity between numbers and pointers.The idea of
nullptr
is to end this ambiguity and to provide a keyword with a value and a portable type that can be used anywhere where a null pointer constant is needed.
This “ambiguity” is not changed a bit by the introduction
of nullptr
. Even if we have an additional way of
expressing null pointers, the old ones will have to stay. The
overall burden on the student only increases. This does
not simplify the language.
We already have expressions with a value and a portable type
that can be used anywhere a null pointer constant is needed: a
null pointer constant; say, (void*)0
.
The
nullptr
feature presented in this paper has the following properties.
- It has a complete object type.
Same for (void*)0
.
- It does not have scalar type, so it is forbidden in arithmetic.
This is not true. In this revision of the proposal, it does have scalar type.
- It converts to any pointer type.
Same for (void*)0
.
- It converts to
bool
by always evaluating tofalse
.
Same for (void*)0
.
- In memory,
nullptr
is represented with the same bit-pattern as a null pointer constant of typevoid*
.
Same for (void*)0
.
nullptr
is permitted in all “Boolean” contexts such as&&
operators orif
statements.
Same for (void*)0
.
nullptr
is permitted as argument to...
, as long as the function interprets it as pointer tovoid
or character type.
Same for (void*)0
.
The aim is that this feature has exactly the same behavior as the corresponding feature in C++.
If the aim was enriching the C ABI in a way compatible with C++, if not a useful goal, I would understand; this, I don’t understand. There is not reason to aim for the same behavior as a corresponding C++ feature.
Why do we need a specific
nullptr_t
type different fromvoid*
?The secondary feature proposed in this paper is the the type
nullptr_t
with the intent to allow better diagnostics for functions that possibly receive a null pointer argument and to potentially optimize the case where a null pointer constant is received.Consider a function
func
that receives a pointer parameter that can either be valid or a null pointer to indicate a default choice.// header "func.h" void func_general(toto*); // define a default action // no parameter name, parameter is never read inline void func_default(nullptr_t) { ... } #define func(P) \ _Generic((P), \ nullptr_t: func_default, \ default: func_general)(P)
// one translation unit #include "func.h" // emit an external definition extern void func_default(nullptr_t); // define the general action void func_general(toto* p) { // p may still have value null if (!p) func_default(nullptr); // may only be called with nullptr else { ... } }
Here, a function
func_default
is defined that receives anullptr
. The function needs no access to the parameter, since that parameter can only hold one specific value. A type-generic macrofunc
then chooses this function or the general functionfunc_general
. The translation unit that definesfunc_general
may then emit an external definition offunc_default
and also use it within the definition for the case thatfunc_general
receives a parameter value that is null without being recognized as such at translation time of the call.#include "func.h" ... func(0); // ok, but uses the general function and may issue a diagnostic func((void*)0); // ok, but uses the general function, no diagnostic func(NULL); // ok, but uses the general function, diagnostic or not func((toto*)0); // ok, but uses the general function, no diagnostic func(nullptr); // uses default action directly
The use of the macro with a null pointer constant of integer type then uses the general function and sets the parameter to null; implementations that chose to diagnose the use of null pointer constants of integer type may do so for this call.
In contrast to that, a call that uses
nullptr
as an argument directly resolves tofunc_default
, may or may not inline the corresponding action, and will not trigger such a diagnosis.The emission of a diagnosis can be forced by restricting the admissible type as shown in the definition of
func_strict
.#define func_strict(P) \ _Generic((P), \ nullptr_t: func_default, \ toto*: func_general)(P) ... func_strict(0); // invalid, int argument is not a valid choice, constraint violation func_strict((void*)0); // invalid, void* argument is not a valid choice, constraint violation func_strict(NULL); // invalid, void* or integer argument is not a valid choice, constraint violation func_strict((toto*)0); // ok, but uses the general function, no diagnostic func_strict(nullptr); // uses default action directly
This one example is a giant hack. It’s abusing the generic selection to check not what would ordinarily be the type, but whether the caller provided a constant expression with a certain value.
For this specific example, since you need to document the
special handling of nullptr
anyway, you could’ve
simply provided the two functions as they are.
func
and func_default
. There is no
point in trying to squish that extra bit of information—“I
know that I definitely want the default; it doesn’t depend on
runtime information”—in the one parameter.
In general, you will note, is this percieved problem not
addressed at all: It’s not specific to pointer parameters.
You still can’t use a generic selection to find out whether
an integer is an integer constant expression 0 or any other
integer, non-zero or non-constant. (No, that’s wrong. You
can. You can build a function with an int
parameter and a type-generic macro that select a function
without parameter if the argument is of type
nullptr_t
. Go figure.)
Not only does a new type for pointer values that have
value 0 special-case the type (to pointer types), but
also the value (to 0): If your default is something other than
the null pointer, you’d be ill-advised to interpret
nullptr
as that default. Suddenly,
func((toto*)0)
and func(nullptr)
not
only go slightly different paths (the one first to
func_general
and the other directly to
func_default
), but behave completely differently!
TODO: Design choices and Impact
Prior art
The concept to present a null pointer constant as a keyword that is tightly integrated into the language as is proposed here is present in most other programming languages that have the concept of pointers, for example Pascal, Lisp, Smalltalk, Ruby, Objective-C, Lua, Scala, or Go, often with other spellings such as
nil
,NIL
,None
,null
orNull
. The fact that C still does express this concept with other language features is a rare exception in this picture and only a historic artifact and not a necessity.
It is neither a necessity nor a bug. And note how those languages do not have own types for null pointers.
The
nullptr
feature together withnullptr_t
is present in C++ since C++11 and has extensive implementation and application experience in that framework. This feature is also given under a different name in the Plan 9 C compiler, namednil
. It approximates some of the features provided below, but not all of them.
This is slander. The Plan 9 C compiler has nothing similar to
nullptr
. The Plan 9 C library has
this line:
#define nil ((void*)0)
A plain old macro. Instead of “NULL” it’s called
“nil” and instead of being defined as any null pointer
constant it’s defined as ((void*)0)
. The name
is as fine as “NULL” (if not better—easier to type and
easier on the eyes) and the value is, for once, correct.
This is how it should be.
C users often shift between using literal
0
versus(void*)0
for a library-deployed, macro-based definition. There are various trade-offs for doing this (discussed as part of the design decisions above) that can make this have undesirable behaviors and qualities. Recently, users have tried to move away from their own personal definitions for portability and correctness reasons.
Actually, the design decisions do not discuss those trade-offs.
The introduction of nullptr
and
nullptr_t
does not address all the problems stated
and does not address any real problem better than changing
NULL
’s definition to ((void*)0)
would have.