Declarations

Section 10. Declarations

10.1: How do you decide which integer type to use?

If you might need large values (above 32767 or below -32767), use long. Otherwise, if space is very important (there are large arrays or many structures), use short. Otherwise, use int. If well-defined overflow characteristics are important and/or negative values are not, use the corresponding unsigned types. (But beware of mixing signed and unsigned in expressions.) Similar arguments apply when deciding between float and double.

Although char or unsigned char can be used as a "tiny" int type, doing so is often more trouble than it's worth, due to unpredictable sign extension and increased code size.

These rules obviously don't apply if the address of a variable is taken and must have a particular type.

If for some reason you need to declare something with an exact size (usually the only good reason for doing so is when attempting to conform to some externally-imposed storage layout, but see question 17.3), be sure to encapsulate the choice behind an appropriate typedef.

10.2: What should the 64-bit type on new, 64-bit machines be?

Some vendors of C products for 64-bit machines support 64-bit long ints. Others fear that too much existing code depends on sizeof(int) == sizeof(long) == 32 bits, and introduce a new 64-bit long long int (or __longlong) type instead.

Programmers interested in writing portable code should therefore insulate their 64-bit type needs behind appropriate typedefs. Vendors who feel compelled to introduce a new, longer integral type should advertise it as being "at least 64 bits" (which is truly new; a type traditional C doesn't have), and not "exactly 64 bits."

10.3: I can't seem to define a linked list successfully. I tried

	typedef struct
		{
		char *item;
		NODEPTR next;
		} *NODEPTR;

but the compiler gave me error messages. Can't a struct in C contain a pointer to itself?

Structs in C can certainly contain pointers to themselves; the discussion and example in section 6.5 of K&R make this clear. The problem with this example is that the NODEPTR typedef is not complete at the point where the "next" field is declared. To fix it, first give the structure a tag ("struct node"). Then, declare the "next" field as "struct node *next;", and/or move the typedef declaration wholly before or wholly after the struct declaration. One corrected version would be

	struct node
		{
		char *item;
		struct node *next;
		};

	typedef struct node *NODEPTR;

, and there are at least three other equivalently correct ways of arranging it.

A similar problem, with a similar solution, can arise when attempting to declare a pair of typedef'ed mutually referential structures.

References: K&R I Sec. 6.5 p. 101; K&R II Sec. 6.5 p. 139; H&S Sec. 5.6.1 p. 102; ANSI Sec. 3.5.2.3 .

10.4: How do I declare an array of N pointers to functions returning pointers to functions returning pointers to characters?

This question can be answered in at least three ways:

char *(*(*a[N])())();
Build the declaration up in stages, using typedefs:
typedef char *pc; /* pointer to char */
typedef pc fpc(); /* function returning pointer to char */
typedef fpc *pfpc; /* pointer to above */
typedef pfpc fpfpc(); /* function returning... */
typedef fpfpc *pfpfpc; /* pointer to... */
pfpfpc a[N]; /* array of... */
Use the cdecl program, which turns English into C and vice versa:
cdecl> declare a as array of pointer to function returning pointer to function returning pointer to char
char *(*(*a[])())()
cdecl can also explain complicated declarations, help with casts, and indicate which set of parentheses the arguments go in (for complicated function definitions, like the above). Versions of cdecl are in volume 14 of comp.sources.unix (see question 17.12) and K&R II.

Any good book on C should explain how to read these complicated C declarations "inside out" to understand them ("declaration mimics use").

References: K&R II Sec. 5.12 p. 122; H&S Sec. 5.10.1 p. 116.

10.5: I'm building a state machine with a bunch of functions, one for each state. I want to implement state transitions by having each function return a pointer to the next state function. I find a limitation in C's declaration mechanism: there's no way to declare these functions as returning a pointer to a function returning a pointer to a function returning a pointer to a function...

You can't do it directly. Either have the function return a generic function pointer type, and apply a cast before calling through it; or have it return a structure containing only a pointer to a function returning that structure.

10.6: My compiler is complaining about an invalid redeclaration of a function, but I only define it once and call it once.

If the first call precedes the definition, the compiler will assume a function returns an int. Non-int functions must be declared before they are called.

References: K&R I Sec. 4.2 pp. 70; K&R II Sec. 4.2 p. 72; ANSI Sec. 3.3.2.2 .

10.7: What's the best way to declare and define global variables?

First, though there can be many declarations (and in many translation units) of a single "global" (strictly speaking, "external") variable (or function), there must be exactly one definition. (The definition is the declaration that actually allocates space, and provides an initialization value, if any.) It is best to place the definition in some central (to the program, or to the module) .c file, with an external declaration in a header (".h") file, which is #included wherever the declaration is needed. The .c file containing the definition should also #include the header file containing the external declaration, so that the compiler can check that the declarations match.

This rule promotes a high degree of portability, and is consistent with the requirements of the ANSI C Standard. Note that Unix compilers and linkers typically use a "common model" which allows multiple (uninitialized) definitions. A few very odd systems may require an explicit initializer to distinguish a definition from an external declaration.

It is possible to use preprocessor tricks to arrange that the declaration need only be typed once, in the header file, and "turned into" a definition, during exactly one #inclusion, via a special #define.

References: K&R I Sec. 4.5 pp. 76-7; K&R II Sec. 4.4 pp. 80-1; ANSI Sec. 3.1.2.2 (esp. Rationale), Secs. 3.7, 3.7.2, Sec. F.5.11; H&S Sec. 4.8 pp. 79-80; CT&P Sec. 4.2 pp. 54-56.

10.8: What does `extern` mean in a function declaration?

It can be used as a stylistic hint to indicate that the function's definition is probably in another source file, but there is no formal difference between

        extern int f();

and

        int f();

References: ANSI Sec. 3.1.2.2 .

10.9: I finally figured out the syntax for declaring pointers to functions, but now how do I initialize one?

Use something like

	extern int func();
	int (*fp)() = func;

When the name of a function appears in an expression but is not being called (i.e. is not followed by a "("), it "decays" into a pointer (i.e. it has its address implicitly taken), much as an array name does.

An explicit extern declaration for the function is normally needed, since implicit external function declaration does not happen in this case (again, because the function name is not followed by a "(").

10.10: I've seen different methods used for calling through pointers to functions. What's the story?

Originally, a pointer to a function had to be "turned into" a "real" function, with the * operator (and an extra pair of parentheses, to keep the precedence straight), before calling:

	int r, func(), (*fp)() = func;
	r = (*fp)();

It can also be argued that functions are always called through pointers, but that "real" functions decay implicitly into pointers (in expressions, as they do in initializations) and so cause no trouble. This reasoning, made widespread through pcc and adopted in the ANSI standard, means that

	r = fp();

is legal and works correctly, whether fp is a function or a pointer to one. (The usage has always been unambiguous; there is nothing you ever could have done with a function pointer followed by an argument list except call through it.) An explicit * is harmless, and still allowed (and recommended, if portability to older compilers is important).

References: ANSI Sec. 3.3.2.2 p. 41, Rationale p. 41.

10.11: What's the `auto` keyword good for?

Nothing; it's obsolete.