Learning C from Java: Differences
Primitive types
In C, the primitive types are referred to using a combination of
the keywordschar,int,float,double,signed,unsigned,long,shortandvoid. The allowable combinations are listed below, but
their meanings depend on the compiler and platform in use, unlike
Java.unsigned charThe narrowest unsigned integral type, typically (and always at
least) 8 bits wide.signed charThe narrowest signed integral type, of the same width as
unsigned char.charAn integral type equivalent to one or other of the signed/unsigned
variants, but its signedness is implementation-dependent. C treats it as a
distinct type, though.unsignedshortintunsignedshortAn unsigned integral type at least as wide as
unsigned, typically (and always at least) 16 bits. The
charintis usually omitted.signedshortintsignedshortshortintshortA signed integral type of the same width as
unsigned. This is often just called
short intshort.unsignedintunsignedAn unsigned integral type at least as wide as
unsigned short, and wider than thechartypes. 16- or 32-bit widths are common.signedintsignedintA signed integral type of the same size as
unsigned. This is often just called
intint.unsignedlongintunsignedlongAn unsigned integral type at least as wide as
unsigned int, typically (and always at least) 32
bits. Theintis often omitted.signedlongintsignedlonglongintlongA signed integral type of the same size as
unsigned. This is often just called
long intlong.unsignedlonglongintunsignedlonglongIn C99, an unsigned integral type at least as
wide asunsigned long, typically (and always at
least) 64 bits. Theintis often
omitted.signedlonglongintsignedlonglonglonglongintlonglongIn C99, a signed integral type of the same
size asunsigned long long int. This is often
just calledlong long.floatA single precision floating-point type.
doubleA double precision floating-point type.
longdoubleAn extended double precision floating-point type.
voidAn empty type. It has no value, and cannot be accessed. As in
Java, C functions with no return value are defined to returnvoid. Unlike Java, a function with no parameters hasvoidin its parameter list.
Note that there is no boolean type. Instead, the test conditions of
if,whileandforstatements, and the operands of the logical
operators (!,&&and||), are integer expressions with a boolean
interpretation: zero means false, non-zero means true. The relational
operators (==,!=,<=,>=,<and>) and logical
operators return0for false and1for true.In C99, there is a boolean type
bool(which is really just a very small integer type)
and symbolic valuestrueandfalse(i.e. just 1 and 0), but the other integer types
work just as well as before.Comments
Java allows the use of these forms of comment:
/* a multiline
comment */
// a single line commentC only allows the former. It is not wise to use such comments to
temporarily disable sections of code, since they do not nest. Use the
preprocessor (see later) instead:/* enabled code */
#if 0
/* disabled code */
#endif
/* enabled code */In C99, the one-line comment is allowed.
Structures instead of classes
C does not allow you to declare class types (as you can in Java
using theclassconstruct), but you can declare
C structures using thestructconstruct. A C
structure is like a Java class that only contains public data members
— there must be no functions, and all parts are visible to any
code that knows the declaration. For example:struct point {
int x, y;
};This declares a type called
struct point(NB:
‘struct’ is part of the name;pointis known as the structure type's
tag).Members of a C structure are accessed using the
.operator, as class members can be in Java:struct point location;
location.x = 10;
location.y = 13;A structure object may be initialised where it is defined:
struct point location = { 10, 13 }; /* okay; initialisation (part of definition) */
location = { 4, 5 }; /* illegal; assignment (not part of definition) */In C99, you can create anonymous structure objects
to perform compound assignement:location = (struct point) { 4, 5 }; /* legal in C99 */In C99, a structure initialisation can specify
which members are being set:struct point location = { .y = 13, .x = 10 }; /* legal in C99 */Unlike Java, where class variables are references to objects, C
structure variables are the objects themselves. Assigning one to
another causes copying of the members:struct point a = { 1, 2 };
struct point b;
b = a; /* copiesa.xtob.x, anda.ytob.y*/
b.x = 10; /* does not affecta.x*/Unions
C allows an area of memory to be occupied by data of several types,
though only one at a time, using a union. Unions are syntactically
similar to structures:union number {
char c;
int i;
float f;
double d;
};This declares a type called
union number(NB:
‘union’ is part of the name;numberis known as the union's tag).Members of a C union are accessed using the
.operator, just as structure members are accessed:union number n;
int j;
n.i = 10;
j = n.i;Only the member to which a value was last assigned contains valid
information to be read. There is no way to determine that member
implicitly, so the programmer must take steps to identify it, for
example, by using a separate variable to indicate the type:union number n;
enum { CHAR, INT, FLOAT, DOUBLE } nt;
n.i = 10;
nt = INT;
switch (nt) {
case CHAR:
/* accessn.c*/
break;
case INT:
/* accessn.i*/
break;
case FLOAT:
/* accessn.f*/
break;
case DOUBLE:
/* accessn.d*/
break;
}Java does not have unions, although it is possible for a reference
to refer to any class derived from its own. A reference of typejava.lang.Objectcan refer to any class of object,
since all classes are originally derived fromjava.lang.Object.Single namespace for functions and global
variablesEach class in Java defines a namespace which allows functions and
variables in separate, unrelated classes to share the same name. When
identifying a function or variable in Java, the namespace must be
expressed, or implied using animport
directive; for example, the methodjava.lang.Integer.toString()is distinct fromjava.lang.Long.toString(). Java packages allow
distinct classes and interfaces to share the same name; for example,
the nameObjectcould refer to eitherjava.lang.Objectororg.omg.CORBA.Object.In C, all functions are global,
and must share a single namespace (i.e. one
per program). Global variables can also be declared and defined, and
they also share that namespace. Care must be taken in choosing names
for functions in large projects, and often a strategy of using a
common prefix for groups of related functions is employed, e.g.WSAprefixes most of the
WinSock functions.Note that other namespaces exist in C: a single namespace is shared by
the tags of all structures, unions and enumerations; each structure
and union holds a unique namespace for its members; each block
statement holds a namespace for local variables.Lack of function name overloading
In Java, two functions in the same namespace may share the same
name if their parameter types are sufficiently different. In C, this
is simply not the case, and all function names must be unique.void myfunc(int a)
{
/* ... */
}
void myfunc(float b) /* error:myfuncalready defined */
{
/* ... */
}Type aliasing
New names or aliases for existing types may be created using
typedef. For example:typedef int int32_t;
This allows
int32_tto be used anywhere in
place ofint, and such aliases are often used to
hide implementation- or platform-specific details, or to allow the
choice of a widely-used type to be changed easily.typedefare also useful for expressing complex
compound types. For example, a prototype for the standard-library
functionsignalhas the following, rather cryptic
form (in ISO C):void (*signal(int signum, void (*handler)(int)))(int);
Erm, what? It becomes a little clearer when POSIX (an Operating
System standard which incorporates the C standard) declares it:typedef void (*sighandler_t)(int);
sighandler_t signal(int signum, sighandler_t handler);Now we can see that the function's second argument has the same
type as its return value, and that that type is, in fact, a
pointer-to-function type.Note that a
typedefis syntactically
similar to a variable declaration, with the new type name appearing in
the place of the variable name.There is no equivalent of type aliasing in Java.
Declarations and definitions
C programs are built from collections of functions (which have
behaviour) and objects (which have values; variables are objects), the
natures of which are indicated by their types. C compilers read
through source files sequentially, looking for names of types, objects
and functions being referred to by other types, objects and
functions.A declaration of a type, object or function tells the compiler that
a name exists and how it may be used, and so may be referred to later
in the file. If the compiler encounters a name that does not have a
preceding declaration, it may generate an error or a warning because
it does not understand how the name is to be used.In contrast, a Java compiler can look forward or back, or even into
other source files, to find definitions for referenced names.A definition of an object or function tells the compiler which
module the object or function is in (see “Program modularity”). For an object,
the definition may also indicate its initial value. For a function,
the definition gives the function's behaviour.Functions and their prototypes
In Java, the use of a function may appear earlier than its
definition. In C, all functions being used in a source file
should be declared somewhere earlier than their invocations
in that file, allowing the compiler to check if the arguments match
the function's formal parameters. A function declaration (or
prototype) looks like a function definition, but its body
(the code between and including the braces (‘{’ and ‘}’) is
replaced by a semicolon (similar to anative
method, or an interface method, in Java). If the compiler finds a
function invocation before any declaration, it will try to infer a
declaration from the invocation, and this may not match the true
definition. A proper declaration can be inferred from a function
definition, should that be encountered first./* a declaration; parameter names may be omitted */
int power(int base, int exponent);
/* From here until the end of the file, we can make calls topower(),
even though the definition hasn't been encountered. */
/* a definition; parameter names do not need to match declaration */
int power(int b, int e)
{
int r = 1;
while (e-- > 0)
r *= b;
return r;
}Global objects
Global objects also have distinct declarative and definitive
forms. A definition may be accompanied by an initialiser, e.g.int globval = 34; /* initialised */
int another; /* uninitialised */while a declaration should not have an initialiser, and should be
preceded byextern.extern int globval;
extern int another;(
externcan also appear before a function
declaration, but it is optional.)Local objects
For local objects in C, the definition and declaration are not
distinguished. Unlike Java, all local variables must be defined at the
beginning of their enclosing block, before any statements are
reached. This restriction does not apply in C99.{
int x; /* a definition */
x = 10; /* a statement */
int y; /* illegal; follows a statement */
}Furthermore, an iteration variable in a
for
loop cannot be declared within the initialisation of the statement:{
for (int x = 0; x < 10; x++) { /* illegal */
/* ... */
}
}This restriction does not apply in C99.
Scope
All declarations have scope, which is the part of the program in
which the declared name is valid. ‘File scope’ means
from the declaration to the end of the file, and applies to
types, functions and global objects.‘Block scope’ means from the declaration to the end
of the block statement in which it is declared. This always
applies to local objects (and formal parameters), but can also apply
to types, functions and global objects. All of the following
declarations have block scope, and can be used by the trailing
statements, but not beyond:{
/* a local type */
typedef int MyInteger;
/* a local variable */
MyInteger x;
/* global variable */
extern int y;
/* function (externis implicit) */
int power(int base, int exponent);
/* statements... */
}Unlike Java, a local variable in an inner block may hide one in an
outer block by having the same name:{
int x;
{
int x; /* hides the other */
}
/* first one visible again */
}Empty parameter lists
In Java, a function that takes no parameters is expressed using
(). In C, such a function should be expressed
with(void)in its declaration and
definition. However, it is still invoked with():/* prototype/declaration */
int myfunc(void);
/* definition */
int myfunc(void)
{
/* ... */
}
/* invocation */
myfunc();The form
()is permitted in declarations, but
it means "unspecified arguments" rather than "no
arguments". This tells the compiler to abandon type-checking of
arguments where that function is invoked, and is not recommended.Preprocessing
Each C source file undergoes a lexical preprocessing stage which
serves several purposes, including conditional compilation and macro
expansion. The main purpose is to allow common declarations of types,
global data and functions to be conveniently and consistently made
available to modules which need to access them. In general, the
preprocessor is able to insert, remove or replace text from the source
code as it is supplied to the compiler (the original source code
doesn't change).There is no equivalent of preprocessing in Java, but the following
purposes don't usually apply to it anyway.File inclusion
When a large C program is split over several modules, code in one
module may need to make references to named code in another, or may
use types that the other module uses. The usual way to achieve this is
to precede the reference with a declaration that shows what the name
means. Some example declarations:/* this declares the type
struct point*/
struct point {
int x, y;
};
/* this declares the global variableerrno*/
extern int errno;
/* this declares the functiongetchar*/
int getchar(void);It would be tedious to repeat such declarations in each source file
that requires them (particularly if they need to be modified as the
program develops), but these could instead be placed in a separate
file (usually with a .h extension), and inserted
automatically by the preprocessor when it encounters an#includedirective embedded in the source code, for
example:#include "mydecls.h"
These header files are also preprocessed, and so may
contain further#include(or other)
directives.Header files containing declarations for the standard library are
also available to the preprocessor. These are normally accessed with a
variant of the#includedirective:/* include declarations for input/output routines */
#include <stdio.h>You should normally use the
""form
for your own headers rather than<>.Do not put definitions of functions or variables
in header files — it may result in multiple definitions of the
same name, so linking will fail. Header files should
normally only contain types, function prototypes, variable
declarations, and macro definitions. Note that inline functions are exceptional.Macros
The preprocessor allows macros to be defined which serve a number
of purposes:Some macros are used to hold constants or expressions:
#define PI 3.14159
double pi_twice = PI * 2;PIwill be replaced by the numeric value wherever it
is used.Some macros take arguments:
#define MAX(A,B) ((A) > (B) ? (A) : (B))
that provide a convenient way to emulate functions without the
overhead of a real function call. (See a good book on C for the
limitations of this.)Some macros are merely defined to exist:
#define JOB_DONE
and are used in conditional compilation.
Conditional compilation
The preprocessor allows code to be compiled selectively, depending
on some condition. For example, if we assume that the macro__unix__is defined only when compiling for a UNIX
system, and that the macro__windows__is defined
only when compiling for a Windows system, then we could provide a
single piece of code containing two possible implementations depending
on the intended target:int file_exists(const char *name)
{
#if defined(__unix__)
/* use UNIX system calls to find out if the file exists */
#elif defined(__windows__)
/* use Windows system calls to find out if the file exists */
#else
/* don't know what to do - abort compilation */
#error "No implementation for your platform."
#endif
}The most common use of conditional compilation, though, is to
prevent the declarations in a header file from being made more than
once, should the file be inadvertently#included
more than once:/* in the file mydecls.h */
#if !defined(mydecls_header)
#define mydecls_header
typedef int myInteger;
#endifYou should normally protect all your header files in this way.
Pointer types
For every type, there is a pointer type. Since there is an
inttype, there is also a pointer-to-inttype, writtenint *.float *is the
pointer-to-floattype. When assigning a pointer
value to a variable, or comparing two pointer values, the types must
match. Given these declarations:int i, j;
float f;
int *ip;
float *fp;…then
iis of typeint, so the expression&imust
be of typeint *.ipis
of typeint *, so you can assign&ito it.&jis of typeint *, so it can be compared with&i, and so on.But
&fis of typefloat *, so it cannot be assigned toip, or compared withip,&ior&j.Dangling pointers
In Java, an object will remain in existence so long as there is a
reference to it. In C, an object may go out of existence even if there are
pointers to it — the programmer is entirely responsible for
ensuring that pointers contain valid addresses (either0, or the address of an existing object) when used. This
badly written function returns a pointer to an integer variable:int *badfunc(void)
{
int x = 18;
return &x; /* bad -xwon't exist after the call has finished */
}The pointer returned by
badfunc()is invalid.Generic pointers
It is sometimes necessary to store or pass pointers without knowing
what type they point to. For this, you can use the generic pointer
typevoid *. You can convert between the
generic pointer type and other pointer types (but not
pointer-to-function types) whenever you need to:int x;
int *xp, *yp;
void *vp;
xp = &x;
vp = xp; /* types are compatible */
/* later... */
yp = vp; /* types are compatible */A generic pointer cannot be dereferenced, nor can pointer arithmetic be applied to it.
x = *vp; /* error: cannot dereference
void **/
vp++; /* error: cannot do arithmetic onvoid **/The generic pointer type simply allows you to tell the compiler
that you're taking responsibility for a pointer's interpretation, and
so no error messages or warnings are to be reported when assigning. It
is the programmer's responsibility to ensure that the pointer value is
interpreted as the correct type.int *ip;
float *fp;
void *vp;
fp = ip; /* error: incompatible types */
vp = ip; /* okay */
fp = vp; /* no compiler error, but is misuse */Generic pointers are used with dynamic memory
management, among other things.Inline functions
C99 supports inline functions. The programmer can indicate to the
compiler that a function's speed is critical by making itinline:inline int square(int x)
{
return x * x;
}If this definition is in scope, and you make a call to it, the
compiler may choose not to actually go through the overhead of calling
the function, but effectively place a copy of it inside the calling
function.Inline function definitions can (and often should) appear in header files instead of their prototypes. A normal (‘external’)
definition must still be provided — for example, some part of
your program may try to obtain a pointer to the
function, and only a normal definition can provide that.If the inline definition is in scope, an equivalent external
definition can be generated from it by simply redclaring the function
withextern:extern int square(int x);
If the inline definition isn't in scope, you could provide a normal
definition which doesn't actually match the inline definition —
but this could lead to confusing behaviour.Characters and strings
A Java variable of type
charcan hold any
Unicode character. In C, thechartype can
represent any character in a character set that depends on the type of
system or platform for which the program is compiled. This is usually
a variation of US ASCII, but it doesn't have to be, so beware.
In particular, it could be a multibyte encoding, where a larger set of
characters are represented by severalchar
objects, e.g. UTF-8; a basic set of characters, however, are always
represented as singlechars.Java strings are objects of class
java.lang.Stringorjava.lang.StringBuffer, and represent sequences ofchar.Strings in C are just arrays of, or pointers to,
char. Functions which handle strings typically assume
that the string is terminated with a null character'\0', rather than being passed length parameter. A
character array can be initialised like other arrays:char word[] = { 'H', 'e', 'l', 'l', 'o', '!', '\0' };
char another[] = "Hello!";Note that the second initialiser is a shorter form of the first,
including the terminating null character. Such a string literal can
also appear in an expression. It evaluates to a pointer to the first
character.const char *ptr;
ptr = "Hello!";ptrnow points to an anonymous, statically
allocated array of characters. Attempting to write to a string literal
like this has undefined behaviour, so the use ofconstensures that such attempts are detected while
compiling.Utilities for handling character strings are declared in
<string.h>. For example, the function to copy a
string from one place to another is declared as:char *strcpy(char *to, const char *from);
and may be used like this:
#include <string.h>
char words[100];
strcpy(words, "Madam, I'm Adam.");Like many of the other
<string.h>
functions,strcpyassumes that you have
already allocated sufficient space to store the string.Dynamic memory management
Dynamic memory management is built into Java through its
newkeyword and its garbage collector. In C, it is
available through two functions in<stdlib.h>which are declared as:void *malloc(size_t s); /* reserve memory for
schars */
void free(void *); /* release memory reserved withmalloc()*/(
size_tis an alias for an unsigned integral type.)malloc(s)returns a pointer to the start of a
block of memory big enough forschars. It returns a generic pointer which can be
assigned to a pointer of any type. The memory is not initialised. All
such allocated memory must be released when it is no longer required
by passing a pointer to its start tofree(). Only
pointer values returned bymalloc()can be passed
tofree().You can find out the amount of memory needed to store an object of
a particular type usingsizeof(type). For an array, multiply this by
the required size of the array.long *lp;
long *lap;
lp = malloc(sizeof(long));
lap = malloc(sizeof(long) * 10);
/* now we can access*lpas a long integer,
andlap[0]..lap[9]form an array */
free(lap);
free(lp);
/* now we can't */malloc()returns a null pointer (0) if it cannot allocate the requested amount of
memory.Lack of exceptions
Java supports exceptions to cover application-defined mistakes
as well as more serious system or memory-access errors (such as
accessing beyond the bounds of an array).In C, application-defined error conditions are normally expressed
through careful definition of the meaning of values returned by
functions. More serious errors, such as an attempt to access memory
that hasn't been allocated in some way, may go unnoticed (because the
behaviour is undefined). Write-access to such memory may cause
corruption of critical hidden data, which only results in an error at
a later stage, so the original cause of the error may be difficult to
trace. Just because some activity is illegal in C, it doesn't
mean that you will necessarily be told about it, either by the
compiler or by the running program.main()functionIn a Java application, execution begins in a static method
(void main(String[])) of a specified class. In
C, execution also begins at a function calledmain, but it has the following prototype:int main(int argc, char **argv);
The parameters represent an array of character strings that form
the command that ran the program.argv[0]is
usually the name of the program,argv[1]is the
first argument,argv[2]is the second, ...,argv[argc - 1]is the last, andargv[argc]is a null pointer. For example, the commandmyprog wibbly wobbly
may cause
mainto be invoked as if by:char a1[] = "myprog";
char a2[] = "wibbly";
char a3[] = "wobbly";
char *argv[4] = { a1, a2, a3, NULL };
main(3, argv);The parameters are optional (you can replace them with a single
void), butmainalways
returnsintin any portable program. Returning0tells the environment that the program
completed successfully. Other values (implementation-defined) indicate
some sort of failure.<stdlib.h>defines
the macrosEXIT_SUCCESSandEXIT_FAILUREas symbolic return codes.Standard library facilities
Java comes with a rich and still-developing set of classes to
support I/O, networking, GUIs, etc, to
access a process's environment.Similarly, the C language has a core of facilities to access its
environment. These functions, types and macros form C's Standard
Library. It is necessarily limited in order to support maximum
portability (it provides no GUI facilities, for example), but it is
largely fixed and stable. Access to other facilities (GUI, networking)
is through additional libraries that are usually specific to your
platform.The headers of the C Standard Library are briefly summarised below:
<stddef.h>Some essential macros and additional type declarations
<stdlib.h>Access to environment; dynamic memory allocation; miscellaneous
utilities<stdio.h>Streamed input and output of characters
<string.h>String handling
<ctype.h>Classification of characters (upper/lower case,
alphabetic/numeric etc)<limits.h>Implementation-defined limits for integral types
<float.h>Implementation-defined limits for floating-point types
<math.h>Mathematical functions
<assert.h>Diagnostic utilities
<errno.h>Error identification
<locale.h>Regional/national variations in character sets, time
formats, etc<stdarg.h>Support for functions with variable numbers of arguments
<time.h>Representations of time, and clock access
<signal.h>Handling of exceptional run-time events
<setjmp.h>Restoration of execution to a previous state
C95 additionally provides the following headers:
<iso646.h>Alphabetic names for operators
<wchar.h>Manipulation of wide-character streams and strings
<wctype.h>Classification of wide characters (upper/lower case,
alphabetic/numeric etc)
C99 additionally provides the following headers:
<stdbool.h>The boolean type and constants
<complex.h>The complex types and constants
<inttypes.h><stdint.h>Integer types of specific or minimum widths
<fenv.h>Access to the floating-point environment
<tgmath.h>Type-generic mathematics functions
Posted from Diigo. The rest of my favorite links are here.