Over the years I have written various things related to programming and hardware. They were published on websites that no longer exist and/or have disappeared into oblivion. I have found a few of them, which I think may still be relevant today, so I will re-publish them here.
I will start witha tutorial on programming in C. I think programming in bare C is still relevant today, even with fancy languages based on C, such as C++, C#, Objective C and Java. It is good to have a good grasp of the basics in C, and be fluent with pointer manipulation and such. It can help you to make your code simpler and more efficient in more modern languages as well.
C: The beginning
—————-
Well, C is a relatively simple language, in that it does not have too many
language constructs. That's because the language is relatively low-level,
it's quite close to the machine, so to say. And the machine is a thing of
anarchy and chaos. C was developed to get some structure in this, without
sacrificing too much performance, and size.
We have:
- Data declarations - Functions - Operators - Flow control statements - Type definitions
That is what you have to work with. In short, code consists of functions,
containing (flow control) statements, with operations on data of specific
types.
That may seem a bit too much to grasp at this point, but we'll take it one
step at a time.
Data
—-
Ofcourse we need data… like text or numbers. On data, we can perform
operations, like adding, subtracting, multiplying and such.
So first, let's see how we can give our programs some data.
We have 2 kinds of data: initialized data, and uninitialized data. They are
declared in much the same way, except that initialized data gets a value
assigned at declaration, and uninitialized data does not. So the initial
value of an uninitialized variable is undefined.
First you give the type of your data variable, then you give the name:
char myChar;
This is an unintialized data variable. Initialized data works by assigning an
initial value to the variable:
char myChar = 'a';
Well, char is just 1 primitive data type of C. I will give you the complete
list and their sizes in memory here:
- char 1 byte integer, also used for characters. - int 2 or 4 bytes integer, depending on the system architecture. - float 4 bytes floating point number. - double 8 bytes floating point number. - (pointers) depending on the system architecture.
As you see, some data types are dependant on the system. So to make things
easier, I will choose the popular x86 system in 32 bit mode from now on.
Note that other systems may vary.
There are also 2 'size modifier' directives:
- short 2 bytes - long 4 bytes
These directives can be prefixed to int, and will determine the size. In most
compilers, you can omit the int, and just use long and short as if they were
primitive types themselves.
So for 2 byte integers, both these declarations are correct:
short int a; short a;
Similarly for 4 byte integers:
long int a; long a;
And if there's no size modifying directive for an int, the compiler will use
the default size, which is platform-dependant.
On the x86 system, ints are long, 4 bytes, and so are pointers. Pointers are a
special group of data types, which I will cover later.
(in 16 bit realmode OSes, ints used to default to short, and pointers could
be either near or far. This had to do with the segmented memory model on old
x86 processors (8086, 8088, 80186 and 80286). This legacy system is beyond
the scope of this text, since modern x86 systems use 32 bit addressing. But
when using a realmode OS such as DOS, you have to pay attention to this.)
These data types are seen as signed numbers by default. Signed means that the
number can be both positive and negative. Unsigned variables can not have
negative values.
This is interesting, because a char is only 1 byte, or 8 bits big. It can take
on 2^8, or 256 values. With signed, this would be -128 to and with 127.
With unsigned, it would be 0 to and with 255.
You can control this behavior with the signed and unsigned directives:
signed int; unsigned int;
It is also legal to define multiple variables of the same type on one line,
even intermixing initialized and uninitialized data.
It works by simply separating all variables by comma's, like this:
unsigned int myVar1, myVar2, myVar3 = 50, myVar4;
Functions
———
Functions are the core of any C program. They contain the actual code, and
therefore provide the functionality of the program. A function can receive
parameters, the data it will process. And a function can return a primitive
data type variable. You declare a function in the sequence of return type,
function name, and parameter list (in parentheses):
int MyFunction(int param1, unsigned char param2, signed short param3)
Functions also have the possibility to not return anything. In that case we
have the special void data type. We will see this type again later with
pointers. This data type can also indicate that we want no parameters.
So if you don't need any return value, and no parameters, then you can do:
void MyFunction(void)
(note: there's old-style and new-style for functions with no parameters.
Official ANSI C wants MyFunction(void), but before the ANSI C standard was
introduced, MyFunction() was used. For most compilers, both styles should
work, but some (eg Borland) may enforce the ANSI C (void) style.)
Blocks of code are always between curly braces: {}. So the code that goes
into our function is no different. The code block immediately follows our
first line which declared the function prototype.
This might also be a good time to explain how to add comments to your program.
A C comment is prefixed by /* and postfixed by */. Anything between those
symbols is considered as comment, and will not be looked at by the compiler.
A small example:
int main(void) { /* Print some text to the screen, using a library function */ puts("Hello world!"); /* Exit function with return value */ return 0; }
Here we have a function calling the puts() function with a text string as
a parameter (puts() will 'put' the 's'tring on screen. We will look at these
strings later, aswell as the puts function), and then returning a signed
integer value of 0 (this is an immediate operand).
Note also that each line of code in C is delimited by a semicolon (;).
Now, to look at the calling of functions more closely…
You can use functions from your own source, but you can also import functions
from earlier compiled modules of code, or libraries. Libraries are made up of
a number of modules of code. ANSI C comes with quite a few libraries of code,
which you can use in your programs. With these libraries, you also get header
files, which include these function prototypes, among other things. We will
look at these header files more closely lateron, when we are actually going
to write a program.
Before you can call a function, the compiler needs to know how many parameters
are to be passed to the function, and what types they are. This is done via a
prototype of the function.
If a function is defined in your own source code, above the line where you
want to call it, then the compiler already knows the prototype, since it has
seen the actual function before, and you won't have to do anything.
If a function is below your call, or imported from a code library or module,
then the compiler won't know the function, so we have to provide a prototype
before using the function.
A prototype looks much like the first line of a function, except that the
parameter names are optional, and are usually omitted.
An example of a prototype:
int MyFunction(int x, char y, short z);
Or, omitting the names:
int MyFunction(int, char, short);
Then you're all set to use the function lateron in your source.
Calling a function is as simple as filling in the blanks, basically. All you
have to do is fill in the variables, in the prototype, and the function will
be called, and its return value will be yielded.
You can either import functions from another library of code, or
use functions from your own source.
For example, if we have a function like this:
float sqrt(float);
which will return the square root of a float we give it, we can make a small
piece of code like this:
float x = 25, sqrtOfX; sqrtOfX = sqrt(x);
As you can see, you can treat a function call like a number. In this example,
the parameter x will be passed to the function, the function will do its work,
and return the square root of x. And this result will be assigned to the
sqrtOfX variable.
Operators
———
So now that we know how and where to put our code, the next question ofcourse
will be: "How do I write code?". That is not a trivial question, so we will
break code down to some subsets. Our first subset is "operations on data".
As with most language constructs, C does not have too much operations on data.
Here's the list:
Mathematical operators:
- + : addition. - - : subtraction. - * : multiplication. - / : division. - % : division remainder/modulus.
Bitwise operators:
- & : AND - | : OR - ^ : XOR - ~ : NOT - >> : shift right - << : shift left
They all work the same, in that you specify a target variable, then the first
operand, the operator, and then the second operand.
I will give a small example, with some data:
int destination; int operand1 = 10; int operand2 = 20; destination = operand1 * operand2;
This will assign the value of the expression 10 * 20 to the destination
variable.
Well, to be more precise, the right hand side is an expression, which yields
a result. You could just write this in C:
operand1 * operand2;
This would yield the result, but it never does anything with it. In these
first examples, we will assign the result to a variable, but we will see that
there are other things we can do with expressions, such as combining them to
larger expressions. You could say that the above expression is a 'primitive
expression'
You can also use immediate operands instead of variables:
int destination; int operand1 = 15; destination = operand1 / 3;
This will assign the result of 15 divided by 3 to destination.
There is also shorthand notation for the case where one of the operands is
also the destination variable. The shorthand notation works with putting the
operator directly in front of the equals-sign, and specifying only the other
operand. So:
destination = destination ^ 10;
can be written as:
destination ^= 10;
There's another shorthand case, namely when you want to increase or decrease
the value of a variable by 1 unit. Why do I call it a unit? We'll see that
later, when discussing pointers. For numbers, the unit is simply the number
1. These are the operators for it:
- ++ : increase - -- : decrease
They work slightly different from the normal operators. You just prefix or
postfix them to a variable, there is no equals-sign involved. When postfixing
the operator, the value is used in an expression, and afterwards its value is
increased. When prefixing it, the value is increased first, then used in the
expression.
Some examples:
int destination; int operand1 = 30; int operand2 = 19; destination = operand1 - ++operand2;
This will result in the following values:
destination = 30 - 20 = 10 operand1 = 30 operand2 = 20
Postfixing the operator:
destination = operand1 - operand2++;
Gives the following results: