This tutorial assumes you have a decent grasp of the Python language.
C is a compiled low-level language first developed at Bell Labs by Dennis Ritchie in 1972 (source).
As a Python programmer, you are accustomed to a language developed by Guido van Rossum. Python was coded in C++, an extension of C (source).
Additionally, Python is an Interpreted Language, parsing instructions line by line opposed to a Compiled Language.
Thus, as previously implied, Python is a high-level programming language (far abstracted from raw machine code) while C is a low-level programming language. Furthermore, as a C programmer, you can shouldn't expect high-level features such as Object Oriented Programming and lists.
In C, one primarily and fundamentally deals with raw numbers and memory addresses. High-level features such as strings, lists, dictionaries and the like do not exist and must be coded from scratch.
To compile C programs you, of course, will need a compiler. For general purpose projects I'd use mingw-w64 (specifically i686-w64-mingw32 packaged with MSYS2). Mingw-w64 is packaged with a ton of tools and installing it can be tricky for beginners, but it allows one to compile C and C++ programs.
For this tutorial, I'll be using the Tiny C Compiler due to its small size and ease of use. I specifically downloaded tcc-0.9.27-win32-bin since I am on Windows 10. Don't stress to much between 32 bit versus 64 bit (unless, of course, you are running a 32 bit operating system).
Since tcc (Tiny C Compiler) is much simpler than gcc (one of mingw-w64's compilers), the resulting file size is very tiny. We're talking 2KB versus 240KB just for a simple "Hello World!" program.
(Note: For windows users, I'd recommend adding tcc to PATH. If you do not know what that means, you probably aren't experienced enough in Python to be following this tutorial.)
A basic "Hello World!" program looks like this:
#include <stdio.h>
int main() {
printf("Hello World!\n");
return 0;
}
Here's a line-by-line explanation:
#include <stdio.h>
- preprocessor directive functioning similar to an import
statement in Python code.
"std" means standard, "io" means input/output, and ".h" means it is a header file. "stdio.h" contains functions for standard input/output, like printf()
.
Preprocessor statements do not end with semicolons.
int main {
- defines the program entry point. Unlike Python, to run any code in C you must define a main function; additionally, this function must return an integer (the int
).
The returned int
signals to the operating system the success or failure of your program. 0 = success while 1 = failure.
printf("Hello World!\n");
- prints a character array to the console. The "f" in printf()
stands for formatted (source).
Notice also how the line is terminated with a semicolon; this simply tells the compiler this line is finished (unlike the invisible newline [or '\n'] character that Python uses).
return 0;
- simply returns 0 denoting to the system that the program ran successfully.
}
- end bracket to define the end of the main()
function. Brackets are nested like indentation in Python. Furthermore, C does not rely on indentation at all (a valid program can consist of a single line).
To compile this "Hello World!" program, first create a text file named: "main.c". Unlike Java (and similar languages) the name of the .c source file does not correspond to the name of the main()
function. You can name your file whatever you want as long as it ends in ".c".
Using tcc in the console, run tcc main.c
. A main.exe (windows) should be generated. Here's an example:
C:\Users\thbop\Desktop\C-For-Python>tcc main.c
C:\Users\thbop\Desktop\C-For-Python>main
Hello World!
C:\Users\thbop\Desktop\C-For-Python>main.exe
Hello World!
One benefit, made immediately apparent when using a compiled language, is that users downloading a program do not need a "compiler" (or "interpreter") to run it; their machine already knows how to run it.
C Compilers basically convert C code into Assembly, assemble it into an object file (".o"), and finally link it into a machine code.
For this theory dump, I'll show the entire process, generating a program that can run on the 6502 Processor, an older and (relatively) simpler processor compared to modern processors. (6502 assembly is also much simpler to understand compared to modern assembly.)
First, you will have to understand the elementary concept of a Byte. (Simple explanation: 8 bits, 1 bit is an on or off state.)
For this walk-through, I'll be using this program:
// We do not #include <stdio.h> because the 6502 is just a processor; there is no defined standard input/output.
unsigned char main() {
unsigned char i = 0;
while ( i < 10 ) {
i++;
}
return 0;
}
Note: unsigned char
declares that variable's / function's type as an unsigned (no negative sign, only positive) character (1 byte). You can think of it as an integer variable that can only store values from 0 to 255 (range of 2^8=256).
Aside from that no-so-obvious (for some) note, this program is relatively simple and its Python "equivalent" is:
i = 0
while i < 10:
i += 1
The abridged 6502 assembly source file (compiled by cc65) looks like this (with comments added):
.segment "CODE"
lda #$00 ; Loads a 0 into the A register; the A register or Accumulator is general purpose single-byte storage in the CPU
jsr pusha ; Pushes A onto the stack (I'll cover the stack later)
jmp L0004 ; Jumps to the while loop (the assembly line starting with "L0004:")
L0002: ldy #$00 ; This section defines the "i++" expression within the while loop
ldx #$00
clc
lda #$01 ; Loads 1 into A so that we can add it to the C variable "i"
adc (sp),y ; Adds A + i
sta (sp),y ; Stores the result in i. After processing this line, the CPU will go to the next line (L0004)
L0004: ldy #$00 ; Start of the while loop line
ldx #$00
lda (sp),y ; Loads the value of i into A
cmp #$0A ; Compares A to 0A (hex representation of 10)
jsr boolult ; Jumps to the boolult subroutine to handle the "<" sign
jne L0002 ; If the comparison is is satisfied (e.g. A is less than 10), jump to L0002
ldx #$00 ; When the CPU is on this instruction, the while loop has finished and everything just needs to be cleaned up.
lda #$00
jmp L0001 ; Jump to L0001
L0001: jsr incsp1
rts ; Return from program. This processor can only run one program at a time, so we need to be able to return.
Congratulations! You were reading assembly code!
When assembled, the object file's (".o") raw bytes look something like this:
00000000 55 7A 6E 61 11 00 00 00 60 00 00 00 0B 00 00 00 |Uzna....`.......|
00000010 6B 00 00 00 0F 00 00 00 7A 00 00 00 E5 00 00 00 |k.......z.......|
00000020 5F 01 00 00 20 00 00 00 7F 01 00 00 0C 00 00 00 |_...............|
00000030 8B 01 00 00 02 00 00 00 8E 01 00 00 9C 00 00 00 |................|
...
00000220 1F 01 00 00 00 2B 01 00 00 00 20 00 06 6D 61 69 |.....+.......mai| <--- Notice how the object file stores extra data about the program.
00000230 6E 2E 73 18 63 61 36 35 20 56 32 2E 31 39 20 2D |n.s.ca65.V2.19.-| <--- This data will be used by the linker
00000240 20 47 69 74 20 38 63 33 32 39 64 66 19 63 63 36 |.Git.8c329df.cc6| <---
00000250 35 20 76 20 32 2E 31 39 20 2D 20 47 69 74 20 38 |5.v.2.19.-.Git.8| <---
00000260 63 33 32 39 64 66 02 73 70 04 73 72 65 67 07 72 |c329df.sp.sreg.r| <---
00000270 65 67 73 61 76 65 07 72 65 67 62 61 6E 6B 04 74 |egsave.regbank.t| <---
00000280 6D 70 31 04 74 6D 70 32 04 74 6D 70 33 04 74 6D |mp1.tmp2.tmp3.tm| <---
00000290 70 34 04 70 74 72 31 04 70 74 72 32 04 70 74 72 |p4.ptr1.ptr2.ptr| <---
000002A0 33 04 70 74 72 34 0E 6C 6F 6E 67 62 72 61 6E 63 |3.ptr4.longbranc| <---
000002B0 68 2E 6D 61 63 0B 5F 5F 53 54 41 52 54 55 50 5F |h.mac.__STARTUP_| <---
000002C0 5F 05 5F 6D 61 69 6E 05 70 75 73 68 61 05 4C 30 |_._main.pusha.L0| <---
000002D0 30 30 34 05 4C 30 30 30 32 05 2E 73 69 7A 65 07 |004.L0002..size.| <---
000002E0 62 6F 6F 6C 75 6C 74 05 4C 30 30 30 31 06 69 6E |boolult.L0001.in| <---
000002F0 63 73 70 31 04 43 4F 44 45 06 52 4F 44 41 54 41 |csp1.CODE.RODATA| <---
00000300 03 42 53 53 04 44 41 54 41 08 5A 45 52 4F 50 41 |.BSS.DATA.ZEROPA| <---
00000310 47 45 04 4E 55 4C 4C 00 00 |GE.NULL..|
Now onto the final stage of "compilation," linking.
The linker basically inserts the program bytes from the main.o object file and its dependencies into one executable.
If you found this process and the 6502 interesting, check out the following projects / sources relating to the topic:
This section is going to be very simple compared to the last. So take a breath-- and let's continue.
C has a few common data types:
int
- 16 or 32 bit, or 2 or 4 byte representation of a signed integer. The signed integer can be negative or positive at the expense of a half-reduced max value.short
- 16 bit or 2 byte representation of a signed integer.char
- 8 bit or 1 byte representation of a signed integer. Commonly used to store a single character or an array/string of characters (more on that later).float
- 32 bit or 4 byte representation of a signed floating point number (decimal number).double
- 64 bit or 8 byte representation of a signed floating point number. A more precise "float" at the expense of memory and process time.Any of these declarations can be prefixed with unsigned
to make them unsigned (not negative):
unsigned int a = -10; // Would have undefined behavior (unless you know what you are doing)
They can also be prefixed with const
to define them as constant and immutable.
const int a = 76; // "a" cannot be modified
Additionally, int's can be prefixed with long
to implicitly define a 32 bit or 4 byte integer.
On some systems, int's by default are 16 bit while on other systems they are 32 bit. The keyword long
forces an int to be 32 bit (source).
long int a = -45; // 32 bit int
The sizeof(x)
function can be used to determine the size (in bytes) of a particular variable.
You might have already run into this issue while playing with numbers: "I can't print them!"
Don't worry! printf()
is defined as:
static inline int printf(const char *__format, ...) { ... }
"Thbop, that doesn't help."
The arguments for printf()
are: "const char* __format
" (a constant character pointer or array; I'll cover pointers and arrays later, but for now you can think of this as a string) and "..."
The word "format" should get you thinking of the Python str.format() method. The concept is the same, but the execution is slightly different:
int a = 10;
float b = 65.0f; // const float values like this must be defined with an "f" on the end
double c = 34.654; // opposed
printf("%d\n", a); // Prints "a" as 10; note %d is the same as %i, I just prefer %d
printf("%d %f %lf\n", a, b, c); // Outputs: 10 65.000000 34.654000
Some common type specifiers are shown below:
Symbol | Type |
---|---|
%d or %i | int |
%x | int (hex representation) |
%u | unsigned int |
%lu | long unsigned int |
%f or %F | float |
%lf (long float) | double |
%c | char (actually prints a character) |
%s | string (actually a char*) |
The for-loop in Python has the following structure:
for i in iterator:
In C, only the iterator is supplied; one constructs the iterator like so:
for ( varible-definition; condition; runs-every-iteration ) {...}
or, for example:
for ( int i = 0; i < 10; i++ ) {...}
Miscellaneous note: whitespaces do not affect the code in many instances, so:
for(int i=0;i<10;i++){...}//Hasthesameaffectasabove
Additionally, the variable i will only be defined within the scope between the squiggly brackets {}, so:
for ( int i = 0; i < 10; i++ ) {...}
int a = i; // Will result in an error; i is undefined in this scope.
Here are some other examples:
int a = 10;
if ( a == 10 || 5 == 5 ) { // || = or
printf("yes\n");
} else if ( a < 10 && a > 0 ) { // && = and
// do something
} else printf("hmm");
if ( a == 10 )
printf("If there's only one statement after if, {} are not required\n");
switch ( a ) {
case 1: {
printf("a = %d\n", a);
break; // case 1 is satisfied, no other cases will be evaluated
} case 2:
printf("Brackets also do not have to be used here"); // Note: there is subtle difference between having brackets and not having them
break;
default: printf("Neither are satisfied"); break;
}
Arrays and pointers are a source of much frustration for Python programmers because these features are handled more automatically in Python.
For example, in Python you can simply define a list, append, insert, etc to it to your heart's content. In C, it's not that simple. And what's a pointer?
First, let's discuss what an array is. Here's a simple array definition:
int arr[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; // Define array with size of 10
for ( int i = 0; i < 10; i++ ) // "Iterate over the array," more like: "loop 10 times"
printf("%d\n", arr[i]); // Print the value at index i
Some notes:
arr[-1]
) does not exist.int a = 34;
int arr[a] = {0}; // "a" is a varible known at runtime, not compile time; this code would result in an error
// The {0} would initialize the array's values as 0 if this example was valid
To pass an array into a user defined function, the size must be declared in the function parameters:
int array_process( int arr[11] ) { // Notice this array is of size 11; this is still valid but index 10 will result in garbage (random) data
printf("%d\n", arr[9]); // Print the 10th element
return 0;
}
int main() {
int arr[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
array_process(arr); // Pass array into function
return 0;
}
What an interesting introduction into pointers! You might have noticed that I've been rather silent on the topic... until now.
The concept to grasp is this: arrays are pointers.
Here's a simple pointer declaration (to leave you even more confused):
int a = 45; // Define a variable; this exists at a particular memory address
int* b = &a; // Define b to point to a; "int*" denotes a pointer variable and "&a" references a (returns its memory address)
printf("%d at %x\n", *b, b); // "*b" dereferences b (returning the value of a); b's actual value is a memory address, so we print that here
Result: 45 at 19ff34
Now here's the trick: an array is actually a pointer to the first element in an array
Thus:
int* arr[4] = {45, 23, 73, 1};
printf("%d ", *arr); // Dereference first value in array, 45
printf("%d\n", *(arr+3)); // Adding 3 to the address will return the 4th element (or index 3), 1
Result: 45 1
The benefit is that we can rewrite the array_process()
function as so:
int array_process( int* arr ) { ... }
The downside to this approach is that there will be no runtime checks when accessing and modifying array members:
int array_process( int* arr ) {
printf("%d\n", arr[400]); // Print some random out-of-bound value
return 0;
}
I personally wouldn't reccommend messing with out-of-bounds memory at all because it doesn't really have many legitimate uses.
Now to clear up strings:
char* str = "Hello World!";
printf("%s\n", str);
Yes, I haven't properly covered functions yet. Here's some example code to cover them real quick:
void printintarr( int* arr, int length ) { // A void function returns nothing
printf("{ ");
for ( int i = 0; i < length; i++ ) printf("%d, ", arr[i]);
printf("}\n");
}
int main() { // main() cannot return void, it must always return an int
int a[] = {1, 5, 3, 7, 3, 7, 2, 5};
printintarr(a, 8);
return 0;
}
The closest thing to class, structs provide a sleek way to group variables in an "OOP" way.
The principal difference between a struct and a class is that a stuct cannot have its own methods (or functions). You can, of course, create functions that modify a struct, but they must be external to the struct itself.
Here's an example of a simple struct declaration:
struct vec2 { // Declare struct
float x, y;
};
void printvec2( struct vec2 p ) { // Example of a function that prints the vec2 by passing the struct in directly
printf("vec2(%f, %f)\n", p.x, p.y);
}
void printvec2ptr( struct vec2* p ) { // Passing a pointer to the struct
printf("vec2(%f, %f)\n", p->x, p->y); // Notice when dereferencing struct values we use the "->" symbol instead of "."
}
int main() {
struct vec2 p = {4.0f, 1.0f}; // Declare p as type struct vec2; we supply it with a buffer
printvec2(p);
printvec2ptr(&p); // Reference p
return 0;
}
As a simplification, we can use typedef
:
struct vec2 {
float x, y;
} typedef vec2; // Define a custom type for the struct
vec2 subtractv2( vec2 p, vec2 q ) {
return (vec2){ p.x - q.x, p.y - q.y }; // "(vec2)" is a type cast
}
float dot( vec2 p, vec2 q ) {
return p.x * q.x + p.y * q.y;
}
int main() {
vec2 p = {5.0f, 20.0f};
vec2 q = {7.0f, 2.0f};
printf( "%f\n",
dot( subtractv2( p, q ), q )
);
return 0;
}
This introductory "course" did not cover topics like files, stack vs heap, void pointers, and others. Just keep experimenting and researching.
Website last updated Tue Nov 19 15:52:15 2024 by Thbop.