-
Notifications
You must be signed in to change notification settings - Fork 66
C Library
This page teaches you about the C Library - the bridge between your programs and the kernel.
Important: All source code references point to lib/inc/ for headers and lib/src/ for implementations in the MentOS repository.
libc is the C Standard Library - a collection of functions that:
-
Wrap system calls - Make kernel services easy to use (
fork(),read(),write()) -
Provide utilities - Common functions every program needs (
strlen(),malloc(),printf()) -
Handle initialization - Set up your program before
main()runs -
Handle termination - Clean up after
main()returns
Key insight: libc is NOT part of the kernel. It's a separate library that every userspace program is linked with.
Think about this design decision:
Option 1 (Wrong): Put everything in the kernel
Kernel: 10MB (includes strlen, malloc, printf, fork, etc.)
Program 1: Linked with whole kernel
Program 2: Linked with whole kernel
Program 3: Linked with whole kernel
→ Total waste: 30MB+ of RAM (multiple copies of libc!)
Option 2 (Better): Separate libc and kernel
MentOS keeps libc separate from the kernel and statically links it into each userspace program:
Kernel: small and focused (process/memory/files/IPC)
libc: user-mode library (strings, memory, I/O, formatting)
Program 1: statically linked with libc
Program 2: statically linked with libc
Program 3: statically linked with libc
Benefits of separation:
- ✅ Smaller kernel (fewer bugs, easier to test)
- ✅ Clear boundary between kernel and userspace
- ✅ Reusable userspace code (libc functions shared across programs)
- ✅ Standard interface (POSIX-like compatibility)
Your Program
↓
libc Functions (lib/src/)
├─ String functions (strlen, strcpy, strcat)
├─ Memory (malloc, free, realloc)
├─ I/O (printf, read, write)
├─ Time (time, sleep, clock)
├─ System call wrappers
│ ├─ fork() → sys_fork
│ ├─ exec() → sys_execve
│ ├─ read() → sys_read
│ └─ write() → sys_write
└─ Startup (crt0.S, libc_start.c)
↓
Kernel System Calls
↓
Hardware
These run entirely in userspace - they don't call the kernel:
// lib/src/string.c
size_t strlen(const char *s)
{
size_t len = 0;
while (s[len] != '\0') len++;
return len;
}
// lib/src/stdlib.c
void *malloc(size_t size)
{
// Allocate from heap using brk()
// (brk() is a syscall, but malloc wraps it)
}
// lib/src/stdio.c
int sprintf(char *buf, const char *fmt, ...)
{
// Format string into buffer
// No syscall needed!
}These are fast - they're just C code running in your program.
These are thin wrappers that:
- Take userspace arguments
- Call the kernel via INT 0x80
- Return the result
// lib/src/unistd/fork.c
pid_t fork(void)
{
long __res;
__inline_syscall_0(__res, fork);
__syscall_return(pid_t, __res);
}
// lib/src/sys/ipc.c
long semget(key_t key, int nsems, int semflg)
{
long __res;
__inline_syscall_3(__res, semget, key, nsems, semflg);
__syscall_return(long, __res);
}
// lib/src/unistd/read.c
ssize_t read(int fd, void *buf, size_t nbytes)
{
long __res;
__inline_syscall_3(__res, read, fd, buf, nbytes);
__syscall_return(ssize_t, __res);
}These are slower - they cross from user to kernel mode.
When you run a program, here's what happens:
1. Kernel loads ELF binary
↓
2. Kernel jumps to _start (in crt0.S)
↓
3. _start (assembly):
- Set EBP to 0
- Push pointer to main
- Call __libc_start_main
↓
4. __libc_start_main (libc_start.c):
- Validate argc/argv/envp (as provided by the kernel)
- Set environ
- Call main(argc, argv, envp)
↓
5. main() returns exit code
↓
6. _start issues int 0x80 exit syscall
Key source files:
-
lib/src/crt0.S- Assembly entry point (_start) -
lib/src/libc_start.c-__libc_start_maininitialization beforemain()
One of the most important jobs of libc is dynamic memory allocation.
#include <stdlib.h>
char *data = malloc(1024); // Request 1KB from libcBehind the scenes:
Heap layout:
┌──────────────────────────────┐
│ Allocated (256 bytes) │ ← malloc returned this
├──────────────────────────────┤
│ Free space │
├──────────────────────────────┤
│ Allocated (512 bytes) │
├──────────────────────────────┤
│ Free space │
└──────────────────────────────┘
malloc() strategies:
1. Track which blocks are allocated/free
2. Find first/best free block
3. Split larger block if needed
4. Return pointer to new block
Implementation in MentOS:
-
lib/src/stdlib.c- Main malloc/free implementation - Uses brk() syscall to expand heap when needed
- Keeps internal list of allocated blocks
void *brk(void *addr); // Move heap end to addr
// Example:
// Initially: heap end = 0x08048000
brk(0x08049000); // Expand heap by 1 page (4096 bytes)
// Now: heap end = 0x08049000This is a syscall because only the kernel can change memory mappings!
Basic string utilities (no syscalls needed):
#include <string.h>
size_t strlen(const char *s); // String length
char *strcpy(char *dst, const char *src); // Copy string
char *strcat(char *dst, const char *src); // Concatenate
int strcmp(const char *s1, const char *s2); // CompareImplementation: lib/src/string.c
Why these matter:
- Used by EVERY program
- Must be fast (called millions of times)
- MentOS implements them efficiently in C
#include <stdio.h>
printf("Hello %s, you are %d years old\n", name, age);How printf works:
printf(format, args)
↓
1. Parse format string
2. Convert arguments to strings
3. Build output buffer
4. Call write() syscall with buffer
↓
write(1, buffer, length) // fd=1 is stdout
↓
Kernel outputs to screen
Key functions:
-
printf()- Print to stdout -
fprintf()- Print to FILE* -
sprintf()- Print to buffer (no syscall!) -
vsprintf()- Printf with varargs
Implementation: lib/src/stdio.c, lib/src/vsprintf.c
#include <unistd.h>
char buffer[256];
ssize_t n = read(0, buffer, 256); // Read from stdin
write(1, buffer, n); // Write to stdoutThese ARE syscalls:
-
read()- Calls sys_read (kernel reads from file) -
write()- Calls sys_write (kernel writes to file)
Implementation: lib/src/unistd/read.c, lib/src/unistd/write.c
pid_t pid = fork();
if (pid == 0) {
// Child process
printf("I'm the child!\n");
} else if (pid > 0) {
// Parent process
printf("I created child %d\n", pid);
}This IS a syscall:
-
fork()wrapper callssys_fork - Kernel duplicates process
Implementation: lib/src/unistd/fork.c
execve("/bin/cat", argv, envp);
// After this, your program is replaced with cat
// (This line doesn't print - we're now running cat!)This IS a syscall:
-
execve()wrapper callssys_execve - Kernel loads new program
Implementation: lib/src/unistd/execve.c and related
exit(0); // Exit with code 0This IS a syscall:
-
exit()wrapper callssys_exit - Kernel cleans up process
Implementation: lib/src/unistd/exit.c
Every syscall in MentOS has:
-
Kernel declaration -
kernel/inc/system/syscall.hpid_t sys_fork(pt_regs_t *f); ssize_t sys_read(int fd, void *buf, size_t count);
-
libc wrapper -
lib/src/unistd/<function>.cpid_t fork(void) { long __res; __inline_syscall_0(__res, fork); __syscall_return(pid_t, __res); }
-
User calls wrapper - Your program
pid_t pid = fork(); // Calls the wrapper!
Key insight: libc provides the familiar POSIX interface that hides the syscall machinery.
#include <string.h> // String functions
#include <stdio.h> // I/O functions
#include <stdlib.h> // Memory, exit, etc.
#include <unistd.h> // fork, exec, read, write, sleep
#include <sys/ipc.h> // IPC (semget, msgget, shmget)
#include <sys/sem.h> // Semaphore functions
#include <sys/msg.h> // Message queue functionsMentOS programs are built via CMake and automatically linked against the in-tree libc target. If you build manually, make sure you link against the produced libc static library and use the same cross-compiler flags as the rest of the build.
lib/
├── inc/ # Headers students #include
│ ├── string.h, stdio.h, stdlib.h
│ ├── unistd.h # POSIX API
│ ├── sys/ipc.h, sys/sem.h # IPC headers
│ └── ...
├── src/
│ ├── crt0.S # Entry point ← Kernel jumps here
│ ├── libc_start.c # Initialization
│ ├── string.c # strlen, strcpy, etc.
│ ├── stdio.c, vsprintf.c # printf internals
│ ├── stdlib.c # malloc, free, exit
│ ├── unistd/
│ │ ├── fork.c # fork() wrapper → sys_fork
│ │ ├── execve.c # exec() wrapper → sys_execve
│ │ ├── read.c # read() wrapper → sys_read
│ │ ├── write.c # write() wrapper → sys_write
│ │ └── ...
│ ├── sys/
│ │ ├── ipc.c # semget, msgget, shmget wrappers
│ │ └── ...
│ └── ...
└── CMakeLists.txt # Build configuration- libc ≠ kernel - It's a separate library providing POSIX interface
-
Two types of functions:
- Library functions (strlen, malloc, printf) - fast, no syscall
- Syscall wrappers (fork, read, write) - slower, cross to kernel
- Separation is intentional - Smaller kernel, shared code, easier updates
- Your program links with libc - Every program gets these functions
- MentOS libc is POSIX-compatible - Programs expect these functions to exist
- System Calls - How libc wrappers call the kernel
- Userspace Programs - How to use libc in your programs
- Architecture - Overall libc location in the stack
- POSIX Specification - Full API reference
Standard I/O (fd-based)
MentOS stdio operates on integer file descriptors (no FILE * streams).
Basic I/O:
int putchar(int c); // Write single character
void puts(const char *str); // Write string with newline
int getchar(void); // Read single character
char *gets(char *str); // Read string (unsafe)
int fgetc(int fd); // Read one char from fd
char *fgets(char *buf, int n, int fd);
int fflush(int fd); // Flush (no-op for unbuffered I/O)Formatted output/input:
int printf(const char *fmt, ...);
int sprintf(char *buf, const char *fmt, ...);
int snprintf(char *buf, size_t size, const char *fmt, ...);
int fprintf(int fd, const char *fmt, ...);
int vfprintf(int fd, const char *fmt, va_list args);
int scanf(const char *fmt, ...);
int sscanf(const char *str, const char *fmt, ...);
int fscanf(int fd, const char *fmt, ...);
int vsprintf(char *buf, const char *fmt, va_list ap);
int vsnprintf(char *buf, size_t size, const char *fmt, va_list ap);Memory Allocation:
void *malloc(size_t size); // Allocate memory
void *calloc(size_t nmem, size_t sz);// Allocate and zero
void *realloc(void *ptr, size_t sz); // Resize allocation
void free(void *ptr); // Deallocate memoryString Conversion:
int atoi(const char *str); // String to integer
long atol(const char *str); // String to long
long strtol(const char *s, char **ep, int base);
double atof(const char *str); // String to double
char *itoa(int n, char *buf, int base);Random Numbers:
int rand(void); // Random number
void srand(unsigned int seed); // Seed RNGProgram Control:
void exit(int status); // Exit program
void abort(void); // Abort program
char *getenv(const char *name); // Get environment variable
int setenv(const char *name, const char *value, int overwrite);Sorting and Searching:
void qsort(void *base, size_t nmem, size_t size,
int (*cmp)(const void *, const void *));
void *bsearch(const void *key, const void *base,
size_t nmem, size_t size,
int (*cmp)(const void *, const void *));String Manipulation:
char *strcpy(char *dst, const char *src); // Copy string
char *strncpy(char *dst, const char *src, size_t n);
char *strcat(char *dst, const char *src); // Concatenate
char *strncat(char *dst, const char *src, size_t n);String Examination:
size_t strlen(const char *str); // String length
int strcmp(const char *s1, const char *s2); // Compare strings
int strncmp(const char *s1, const char *s2, size_t n);
int strcasecmp(const char *s1, const char *s2);
char *strchr(const char *str, int c); // Find character
char *strrchr(const char *str, int c); // Find last character
char *strstr(const char *haystack, const char *needle);
size_t strspn(const char *str, const char *accept);Memory Functions:
void *memcpy(void *dst, const void *src, size_t n);
void *memmove(void *dst, const void *src, size_t n);
void *memset(void *ptr, int val, size_t n);
int memcmp(const void *s1, const void *s2, size_t n);
void *memchr(const void *ptr, int val, size_t n);Utility Functions:
char *strtok(char *str, const char *delim); // Tokenize string
char *strdup(const char *str); // Duplicate string
char *strerror(int errnum); // Error messageint isalpha(int c); // Alphabetic character
int isdigit(int c); // Decimal digit
int isalnum(int c); // Alphanumeric
int isspace(int c); // Whitespace
int isupper(int c); // Uppercase letter
int islower(int c); // Lowercase letter
int isgraph(int c); // Printable except space
int isprint(int c); // Printable character
int iscntrl(int c); // Control character
int isxdigit(int c); // Hexadecimal digit
int toupper(int c); // Convert to uppercase
int tolower(int c); // Convert to lowercaseImplemented (selected):
double round(double x);
double floor(double x);
double ceil(double x);
double fabs(double x);
float fabsf(float x);
double sqrt(double x);
float sqrtf(float x);
double pow(double base, double exponent);
double exp(double x);
double ln(double x);
double log10(double x);
double logx(double x, double y);
double modf(double x, double *intpart);
int isinf(double x);
int isnan(double x);time_t time(time_t *timer); // Current time
struct tm *localtime(const time_t *timer); // Local time structure
char *ctime(const time_t *timer); // Time string
char *asctime(const struct tm *timeptr); // ASCII time stringThese are wrapper functions that call kernel system calls:
Process Control:
pid_t fork(void); // Create child process
int execve(const char *path, char *const argv[],
char *const envp[]); // Execute program
void exit(int status); // Exit process
int getpid(void); // Get process ID
int getppid(void); // Get parent PIDUser/Group:
uid_t getuid(void); // Get user ID
gid_t getgid(void); // Get group ID
int setuid(uid_t uid); // Set user ID
int setgid(gid_t gid); // Set group IDFile Operations:
ssize_t read(int fd, void *buf, size_t n); // Read from file
ssize_t write(int fd, const void *buf, size_t n); // Write to file
off_t lseek(int fd, off_t offset, int whence);// Seek in file
int close(int fd); // Close file
int unlink(const char *path); // Delete fileDirectory Operations:
int chdir(const char *path); // Change directory
char *getcwd(char *buf, size_t size); // Get current directory
int mkdir(const char *path, mode_t mode); // Create directory
int rmdir(const char *path); // Remove directorySignal Handling:
typedef void (*sighandler_t)(int);
sighandler_t signal(int signum, sighandler_t handler);
int sigaction(int signum, const struct sigaction *act,
struct sigaction *oldact);
int kill(pid_t pid, int sig); // Send signalTiming:
unsigned int sleep(unsigned int seconds); // Sleep
int usleep(unsigned long microseconds); // Sleep microsecondsPassword Database:
struct passwd {
char *pw_name; // Username
char *pw_passwd; // Password (if available)
uid_t pw_uid; // User ID
gid_t pw_gid; // Group ID
char *pw_dir; // Home directory
char *pw_shell; // Login shell
};
struct passwd *getpwuid(uid_t uid);
struct passwd *getpwnam(const char *name);Group Database:
struct group {
char *gr_name; // Group name
char *gr_passwd; // Group password
gid_t gr_gid; // Group ID
char **gr_mem; // Member usernames
};
struct group *getgrgid(gid_t gid);
struct group *getgrnam(const char *name);extern int errno; // Last error number
char *strerror(int errnum); // Error message
void perror(const char *prefix); // Print error message
void err(int status, const char *fmt, ...);
void warn(const char *fmt, ...);The C runtime startup file (lib/src/crt0.S) performs x86-specific initialization:
- Entry Point - Initial execution from kernel
- Register Setup - Initialize registers per calling convention
- Stack Setup - Prepare stack for argc/argv
- Global Variables - Initialize .data section
- Call libc_start - Jump to libc_start()
- Call main - Execute user program
- Call exit - Handle program termination
.global _start
_start:
; Load arguments
mov %ebx, %edi ; argc (from kernel)
mov %ecx, %esi ; argv (from kernel)
; Call libc initialization
call libc_start
; libc_start calls main and then exitThe libc provides wrapper functions around kernel system calls:
// Example: syscall wrapper for read()
#include <unistd.h>
#include <system/syscall_types.h>
ssize_t read(int fd, void *buf, size_t nbytes)
{
long __res;
__inline_syscall_3(__res, read, fd, buf, nbytes);
__syscall_return(ssize_t, __res);
}Key syscall wrappers in lib/src/unistd/:
-
read.c/write.c- File I/O -
fork.c/execve.c- Process control -
open.c/close.c/ioctl.c- File control - Signal management
- Memory management (brk, mmap)
- And many more...
Lists (list.h / list.c):
typedef struct list_node {
void *data;
struct list_node *next;
} list_node_t;
typedef struct list {
list_node_t *head;
size_t count;
} list_t;Hash Maps (hashmap.h / hashmap.c):
typedef struct hashmap hashmap_t;
hashmap_t *hashmap_create(size_t capacity);
int hashmap_put(hashmap_t *map, const char *key, void *val);
void *hashmap_get(hashmap_t *map, const char *key);
int hashmap_remove(hashmap_t *map, const char *key);Ring Buffers (ring_buffer.h):
typedef struct ring_buffer ring_buffer_t;
ring_buffer_t *rbuf_create(size_t capacity);
int rbuf_enqueue(ring_buffer_t *buf, void *item);
void *rbuf_dequeue(ring_buffer_t *buf);The library is compiled into libc.a static library:
# Build libc
cd build
make libc
# Link with programs
gcc -c program.c -o program.o
gcc program.o -L../libc -lc -o programWhen a program starts:
- Kernel loads program - Loads the ELF binary at its configured text address
- Jump to _start - Initial x86 entry point
- _start setup - Initialize registers and stack
- Call libc_start - Library initialization
- Call main - User program entry
- Call exit - Program termination
- Control back to kernel - Cleanup process
Note: For each example program below, add the source file to userspace/bin/CMakeLists.txt, then rebuild:
make programs
make filesystemGoal: Use libc string functions to work with strings.
Program:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
// String creation and manipulation
char str1[50] = "Hello";
char str2[50] = "World";
char result[100];
// Use libc string functions
printf("str1: %s (length: %lu)\n", str1, strlen(str1));
printf("str2: %s (length: %lu)\n", str2, strlen(str2));
// Concatenation
strcpy(result, str1);
strcat(result, " ");
strcat(result, str2);
printf("Combined: %s\n", result);
// Comparison
if (strcmp(str1, str2) < 0) {
printf("%s comes before %s alphabetically\n", str1, str2);
}
// Substring search
char text[] = "The quick brown fox";
char *found = strstr(text, "brown");
if (found) {
printf("Found 'brown' at position %ld\n", found - text);
}
return 0;
}Steps:
- Create
userspace/bin/string_test.cwith above code - Add to userspace/bin/CMakeLists.txt
- Build:
make programs && make filesystem - Run in MentOS:
/bin/string_test -
Bonus: Implement your own
strlen()function and compare performance!
Goal: Use malloc/free to allocate and deallocate memory.
Program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
// Allocate array of integers
int *numbers = malloc(10 * sizeof(int));
if (!numbers) {
printf("malloc failed!\n");
return 1;
}
// Fill array
for (int i = 0; i < 10; i++) {
numbers[i] = i * 10;
}
// Print array
printf("Array: ");
for (int i = 0; i < 10; i++) {
printf("%d ", numbers[i]);
}
printf("\n");
// Reallocate to larger size
int *resized = realloc(numbers, 20 * sizeof(int));
if (!resized) {
printf("realloc failed!\n");
free(numbers);
return 1;
}
// Fill new space
for (int i = 10; i < 20; i++) {
resized[i] = i * 10;
}
printf("Resized array size: 20\n");
// Free memory
free(resized);
return 0;
}Key Points:
- Always check malloc() return value
-
sizeof(type)ensures correct allocation - Use realloc() to grow arrays
- Always free() allocated memory
Test:
- Create and compile
- Run:
/bin/malloc_test - Challenge: Implement your own malloc() tracker to log all allocations/frees
Goal: Master printf formatting for various data types.
Program:
#include <stdio.h>
int main() {
// Integer formats
int num = 42;
printf("Decimal: %d\n", num);
printf("Hex: 0x%x\n", num);
printf("Octal: %o\n", num);
printf("Binary: (use %d with bit operations)\n");
// String formats
char str[] = "MentOS";
printf("String: %s\n", str);
printf("String (5 chars): %.5s\n", str);
printf("String (padded): |%10s|\n", str);
printf("String (left): |%-10s|\n", str);
// Floating point (if math lib available)
// float pi = 3.14159;
// printf("Float: %f\n", pi);
// Character
char c = 'A';
printf("Char: %c, ASCII: %d\n", c, c);
// Pointers
int *ptr = #
printf("Pointer: %p\n", (void *)ptr);
return 0;
}Explore:
-
-for left alignment - Number before decimal for width
- Number after decimal for precision
- Study
lib/src/vsprintf.cto understand how printf works internally
Goal: Use fork/exec/wait to understand process creation.
Program:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
printf("[Parent] PID %d starting\n", getpid());
pid_t pid = fork();
if (pid == 0) {
// Child process
printf("[Child] PID %d: I'm the child!\n", getpid());
printf("[Child] PID %d: My parent is %d\n", getpid(), getppid());
// Replace with new program
char *argv[] = {"sleep", "2", NULL};
execve("/bin/sleep", argv, NULL);
printf("[Child] This line won't print\n");
} else if (pid > 0) {
// Parent process
printf("[Parent] Created child PID %d\n", pid);
printf("[Parent] Waiting for child...\n");
int status;
waitpid(pid, &status, 0);
printf("[Parent] Child exited with status: %d\n", WEXITSTATUS(status));
} else {
perror("fork");
}
return 0;
}Steps:
- Create program and compile
- Run: observe parent/child messages
- Bonus: Implement process pool (create N children, distribute work)
Goal: Understand how libc functions work internally.
Research these:
-
String functions -
lib/src/string.c- How does
strlen()work? - Why is it fast?
- How does
-
Memory allocation -
lib/src/stdlib.c- How does malloc track allocated blocks?
- What's the overhead per allocation?
-
Printf -
lib/src/vsprintf.c- How does printf parse format strings?
- How does it handle different format specifiers?
-
Syscall wrappers -
lib/src/unistd/*.c- Compare with their kernel counterparts
- Notice how simple the wrappers are!
Challenge: Modify strlen() to count only alphabetic characters. Recompile libc and verify your change works!
- System Calls - Available system calls
- Architecture - Program address space
- Development Guide - Using libc in programs
- POSIX Specifications - For detailed function behavior