< Back to Projects

Webserv

C++ Networking HTTP

Available soon on GitHub

WORK IN PROGRESS


Introduction

The Webserv project is a custom, fully functional HTTP server built from scratch in C++98. It adheres to the HTTP/1.1 protocol specification and is designed to handle multiple simultaneous client connections efficiently. This project served as an intensive deep dive into network programming, multi-client communication, and parsing complex HTTP requests.

This project was completed during the 42 common core, in collaboration with maambuhl and lorey.

The HTTP Protocol

The Hypertext Transfer Protocol (HTTP) is the foundation of data communication for the World Wide Web. It is an application layer protocol designed for transmitting hypermedia documents, such as HTML. Essentially, HTTP defines how clients (e.g. web browsers) request information from servers (e.g. Webserv) and how servers transfer the response back.

We specifically chose to implement HTTP/1.1 due to its significant improvements over its predecessor. The main reason is support for Persistent Connections (Keep-Alive).

Diagram illustrating the HTTP request, server processing, and final response flow.
This diagram is a brief illustration of a HTTP transaction flow, from the web browser initiating a request to the server processing it and sending back the final response.

I/O Multiplexing with `poll`

poll automatically monitors a large set of non-blocking file descriptors (FDs) to determine which ones are ready for I/O operations (reading or writing) so we don't need to rely on multithreading.

Anatomy of HTTP: Requests and Responses

The fundamental unit of communication in our web server is the exchange of HTTP Request and Response. Correctly parsing and generating these messages is crucial for protocol compliance.

A request is structured into three main parts:

After processing the request, our server constructs a meticulous response for the client:

Diagram illustrating the HTTP request, server processing, and final response flow.
Screenshots showing communication between Firefox and our server.

Supported HTTP Methods

The server implements the three most common HTTP request methods, each with specific handling logic for security and file operations.

GET

Retrieves data from the specified resource. The server must handle file paths, directory listings (if configured), and manage 404 (Not Found) errors. This is the primary method for static content delivery.

POST

Submits data to be processed to a specified resource, often used for forms or uploading files. The server must correctly parse the request body, handle data size limits, and manage file writing.

DELETE

Deletes the specified resource. This is a highly restrictive method and requires careful security checks to prevent unauthorized file removal.

The NGINX-like Configuration File

To allow flexible setup and easy configuration of multiple servers, we designed a custom configuration file format heavily inspired by NGINX. Our C++ parser processes this file, enabling dynamic configuration of ports, server names, file paths, and method restrictions.

Some examples of features :

			server {

				name						moteurX;
				interface					localhost;
				listen						4242;
				send_timeout				60;
				error_pages					400 404 417 /www/4xx.html;
				chunk_size					16384;

				location / {
					root					/www;
					methods					GET;
					index					index.html index.htm;
					client_max_body_size	1000;
				}
				location /cgi-bin {
					root					/;
					index					hello.py;
					methods					GET;
					upload_authorized		on;
					storage_location		www/upload;
					autoindex				off;
					cgi_path				/usr/bin/python3;
					cgi_ext					.py;
				}
				location /upload {
					root					/;
					autoindex				on;
					methods					GET/POST/DELETE;
					client_max_body_size	100000;
					upload_authorized		on;
					storage_location		/;
				}
				location /search {
					return					301 /upload;
				}
			}
		

Common Gateway Interface (CGI)

A static server can only serve files. To introduce dynamic content and run external scripts (like Python or PHP), we implemented support for the Common Gateway Interface (CGI).

The CGI is implemented using process management functions such as fork(), dup2(), execve() and pipe(). The script (e.g. Python, Php, Perl) is being executed, and the output is being send to the client as the response's body.

Diagram illustrating the HTTP request, server processing, and final response flow.
Our file management cgi, inspired by Windows XP.