Do Not Fear Systems Programming
Author: Alexander Avery
Posted: Tue | Mar 12, 2024
computer-science linuxFirst time with low level networking
In my junior year of college, I took a computer networking course. It was my introduction to C++, glibc, sockets, and other topics, covered in around six assignments. These assignments included making a chat program, and a client and server for a subset of FTP.
At that point, I had written HTTP servers, but had never dealt with raw bytes going over sockets. The beginning was intimidating, but the course and professor primed my intrigue. After the final, I had changed from avoiding systems programming to being distantly interested.
Today, I program on sockets as a hobby.
Most recently, I wrote a MUD that uses only net/tcp
for communication from the Go stdlib.
It’s a decently concurrent program and a real joy to write in Go.
But, it got me wondering: how would I implement this program in a language without CSP-inspired features?
Channels, select statements, and goroutines are fundamental to my understanding of concurrent programs, but they do a lot of work behind the scenes. I wanted to challenge myself to see if I could write a concurrent program without these helpful features.
Writing the ’tick’ server in Go
I’ll begin by demonstrating how easily we can write a server in Go that can orchestrate multiple client connections. This server will send ’tick’ messages to connected clients each second, and allow clients to send messages of their own.
package main
import (
"bufio"
"io"
"log"
"net"
"time"
)
// read all messages from client.
func read(rc io.ReadCloser, id int, msg chan<- string, done chan<- int) {
defer rc.Close()
scanner := bufio.NewScanner(rc)
for scanner.Scan() {
msg <- scanner.Text()
}
// we are done once the scanner has reached io.EOF
done <- id
}
// write 'tick' message to all clients.
func write(conns []net.Conn) {
for _, c := range conns {
// some elements in the slice may
// be nil, representing a disconnected client.
if c != nil {
io.WriteString(c, "tick\n")
}
}
}
// add a client and read messages if there is room.
func addClient(conns []net.Conn, conn net.Conn, msg chan<- string, done chan<- int) []net.Conn {
for i, c := range conns {
// the first available space will
// receive the incoming client.
if c == nil {
go read(conn, i, msg, done)
conns[i] = conn
return conns
}
}
// reject the client if there was no available
// space in the slice.
conn.Close()
return conns
}
func main() {
conns := make([]net.Conn, 15)
connC := make(chan net.Conn, 0)
msgC := make(chan string, 0)
doneC := make(chan int, 0)
listener, err := net.Listen("tcp", ":8080")
if err != nil {
log.Fatalf("listening on :8080: %s", err.Error())
}
go func() {
for {
c, err := listener.Accept()
if err == nil {
connC <- c
}
}
}()
// we will message all clients when the
// channel on this ticker receives a value.
ticker := time.NewTicker(time.Second)
for {
select {
// when we receive a connection, we attempt to add
// the new client
case conn := <-connC:
conns = addClient(conns, conn, msgC, doneC)
// if a client completes, we free up space in the slice.
case id := <-doneC:
conns[id] = nil
// we log any message a client sends us.
case msg := <-msgC:
log.Printf("client said: %s", msg)
// on each tick, we message our connected clients.
case <-ticker.C:
write(conns)
}
}
}
Assuming you already know Go, the above program should be fairly straightforward to understand.
Writing the ’tick’ server in Hare
Now comes the challenge of writing the same program in a language that has no CSP features.
Our implementation will rely on the poll(2)
API.
I recommend reading the man page before or after reading the following program.
We’ll introduce the program one function at a time, so it is easier to understand.
Accepting TCP connections in Hare
The first function we can define is one that accepts client connections.
// tryconn attempts to add a client to a pollfd slice.
fn tryconn(fds: []poll::pollfd) void = {
let socket = fds[0];
if (socket.revents & poll::event::POLLIN > 0) {
// try to accept the connection.
let conn = match(net::accept(socket.fd)) {
case let err: net::error =>
log::println("failed to accept connection");
return;
case let sock: net::socket =>
yield sock;
};
// the events we want to listen for
let events = (poll::event::POLLIN | poll::event::POLLPRI);
for (let i = 1z; i < len(fds); i += 1) {
// empty slot is indicated by a
// file descriptor equal to 0
if (fds[i].fd == 0) {
fds[i].fd = conn;
fds[i].events = events;
return;
};
};
// close the connection if there is no room
io::close(conn)!;
};
};
Unlike our Go program, we are dealing with a pollfd
slice.
A pollfd
is a struct that holds a file descriptor, an event mask, and a set of returned events.
The returned events, stored in revents
, are populated by the kernel if some event has taken place on the file descriptor.
In this program, the socket on which we listen for connections is also a pollfd
within the slice.
We treat the first element in our slice as the listening socket of our server.
The presence of POLLIN
on the listening socket indicates an incoming connection.
Once we yield our new client connection, we define a variable named events
that masks incoming events.
The only events we should receive are those present in the mask.
This program only needs to listen for POLLIN
, but we include POLLPRI
to demonstrate masking multiple event types.
Later, when we receive these events, they will be set on the revents
field of our pollfd
.
POLLIN
onrevents
indicates that there is data to read from the file descriptor.POLLPRI
onrevents
indicates that there is an exceptional condition on the file descriptor.
Next, we try to add our connected client to our pollfd
slice.
If a slot in our slice is available for a new connection, the file descriptor will be equal to zero.
If we find an empty slot, we store the net::socket
, which is just an io::file
, in our empty pollfd
.
In the case where no empty slot is found, we will simply reject the client by closing the connection.
Reading client messages in Hare
In our Go program, each client had a corresponding goroutine that read incoming data.
In our Hare program, we read incoming data by checking revents
on each pollfd
in a loop.
// tryread will attempt to read from all connected clients.
fn tryread(fds: []poll::pollfd) void = {
for (let i = 1z; i < len(fds); i += 1) {
let revents = fds[i].revents;
// just close the file descriptor
// on error or hangup events
if (revents & poll::event::POLLERR
& poll::event::POLLHUP > 0) {
closefd(&fds[i]);
continue;
};
// if there is data to read...
if (revents & poll::event::POLLIN > 0) {
let scanner = bufio::newscanner(fds[i].fd, 1024z);
defer bufio::finish(&scanner);
match(bufio::scan_line(&scanner)) {
case let msg: const str =>
log::printfln("client said: {}", msg);
case =>
// io::EOF, io::error, or utf8::invalid
closefd(&fds[i]);
};
};
};
};
We begin by checking for POLLERR
or POLLHUP
, which are two events we did not include in our event mask.
The poll(2)
API indicates those values are ignored on events
but may be placed on revents
anyway.
POLLERR
indicates that the read end of a pipe has been closed.POLLHUP
indicates that the peer has closed its end of the channel (POLLHUP
stands for poll hangup).
After reading POLLHUP
there may be unread data in the channel, but our program will ignore that.
If either of those events are present, we close our file descriptor and continue.
Whenever we encounter a POLLIN
event, we have data to read.
A simple way to read the line is to make a bufio::scanner
with a buffer size of 1024 bytes.
Hare documentation indicates we are responsible for freeing the scanner with bufio::finish
, so we defer that function.
If we successfully read the message, we log it to the console; otherwise, we close the client connection.
Closing a client connection in Hare
We can define the function closefd
like this:
// closefd closes a provided file descriptor
// and resets all fields to 0.
fn closefd(fd: *poll::pollfd) void = {
match(io::close(fd.fd)) {
case void =>
void;
case io::error =>
log::println("failed to close socket");
};
fd.fd = 0;
fd.events = 0;
fd.revents = 0;
};
It is not enough to just call io::close
on our file descriptor.
Closing the underlying file descriptor invalidates the descriptor stored on fd
.
If we don’t also reset the fd
field to zero, we will run into issues when calling poll::poll
after ending a client connection.
Writing to client connections in Hare
def tick: [5]u8 = ['t', 'i', 'c', 'k', '\n'];
// write sends a 'tick' message to all connected clients.
fn write(fds: []poll::pollfd) void = {
for (let i = 1z; i < len(fds); i += 1) {
if (fds[i].fd > 0) {
io::writeall(fds[i].fd, tick)!;
};
};
};
The message we are sending to clients, ’tick’, is defined as a constant in our program. In our write function, we simply check all file descriptors and write ’tick’ to any that are opened.
Writing our poll loop in Hare
Finally, we can write our main function. Our first implementation will have a bug, but we will fix that later.
export fn main() void = {
let socket = tcp::listen(ip::LOCAL_V4, 8080)!;
// initialize all pollfds with zero values
let fds: [15]poll::pollfd = [
poll::pollfd {
fd = 0,
events = 0,
revents = 0,
}
...
];
// overwrite the first pollfd with our socket
fds[0] = poll::pollfd {
fd = socket,
events = (poll::event::POLLIN | poll::event::POLLPRI),
revents = 0
};
for (true) {
match(poll::poll(fds, time::SECOND)) {
case let n: uint =>
if (n > 0) {
tryconn(fds);
tryread(fds);
} else {
write(fds);
};
case let err: poll::error =>
log::fatal("poll failed");
};
};
};
First, we listen for connections on a new socket on port 8080.
Next, we can define a pollfd
array with a size of fifteen.
We initialize each slot with an empty pollfd
, and subsequently overwrite index zero with our socket.
In our main loop, we repeatedly call poll::poll
with a timeout of one second.
If we get events back, we accept any incoming connections, or read messages from active clients.
Otherwise, if we reach the timeout, we write our ’tick’ message to each client.
You can now build this program, and connect up to fourteen clients on your local machine with telnet 127.0.0.1 8080
.
The bug in our Hare program
You may notice that this program doesn’t send ’tick’ messages as it should. If you connect with a new client, or read a message from a client, the one-second timeout restarts. That means, repeatedly sending messages from clients can idefinitely prevent the timeout from firing. We’ll have to make some changes if we don’t want our writes to be blocked by our data reads and client connections.
Fixing our timeout code in Hare
We need to keep track of our tick events independently of our returned events.
A naive solution would be to set our poll timeout to NONBLOCK
and check if we should ’tick’ on each iteration.
This will certainly work, but the CPU will work needlessly hard with this approach.
A better solution is to compute our timeout duration each time we return from poll::poll
.
diff --git a/main.ha b/main.ha
index 48babb8..e9450b5 100644
--- a/main.ha
+++ b/main.ha
@@ -102,6 +102,20 @@ fn closefd(fd: *poll::pollfd) void = {
fd.revents = 0;
};
+// shouldtick returns a boolean representing if a tick
+// message should be sent and a time::duration indicating
+// when the next tick should be.
+fn shouldtick(next: time::instant) (bool, time::duration) = {
+ let now = time::now(time::clock::MONOTONIC);
+
+ if (time::compare(now, next) >= 0) {
+ return (true, time::SECOND);
+ };
+
+ let diff = time::diff(now, next);
+ return (false, diff);
+};
+
export fn main() void = {
let socket = tcp::listen(ip::LOCAL_V4, 8080)!;
@@ -123,17 +137,28 @@ export fn main() void = {
revents = 0
};
+ let start = time::now(time::clock::MONOTONIC);
+ let nexttick = time::add(start, time::SECOND);
+ let wait = time::SECOND;
+
for (true) {
- match(poll::poll(fds, time::SECOND)) {
+ match(poll::poll(fds, wait)) {
case let n: uint =>
if (n > 0) {
tryconn(fds);
tryread(fds);
- } else {
+ };
+
+ let (shouldtick, d) = shouldtick(nexttick);
+ if (shouldtick) {
+ let now = time::now(time::clock::MONOTONIC);
+ nexttick = time::add(now, d);
write(fds);
};
+ wait = d;
+
case let err: poll::error =>
log::fatal("poll failed");
};
With these changes, poll::poll
will only return when we have events, or have reached an appropriate time to ’tick’.
On each iteration we check if we reached an appropriate time to ’tick’ regardless of what caused poll::poll
to return.
This means we don’t skip any ’tick’ messages and we keep our CPU usage to a reasonable minimum.
The full server code for both Go and Hare can be found here.
The poll(2)
API is just one of several available on Linux that enable concurrent I/O.
From here, I’d like to learn how to handle concurrent operations that don’t involve I/O.
For example, what if we wanted to send a hash that took some time to compute instead of ’tick’ messages to our clients.
To accomplish such programs in a language like Hare, I might have to read the famous CSP paper by Tony Hoare first.
Previous: What to Test
>> Home