Python Multithreaded TCP Server
Key‑value store protocol with persistence and concurrent clients.
Tech: Python, Sockets, Threads
Try it: send a command
Try: PING, SET <k> <v>, GET <k>, REMOVE <k>, PRINTI built this multithreaded TCP server to explore systems fundamentals that web frameworks usually hide: sockets, concurrency, shared state, and durability. The goal was to implement a small but realistic key–value store that can be driven by multiple concurrent clients while remaining simple enough to reason about. Instead of relying on HTTP and a heavyweight framework, I chose raw TCP with a line-oriented text protocol so I could control message framing, error handling, and shutdown behavior directly. The focus areas were thread safety (using reentrant locks to guard the store), file-backed persistence so data survives restarts, and a clean separation between transport, command parsing, and storage logic so each layer is testable in isolation.
Functionally, the server supports a minimal, Redis-style command set designed for everyday use:
PING,
SET <key> <value>,
GET <key>,
REMOVE <key>,
PRINT (to list present keys), and
QUIT (to disconnect).Each client connection is handled on its own thread; the main acceptor loop blocks on accept(), spawns a worker thread, and then goes back to listening. The store itself is a single dictionary guarded by a reentrant lock (RLock) so that nested operations (like writes that also trigger a persistence routine) don’t deadlock. Persistence is file-backed: on every mutating command (SET/REMOVE), the server writes an append-only log entry to disk and also updates a snapshot file periodically. On startup, recovery replays the log onto the snapshot to restore the last known state without losing committed operations.
The primary use cases are (1) learning and teaching: this project is a compact sandbox to demonstrate how concurrent servers work, how to design a simple wire protocol, and how to introduce durability without a database; (2) prototyping: scripts, tests, and small services can depend on this KV store as a lightweight coordination mechanism; and (3) interview prep and portfolio. It exhibits practical systems knowledge: parsing, sockets, synchronization, and fault handling, plus thoughtful documentation and tests. Because the protocol is just text lines, any language can speak to it with a few socket calls.
# Clone the git repository
git clone https://github.com/BrianKYildirim/key-value-storage.git
cd key-value-storage
# Start the server
python server.py
# The server will start listening on port 3490 and display:
Server listening on 0.0.0.0:3490
# Open another terminal in the same project directory and run the client:
python client.py
PING -> PONG
SET user:42 Brian -> Added key 'user:42' with value 'Brian'
GET user:42 -> Brian
REMOVE user:42 -> Removed key 'user:42'.
PRINT -> user:42, session:abc, ... (Other key-value pairs added)
QUIT -> Exiting client. Connection closed.The protocol is line-oriented and space-delimited: the command and its arguments are separated by spaces, values are treated as opaque strings, and responses are a single line terminated by a newline. Errors are returned as "ERR <message>" with clear hints (e.g., "ERR wrong number of arguments"). Because the protocol is simple and deterministic, it’s easy to write black-box tests that assert the server’s exact responses.
Under the hood, the concurrency model is intentionally conservative and easy to audit. Each client thread performs a blocking read loop with small, bounded buffers to avoid unbounded memory growth. A shared RLock protects the in-memory dictionary, and all mutations take the lock briefly, update the structure, and then trigger persistence. The persistence layer itself avoids long, blocking file I/O on the hot path by batching writes where possible and by using atomic file replacement for snapshots: write to a temp file, fsync it, and then rename it over the old snapshot. This approach is portable across platforms and yields crash-safe snapshots (rename is atomic in most filesystems). Recovery first loads the snapshot and then tails the append log to apply any operations that occurred after the last snapshot.
There were several interesting challenges. First, message framing: TCP is a stream, not a message bus; reads can return partial lines or multiple lines at once. I fixed this by keeping a per-connection buffer and splitting on newline boundaries so that commands are reassembled correctly regardless of how the kernel chunks data. Second, shutdown and resource cleanup: threads must exit cleanly, sockets must be closed, and files must be flushed and fsynced. I added signal handling and a graceful shutdown path that stops accepting new connections, notifies workers, and waits for them to drain. Third, concurrency correctness: without careful lock scoping, it’s easy to introduce deadlocks or data races. Reentrant locks simplified the persistence path where a write operation needs to call a function that also touches the store. Finally, persistence correctness: if the process crashes midway through a write, we must not corrupt the snapshot. Atomic rename and a simple write-ahead log made this both robust and understandable.
The design is intentionally modest yet extensible. It would be straightforward to add an expiry mechanism (EXPIRE <key> <seconds>), a background compaction that rolls log entries into the snapshot at configurable intervals, batched commands to reduce lock contention, or even a simple binary protocol for high-volume clients. Another direction would be replacing the global lock with a sharded map (consistent hashing based on key) to reduce contention under heavy write loads; the API wouldn’t need to change for clients. Security-wise, the current server is meant for trusted networks; enabling TLS and authentication would be the next production-readiness steps. Finally, containerizing the server with a small entrypoint script makes it easier to run repeatable benchmarks and to deploy it as a sidecar in local development environments.
In short, this project exists to demonstrate pragmatic systems engineering: a clear wire protocol, careful concurrency, crash-safe persistence, thorough testing, and meaningful performance measurements. It’s not a replacement for Redis or a production database, but it captures the essential ideas in a compact codebase that I can explain line by line. Building it forced me to think about everything from partial reads to atomic file operations, and it gave me a tangible platform for discussing design tradeoffs—exactly the kind of learning artifact I want in my portfolio.