This post is about coding. If you are not a developer, you can stop reading now.
I read some stuff about c10k problem and I am trying to implement event driven pattern with asynchronous I/O.
Since event driven pattern looks efficient for servers with a lot of connections, I will try to explain briefly how I implemented it.
Event driven pattern means that we consider events as input and process a number of them sequentially until the operations they trigger are done. If the workflow "forks" (i.e the events were dispatched in 2 output queues), we schedule a task for another thread to process the other branch.
Asynchronous I/O means that we do not block to wait for events. Instead, we poll several queues in a non blocking way. Basically, asynchronous means that data are queued until a thread process them.
Putting all together, we need define the workflow of the events, and cut the workflow in small blocks. Each block is processed asynchronously, i.e the input is a queue and the output is also a queue that will be the input of the next block.
In the TCP layer, there are several blocks defined. They are called workers.
There is a worker to detect data on the socket and dispatch it according to the context of the socket (authenticated, waiting authentication or new). There is a worker to reassemble frames, another worker to read the header of the message, and so on. Each work pass its result to the next in the workflow. Workers are thread-safe, so they support concurrency.
The workflow will be defined as a list of worker functions. Since the workflow is a tree, the "forks" will be the node of the tree, and there will be a list of worker functions for each branch.
To Execute the workflow, we need to put the event in the input queue of the first block, and run a thread in the first worker of the list, and we iterate on the function list to run the second worker, and so on. If we need to fork, we request a task to execute the other branch while continuing on the "main" branch. The "main" branch will be executed at once: the same thread will execute all the blocks. "forks" are scheduled tasks that will be executed by other threads.
Here comes the thread strategy. As said, a unique thread will excute the main list (high priority lane), while secondary lists will be scheduled. Note that we can assign a different priority class to each list. There will be a task queue managed in a thread pool. we will have several threads, but only a limited number of them will be active at the same time (a counter controls the number of active threads to avoid context switching).
The advantage of this pattern is that it is highly scalable because we can add as many threads as we want to manage a high volume of events. There is also a nice thread distribution, the threads going always in the blocks where there is something to do. We can also define priority class. The workflows are lists of functions that can be centralized in a class.
This is still a work in progress, but I think that the event driven pattern can be used somewhere else that in the network layer. There is already a ThreadPool class to manage tasks.