noss in #erlang on freenode recently brought to my attention: Jetty Continuations
This is an interesting blog entry. The basic idea is, instead of using 1 thread per connection, since connections can last awhile, they use 1 thread per request that a connection has. The hope being, a connection will idle most of the time and only send requests once in awhile. The problem that they ran into is, a piece of software is using a request timeout to poll for data. So requests are now sticking around for a long time, so they have all these active threads that they don't want. So to deal with this, they use a concept of continuations so the thread can die but the request still hang around, and then once it's ready to be processed a thread is created again and the request is handled. So having all these requests hanging around that arn't doing anything is no longer a problem.
Well, this begs the question, why are you using a dynamic number of threads in the first place if you are going to have to limit how many you can even make. If the problem, in the first place, is they have too many threads running, then their solution works only for idle threads doesn't it? Being forced to push some of the requests to a continuation means they have applied some artificial limit to the number of threads which can be run. What happens then, when the number of valid active requests exceeds this limit? What then? Push active requests to a continuation and get to then when you have time? Simply don't let the new requests get handled? If they want to to use threads to solve their problem then putting a limit on them seems to make the choice of threads not a good one. Too poorly paraphrase Joe Armstrong, are they also going to put a limit on the number of objects they can use? If threads are integral to solving your problem, then it seems as though you are limiting how well you can solve the problem.
This also got me thinking about other issues involving threading in non-concurrent orientated languages. Using a COL (Concurrent Orientated Language) all the time would be nice (and I hope that is what the future holds for us). But today, I don't think it is always practical. We can't use Erlang or Mozart or Concurrent ML for every problem due to various limiting factors. But on the same token, using threads in a non-COL sometimes makes the solution to a problem a bit easier to work with. At the very least, making use of multiple processors sounds like a decent argument. But writing code in, say, java, as if it was Erlang does not work out. I think the best one can hope to do is a static number of threads. Spawning and destroying threads dynamically in a non-COL can be fairly expensive in the long run and you have to avoid situations where you start up too many threads. I think having a static number of threads i a pool or with each doing a specific task is somewhat the "best of both worlds". You get your concurrency and you, hopefully, avoid situations like Jetty is running into. As far as communication between the threads is concerned, I think message passing is the best one can hope for. The main reason I think one should use message passing in these non-COL's is, it forces all of the synchornization to happen in one localized place. You can, hopefully, avoid deadlocks this way. And if there is an error in your synchornization, you can fix it in one spot and it is fixed everywhere. As opposed to having things synchornized all over the code, god knows where you may have made an error.
I think this post most likely opened up a can of worms. I lightly touched on a lot of issues and most likely did not explain things in full. Perhaps this will raise some interesting questions.