| To: | l.andrew@xxxxxxxx |
|---|---|
| Subject: | [Discussion] Re: Reduce polling costs |
| From: | John Heffner <jheffner@xxxxxxx> |
| Date: | Wed, 22 Aug 2007 18:45:06 -0700 |
Lachlan Andrew wrote:
Greetings John, Hm, I had forgotten how unoptimized some of the library was. ;) This is probably unrelated to your problem, though, at >=1ms polling frequency. BTW, I'm just startinga complete API redesign, so if you have suggestions/complaints, now's the time... We've also noticed that our polling process occasionally freezes for several seconds when we poll web100 too quickly, but doesn't if we poll slower. This may be a bug in our experimental kernel, but the interaction with web100 seems odd. Has anyone seen anything like that before? Interesting, I haven't heard of that one before. If you can help track that down, I'm be curious what might be causing the problem. Tom Quetchenbach wrote: I know Baruch Even at Hamilton had this issue, and rolled his own instrumentation of the things he was interested in using queues of events rather than a polling approach. One problem seems to be that, at times of heavy loss, Linux can hold the lock for a second or more going through all the retransmissions etc. That is exactly the sort of Linux problem we would like to debug, but we need to see what is happening during those times. Ah, I understand your issue now. This is kind of a tough case for Web100 -- it's not really what it was designed to do. Baruch's approach for this type of thing is more appropriate. Taking out the lock_sock() will definitely help you out. (I could add a switch for this or pretty easily.) However, I think if you're running on a uni-processor system you may have some problems no matter what. The softirq processing in 2.6 is better than it used to be in 2.4 in terms of not starving out user processes, but I think it will still be an issue if you're looking at trying to get fine-grain data to see what's happening during recovery. The lock_sock() is there to ensure the correctness of the stats. Also, though unlikely some of the 64-bit values could be incorrect if read at the wrong time. It spent 10s in lock_sock()? Or was this some other lock? -John _______________________________________________ Discussion mailing list Discussion@xxxxxxxxxx http://internal.web100.org/mailman/listinfo/discussion |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | [SPAM] I have been trying to reach you, Lisa Patterson |
|---|---|
| Next by Date: | [Discussion] Re: Reduce polling costs, Lachlan Andrew |
| Previous by Thread: | Re: [Discussion] Re: Reduce polling costs, John Heffner |
| Next by Thread: | [Discussion] Re: Reduce polling costs, Lachlan Andrew |
| Indexes: | [Date] [Thread] [Top] [All Lists] |