Joey reported that websec failed him for reading modifications in web pages when reading the generated mails in a text mail agent. I wanted to point out that websec has ASCII marker support to highlight changes in text for this purpose exactly. This feature was implemented by Javier M. Mora.
Baruch Even's blog
Tue, 26 Jun 2007
WebSec ASCII MarkersCategory: Dev
Thu, 21 Sep 2006
IDE RoundupCategory: Dev
Des Traynor made a very nice roundup of various IDEs and development environments. These were collected from friends of his and it was quite a shock to find Notepad in there!
Sat, 11 Mar 2006
Unix-style utilitiesCategory: Dev
Joey mentioned a few unix utilities and started collecting them. I create utilities for my work but usually they are too specialized and mostly fall into the "process my unique file format" category. I have however a few more generally usable utilities which haven't seen the world outside my computer and thought I'd make them more accessible.
- todist
- Input is a list of numbers, one on a line. Output is their distribution, a value and how many time it occurs in the input. Useful for my statistics and performance work.
- tostats
- Input is a list of numbers, one on a line. Output is some statistics about the numbers: average, stddev, min, max, mid point.
I also have a few non-unix-style utils that are useful (to me), It's a fairly simple multiple machine synchronization by using multicast messages. It's also useful to synchronize on a single machine, running it at the end of one command sequence and running the receiver to wait for it to end and do some other commands. It enables the equivalent of: cmd && othercmd && thirdcmd for multiple branches, so I can run two or more command chains when one ends.
Wed, 14 Sep 2005
When statistical profiling failsCategory: Dev
I'm working on improving the performance of TCP for very high speed networks, the initial work was done by inspecting code and looking for inefficient operations, even O(n) operations are deadly when n>10,000 and the multiplier is high. We found times when processing a single ACK takes more than a millisecond, and a few cases that took more than a second. These things simply stall the network and kill any performance you might had.
For the second stage I wanted to use OProfile, a statistical profiler that uses CPU performance counters to gather information on where we spend most of our time in. This worked well for me in the past, but this time it failed me.
The main reason it failed me is that the normal operation of the TCP stack is working flawlessly and can handle very high speeds already it has built in mechanisms to reduce load on the machine in the receiver side and it so far didn't fail us. The performance issues happen when we lose a packet, at that time we start to use SACK and account for which packets were lost and which we need to resend. This part of the code is what fails us to gain full 1Gbps performance.
The problem this causes for statistical profiling is that it is only a small part of the time spend in the overall process of transferring the data. In a 600 second file transfer we will have less than 10 points where we lose packets and for each of them it will take a second or two to correct. So we have about 15 seconds out of 600 where we measure interesting data and the rest is noise. Except that the noise is far greater than the interesting data.
It took me a while to execute it but a solution was found in the form of pausing and playing the profiling so that profiling will only be done for those times that we are interested in. When a packet is lost we play the profiling and when we recovered from the loss we pause it. This works nicely and the data is sane again.
But it didn't help much, running our changes through this pausing statistical profiler showed us that before our changes things were bad. After our changes the SACK code takes the same amount of time as the interrupt handling code. It looks like the statistical profiler says "You've got nothing to improve on now". But we do have performance problems and we see that the incoming queue is filled up and we get to a throttle state where we drop plenty of incoming packets because the CPU can't process the ACKs fast enough. So what's the deal?
From some initial observations it looks like there is some sub-process at the beginning of SACK processing that is inefficient, after it is passed we handle SACKs fast (faster than normal ACKs on average), but until it is finished we are too slow and ACKs come faster than we can handle them.
And we find again that we need to limit the profiling scope, except that now we don't have an easy way to separate the 'measure' case from the 'dont-measure' case.
Go back to the drawing board, do not collect ₪100.
Tue, 23 Aug 2005
RSS2dl in useCategory: Dev
Someone has put my RSS2dl to good use. Though it looks like their rss feed is not working right now due to some PHP error.
Brought to you from the things-that-only-your-website-logs-tell-you department.
Sat, 20 Aug 2005
Some MSc results for August 2005Category: Dev
After some time chasing non-existing bugs, which were found to be configuration problems on the dummynet machine, I finally got to see the full impact of my work.
Linux 2.6.6 with H-TCP at round-trip delay of 220ms can get to a speed of about 100Mbit per second. After the patches on which I've been laboring for awhile it gets to 500 Mbit/s and handles itself pretty well even up to 600 Mbit/s. That's not very far from the mark of 1Gbit/s. Woooot!
Problem is, 2.6.6 is old by now and my porting of the patches to 2.6.11 (which isn't that new either) had introduced some bugs.
Mon, 23 May 2005
THE Programming Language...Category: Dev
is Python, even Google says so.
This post triggered by this thread.

