[ag-automation] some stresstest results with Xenomai and Preempt-RT

Wed Apr 19 18:50:12 CEST 2006

Hi Fu,

Luotao Fu wrote:
> Hi folks,
> I'm following the example of ealier threads and thus open this thread in
> english :-) 
> I did some Stresstest with Pr-RT and Xenomai, in the
> following some simple results for evtl. further discussion:
> 
> Testcandidate:
> A: Preempt-RT 2.6.16-rt16
> B: Xenomai (svn Rev. #949)

Just for the records: ipipe-1.2-04?

> 
> Hardware:
> Intel Celeron 733MHz, 256 MB RAM, no SWAP. 
> 
> Testtools:
> * Cyclictest (originally by tglx, modified by Jan Kiszka, available at
>   the svn trunk of the xenomai project)

I banged my head against several walls yesterday and today before I
found a problematic property of cyclictest and also a bug of my own:

 o The threads are starting in an unsynchronised fashion, which is ok
   for comparison as long as the start offsets are comparable as well.
   They weren't in the original implementation. I therefore pinned all
   threads to the same start time in latest Xenomai trunk. [I'm now
   considering to add adjustable start offsets as parameter.]

 o I placed an ioctl into the measurement path of the highest prio
   thread in order to freeze backtraces on the highest latencies.
   Unfortunately, this call disturbed the measurement even without any
   tracing support installed. Bug fixed in trunk.

To sum up: when time permits, please recheck both systems with
cyclictest from revision #955. I don't expect major differences on your
test machine (too much MHz...), but I faced significant effects on
Pentium 266 and 133 MHz boxes. What will change is the worst-case
latency "ladder"; it will become priority-sorted.

BTW, the effects that happened to be observable with the unsynchronised
threads demonstrate how carefully one has to design a complex system:
Timer events were arranged in a cascading order, causing a IRQ storm
once in a while which influenced the system latency on low-end
significantly. The point is that the number of timer events one has to
consider for the system design varies between both approaches due to the
fact that Preempt-rt provides high resolution timing for everyone (RT
and non-RT) automatically.

> * irLat (interrupt latency tool, based on the concept of LPPTEST by tglx,
> we'll release the tool soon)
> 
> used libraries/skins:
> A: NPTL
> B: POSIX Skin, RTDM Skin
> 
> (non-realtime) Workload:
> * cache calibrator (http://monetdb.cwi.nl/Calibrator/)
> * flood ping
> 
> Test duration:
> about 5 hours each
> 
> Results:
> * irLat:
>   |  Min(usec)  |  Max(usec)  
> A |  3.1985     |  105.33 
> B |  3.2130     |  92.52
> **********************************

Just for clarification:

event -> irq handler -> user space task -> reply
or
event -> irq handler -> reply?

On Preempt-rt: handler is SA_NODELAY, i.e. non-threaded, right?

> 
> * cyclictest:
> A:
> root at krachkiste:/ptx/work/lfu/utils/cyclictest ./cyclictest -p 80 -t 5 -n
> 1.58 1.61 1.62 3/68 4079
> 
> T: 0 ( 3131) P:80 I:    1000 C:16469865 Min:       8 Act:      35 Max: 192
> T: 1 ( 3132) P:79 I:    1500 C: 9979903 Min:       8 Act:      34 Max: 215
> T: 2 ( 3133) P:78 I:    2000 C: 7934887 Min:       9 Act:      38 Max: 123
> T: 3 ( 3134) P:77 I:    2500 C: 6587910 Min:       9 Act:      35 Max: 161
> T: 4 ( 3135) P:76 I:    3000 C: 5489925 Min:       9 Act:      57 Max: 186
> ______________________________________
> 
> B:
> root at krachkiste:/ptx/work/lfu/local/xenomai/testsuite/cyclic ./cyclictest -n -p 80 -t 5
> 1.16 1.05 1.01 2/43 4063
> 
> T: 0 ( 3381) P:80 I:    1000 C:17461866 Min:       1 Act:       7 Max: 151
> T: 1 ( 3382) P:79 I:    1500 C:11641244 Min:       1 Act:       7 Max: 223
> T: 2 ( 3383) P:78 I:    2000 C: 8730933 Min:       1 Act:      10 Max: 125
> T: 3 ( 3384) P:77 I:    2500 C: 6984747 Min:       1 Act:      10 Max: 134
> T: 4 ( 3385) P:76 I:    3000 C: 5820622 Min:       1 Act:       8 Max: 99
> 
> 
> Both systems show reliable results under heavy cache and interrupt workload. 
> Though the extreme values of testresults are quite near to each other,
> both candidates show quite different behaviour during the test. The
> presented results are just simple final results, in case
> of interest we'll post later some more detailed informations like plots,
> results of tracing/profiling, results of further benchmark tools etc.
> 
> P.S. 
> The Kernelconfiguration of both candidates are mostly identical.
> The configfiles, together with a list of running processes, are attached
> to this mail.

Minor note on Xenomai: CONFIG_XENO_OPT_TIMING_PERIODIC is not required
when running highres timing (one-shot mode). But it will save you only a
few nanoseconds and bytes on mid-range platforms.

> 
> Cheers
> Luotao Fu
> 

Thanks for your effort! Looking forward to further details.

Jan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 250 bytes
Desc: OpenPGP digital signature
Url : https://lists.osadl.org/pipermail/ag-automation/attachments/20060419/934f8532/signature-0001.pgp