[strongSwan] Help needed to achieve 250+ tunnel negotiations per second using the strongswan (5.0.4) and load tester plugin

Chinmaya Dwibedy ckdwibedy at yahoo.com
Wed Mar 5 07:07:16 CET 2014



Hi,

Can anyone please respond to this email ? Thanks in advance for your support and help.

Regards,
Chinmaya



On Tuesday, March 4, 2014 4:53 PM, Chinmaya Dwibedy <ckdwibedy at yahoo.com> wrote:
  
Hi All,


I modified the strongswan (5.0.4) code to write a new DH using
Octeon Core Crypto Library APIs.   Run
with 200k IPsec tunnels with DH group 1 (Encryption algo: AES and integrity
algorithm: SHA1) and found the tunnel setup rate to be 175-180 per second (approximately).
Note that, with gmp library (using the same set of parameters), the setup rate
was found out to be 120-125. Note, the Octeon Core Crypto Library provides
API's on Octeon for Crypto acceleration and DH operation.

Then I did profiling the Charon implementation (using perf
profiler tool) to find the functions which slows down the setup rate. I found
the hotspot to be pthread_mutex_lock(), where most of the CPU cycles are
consumed. I changed the code libstrongswan\threading\mutex.c so as to use gcc
atomic builtins  __sync_fetch_and_add()/__sync_sub_and_fetch() instead of
pthread_mutex_lock()/pthread_mutex_unlock() for enhanced setup rate. Upon running, noticed that the setup rate got reduced to 50
per seconds. What I understand, atomic operation should speed up the rate which
is opposite in this case. 

Can anyone please let me know what might the issue and what is
the way to move forward to achieve the 250 tunnels per second? I am stuck up. Thus any suggestions
are greatly appreciated. I found from the following web link i.e.,
https://lists.strongswan.org/pipermail/users/2009-December/004184.html that,
Mr.  Martin has measured 200+ tunnel
negotiations/second (1 IKE + 1 CHILD_SA). It implies that, it is doable and I am missing something. 

Note that, we are using two Multi-Core MIPS64 Processors with
16 cnMIPS64 v2 cores (one acts as an IKE initiator and another as an IKE
responder). We are running strongswan in both systems. Both the systems have
1Gbps Ethernet cards, which are connected to 1 Gbps L2 switch. The Wind River
Linux runs on all the 16 cores. 

Here goes the strongswan configuration at both the ends 

IKE Initiator  

   # number of worker
threads in charon

        threads = 64

        replay_window =
32

        dos_protection =
no

        block_threshold=1000

        cookie_threshold=1000

        init_limit_half_open=1000

        retransmit_timeout=10

        retransmit_tries=5

        install_virtual_ip=no

        install_routes=no

        close_ike_on_child_failure=yes

        ikesa_table_size
= 16384

        ikesa_table_segments = 256

        reuse_ikesa = no


  load-tester {

                   enable = yes

                   initiators = 10

                   iterations = 25000

                   delay
= 20

                   responder = 30.30.30.21

                   proposal = aes128-sha1-modp768

                   initiator_auth = psk

                   responder_auth = psk

                   request_virtual_ip = yes

                   initiator_tsr=40.0.0.0/8

                   ike_rekey = 0

                   child_rekey = 0

                   delete_after_established = no

                   shutdown_when_complete = no

                    #fake_kernel = yes

 

                  }

 

IKE Responder

 

# number of worker threads in charon

        threads = 64

        replay_window = 32

        dos_protection =
no

        block_threshold=100
        cookie_threshold=100

        init_limit_half_open=100

        hslf_open_timeout=100

        close_ike_on_child_failure=yes

        ikesa_table_size
= 16384

        ikesa_table_segments = 256
        reuse_ikesa = no


Regards,
Chinmaya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.strongswan.org/pipermail/users/attachments/20140304/f7120b0e/attachment.html>


More information about the Users mailing list