[strongSwan-dev] strongswan crashes due to invalid pointer access
SM K
sacho.polo at gmail.com
Thu May 7 01:09:43 CEST 2015
Hi,
I see a problem with some crashes we have seen under load (and tunnels
going up and down). I am using strongswan 5.1.3, running on a 64bit ubuntu
server. The crash is not consistently reproduced and the stack shows
different places of crash.
But I see a pattern for all crashes. It seems that certain pointers point
to objects that have shifted by 8 bytes. I have listed two such examples
here. One thing we also noticed is that the frequency of the
crash went down when we reduced logging to 0. Has anyone seen a similar
problem? Could a copy or clone function be misbehaving by copying one word
after?
EXAMPLE 1:
the job_t pointer was as follows (from the crash)
(gdb) p *(job_t *) 0x7feca438a760
$13 = {status = 3064964496, execute = 0, cancel = 0x7fecb6afa680
<get_priority>, get_priority = 0x7fecb6afa710 <destroy>, destroy =
0x7feca435cb30}
When i shifted the pointer by 8, I get
(gdb) p *(job_t *) (0x7feca438a760 - 8)
$14 = {status = JOB_STATUS_QUEUED, execute = 0x7fecb6afa590 <execute>,
cancel = 0, get_priority = 0x7fecb6afa680 <get_priority>, destroy =
0x7fecb6afa710 <destroy>}
So, when the function queue_job was trying to get the priority of a job by
calling get_priority on the job_t pointer, it ended up calling destroy and
crashing.
The stack trace is
(gdb) bt
#0 0x00007fecb60dd845 in *__GI_raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007fecb60e1390 in *__GI_abort () at abort.c:92
#2 0x00000000004014f3 in segv_handler (signal=11) at charon.c:196
#3 <signal handler called>
#4 0x00007fecb6afa71b in destroy (this=0x7feca438a760) at
processing/jobs/rekey_ike_sa_job.c:45
#5 0x00007fecb6f7f4df in queue_job (this=0xf174b0, job=0x7feca438a760) at
processing/processor.c:395
#6 0x00007fecb6f803cf in schedule (this=0xf17a60) at
processing/scheduler.c:197
#7 0x00007fecb6f7f26e in execute (this=0x1) at
processing/jobs/callback_job.c:77
#8 0x00007fecb6f7fca9 in process_job (worker=0xf4eec0) at
processing/processor.c:235
#9 process_jobs (worker=0xf4eec0) at processing/processor.c:321
#10 0x00007fecb6f83007 in thread_main (this=<value optimized out>) at
threading/thread.c:309
#11 0x00007fecb663d9ca in start_thread (arg=<value optimized out>) at
pthread_create.c:300
#12 0x00007fecb619545d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#13 0x0000000000000000 in ?? ()
EXAMPLE 2:
SIMILARLY, another core had this problem with private_traffic_selector_t.
When it tried to destroy the traffic selector, it ended up calling invalid
memory.
(gdb) p *(private_traffic_selector_t *)0x7fdb44c43eb0
$3 = {public = {get_subset = 0x7fdb598965e0 <clone_>, clone =
0x7fdb59896370 <get_from_address>, get_from_address = 0x7fdb598963b0
<get_to_address>,
get_to_address = 0x7fdb598963f0 <get_from_port>, get_from_port =
0x7fdb59896400 <get_to_port>, get_to_port = 0x7fdb59896410 <get_type>,
get_type = 0x7fdb59896420 <get_protocol>, get_protocol = 0x7fdb59896900
<is_host>, is_host = 0x7fdb59896430 <is_dynamic>, is_dynamic =
0x7fdb59896e30 <set_address>,
set_address = 0x7fdb59896850 <equals>, equals = 0x7fdb59896c20
<is_contained_in>, is_contained_in = 0x7fdb59896790 <includes>, includes =
0x7fdb59896ce0 <to_subnet>,
to_subnet = 0x7fdb598965d0 <destroy>, destroy = 0x20000000000007}, type
= 1040435210, protocol = 0 '\000', dynamic = false, netbits = 0 '\000', {
from = 0x7fdb44c43eb0 "\340e\211Y\333\177", from4 = {0}, from6 = {0, 0,
1040435210, 0}}, {to = 0x7fdb44c43eb0 "\340e\211Y\333\177", to4 = {0}, to6
= {0, 0, 4294901760,
0}}, from_port = 192, to_port = 0}
(gdb) p *(private_traffic_selector_t *)(0x7fdb44c43eb0-8)
$4 = {public = {get_subset = 0x7fdb598969f0 <get_subset>, clone =
0x7fdb598965e0 <clone_>, get_from_address = 0x7fdb59896370
<get_from_address>,
get_to_address = 0x7fdb598963b0 <get_to_address>, get_from_port =
0x7fdb598963f0 <get_from_port>, get_to_port = 0x7fdb59896400 <get_to_port>,
get_type = 0x7fdb59896410 <get_type>, get_protocol = 0x7fdb59896420
<get_protocol>, is_host = 0x7fdb59896900 <is_host>, is_dynamic =
0x7fdb59896430 <is_dynamic>,
set_address = 0x7fdb59896e30 <set_address>, equals = 0x7fdb59896850
<equals>, is_contained_in = 0x7fdb59896c20 <is_contained_in>, includes =
0x7fdb59896790 <includes>,
to_subnet = 0x7fdb59896ce0 <to_subnet>, destroy = 0x7fdb598965d0
<destroy>}, type = TS_IPV4_ADDR_RANGE, protocol = 0 '\000', dynamic =
false, netbits = 32 ' ', {
from = 0x7fdb44c43ea8 "\360i\211Y\333\177", from4 = {1040435210}, from6
= {1040435210, 0, 0, 0}}, {to = 0x7fdb44c43ea8 "\360i\211Y\333\177", to4 =
{1040435210}, to6 = {
1040435210, 0, 0, 0}}, from_port = 0, to_port = 65535}
Stack trace for this core:
#0 0x00007fdb589f2845 in *__GI_raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007fdb589f6390 in *__GI_abort () at abort.c:92
#2 0x00000000004014f3 in segv_handler (signal=11) at charon.c:196
#3 <signal handler called>
#4 array_invoke_offset (array=0x7fdb345d21c0, offset=120) at
collections/array.c:531
#5 0x00007fdb5987ddc9 in array_destroy_offset (array=0x7fdb44c43eb0,
offset=120) at collections/array.c:553
#6 0x00007fdb59411d88 in destroy (this=0x7fdb34b71f70) at
sa/child_sa.c:1108
#7 0x00007fdb59414b83 in destroy_child_sa (this=0x7fdb2cbadcc0,
protocol=<value optimized out>, spi=621215179) at sa/ike_sa.c:1441
#8 0x00007fdb59442d15 in delete_child (this=0x7fdb3cd20a20,
protocol=<value optimized out>, spi=621215179, remote_close=<value
optimized out>)
at sa/ikev1/tasks/quick_delete.c:145
#9 0x00007fdb59443000 in delete_child (this=0x7fdb4ea5bb10,
protocol=<value optimized out>, spi=1319484188, remote_close=<value
optimized out>)
at sa/ikev1/tasks/quick_delete.c:105
#10 0x000000007f19ff1b in ?? ()
#11 0x0000000000603190 in stderr@@GLIBC_2.2.5 ()
#12 0x0000000000000000 in ?? ()
anyone seen this before?
regards,
sk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.strongswan.org/pipermail/dev/attachments/20150506/d179ac52/attachment.html>
More information about the Dev
mailing list