-
Notifications
You must be signed in to change notification settings - Fork 110
Debugging and troubleshooting
If you're developing or debugging Tempesta's code (thanks for that by the way!), then you need to make several steps to simplify the process or even save your VM in a cloud. Tempesta FW is a Linux kernel extension, so the below are typical steps helping in Linux kernel development and debugging.
If a kernel crash occurs, then the system may hang, so it has a sense to setup automatic reboot. Normally your Linux should reboot automatically on kernel panic. You can check it by
# cat /proc/sys/kernel/panic
1
1 means 1 second before reboot on system panic. You can access and set the
setting by sysctl kernel.panic. Next, you can emulate the panic by
# echo c > /proc/sysrq-trigger
and see that the system reboots in 1 second. One more important setting is
sysctl kernel.panic_on_oops (you can find it also in
/proc/sys/kernel/panic_on_oops). Usually it's set to 1, i.e. reboot on any
kernel fault ("Oops") occurred. Setting this isn't necessary, but you may prefer
to use it.
Next, you need to reboot to testing kernel but make the system reboot automatically to safe kernel if the first one occasionally crashes. You can do this using
GRUB_DEFAULT=saved
GRUB_CMDLINE_LINUX_DEFAULT="panic=1"
in /etc/default/grub. The settings allow you to set a kernel as safe, i.e.
booted by default, and add the kernel parameter to reboot automatically in 1
second after panic. The kernel parameter automatically sets kernel.panic
sysctl. To apply the changes run
# update-grub
for Ubuntu. Now let's list all installed kernels in the system. For Ubuntu you might see following:
# grep 'menuentry\>' /boot/grub/grub.cfg
menuentry 'Ubuntu' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-cfda0544-9803-41e7-badb-43563085ff3a' {
menuentry 'Ubuntu, with Linux 4.8.15+' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.8.15+-advanced-cfda0544-9803-41e7-badb-43563085ff3a' {
menuentry 'Ubuntu, with Linux 4.8.15+ (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.8.15+-recovery-cfda0544-9803-41e7-badb-43563085ff3a' {
menuentry 'Ubuntu, with Linux 4.4.0-75-generic' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.4.0-75-generic-advanced-cfda0544-9803-41e7-badb-43563085ff3a' {
menuentry 'Ubuntu, with Linux 4.4.0-75-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.4.0-75-generic-recovery-cfda0544-9803-41e7-badb-43563085ff3a' {
menuentry 'Ubuntu, with Linux 4.4.0-45-generic' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.4.0-45-generic-advanced-cfda0544-9803-41e7-badb-43563085ff3a' {
menuentry 'Ubuntu, with Linux 4.4.0-45-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-4.4.0-45-generic-recovery-cfda0544-9803-41e7-badb-43563085ff3a' {
I.e. the new kernel is listed in submenu. However, we can choose the first kernel as testing and the second one as safe:
# grub-set-default 'Ubuntu, with Linux 4.4.0-75-generic'
# grub-reboot 'Ubuntu, with Linux 4.8.15+'
After the system reboot it will boot into 4.8.15-tfw kernel, but if it fails it will automatically reboot to 4.1.27+ kernel.
It might be useful to setup automatic storing kernel crash dumps. There are plenty of good documentation about kdump & crash utility, please explore the links from references.
Just some settings for quick start and to fix known issues:
-
kernel.panic=60in/etc/sysctl.conf- we've found that sometimes kdump isn't in time to create a memory image and this setting fixes the problem. -
it's better to use as small
tempesta_dbmemkernel parameter as possible, e.g.tempesta_dbmem=4. Smaller TempestaDB memory leave more space for kdump. -
adjust
/etc/default/grub.d/kdump-tools.defaultasGRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=2048M-:512M"
- http://people.redhat.com/anderson/crash_whitepaper/
- https://www.kernel.org/doc/Documentation/kdump/kdump.txt
- http://www.dedoimedo.com/computers/crash-book.html
It's extremely important for the kernel debugging to get output of the serial
console (dmesg), which contains call stacks and necessary system output just
before a problem occurred.
Add printk.synchronous=1 to the kernel cmdline to get all printk() messages.
If you see messages like
** 5029 printk messages dropped **
in dmesg, then you typically need to set the option. This is extremely important for debug builds producing massive printing.
You also might need to update your /etc/sysctl.conf with
kernel.printk = 8 4 1 8
to make the kernel to send even debug messages to the console.
If you use a KVM virtual machine, then it's very straightforward to enable serial console output:
- add
-serial file:serial.txtoption toqemu-system-x86_64call so KVM will write the output of the serial console to theserial.txtfile. - add
console=tty0 console=ttyS0,115200n8toGRUB_CMDLINE_LINUXline in/etc/default/grubto let the kernel forward the console output to the tty device. - update the grub as in the instructions above and reboot your VM.
If you use a VMware virtual machine:
- add
serial portin virtual machine settings: Virtual Machine -> Settings (Ctrl+D) -> Hardware tab -> Add... -> select Serial port -> Finish -> select Use output file. Or use official documentation. - add
console=tty0 console=ttyS0,115200n8toGRUB_CMDLINE_LINUXline in/etc/default/grubto let the kernel forward the console output to the tty device. You may need to use `console=tty1 console=ttyS1,115200n8'. - update the grub as in the instructions above and reboot your VM.
If you run Tempesta FW on a hardware server and do not have access to IPMI, then
a netconsole can be used to get the output of serial console. You can use the
Ubuntu guide how to set it up.
Call traces are dumped on kernel crash and on kernel warnings. Typical trace consists of entries which looks like
do_syscall_64+0x33/0x80
Here are
- a function name (do_syscall_64 in the example),
- an offset from function beginning (0x33 in the example),
- a function length (/0x80 in the example).
The addr2line utility helps to convert this information into a source file name and line number. The call to utility is:
addr2line -e /PATH/TO/MODULE -i HEX_ADDRESS
Here:
- /PATH/TO/MODULE is a path to a module where the function is located. In case of kernel built in functions, this should be a path to the kernel. Note that the kernel should not be compressed and debug info should not be stripped from it.
- HEX_ADDRESS - an address of the instruction, '0x' prefix is required.
To convert the function name and the offset into the plain hex address,
a mix from nm, grep, perl & awk could be used, so the overal command will be in
the form
addr2line -e /PATH/TO/MODULE -i $(perl -e 'printf("%x\n", 0x'$(nm /PATH/TO/MODULE | grep '\<FUNCTION_NAME$' | awk '{print $1}')'+HEX_OFFSET)')
; for example above it will be
addr2line -e vmlinux -i $(perl -e 'printf("%x\n", 0x'$(nm vmlinux | grep '\<do_syscall_64$'| awk '{print $1}')'+0x33)')
There are scripts in linux kernel tree capable of automating this task. Those can be used like this:
linux-kernel-source/scripts/decode_stacktrace.sh linux-kernel-source/vmlinux linux-kernel-source tempesta-source/fw/tempesta_fw.ko < dmesg.log
There exist several tools to help debug memory issues.
https://www.kernel.org/doc/html/v5.10/dev-tools/kmemleak.html
Kmemleak provides a way of detecting possible kernel memory leaks in a way similar to a tracing garbage collector.
In order to use kmemleak the kernel has to be built with the following options:
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000 <- can be arbitrary
Basic flow:
-
Check if
debugfsis already mounted. Mount if necessary:# mount -t debugfs nodev /sys/kernel/debug/ -
echo clear > /sys/kernel/debug/kmemleak -
Run the test
-
echo scan > /sys/kernel/debug/kmemleakIf there are any hypothetical leaks, one might see the following meesage in
dmesg:[ 1185.562872] kmemleak: 39 new suspected memory leaks (see /sys/kernel/debug/kmemleak) -
Gather the report:
cat /sys/kernel/debug/kmemleakExample report:
unreferenced object 0xffff88813997d040 (size 704):
comm "softirq", pid 0, jiffies 4295173707 (age 614.949s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 c0 98 99 26 81 88 ff ff ...........&....
backtrace:
[<0000000035bfc8aa>] tfw_h2_add_stream+0xbe/0x410 [tempesta_fw]
[<00000000ba7e15f7>] tfw_h2_frame_recv+0x3140/0x64f0 [tempesta_fw]
[<000000004e5a3637>] ss_skb_process+0x3a3/0x5e0 [tempesta_fw]
[<00000000aedeaca5>] tfw_h2_frame_process+0x16d/0x1200 [tempesta_fw]
[<00000000e66f86b4>] tfw_connection_recv+0x100/0x1d0 [tempesta_fw]
[<000000003ca45a93>] tfw_tls_connection_recv+0x538/0xbb0 [tempesta_fw]
[<00000000b7f25f34>] ss_tcp_process_data+0x5d9/0xd90 [tempesta_fw]
[<000000004dcf244d>] ss_tcp_data_ready+0xc4/0x1c0 [tempesta_fw]
[<00000000832ba38a>] tcp_data_queue+0x1669/0x4cb0
[<000000001c743974>] tcp_rcv_established+0x6ae/0x1d10
[<00000000a6c79ebb>] tcp_v4_do_rcv+0x4fa/0x750
[<0000000035a2b47c>] tcp_v4_rcv+0x2588/0x3530
[<00000000158df7d6>] ip_protocol_deliver_rcu+0x6a/0x550
[<0000000085c0e07f>] ip_local_deliver_finish+0x1a4/0x250
[<000000006d5a9809>] ip_local_deliver+0x246/0x2a0
[<00000000d925252d>] ip_rcv+0x15a/0x180
unreferenced object 0xffff888137f47a80 (size 704):
https://www.kernel.org/doc/html/v5.10/vm/page_owner.html
kmemleak can't track raw pages allocations from the buddy allocator, so page_owner comes to rescue!
This tool is useful to track who allocated each page and can be used to find memory leaks.
The kernel has to be built with CONFIG_PAGE_OWNER=y option and page_owner=on should be added to boot cmdline.
- Build user space helper:
cd <linux kernel>/tools/vm && make page_owner_sort - Collect allocation statistics before the test:
cat /sys/kernel/debug/page_owner > page_owner_full_pre.txt ./page_owner_sort page_owner_full_pre.txt sorted_page_owner_pre.txt - Run the workload.
- Collect new statistics and compare it to the original data from above:
cat /sys/kernel/debug/page_owner > page_owner_full_post.txt ./page_owner_sort page_owner_full_post.txt sorted_page_owner_post.txt
Example output:
PRE >
57889 times:
Page allocated via order 0, mask 0x100a20(GFP_ATOMIC|__GFP_HARDWALL)
prep_new_page+0x1ad/0x250
get_page_from_freelist+0x22b1/0x2e80
__alloc_pages_nodemask+0x2b4/0x630
tfw_cache_do_action+0x2598/0x2e10 [tempesta_fw]
tfw_cache_process+0x1aa/0x7b0 [tempesta_fw]
tfw_http_req_process+0x94d/0x13c0 [tempesta_fw]
tfw_http_msg_process_generic+0x42e/0x11d0 [tempesta_fw]
tfw_h2_frame_process+0x7b0/0x11d0 [tempesta_fw]
tfw_connection_recv+0x102/0x1e0 [tempesta_fw]
tfw_tls_connection_recv+0x4d9/0xa30 [tempesta_fw]
ss_tcp_process_data+0x5ea/0xd50 [tempesta_fw]
ss_tcp_data_ready+0xb9/0x1e0 [tempesta_fw]
tcp_data_queue+0x1669/0x4cb0
tcp_rcv_established+0x6e8/0x1d70
tcp_v4_do_rcv+0x4fa/0x750
tcp_v4_rcv+0x2588/0x3530
POST >
120203 times:
Page allocated via order 0, mask 0x100a20(GFP_ATOMIC|__GFP_HARDWALL)
prep_new_page+0x1ad/0x250
get_page_from_freelist+0x22b1/0x2e80
__alloc_pages_nodemask+0x2b4/0x630
tfw_cache_do_action+0x2598/0x2e10 [tempesta_fw]
tfw_cache_process+0x1aa/0x7b0 [tempesta_fw]
tfw_http_req_process+0x94d/0x13c0 [tempesta_fw]
tfw_http_msg_process_generic+0x42e/0x11d0 [tempesta_fw]
tfw_h2_frame_process+0x7b0/0x11d0 [tempesta_fw]
tfw_connection_recv+0x102/0x1e0 [tempesta_fw]
tfw_tls_connection_recv+0x4d9/0xa30 [tempesta_fw]
ss_tcp_process_data+0x5ea/0xd50 [tempesta_fw]
ss_tcp_data_ready+0xb9/0x1e0 [tempesta_fw]
tcp_data_queue+0x1669/0x4cb0
tcp_rcv_established+0x6e8/0x1d70
tcp_v4_do_rcv+0x4fa/0x750
tcp_v4_rcv+0x2588/0x3530
https://www.kernel.org/doc/html/v5.10/dev-tools/kasan.html
KernelAddressSANitizer (KASAN) is a dynamic memory error detector designed to find out-of-bound and use-after-free bugs.
The following build options are needed to use it:
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
# CONFIG_KASAN_OUTLINE is not set
CONFIG_KASAN_INLINE=y <- works faster than CONFIG_KASAN_OUTLINE
CONFIG_KASAN_STACK=1
CONFIG_KASAN_VMALLOC=y <- enable this option, so KASAN would also cover vmalloc space
Usage:
- Run the desired test
- Observe reports in
dmesg
Example report:
BUG: KASAN: slab-out-of-bounds in sg_next+0x72/0x80
Read of size 8 at addr ffff8881292b50e8 by task ksoftirqd/3/29
CPU: 3 PID: 29 Comm: ksoftirqd/3 Tainted: G OE 5.10.35+ #8
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
dump_stack+0x91/0xbc
print_address_description.constprop.0+0x1c/0x210
? _raw_spin_lock_irqsave+0x7d/0xc0
? _raw_write_unlock_bh+0x60/0x60
? sg_next+0x72/0x80
kasan_report.cold+0x37/0x7c
? kasan_poison_shadow+0x10/0x30
? sg_next+0x72/0x80
sg_next+0x72/0x80
ttls_decrypt+0x12cb/0x20b0 [tempesta_tls]
? receive_buf+0xb68/0x57f0 [virtio_net]
? virtnet_poll+0x4b0/0xc81 [virtio_net]
? ttls_send_alert+0x1d0/0x1d0 [tempesta_tls]
? smpboot_thread_fn+0x2eb/0x6a0
? kthread+0x31c/0x3e0
? ret_from_fork+0x1f/0x30
? console_trylock+0xb0/0xb0
ttls_recv+0x23e/0x1e10 [tempesta_tls]
? memset+0x10/0x40
ss_skb_process+0x3a3/0x5e0 [tempesta_fw]
? ttls_handle_alert+0x1a0/0x1a0 [tempesta_tls]
? __kasan_slab_free+0x124/0x150
? tfw_sock_srv_exit+0x30/0x30 [tempesta_fw]
? _raw_spin_lock+0x6a/0xb0
? _raw_write_lock_irqsave+0xc0/0xc0
tfw_tls_connection_recv+0x17d/0xc00 [tempesta_fw]
? ss_skb_unroll+0x9b0/0xf40 [tempesta_fw]
? tfw_tls_connection_lost+0x30/0x30 [tempesta_fw]
? dynamic_emit_prefix+0x350/0x350
ss_tcp_process_data+0x5d9/0xe30 [tempesta_fw]
? ss_linkerror+0x120/0x120 [tempesta_fw]
ss_tcp_data_ready+0x13b/0x270 [tempesta_fw]
tcp_rcv_established+0x1155/0x1d10
? tcp_data_queue+0x4cb0/0x4cb0
tcp_v4_do_rcv+0x4fa/0x750
tcp_v4_rcv+0x2588/0x3530
? nf_ct_deliver_cached_events+0xe4/0x410 [nf_conntrack]
? __inet_lookup_established+0x27e/0x4d0
? tcp_v4_early_demux+0x790/0x790
ip_protocol_deliver_rcu+0x6a/0x550
? nf_confirm+0x340/0x340 [nf_conntrack]
ip_local_deliver_finish+0x1a4/0x250
ip_local_deliver+0x246/0x2a0
? ip_local_deliver_finish+0x250/0x250
? ip_rcv_finish_core.constprop.0+0x2e9/0x18a0
? ip_protocol_deliver_rcu+0x550/0x550
ip_rcv+0x15a/0x180
? ip_local_deliver+0x2a0/0x2a0
? ip_rcv_finish_core.constprop.0+0x18a0/0x18a0
? kasan_unpoison_shadow+0x34/0x40
__netif_receive_skb_core.constprop.0+0xdd9/0x3050
? get_page_from_freelist+0x1b39/0x4da0
? generic_xdp_tx+0x410/0x410
? ip_finish_output2+0x60c/0x1a60
? __zone_watermark_ok+0x420/0x420
__netif_receive_skb_list_core+0x2cf/0x8b0
? __netif_receive_skb_core.constprop.0+0x3050/0x3050
? ret_from_fork+0x1f/0x30
? ktime_get_with_offset+0xa9/0x150
netif_receive_skb_list_internal+0x5d7/0xc10
? arch_stack_walk+0x6c/0xb0
? __netif_receive_skb_list_core+0x8b0/0x8b0
? dev_gro_receive+0x229/0x2330
? memcpy+0x39/0x60
gro_normal_one+0x137/0x220
napi_gro_receive+0x23c/0x2a0
receive_buf+0xb68/0x57f0 [virtio_net]
? page_to_skb+0x9f0/0x9f0 [virtio_net]
? tcp_schedule_loss_probe.part.0+0x383/0x510
? virtqueue_get_buf_ctx_split+0x273/0x630
? _raw_spin_unlock_bh+0x20/0x20
virtnet_poll+0x4b0/0xc81 [virtio_net]
? receive_buf+0x57f0/0x57f0 [virtio_net]
net_rx_action+0x37c/0xbc0
? napi_complete_done+0x5d0/0x5d0
? sched_clock_cpu+0x18/0x160
__do_softirq+0x1a6/0x5ab
? tasklet_hi_action+0x20/0x20
run_ksoftirqd+0x26/0x40
smpboot_thread_fn+0x2eb/0x6a0
? smpboot_register_percpu_thread+0x320/0x320
? __kthread_parkme+0x85/0x100
? smpboot_register_percpu_thread+0x320/0x320
kthread+0x31c/0x3e0
? __kthread_bind_mask+0x90/0x90
ret_from_fork+0x1f/0x30
Allocated by task 29:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc.constprop.0+0xc2/0xd0
ttls_decrypt+0xf6f/0x20b0 [tempesta_tls]
ttls_recv+0x23e/0x1e10 [tempesta_tls]
ss_skb_process+0x3a3/0x5e0 [tempesta_fw]
tfw_tls_connection_recv+0x17d/0xc00 [tempesta_fw]
ss_tcp_process_data+0x5d9/0xe30 [tempesta_fw]
ss_tcp_data_ready+0x13b/0x270 [tempesta_fw]
tcp_rcv_established+0x1155/0x1d10
tcp_v4_do_rcv+0x4fa/0x750
tcp_v4_rcv+0x2588/0x3530
ip_protocol_deliver_rcu+0x6a/0x550
ip_local_deliver_finish+0x1a4/0x250
ip_local_deliver+0x246/0x2a0
ip_rcv+0x15a/0x180
__netif_receive_skb_core.constprop.0+0xdd9/0x3050
__netif_receive_skb_list_core+0x2cf/0x8b0
netif_receive_skb_list_internal+0x5d7/0xc10
gro_normal_one+0x137/0x220
napi_gro_receive+0x23c/0x2a0
receive_buf+0xb68/0x57f0 [virtio_net]
virtnet_poll+0x4b0/0xc81 [virtio_net]
net_rx_action+0x37c/0xbc0
__do_softirq+0x1a6/0x5ab
The buggy address belongs to the object at ffff8881292b5000
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 232 bytes inside of
256-byte region [ffff8881292b5000, ffff8881292b5100)
The buggy address belongs to the page:
page:00000000decafbbf refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1292b4
head:00000000decafbbf order:1 compound_mapcount:0
flags: 0x17ffffc0010200(slab|head)
raw: 0017ffffc0010200 0000000000000000 0000000500000001 ffff888100043400
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8881292b4f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8881292b5000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff8881292b5080: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc
^
ffff8881292b5100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8881292b5180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
- Home
- Requirements
- Installation
-
Configuration
- Migration from Nginx
- On-the-fly reconfiguration
- Handling clients
- Backend servers
- Load Balancing
- Caching Responses
- Non-Idempotent Requests
- Modify HTTP Messages
- Virtual hosts and locations
- HTTP Session Management
- HTTP Tables
- HTTP(S) Security
- Header Via
- Health monitor
- TLS
- Virtual host confusion
- Traffic Filtering by Fingerprints
- Access Log Analytics
- Run & Stop
- Application Performance Monitoring
- Use cases
- Performance
- Bot Protection
- Contributing