Project

General

Profile

Bug #5896

SIGABRT during record removal on FreeBSD

Added by Martin Rehak over 4 years ago. Updated over 4 years ago.

Status:
Fixed
Priority:
Normal
Assignee:
-
Category:
Crashes
Target version:
-
Start date:
2020-05-13
Due date:
% Done:

0%

Estimated time:
Found in version:
4.3-1857~g221c29b40-dirty
Affected Versions:

Description

I am hitting this frequently with HTS Tvheadend 4.3-1857~g221c29b40-dirty on FreeBSD-12.1. This is on console:

2020-05-13 09:21:07.705 [ DEBUG]:dvr: adding inotify watch to /tv/live (fd=1)
2020-05-13 09:21:07.705 [ INFO]:dvr: delete entry 317667e0c3cb6991651657e5bbd3569f "Fejeton ze skal a kamení" on "CT2 CZ" start time 2020-05-09 08:59:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
2020-05-13 09:21:07.706 [ DEBUG]:dvr: adding inotify watch to /tv/live (fd=1)
2020-05-13 09:21:07.706 [ INFO] dvr: delete entry 6a23133e6d99310c02ca58e46e15a9ed "Australská železnice" on "Prima Zoom CZ" start time 2020-05-09 14:54:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
Abort trap

The issue doesn't relate to any Tvheadend DVB work. If I stop all receivers
and DVR and just remove recordings this abort appears.

No core remains on the filesystem, no stack trace is in the log. I was not
able to track down exact place in the code where the abort happens even after
abort handlers breakpoint has been installed in gdb. The only thing what I
have is a truss output. Complete truss attached.

...
11222: rename("/tv/live/P<C5><99><C3><AD>b<C4><9B>hy ze Zambie-Prima Zoom CZ2020-05-1016-50-1.ts","/tv/live/.<C5><99><C3><AD>b<C4><9B>hy ze Zambie-Prima Zoom CZ2020-05-1016-50-1.ts.removing") = 0 (0x0)
11222: getdirentries(85,"b;\0\0\0\0\0\0u\^W9\^_\0\0\0\08"...,4096,{ 0x1f2de95c }) = 2456 (0x998)
11222: access("/tv/live/P<C5><99><C3><AD>b<C4><9B>hy ze Zambie-Prima Zoom CZ2020-05-1016-50-1.txt",F_OK) ERR#2 'No such file or directory'
11222: getdirentries(85,0x8043f7000,4096,{ 0x1ffd85e0 }) = 0 (0x0)
11222: lseek(91,0x0,SEEK_END) = 185762445 (0xb12828d)
11222: close(85) = 0 (0x0)
11222: access("/tv/live/P<C5><99><C3><AD>b<C4><9B>hy ze Zambie-Prima Zoom CZ2020-05-1016-50-1.edl",F_OK) ERR#2 'No such file or directory'
11222: fcntl(91,F_GETFD,) = 0 (0x0)
11222: fcntl(91,F_SETFD,FD_CLOEXEC) = 0 (0x0)
11222: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },0x0) = 0 (0x0)
11222: fstat(91,{ mode=-rw-r--r-- ,inode=107084,size=185762445,blksize=131072 }) = 0 (0x0)
11222: __sysctl(0x7fffd83be7a0,0x2,0x7fffd83bf4f0,0x7fffd83be798,0x0,0x0) = 0 (0x0)
11222: getpid() = 11222 (0x2bd6)
11222: write(91,"2020-05-13 10:17:36.254 [ INFO"...,247) = 247 (0xf7)
11222: sendto(8,"<26>1 2020-05-13T10:17:36.263645"...,109,0,NULL,0) = 109 (0x6d)
11222: close(91) = 0 (0x0)
11222: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 }) = 0 (0x0)
11222: kevent(44,0x0,0,{ 43,EVFILT_READ,0x0,0,0,0x800f90c00 },1,{ 0.150000000 }) = 1 (0x1)
11222: sigaction(SIGABRT,{ SIG_DFL 0x0 ss_t },{ SIG_DFL 0x0 ss_t }) = 0 (0x0)
11222: read(43,"GI\^F\^Q\0\0\^A\M-`\0\0\M^D\M-P"...,188000) = 31208 (0x79e8)
11222: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },0x0) = 0 (0x0)
11222: sendmsg(40,{NULL,0,[{"\^A\0\0\0\0\^B\0\0\0\0\0\0Q\0\0"...,97},{"\^A\0\0\0\0\^B\0\0\0\0\0\0O\0\0"...,95},{"\^A\0\0\0\0\^B\0\0\0\0\0\0005\0"...,69}],3,{},0,0},MSG_NOSIGNAL) = 261 (0x105)
11222: kevent(37,0x0,0,{ 38,EVFILT_VNODE,EV_CLEAR,NOTE_WRITE,0,0x804404530 },1,0x0) = 1 (0x1)
11222: _umtx_op(0x8044bdeb8,UMTX_OP_NWAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
11222: _umtx_op(0x800f98380,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
11222: getpid() = 11222 (0x2bd6)
11222: fcntl(38,F_DUPFD_CLOEXEC,0x0) = 85 (0x55)
11222: lseek(85,0x0,SEEK_SET) = 0 (0x0)
11222: kill(11222,SIGABRT) = 0 (0x0)
11222: _umtx_op(0x800f98008,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) ERR#4 'Interrupted system call'
11222: read(39,"\^A\0\0\0\0\^B\0\0\0\0\0\0Q\0\0"...,335) = 261 (0x105)
11222: SIGNAL 6 (SIGABRT) code=SI_USER pid=11222 uid=984
11222: <thread 101915 exited>
11222: <thread 102390 exited>
...

Could anyone guide me where to place a breakpoint or how to force the core to
be written, please?


Files

trace.gz (3.62 MB) trace.gz truss trace Martin Rehak, 2020-05-13 14:22
core.76255.gz (8.2 MB) core.76255.gz Martin Rehak, 2020-05-15 11:26

History

#1

Updated by Flole Systems over 4 years ago

Use a clean version from the git repository. Without a core dump or stack trace it is impossible to track down the issue. You need to enable that first and then you can provide the necessary information.

#2

Updated by Martin Rehak over 4 years ago

Flole Systems wrote:

Use a clean version from the git repository. Without a core dump or stack trace it is impossible to track down the issue. You need to enable that first and then you can provide the necessary information.

That's what I would like to achieve. Maybe I am missing something. I have cloned

https://github.com/tvheadend/tvheadend.git

To be able to compile latest 4.3 version I have checkout v4.3 tag. And that's where I am:

  1. git log
    commit 221c29b40b1e53ae09a69d9458442dd4fea665f5 (HEAD -> master, origin/master, origin/HEAD)
    Author: Flole998 <>
    Date: Sat Nov 23 17:11:58 2019 +0100

    Fix #5782
    ...

I have few changes in the workspace which are needed to compile it on FreeBSD (patches from FreeBSD ports).

Am I doing something wrong?

Thank you

#3

Updated by Flole Systems over 4 years ago

I just noticed the "dirty" suffix on your build number. That means that there was some kind of modification done (in this case the FreeBSD patches), as there are quite some weird (as in crazy) changes done by various people it is generally a bad sign when the dirty suffix is there. The OpenWRT Port for example is one of those that has several issues which are caused by such a modification.

You need to figure out how you can activate core dumps on your system, once you got those working it will probably be easy to find the issue.

#4

Updated by Martin Rehak over 4 years ago

And that's the issue. FreeBSD cores on disk by default so it seems tvheadend does something to prevent that.

I would say ABRT is blocked by process on demand:

11222: sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2 },0x0) = 0 (0x0)

The question is whether it is done by some code in src/trap.c intentionally or this is executed as part of some libc code temporarily. Any clue? Any clue to avoid it, please?

#5

Updated by Flole Systems over 4 years ago

Did you start tvheadend with the "-D" parameter? That is necessary to enable coredumps.

#6

Updated by Martin Rehak over 4 years ago

I have tried that. And it was not working. And now I am looking into the code and see:

/* Make dumpable */
if (opt_dump) {
if (chdir("/tmp"))
tvhwarn(LS_START, "failed to change cwd to /tmp");
#ifdef PLATFORM_LINUX
prctl(PR_SET_DUMPABLE, 1);
#else
tvhwarn(LS_START, "Coredumps not implemented on your platform");
#endif
}

Any advice? I think if tvheadend doesn't block SIGABRT it would dump it. I will try to hack it in trap.c.

#7

Updated by Flole Systems over 4 years ago

Even with the abort handler it is possible to get a coredump (at least on Ubuntu/Debian). There should be a stack trace in the Log (which isn't there for you either for some reason) aswell as a dump file, assuming the requirements (see the #ifdef directives) are fulfilled at compiletime.

#8

Updated by Martin Rehak over 4 years ago

So I have found that it is looking for /proc/self (which is linux procfs) and it is possible to emulate that in FreeBSD. So I have set it up and run tvh in gdb again, delete few files and got this:

2020-05-15 11:16:25.678 [ INFO] dvr: delete entry 00b1c60c6d4a92e61014e5d07a5ae2ba "AZ-kvíz" on "CT1 CZ" start time 2020-05-12 11:29:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
2020-05-15 11:16:25.680 [ INFO] dvr: delete entry 10c328beb5bc4af1432c6e52d373add8 "Život je hra" on "Prima Cool CZ" start time 2020-05-12 11:34:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
2020-05-15 11:16:25.682 [ INFO] dvr: delete entry 6515320b1710128cc0ea04141b9f9568 "Život je hra" on "Prima Cool CZ" start time 2020-05-12 11:49:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
2020-05-15 11:16:25.684 [ INFO] dvr: delete entry 380b19ee80c2726355954ea2aa49a6b9 "Pravda o lvech" on "CT2 CZ" start time 2020-05-12 11:59:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
2020-05-15 11:16:25.686 [ INFO] dvr: delete entry b0c944431810e0432ceaec8c0b7f4ac9 "Zprávy ve 12" on "CT24 CZ" start time 2020-05-12 11:59:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
2020-05-15 11:16:25.687 [ INFO] dvr: delete entry adeeb335603bd770429fa582f6c8bda3 "Zprávy ve 12" on "CT1 CZ" start time 2020-05-12 11:59:30, scheduled for recording by "192.168.0.38", retention "On file removal" removal "3 days"
--Type <RET> for more, q to quit, c to continue without paging--c

Thread 99 "tvh:tcp-start" received signal SIGABRT, Aborted.
[Switching to LWP 102086 of process 76255]
0x0000000800c006ca in kill () from /lib/libc.so.7
(gdb) bt
#0 0x0000000800c006ca in kill () from /lib/libc.so.7
#1 0x0000000800c00680 in ?? () from /lib/libc.so.7
#2 0x0000000800c005f0 in __stack_chk_fail () from /lib/libc.so.7
#3 0x00000000003e1a20 in dvr_cutpoint_delete_files (
s=0x803fa6381 "/tv/live/Zpr\303\241vy ve 12-CT1 CZ2020-05-1212-00.ts") at src/dvr/dvr_cutpoints.c:307
#4 0x000000000033a344 in deferred_unlink (
filename=0x803fa6381 "/tv/live/Zpr\303\241vy ve 12-CT1 CZ2020-05-1212-00.ts", rootdir=0x0) at src/utils.c:950
#5 0x00000000003cc9d2 in dvr_entry_delete (de=<optimized out>;) at src/dvr/dvr_db.c:4736
#6 0x00000000003d1d71 in dvr_entry_cancel_delete_remove (de=0x804382900, rerecord=0, _delete=0)
at src/dvr/dvr_db.c:4889
#7 0x00000000003ada2b in api_idnode_handler (idc=0x505080 <dvr_entry_class>, perm=0x804908000, args=<optimized out>,
resp=<optimized out>, handler=0x3b4d80 <api_dvr_remove>, op=<optimized out>, destroyed=0)
at src/api/api_idnode.c:645
#8 0x00000000003b45b5 in api_dvr_entry_remove (perm=0x6, opaque=<optimized out>, op=<optimized out>, args=0x0,
resp=0x800c4f32a <getpid+10>;) at src/api/api_dvr.c:344
#9 0x00000000003eab47 in webui_api_handler (hc=0x7fffd89c2d48, remain=0x8050e965a "dvr/entry/remove",
opaque=<optimized out>;) at src/webui/webui_api.c:47
#10 0x000000000034bad0 in http_exec (hc=0x7fffd89c2d48, hp=0x7fffd89c2ad0, remain=<optimized out>;) at src/http.c:1246
#11 0x000000000034b67c in http_cmd_post (hc=0x7fffd89c2d48, spill=0x7fffd89c2cc8) at src/http.c:1383
#12 http_process_request (hc=0x7fffd89c2d48, spill=0x7fffd89c2cc8) at src/http.c:1405
#13 0x0000000000349f2a in process_request (hc=0x7fffd89c2d48, spill=0x7fffd89c2cc8) at src/http.c:1531
#14 0x000000000034988c in http_serve_requests (hc=0x7fffd89c2d48) at src/http.c:2023
#15 0x000000000034a38f in http_serve (fd=91, opaque=0x801bbdcb0, peer=0x801bbdcc8, self=0x801bbdd48)
at src/http.c:2074
#16 0x0000000000343824 in tcp_server_start (aux=0x801bbdc80) at src/tcp.c:724
#17 0x000000000033b69d in thread_wrapper (p=0x801b5b8e0) at src/tvh_thread.c:91
#18 0x000000080095c736 in ?? () from /lib/libthr.so.3
#19 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffd89c3000
(gdb) generate-core-file
Saved corefile core.76255
(gdb) c
Continuing.
[LWP 102355 of process 76255 exited]
...
[LWP 102892 of process 76255 exited]

Program terminated with signal SIGABRT, Aborted.
The program no longer exists.
(gdb)

No stack trace in the log file, don't know why, but now I am able to create a FreeBSD core file. I have attached the file, though I understand that it is not much useful for you (I assume you don't have tvh running on fbsd).

Do you see anything interesting from the stack trace, please? I can do whatever debugging you need to investigate the issue.

#10

Updated by Martin Rehak over 4 years ago

src/dvr/dvr_cutpoints.c:

void
dvr_cutpoint_delete_files (const char *s) {
char *path, *dot;
int i;

path = alloca(strlen(s) + 1);
/* Check each cutlist extension */
for (i = 0; i < ARRAY_SIZE(dvr_cutpoint_parsers); i++) {
strcpy(path, s);
if ((dot = (strrchr(path, '.') + 1)))
*dot = 0;
strcat(path, dvr_cutpoint_parsers[i].ext);
/* Check file exists */
if (access(path, F_OK))
continue;
/* Delete File */
tvhinfo(LS_DVR, "Erasing cutpoint file '%s'", (const char *)path);
if (unlink(path))
tvherror(LS_DVR, "unable to remove cutpoint file '%s'", (const char *)path);
}
}

alloca() allocates just enough space for file.ts, but when 'txt' or 'edl' is concated it runs out of the buffer.

#11

Updated by Flole Systems over 4 years ago

  • Status changed from New to Fixed

Fixed in latest

#12

Updated by Martin Rehak over 4 years ago

Great! How can I clone the repo with your fix, please?

Thank you very much.

#13

Updated by Flole Systems over 4 years ago

Latest master on github.com/tvheadend/tvheadend contains it, so git clone (or git pull) should get it.

Also available in: Atom PDF