Bug #5259
Bus Error at startup when compiled for ARMHF architecture
0%
Description
Hi everyone,
I have been trying to crosscompile for my mediabox (KII Pro Amlogic S905 running Libreelec 8.2), the latest version from the master branch and while it successfully compiles, I do get a bus error immediately as I execute the binary. I have analysed via gdb the coredump created and the output point at this particular line the code.
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `./tvheadend -c tvh'.
Program terminated with signal SIGBUS, Bus error.
#0 htsmsg_field_set_msg (sub=0xac8c22c8, f=<optimized out>, f=<optimized out> at src/htsmsg.c:604
Code in src/htsmsg.c:604 is m->hm_data_size = 0; which seems innocent enough.
I have eventually traced back that the last version that works for me is 4.3-925. The subsequent commit (https://github.com/tvheadend/tvheadend/commit/d35359fd828ea4011a6ae22e723269d2b04b8b31) broke tvheadend for me, with the above mentioned error.
Since my knowledge of C is almost non-existent, I was wondering if you guys are aware of this issue and what can be done to fix it. Are there any compiler flags I can use, or does the code needs to be modified to work for my box? I have already tried using both the toolchain built by LibreELEC's build process and the the cross-compiler supplied by Ubuntu 18.04.
Thanks for your help guys and keep up the awesome work.
History
Updated by Luis Alves about 6 years ago
The line "m->hm_data_size = 0;" isn't that innocent if the location of the variable pointer is an unaligned memory address
You can try to change line 81 on htsmsg.h from "size_t hmf_edata_size;" to "size_t hmf_edata_size attribute((packed));"
If that fails, you could try the compiler flag "-mno-unaligned-access".
On arm, the linux kernel can usually deal with unaligned accesses by software emulation. Can you paste here the output of:
cat /proc/cpu/alignment
Updated by Luis Alves about 6 years ago
The keywork "attribute" above should be enclosed by double "_" (forgot to use code formating)size_t hmf_edata_size __attribute__((packed));
Updated by Luis Alves about 6 years ago
Forgot to mention that you should be able to control behaviour with:echo N > /proc/cpu/alignment
where N is:
0 - Do nothing (default behavior)
1 - Warning in kernel-log with PC and Memory-Address printed.
2 - Fixup error
3 - Warn and Fixup
4 - Send a SIGBUS to the process
5 - Send SIGBUS and output Warning
Using 2 should allow you to run tvh (although with some overhead to fix mem access)
More info @ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm/mem_alignment
Paste the output of the current alignment state and then try:echo 2 > /proc/cpu/alignment
Updated by Jaroslav Kysela about 6 years ago
Luis Alves wrote:
The line "m->hm_data_size = 0;" isn't that innocent if the location of the variable pointer is an unaligned memory address
You can try to change line 81 on htsmsg.h from "size_t hmf_edata_size;" to "size_t hmf_edata_size attribute((packed));"
The '__attribute__((packed))' forces the unaligned offsets for the variables!!! The compiler should align variables properly when this attribute is not used.
Updated by Jaroslav Kysela about 6 years ago
It looks like a broken compiler or a bad compiler configuration. Could you print addresses of 'm' and 'm->hm_data_size' in gdb?
Run program in gdb, wait for the signal. (gdb) bt (gdb) frame 1 # might be different number - analyze 'bt' (gdb) print &m (gdb) print &m->hm_data_size
Updated by Shaun Schembri about 6 years ago
Unfortunately under LibreELEC /proc/cpu/alignment is unavailable. I did look for it in other places but setting it or trying to cat from it fails. I guess its disabled when they build the kernel for it.
I tried Luis change and by adding the attribute you mentioned and the compiler flag it worked but I got the same error buserror 2 lines further down. Grrrrrrrr
But there is some light....I managed to get a working binary (at least it starts and doesn't crash at startup) by compiling using -O1 optimisations instead of -O2. I don't know how unoptimised the binary is, but tonight I try to pin-point which optimiser flag actually causes the problem. I think if I manage to get a binary with -O2 and exclude the problematic optimisation I should be good to go.
Thanks again
Updated by Luis Alves about 6 years ago
Jaroslav Kysela wrote:
The '__attribute__((packed))' forces the unaligned offsets for the variables!!! The compiler should align variables properly when this attribute is not used.
Not true when casting an arbitrary data block to something else.
Example on line 144 (https://github.com/tvheadend/tvheadend/blob/master/src/htsmsg.c#L144)
f->hmf_msg = (htsmsg_t *)(f->hmf_name + nsize);
It's being assigned to the "msg" pointer something based on a size which sometimes will end up on an unaligned address...
Shaun,
As workaround, you can change the htsmsg.h "typedef struct htsmsg" to this: https://pastebin.com/2J8Shcf1
(add the packed attribute to all fields)
Updated by Luis Alves about 6 years ago
Done a quick hack and found the 2 places where it happens:
https://github.com/tvheadend/tvheadend/blob/master/src/htsmsg.c#L144
and:
https://github.com/tvheadend/tvheadend/blob/master/src/htsmsg_binary2.c#L175
Updated by Luis Alves about 6 years ago
Forgot to post my "hack"...
https://pastebin.com/YisGtCuq
It basically aligns the htsmsg_t pointer to the cpu word boundary.
Wastes some padding bytes but you don't have to force the compiler to do aligned accesses with the packed attribute and performance wise should be better for all architectures)
Updated by Luis Alves about 6 years ago
ps: the above hack is not a fix (other parts of the code would need adjutments), just a proof of concept...
Updated by Shaun Schembri about 6 years ago
This is the gdb output Jaroslav requested...hope its what you needed
(gdb) bt
#0 htsmsg_field_set_msg (sub=0xac5f8ce8, f=<optimized out>, f=<optimized out> at src/htsmsg.c:604
#1 0xaadc860a in json_parse_value (
s=0xac5fa510 " [\n\t\t\"login\",\n\t\t\"storage\",\n\t\t\"time\"\n\t],\n\t\"chname_num\": true,\n\t\"chname_src\": false,\n\t\"date_mask\": \"\",\n\t\"label_formatting\": false,\n\t\"language\": [\n\t],\n\t\"epg_cutwindow\": 300,\n\t\"epg_window\": 86400,\n\t\"prefe"..., parent=parent@entry=0xac5f8b18, name=name@entry=0xac5f8c48 "info_area", jd=jd@entry=0xabf4f9e0 <json_to_htsmsg>, opaque=opaque@entry=0x0,
failp=failp@entry=0xff9dee10, failmsg=failmsg@entry=0xff9dee0c) at src/misc/json.c:367
#2 0xaadc8848 in json_parse_map (s=<optimized out>,
s@entry=0xac5fa438 "{\n\t\"server_name\": \"Tvheadend\",\n\t\"version\": 24,\n\t\"full_version\": \"4.3-1468~g3f74523d2-dirty\",\n\t\"theme_ui\": \"blue\",\n\t\"ui_quicktips\": true,\n\t\"uilevel\": 0,\n\t\"uilevel_nochange\": false,\n\t\"caclient_ui\": fals"..., endp=endp@entry=0xff9dee08, jd=jd@entry=0xabf4f9e0 <json_to_htsmsg>, opaque=opaque@entry=0x0, failp=failp@entry=0xff9dee10,
failmsg=failmsg@entry=0xff9dee0c) at src/misc/json.c:194
#3 0xaadc8a06 in json_deserialize (
src=src@entry=0xac5fa438 "{\n\t\"server_name\": \"Tvheadend\",\n\t\"version\": 24,\n\t\"full_version\": \"4.3-1468~g3f74523d2-dirty\",\n\t\"theme_ui\": \"blue\",\n\t\"ui_quicktips\": true,\n\t\"uilevel\": 0,\n\t\"uilevel_nochange\": false,\n\t\"caclient_ui\": fals"..., jd=0xabf4f9e0 <json_to_htsmsg>, opaque=opaque@entry=0x0, errbuf=errbuf@entry=0x0, errlen=errlen@entry=0) at src/misc/json.c:415
#4 0xaadc6f12 in htsmsg_json_deserialize (
src=src@entry=0xac5fa438 "{\n\t\"server_name\": \"Tvheadend\",\n\t\"version\": 24,\n\t\"full_version\": \"4.3-1468~g3f74523d2-dirty\",\n\t\"theme_ui\": \"blue\",\n\t\"ui_quicktips\": true,\n\t\"uilevel\": 0,\n\t\"uilevel_nochange\": false,\n\t\"caclient_ui\": fals"...) at src/htsmsg_json.c:213
#5 0xaadc9536 in hts_settings_load_one (filename=filename@entry=0xff9dee8c "/tmp/tvh/config") at src/settings.c:257
#6 0xaadc9684 in hts_settings_load_path (fullpath=0xff9dee8c "/tmp/tvh/config", depth=0) at src/settings.c:314
#7 0xaadc96f6 in hts_settings_vload (pathfmt=0xaae92988 "config", pathfmt@entry=0xabf82668 "8\325\"\001", ap=..., ap@entry=..., depth=depth@entry=0) at src/settings.c:334
#8 0xaadc9a84 in hts_settings_load (pathfmt=0xaae92988 "config") at src/settings.c:357
#9 0xaadcee86 in config_boot (path=0xff9e17cf "tvh", gid=4294967295, uid=4294967295, http_user_agent=<optimized out> at src/config.c:1756
#10 0xaad93b74 in main (argc=<optimized out>, argv=<optimized out> at src/main.c:1089
(gdb) frame 0
#0 htsmsg_field_set_msg (sub=0xac5f8ce8, f=<optimized out>, f=<optimized out> at src/htsmsg.c:604
604 in src/htsmsg.c
(gdb) print &m
Address requested for identifier "m" which is in register $r4
(gdb) print &m->hm_data_size
$1 = (size_t *) 0xac5fbb4a
Now I am gonna try...Luis "hack" and report the outcome
Updated by Shaun Schembri about 6 years ago
Sorry forgot to format
(gdb) bt
#0 htsmsg_field_set_msg (sub=0xac5f8ce8, f=<optimized out>, f=<optimized out>) at src/htsmsg.c:604
#1 0xaadc860a in json_parse_value (
s=0xac5fa510 " [\n\t\t\"login\",\n\t\t\"storage\",\n\t\t\"time\"\n\t],\n\t\"chname_num\": true,\n\t\"chname_src\": false,\n\t\"date_mask\": \"\",\n\t\"label_formatting\": false,\n\t\"language\": [\n\t],\n\t\"epg_cutwindow\": 300,\n\t\"epg_window\": 86400,\n\t\"prefe"..., parent=parent@entry=0xac5f8b18, name=name@entry=0xac5f8c48 "info_area", jd=jd@entry=0xabf4f9e0 <json_to_htsmsg>, opaque=opaque@entry=0x0,
failp=failp@entry=0xff9dee10, failmsg=failmsg@entry=0xff9dee0c) at src/misc/json.c:367
#2 0xaadc8848 in json_parse_map (s=<optimized out>,
s@entry=0xac5fa438 "{\n\t\"server_name\": \"Tvheadend\",\n\t\"version\": 24,\n\t\"full_version\": \"4.3-1468~g3f74523d2-dirty\",\n\t\"theme_ui\": \"blue\",\n\t\"ui_quicktips\": true,\n\t\"uilevel\": 0,\n\t\"uilevel_nochange\": false,\n\t\"caclient_ui\": fals"..., endp=endp@entry=0xff9dee08, jd=jd@entry=0xabf4f9e0 <json_to_htsmsg>, opaque=opaque@entry=0x0, failp=failp@entry=0xff9dee10,
failmsg=failmsg@entry=0xff9dee0c) at src/misc/json.c:194
#3 0xaadc8a06 in json_deserialize (
src=src@entry=0xac5fa438 "{\n\t\"server_name\": \"Tvheadend\",\n\t\"version\": 24,\n\t\"full_version\": \"4.3-1468~g3f74523d2-dirty\",\n\t\"theme_ui\": \"blue\",\n\t\"ui_quicktips\": true,\n\t\"uilevel\": 0,\n\t\"uilevel_nochange\": false,\n\t\"caclient_ui\": fals"..., jd=0xabf4f9e0 <json_to_htsmsg>, opaque=opaque@entry=0x0, errbuf=errbuf@entry=0x0, errlen=errlen@entry=0) at src/misc/json.c:415
#4 0xaadc6f12 in htsmsg_json_deserialize (
src=src@entry=0xac5fa438 "{\n\t\"server_name\": \"Tvheadend\",\n\t\"version\": 24,\n\t\"full_version\": \"4.3-1468~g3f74523d2-dirty\",\n\t\"theme_ui\": \"blue\",\n\t\"ui_quicktips\": true,\n\t\"uilevel\": 0,\n\t\"uilevel_nochange\": false,\n\t\"caclient_ui\": fals"...) at src/htsmsg_json.c:213
#5 0xaadc9536 in hts_settings_load_one (filename=filename@entry=0xff9dee8c "/tmp/tvh/config") at src/settings.c:257
#6 0xaadc9684 in hts_settings_load_path (fullpath=0xff9dee8c "/tmp/tvh/config", depth=0) at src/settings.c:314
#7 0xaadc96f6 in hts_settings_vload (pathfmt=0xaae92988 "config", pathfmt@entry=0xabf82668 "8\325\"\001", ap=..., ap@entry=..., depth=depth@entry=0) at src/settings.c:334
#8 0xaadc9a84 in hts_settings_load (pathfmt=0xaae92988 "config") at src/settings.c:357
#9 0xaadcee86 in config_boot (path=0xff9e17cf "tvh", gid=4294967295, uid=4294967295, http_user_agent=<optimized out>) at src/config.c:1756
#10 0xaad93b74 in main (argc=<optimized out>, argv=<optimized out>) at src/main.c:1089
(gdb) frame 0
#0 htsmsg_field_set_msg (sub=0xac5f8ce8, f=<optimized out>, f=<optimized out>) at src/htsmsg.c:604
604 in src/htsmsg.c
(gdb) print &m
Address requested for identifier "m" which is in register $r4
(gdb) print &m->hm_data_size
$1 = (size_t *) 0xac5fbb4a
Updated by Jaroslav Kysela about 6 years ago
Luis Alves wrote:
Jaroslav Kysela wrote:
The '__attribute__((packed))' forces the unaligned offsets for the variables!!! The compiler should align variables properly when this attribute is not used.
Not true when casting an arbitrary data block to something else.
Yep, I see the problem now. The casting to the allocated memory with an unaligned offset makes that. It has nothing to do with the compiler.
Shaun Schembri wrote:
$1 = (size_t *) 0xac5fbb4a
Yes, this is completely unaligned address.
Updated by Shaun Schembri about 6 years ago
Luis Alves wrote:
Forgot to post my "hack"...
https://pastebin.com/YisGtCuqIt basically aligns the htsmsg_t pointer to the cpu word boundary.
Wastes some padding bytes but you don't have to force the compiler to do aligned accesses with the packed attribute and performance wise should be better for all architectures)
You "hack" works and I managed to start tvheadend compiled with -O2. However I could only do limited tests with a fresh configuration to see if it crashes somewhere else. I haven't updated my conf directory since last December and some things seems to be broken (ex Kodi which I use as a client doesn't list any channels).
Any idea how its best to update my configuration without starting afresh? Now that I can build any version with -O1 maybe its a good idea I build interim versions and update my configuration till the latest commit.
Thanks again guys....you rock
Updated by Luis Alves about 6 years ago
Luis Alves wrote:
ps: the above hack is not a fix (other parts of the code would need adjutments), just a proof of concept...
Actually, I don't think anything else needs to be changed...
Shaun,
Can you try build and run the with the following changes: https://pastebin.com/YisGtCuq
(it's just adding 1 line on each file: htsmsg.c and htsmsg_binary2.c)
I'm sure Jaroslav can eventually find a prettier fix...
Updated by Shaun Schembri about 6 years ago
Luis Alves wrote:
Shaun,
Can you try build and run the with the following changes: https://pastebin.com/YisGtCuq
(it's just adding 1 line on each file: htsmsg.c and htsmsg_binary2.c)
Actually my results posted above was with you patch applied.
Any idea about how best to upgrade my configuration?
Updated by Shaun Schembri about 6 years ago
Found my issue why my channels where all gone...I didn't build with zlib so the new binary could not read my old compressed mux files. Now with zlib all is OK....I will keep testing and report and further issues.
Updated by Shaun Schembri about 6 years ago
Did a fresh build and I got another crash.....I believe I tested earlier with -O1 build before hence why I didn't get the error. Too much trail and error lately.
@Program terminated with signal SIGBUS, Bus error.
#0 htsmsg_binary_des0 (buf=0xabb03054 "\003\006", buf@entry=0xabb03048 "\001\006", len=3123, len@entry=3135, msg=<optimized out> at src/htsmsg_binary.c:119
Updated by Luis Alves about 6 years ago
Shaun Schembri wrote:
Did a fresh build and I got another crash.....I believe I tested earlier with -O1 build before hence why I didn't get the error. Too much trail and error lately.
@Program terminated with signal SIGBUS, Bus error.
#0 htsmsg_binary_des0 (buf=0xabb03054 "\003\006", buf@entry=0xabb03048 "\001\006", len=3123, len@entry=3135, msg=<optimized out> at src/htsmsg_binary.c:119
Missed that one... just apply the same kind of "fix": https://pastebin.com/AaGuhfaV
Updated by Luis Alves about 6 years ago
Just one comment on: [[https://github.com/tvheadend/tvheadend/commit/fdaa48b32b0566fbf56a588e54fec28b4350d35a#diff-397651fb104a69a9290d09e030123030R110]]
This alignment function is hard-coded to a 64-bit word alignment that will "waste" memory on 32 bit machines.
These are the methods I know of to do word alignment:
a) Use sizeof(int) which should always be the size of the machine for which it is being compiled on: mask = "sizeof(int)-1"
b) or #include <bits/wordsize.h> and use the "__WORDSIZE" definition: mask = "(__WORDSIZE >> 3) -1"
(both are replaced on compilation time so it doesn't cause any runtime overhead)
Anyway, this is not critical - at least until 128-bit ARMs that don't deal with unaligned mem addresses show up
Keep up the great work Jaroslav!
Updated by Shaun Schembri about 6 years ago
I have built a new binary however I will only be capable of doing a full scale test (covering all the features I use) over this coming weekend as I will be away from home.
Thanks a lot guys....I am sure other will find this patch useful as more and more people move away from 4.2 to newer versions
Updated by Jaroslav Kysela about 6 years ago
Luis Alves wrote:
Just one comment on: [[https://github.com/tvheadend/tvheadend/commit/fdaa48b32b0566fbf56a588e54fec28b4350d35a#diff-397651fb104a69a9290d09e030123030R110]]
This alignment function is hard-coded to a 64-bit word alignment that will "waste" memory on 32 bit machines.These are the methods I know of to do word alignment:
a) Use sizeof(int) which should always be the size of the machine for which it is being compiled on: mask = "sizeof(int)-1"
b) or #include <bits/wordsize.h> and use the "__WORDSIZE" definition: mask = "(__WORDSIZE >> 3) -1"
(both are replaced on compilation time so it doesn't cause any runtime overhead)
I though about this, but the null strings are eliminated now (no overhead because there is no extra offset) and usually we need more than 4 bytes for the variable names, thus almost all allocations will end with
8+ alignments. The cache alignment is mostly bigger than 4+, so we're fine.
Updated by Shaun Schembri about 6 years ago
Just a small update....have been running this a few days and today I "stress" tested it quite a bit. All looks ok now so you can close this issue.
Keep up the awesome work