Bug #1678
I2C timeout followed by system hang (Linux)
0%
Description
The last 2 GIT version cause I2C timeouts and module traces pointing to tvheadend. After a while Linux becomes unresponsive and you need a hardware reset to boot.
gc338802 does not have problem (same hardware, same actions, no timeout).
Files
History
Updated by Adam Sutton over 11 years ago
- Assignee set to Adam Sutton
- Target version set to 3.4
- Affected Versions 3.4 added
I will take a look at this, I've been seeing some instability on my NAS box of late and had wondered if TVH was at fault. However I had at least 1 other plausible candidate for the problems.
Adam
Updated by Rob vh over 11 years ago
Since I reverted to gc338802, 4 days ago, my server has been super stable again (no i2c timeouts), but tvheadend is missing recordings like before.
Updated by Adam Sutton over 11 years ago
Unfortunately there was nothing in my logs to say why my NAS box died :(
I've never seen the I2C errors before, I'm not entirely sure how TVH could
be responsible for such things, though I guess if somehow it was hanging
the system it might cause things to delay and cause I2C issues. But that's
speculation.
But I do feel like my own TVH server has been unstable of late, but as
before at the moment I'm not 100% certain TVH is the culprit and not
something else (as I've managed to trash the machine with some NFS testing,
and I've extended my NFS usage quite a bit recently).
But I'll try and take a look. If you get a chance, it might be worth trying
to use git bisect to see if you can figure out a better approx point at
which things started failing?
Adam
On 3 April 2013 10:56, wrote:
Issue #1678 has been updated by Rob vh.
Since I reverted to gc338802, 4 days ago, my server has been super stable
again (no i2c timeouts), but tvheadend is missing recordings like before.
------------------------------
Bug #1678: I2C timeout followed by system hang (Linux)- Author: Rob vh
- Status: New
- Priority: Normal
- Assignee: Adam Sutton
- Category: DVB
- Target version: 3.4
- Found in version: ge9ce021
- Affected Versions: 3.4The last 2 GIT version cause I2C timeouts and module traces pointing to
tvheadend. After a while Linux becomes unresponsive and you need a hardware
reset to boot.
gc338802 does not have problem (same hardware, same actions, no timeout).
------------------------------You have received this notification because you have either subscribed to
it, or are involved in it.
To change your notification preferences, please click here:
https://tvheadend.org/my/account
Updated by Ben Kibbey over 11 years ago
I accidentally created ticket #1691 which seems to be the same bug as this. Maybe its in the dvb kernel modules and a bug should be sent to bugzilla.kernel.org? Can you tell me the revision in the tvheadend git tree that is working for you?
Updated by Rob vh over 11 years ago
My server has been running stable on gc338802 with Linux 3.2.0-39. I have a ddbridge with 3 Duoflex S2 cards.
Sometime after gc338802, there were changes in mutex locking and in the way tuning is done, so if there is a kernel bug it is exercised by changes applied to tvheadend.
I have tried to apply maintenance to spot the commit that causes the crash, but find that it may take 10 to 20 hours for the timeout to happen (after changing and restarting tvheadend). Last crashes occurred when there was no recording active, just IET collection.
Updated by Adam Sutton over 11 years ago
I'm not 100% certain, since I'm quite adept at cocking things up. But to me this really sounds like a bug in the driver. It's possible that changes to the way TVH operates could have exposed it. But if there was (or there still was, there were some mistakes in the middle of the updates) a general/serious error a lot of users would be seeing it.
I'm running master on my own server with 4*S2 and 2*T2 tuners and have yet to see a lockup (except as noted above). I did see some lockups a while back, but I'm now fairly certain this was down to bad configuration of my NFS server (I'd locked the box this way a few times during testing). I no longer believe it was down to TVH.
However I won't close this one just yet, as I'd still like to see if we can learn something...
Adam
Updated by Rob vh over 11 years ago
Out of curiosity, are you collecting EIT/EPG over the air, or are you using XMLTV?
Updated by Ben Kibbey over 11 years ago
I am using XMLTV. Same here, it takes about 24 hours when idle scanning is enabled to lockup the box. I did happen to be able to gdb attach to a process running on another CPU and noticed that any FD activity (read/write/ioctl/etc) blocks and a reboot is still required.
Updated by Ben Kibbey over 11 years ago
Do you think a bug report should be submitted to bugzilla.kernel.org? If so, would you like to report it? I think since you have I2C debugging enabled it would may be a more complete bug report.
Updated by Rob vh over 11 years ago
On Thursday I installed http://linuxtv.org/hg/~endriss/media_build_experimental/ on my Mythbuntu 12.04 server (3.2.0-40). When this proved stable, I upgraded to the Friday version of tvheadend. Rock-solid. No timeouts for 2 days.
Strangely, an adapter that was unstable before (tuner 1 on a DuoFlex s2 card) is working full again with the new driver. I suspect that the device driver that is shipping with Ubuntu is unstable. Quite a pity because this makes products like tvheadend and DuoFlex look less attractive at first glance.
Updated by Ben Kibbey over 11 years ago
I updated my kernel to 3.8.7 and still get the lockup with tvheadend-gaa0e5b1 when idle scanning is enabled. It must be in the general dvb kernel module since I have a cx18 chipset.
Updated by Adam Sutton over 11 years ago
- Status changed from New to Invalid
I'm going to close this, I really don't believe this is an issue within TVH, but more likely (as Rob's recent input suggests) a problem in the kernel and/or DVB driver modules.
Adam