Project

General

Profile

Actions

Bug #5366

open

EPG text is badly encoded and needs cleaning

Added by Dave H over 6 years ago. Updated over 6 years ago.

Status:
New
Priority:
Normal
Assignee:
Category:
EPG
Target version:
-
Start date:
2018-11-29
Due date:
% Done:

0%

Estimated time:
Found in version:
4.2.6 and others
Affected Versions:

Description

EPG text supplied by Freeview in the UK contains some illegal characters, apparently caused by broken encoding systems somewhere in the transmission path. The most common example is occurrence of the byte 0x19, which appears to be the low byte of the Unicode U+2019 single right quote, sent without the high byte and in the wrong character set.

To avoid propagating these errors to other parts of the system, the encoding should be corrected/substituted as early in its path through TVH as possible. There is a thread on the forum https://tvheadend.org/boards/5/topics/35265?r=35325 that explains more about the issue, gives a list of the most likely incorrect character codes and contains references to external sources that explain the problem in detail.


Files

sample.ts (14.1 MB) sample.ts Dave Pickles, 2018-12-01 10:48
sample2a.ts (93.1 MB) sample2a.ts 30 second MUX sample Dave Pickles, 2018-12-01 18:03
Actions

Also available in: Atom PDF