Project

General

Profile

Bug #2644

htsmsg_xml_deserialize fails when <!DOCTYPE> exist in xmltv.xml

Added by dhead 666 about 10 years ago. Updated about 10 years ago.

Status:
Fixed
Priority:
Normal
Assignee:
Category:
EPG - Grabbers
Target version:
-
Start date:
2015-01-27
Due date:
% Done:

100%

Estimated time:
Found in version:
git-b98e688f5792a9fb3906491cd51e3b5c62294cd1
Affected Versions:

Description

While testing a network tuner (VBox) I found this issue with the auto generated xmltv.xml which includes <!DOCTYPE>, see attached file.

I'm sending the xmltv.xml through the xmltv.sock with "cat xmltv.xml | socat - UNIX-CONNECT:/home/hts/.hts/tvheadend/epggrab/xmltv.sock"

Log output

tvheadend[4025]: xmltv: htsmsg_xml_deserialize error Unknown syntatic element: <!DOCTYPE tv
tvheadend[4025]: xmltv: failed to read data


Files

xmltv.xml (567 KB) xmltv.xml dhead 666, 2015-01-27 02:09

History

#1

Updated by dhead 666 about 10 years ago

I'm not sure if to call it resolved but the reason for the issue is probably because the file starting with the following hidden characters:
M-oM-;M-?

#2

Updated by Jaroslav Kysela about 10 years ago

It looks like UTF-8 BOM: http://unicode.org/faq/utf_bom.html

#3

Updated by Jaroslav Kysela about 10 years ago

Does this help for you ?

diff --git a/src/htsmsg_xml.c b/src/htsmsg_xml.c
index d1ba7d5..81ff53c 100644
--- a/src/htsmsg_xml.c
+++ b/src/htsmsg_xml.c
@@ -833,6 +833,10 @@ htsmsg_xml_deserialize(char *src, char *errbuf, size_t errbufsize)
   xp.xp_encoding = XML_ENCODING_UTF8;
   LIST_INIT(&xp.xp_namespaces);

+  /* check for UTF-8 BOM */
+  if(src[0] == 0xef && src[1] == 0xbb && src[2] == 0xbf)
+    memmove(src, src + 3, strlen(src) - 2);
+
   if((src = htsmsg_parse_prolog(&xp, src)) == NULL)
     goto err;
#4

Updated by dhead 666 about 10 years ago

The patch works perfectly :)

#5

Updated by Jaroslav Kysela about 10 years ago

  • Status changed from New to Fixed
  • % Done changed from 0 to 100

Applied in changeset commit:tvheadend|3c0a2798251a4c40d1e89b6cf835f465438d4d1d.

Also available in: Atom PDF