XMLTV, Kazer & French categories
Added by Stephane Chauveau almost 11 years ago
SEE THE REPLY POSTS BELOW FOR AN UPDATED SCRIPT THAT CAN PROCESS ANY INPUT LANGUAGE.
The following information are mostly intended for french users of www.kazer.org but the scripts below can probably be adapted to other tv services. I am on Ubuntu/Linux using MythTV as frontend.
I assume in the following that the user has a Kazer account and that the tv_grab_fr_kazer command (from package xmltv-utils) is already configured. If so, running the following command should give you a nice XML file.
tv_grab_fr_kazer > tv.xml
Some XBMC themes such as Confluence can colorize the tv programs according to their categories but unfortunately that does not work well with Kazer because the categories are given in French instead of
using the names defined in ETSI standard EN 300 468.
Ideally, it should be possible to configure tvheaded to access other strings but this is not yet implemented (see the array _epg_genre_names in epg.c) so I made a quick and dirty perl script to translate the categories.
The first step is to create an executable script /usr/local/bin/tv_grab_fr_kazer_2 containing:
#!/bin/bash if [ "$1" == "--description" ] ; then echo "France (Kazer2)" elif [ "$#" == 0 ] ; then /usr/bin/tv_grab_fr_kazer | /usr/local/bin/category-filter.pl else /usr/bin/tv_grab_fr_kazer "$@" fiThe conditions for that script to be recognized as a grabber by xmltv are
- it must be executable and located in one of the $PATH directories used when running tvheadend
- its name must start by tv_grab_
XMTLV and Tvheadend shall now be aware of an new grabber named "France (Kazer2)" which can be checked from the command line by running the command tv_find_grabbers
$ tv_find_grabbers /usr/local/bin/tv_grab_fr_kazer_2|France (Kazer2) /usr/bin/tv_grab_ch_search|Switzerland (tv.search.ch) /usr/bin/tv_grab_es_laguiatv|Spain (laguiatv.com) /usr/bin/tv_grab_huro|Hungary/Romania ...
The file /usr/local/bin/category-filter.pl is given below. It is a perl script that reads an xml file from standard input, translates the categories and emits the result to standard output.
#!/usr/bin/perl -w # # The categories recognized by tvheadend (see epg.c) # my $MOVIE = "Movie / Drama"; my $THRILLER = "Detective / Thriller"; my $ADVENTURE = "Adventure / Western / War"; my $SF = "Science fiction / Fantasy / Horror"; my $COMEDY = "Comedy"; my $SOAP = "Soap / Melodrama / Folkloric"; my $ROMANCE = "Romance"; my $HISTORICAL = "Serious / Classical / Religious / Historical movie / Drama"; my $XXX = "Adult movie / Drama"; my $NEWS = "News / Current affairs"; my $WEATHER = "News / Weather report"; my $NEWS_MAGAZINE = "News magazine"; my $DOCUMENTARY = "Documentary"; my $DEBATE = "Discussion / Interview / Debate"; my $INTERVIEW = $DEBATE ; my $SHOW = "Show / Game show"; my $GAME = "Game show / Quiz / Contest"; my $VARIETY = "Variety show"; my $TALKSHOW = "Talk show"; my $SPORT = "Sports"; my $SPORT_SPECIAL = "Special events (Olympic Games; World Cup; etc.)"; my $SPORT_MAGAZINE = "Sports magazines"; my $FOOTBALL = "Football / Soccer"; my $TENNIS = "Tennis / Squash"; my $SPORT_TEAM = "Team sports (excluding football)"; my $ATHLETICS = "Athletics"; my $SPORT_MOTOR = "Motor sport"; my $SPORT_WATER = "Water sport"; my $KIDS = "Children's / Youth programmes"; my $KIDS_0_5 = "Pre-school children's programmes"; my $KIDS_6_14 = "Entertainment programmes for 6 to 14"; my $KIDS_10_16 = "Entertainment programmes for 10 to 16"; my $EDUCATIONAL = "Informational / Educational / School programmes"; my $CARTOON = "Cartoons / Puppets"; my $MUSIC = "Music / Ballet / Dance"; my $ROCK_POP = "Rock / Pop"; my $CLASSICAL = "Serious music / Classical music"; my $FOLK = "Folk / Traditional music"; my $JAZZ = "Jazz"; my $OPERA = "Musical / Opera"; my $CULTURE = "Arts / Culture (without music)"; my $PERFORMING = "Performing arts"; my $FINE_ARTS = "Fine arts"; my $RELIGION = "Religion"; my $POPULAR_ART = "Popular culture / Traditional arts"; my $LITERATURE = "Literature"; my $FILM = "Film / Cinema"; my $EXPERIMENTAL_FILM = "Experimental film / Video"; my $BROADCASTING = "Broadcasting / Press"; my $SOCIAL = "Social / Political issues / Economics"; my $MAGAZINE = "Magazines / Reports / Documentary"; my $ECONOMIC = "Economics / Social advisory"; my $VIP = "Remarkable people"; my $SCIENCE = "Education / Science / Factual topics"; my $NATURE = "Nature / Animals / Environment"; my $TECHNOLOGY = "Technology / Natural sciences"; my $DIOLOGY = $TECHNOLOGY my $MEDECINE = "Medicine / Physiology / Psychology"; my $FOREIGN = "Foreign countries / Expeditions"; my $SPIRITUAL = "Social / Spiritual sciences"; my $FURTHER_EDUCATION = "Further education"; my $LANGUAGES = "Languages"; my $HOBBIES = "Leisure hobbies"; my $TRAVEL = "Tourism / Travel"; my $HANDICRAF = "Handicraft"; my $MOTORING = "Motoring"; my $FITNESS = "Fitness and health"; my $COOKING = "Cooking"; my $SHOPPING = "Advertisement / Shopping"; my $GARDENING = "Gardening"; # # This is the # # # my %REPLACE=( "Météo" => $WEATHER , "Film" => $MOVIE , "Théâtre" => $PERFORMING, "Ballet" => $OPERA , "Clips" => $MUSIC , "Concert" => $MUSIC , "Court métrage" => $EXPERIMENTAL_FILM, "Débat" => $SOCIAL , "Dessin animé" => $CARTOON , "Divertissement" => $VARIETY , "Documentaire" => $DOCUMENTARY , "Drame" => $SOAP , "Émission" => 0, "Feuilleton" => $SOAP , "Fin" => 0, "Fin des programmes" => 0 , "Interview" => $INTERVIEW , "Jeu" => $GAME , "Jeunesse" => $KIDS , "Journal" => $NEWS , "Loterie" => 0 , "Magazine" => $MAGAZINE , "Opéra" => $OPERA , "Série" => $MOVIE , "Spectacle" => $PERFORMING , "Sport" => $SPORT , "Talk show" => $TALKSHOW , # "Téléfilm" => $MOVIE , "Télé-réalité" => $VARIETY , "Téléréalité" => $VARIETY , "Tiercé" => $SPORT , "Variétés" => $VARIETY , ) ; my $PRE = '<category lang=\"fr\">' ; my $POST = '</category>' ; sub myfilter { my ($a) = @_; if ( exists $REPLACE{$a} ) { return $REPLACE{$a} ; } else { print STDERR "Warning: Unmanaged category: '$a'\n" ; return $a ; } } while (<>) { my $line = $_ ; $line =~ s/($PRE)(.*)($POST)/"$1".myfilter("$2")."$3"/ge ; print $line; }
Assuming that you have generated a kazer xml file as indicated below, you can try the script manually as follow:
/usr/local/bin/category-filter.pl < tv.xml > new.xml
The resulting file new.xml should contain categories followind the ETSI standard EN 300 468.
Categories that were not recognized, if any, are printed on standard error.
The variables such as $MOVIE and $THRILLER are the EN 300 468 categories. They should not be modified.
The array %REPLACE can be modified. It provides the translations from the french categories to the EN 300 468 categories. Use 0 for categories that you do not care about. Be aware that tvheadend (or is that XBMC) does not manage sub-categories well. In practice, that mean that all categories from the same group will have the same color in XBMC.
The variables $PRE and $POST specify the regular expression used to perform the replacement. They may have to be modified if you want to adapt the script to another service than Kazer.
For information, the categories in Kazer xml files look like that
<category lang="fr">Magazine</category>
Using regular expressions to perform the replacements is uggly but simple. In the future, I may write a longer version using a proper XML parser and advanced features such as selecting the category according to multiple criterias (title, duration, channel, ... )
Replies (85)
RE: XMLTV, Kazer & French categories - Added by thierry castelot about 8 years ago
hi Renato,
i think it does, i will try tonight after work and let you know.
RE: XMLTV, Kazer & French categories - Added by Nicolas Rioja about 8 years ago
Stephane Chauveau wrote:
For Nicolas,
The easiest way to log the errors is to redirect the error output stream (number 2) to a file.
For example, you can run the script as follow
/usr/local/bin/category-filter.pl 2> /tmp/category-filter.log
or if you want to APPEND to the log file, use a double >> instead
/usr/local/bin/category-filter.pl 2>> /tmp/category-filter.log
The alternative is to modify the perl script itself.
Add the following line at the beginning of the script to open the log file in append mode:
open(LOG, ">>", "/tmp/category-filter.log") or die "Can't open LOG file: $!";
Then clone the 'print STDERR' line and replace 'STDERR' by 'LOG':
print STDERR "Warning: Unmanaged category: '$a'\n" ;
print LOG "Warning: Unmanaged category: '$a'\n" ;If you do not want to repeat the same error hundreds or thousands of times, you can memorize the wrong categories as follow to emit a single error for each.
At the beginning of the script, create an empty map:
my %BAD ;
Then modify the prints to STDERR and LOG as follow
if ( ! exists $BAD{$a} ) {
print STDERR "Warning: Unmanaged category: '$a'\n" ;
print LOG "Warning: Unmanaged category: '$a'\n" ;
- Record in BAD map so next error won't produce a message
$BAD{$a} = 1 ;
}
Hi Stephane,
I´m trying to implement your script with the BAD map now because I´m started to get more Unmanaged categories from some time ago since I´ve included more grabbers to fill my .xml file.
The problem is that I´m doing something wrong since I´m recieving this output:
Missing right curly or square bracket at /share/CACHEDEV1_DATA/homes/nico/wg++/categorias/cambia_categorias.pl line 330, at end of line
syntax error at /share/CACHEDEV1_DATA/homes/nico/wg++/categorias/cambia_categorias.pl line 330, at EOF
Execution of /share/CACHEDEV1_DATA/homes/nico/wg++/categorias/cambia_categorias.pl aborted due to compilation errors.
I´ve tried some things but without success. Check the script attached in this post to review it and comment me what is wrong.
Thank you very much
cambia_categorias.pl (12.8 KB) cambia_categorias.pl |
RE: XMLTV, Kazer & French categories - Added by james Bond about 8 years ago
thierry castelot wrote:
i assume that perl is already installed into your pi3 and have the same path as ubuntu.
move the two tv_grab into usr/bin and category into usr/local/bin, make them executable and restart tvh, you should be able to pick tv_grab_fr_alacarte_2.
many thanks : it works!
At least partialy since categories are not all colored inside Kodi.
But at least the grabber is display in Tvheadend interface.
EDIT : IT IS 100% WORKING! and it is maybe 20 time faster than the original script.
that was my fault...
First I was using your zguidetv account :D
Second, I had trash EPG datas preventing the correct mapping of new EPG datas.
RE: XMLTV, Kazer & French categories - Added by thierry castelot about 8 years ago
@ james Bond
your welcome
@ Renato
works perfect into Synology ds112j and it's much much faster than my previous script.
you just need to update categories.pl to minimize the number of unmanaged categories.
RE: XMLTV, Kazer & French categories - Added by james Bond about 8 years ago
thierry castelot wrote:
@ james Bond
your welcome
@ Renato
works perfect into Synology ds112j and it's much much faster than my previous script.
you just need to update categories.pl to minimize the number of unmanaged categories.
I think your modded script should be posted in its own thread since it is working unlike 98% of the code posted on this thread
RE: XMLTV, Kazer & French categories - Added by Renato Moscardini about 8 years ago
@Thierry,
Many thanks, I will try it
RE: XMLTV, Kazer & French categories - Added by thierry castelot about 8 years ago
i made some corrections on this one.
category-filter.pl (10.4 KB) category-filter.pl |
RE: XMLTV, Kazer & French categories - Added by John Mcenroy almost 8 years ago
Thank you very much for script. But how to make it case insensetive,
so that "Fin" = "fin" ?
I have found function ucfirst() to make first letter upper case, maybe
this can help here, but I can't understand how to implement it.
Thanks
RE: XMLTV, Kazer & French categories - Added by Stephane Chauveau almost 8 years ago
The lc() function convert a string to lowercase so replace the while loop at the end of the script with
foreach my $key (keys %REPLACE) { $REPLACE{lc($key)} = $REPLACE{$key} ; } while (<>) { my $line = $_ ; $line =~ s/($PRE)(.*)($POST)/"$1".myfilter(lc("$2"))."$3"/ge ; print $line; }
The purpose of the foreach loop is to clone each entry in %REPLACE with a lowercase key.
In the while loop, lc() is applied to the argument passed to the my filter function.
RE: XMLTV, Kazer & French categories - Added by John Mcenroy almost 8 years ago
Thanks Stephane, but the problem is that all entries I have in script are upper case,
for example "Fin" => 0 , and my provider in xmltv has "Fin" and also "fin".
So as I understand I need upper case the first letter of xmltv entries of category.
RE: XMLTV, Kazer & French categories - Added by Stephane Chauveau almost 8 years ago
My previous post using lc is not tested.
Be aware that if you already have lowercase entries in %REPLACE then they may be replaced by a corresponding non-lowercase value.
If this is a problem then you may want to use something like that to prevent existing entries to be overwritten
foreach my $key (keys %REPLACE) { if (!exists $REPLACE{lc($key)} ) { $REPLACE{lc($key)} = $REPLACE{$key} ; } }
RE: XMLTV, Kazer & French categories - Added by Stephane Chauveau almost 8 years ago
Ho! Your keyboard is bro
RE: XMLTV, Kazer & French categories - Added by Stephane Chauveau almost 8 years ago
The while loop is calling myfilter(lc("$2")). For example, if your provider uses "fin", "Fin" or "FIN" then myfilter will always be called with "fin".
So that means that %REPLACE only need to contain the lowercase key.
The foreach loop takes care of that.
The original non-lowercase values such as "Fin" are still present in %REPLACE but they will never be used.
RE: XMLTV, Kazer & French categories - Added by John Mcenroy almost 8 years ago
Can't make it work. My provider use two in the same xmltv - "Fin" and "fin" as for example. So there is category "Movie" and "movie" in the same xmltv file. "Movie" works and "movie" not. I am very beginner with perl
RE: XMLTV, Kazer & French categories - Added by Stephane Chauveau almost 8 years ago
Strange. That should work!
Can you send me the script you are currently using and a sample XML input file?
RE: XMLTV, Kazer & French categories - Added by John Mcenroy almost 8 years ago
Strange. I have just tested the original script with samples. All works. Understand nothing )
P.S. Seems that I am very tired because I have put Movie and movie in replace pl script
So it must be
my %REPLACE=( "Movie" => $MOVIE , ) ;
And in this case it will be one missing categorie - movie.
This sample works with your script. Thanks. Seems somewhere my error.
Update:
Your script works well. Thanks again. But I have investigated a problem that
it works only for english letters if there are chinese, russian, greek and so on letters
it doesn't work and must be modified with these strings:
use utf8 ; use Encode ; foreach my $key (keys %REPLACE) { $REPLACE{lc($key)} = $REPLACE{$key} ; } while (<>) { my $line = $_ ; $line = decode_utf8($line); $line =~ s/($PRE)(.*)($POST)/"$1".myfilter(lc("$2"))."$3"/ge ; $line = encode_utf8($line); print $line; }
test.xml (74 Bytes) test.xml | |||
xmltv_convert_categories.pl (543 Bytes) xmltv_convert_categories.pl |
RE: XMLTV, Kazer & French categories - Added by Meindert Oldenburger over 7 years ago
I like to have the "Unmanaged categories" unique and how often they appear and ordered:
Add/replace the following code:
my %Categories;
my $PRE = '<category lang=\"nl\">';
my $POST = '</category>';
sub myfilter {
my ($a) = @_;
if (exists $REPLACE{$a}) {
return $REPLACE{$a} ;
} else {
if (exists $Categories{$a}) {
$Categories{$a} = $Categories{$a} + 1;
} else {
$Categories{$a} = 1;
}
return $a ;
}
}
while (<>) {
my $line = $_ ;
$line =~ s/($PRE)(.*)($POST)/"$1".myfilter("$2")."$3"/ge ;
print $line;
}
foreach my $category (sort { $Categories{$a} <=> $Categories{$b} } keys %Categories) {
print STDERR "WARNING: Unmanaged category: '$category' is $Categories{$category}\n";
}
RE: XMLTV, Kazer & French categories - Added by Alexandre E about 7 years ago
The last source of XMLTV is not available anymore.
http://xmltv.dtdns.net/alacarte/
Have anyone found a different source ?
In the end, has KAZER been able to source HeadTvEnd on Synology DSM6+ ?
Right now, I am stuck with no program input, which is a shame as all the rest has proved to work perfectly !
Thanks for your sharing
RE: XMLTV, Kazer & French categories - Added by thierry castelot about 7 years ago
hello Alexandre,
i will take a look this week end, i've found some news xml but i need rewrite the category.pl to match with.
RE: XMLTV, Kazer & French categories - Added by Alexandre E about 7 years ago
Thierry,
Thanks for your answer !
Actually, I am a bit disappointed about the loss of programs.
Even a straight XML input (without categories) would please me enough :-)
Would be nice if we could get in touch.
Is my email available in my profile ?
I much thank you for your contribution.
Regards
RE: XMLTV, Kazer & French categories - Added by thierry castelot about 7 years ago
Alexandre,
just replace your tv_grab_fr_alacarte by this one, it should works. i will update category.pl later
tv_grab_fr_alacarte (340 Bytes) tv_grab_fr_alacarte |
RE: XMLTV, Kazer & French categories - Added by Renato Moscardini about 7 years ago
Hello Thierry,
Many thanks for your solution.
Is there a way to have episode # in the episode column of tvheadend, not in the title ?
In this way I could manage the naming of file as I prefer.
Anyway in the wait time, it is a nice solution.
Kind regards
RE: XMLTV, Kazer & French categories - Added by Alexandre E about 7 years ago
Many thanks Thierry
I am off for a few days, and will try and report as soon as I am back
Once again thanks for your help !
RE: XMLTV, Kazer & French categories - Added by Alexandre E about 7 years ago
Works again !
But, for some reason, the Categories don't work (they never have)
In the log, I can see many many lines with "UNMANAGED CATEGORY" as if it know none.
Is there something I have missed ?
RE: XMLTV, Kazer & French categories - Added by thierry castelot about 7 years ago
hello Renato,
according to racacax (author of this xmltv) it should be fixed soon:
" ◘ Modification de l'affichage des épisodes des séries. Les modifications seront prises en compte progressivement avant que ce soit entièrement fonctionnel dès la semaine prochaine. Cela concerne la majorité des chaines."
<programme start="20170909190000 +0200" stop="20170909194500 +0200" channel="6ter">
<title>Rénovation impossible</title>
<sub-title lang="fr">Ambiance country</sub-title>
<episode-num system="onscreen">S05E09</episode-num>
Alex, it's sadly normal, i'm waiting for a new update from racacax to clean up the xmltv ( too many categories because he put the length into categories).