Jump to content


Photo

Allocine.fr Scraper: Bug In Description/plot Scraping


  • Please log in to reply
2 replies to this topic

#1 FrankieGTH

FrankieGTH

    X-S Enthusiast

  • Members
  • 3 posts

Posted 02 December 2007 - 05:09 PM

Hi there,
First off, I'm a new user of an xbox I got from my niece, and I'm truly impressed by the work and quality of xbmc+mc360. Well done and thank you for all this, it's a gem.

Now here is the bug:
I have just installed the latest version of xmbc+mc360 (25Nov build + 29Nov build).
Been adding videos and been scraping info from both imdb and allocine.fr.
I can reproduce the bug in Library Mode, on most films which info has been scraped from allocine.fr.
When displaying the movie info (via context menu) in mc360 skin the xbox freezes on an empty info screen. Reboot required. This happens only in the Movie Info page (via context menu), not in the View: Movie Info list where summary info and plot are displayed correctly. Also this is specific to MC360 skin and cannot be reproduced in mayhem II skin.

After some investigation I pinned down the occurrence of the bug to this condition:
On some film which info has been scraped from allocine.fr, the description and plot fields contains extra text that looks like HTML scripting. The bug happens on those that contains such additional text (e.g. La Belle Verte) but not on those films that have proper plot and description text (e.g. Mensonges et trahisons et plus si affinités...).

Here is what the DB contains for the movie La Belle Verte (great movie btw):

CODE
Quelque part dans l'univers existe une planete dont les habitants evolues et heureux vivent en parfaite harmonie. De temps en temps quelques-uns d'entre eux partent en excursion sur d'autres planetes. Curieusement depuis deux cents ans plus personne ne veut aller sur la planete Terre. Or un jour, pour des raisons personnelles, une jeune femme decide de se porter volontaire. Et c'est ainsi que les Terriens la voient atterrir en plein Paname.





var IMAGESERVER = 'http://a69.g.akamai.net/n/69/10688/v1/img5.allocine.fr/acmedia';
function save_alerte() {
    document.getElementById('div_add_alerte').innerHTML = 'Veuillez patienter...';
new ACAjax.Request('/monallocine/ajax/fiches.alerte.save.asp', {onSuccess:function(){getElementDiv('div_add_alerte').innerHTML='Le film "La Belle verte" a été ajouté à vos alertes.Gérez vos alertes';}, method:'post', parameters:'ref=15287typeref=filmalerte=1'});
return false;
}
if (GetCookie ('ACplugged') == '1') {
document.write('');
new ACAjax.Request(    'http://www.allocine.fr/js/alerte.html',    {        method:'post',        onSuccess:function(xhr){getElementDiv('span_add_alerte').innerHTML = xhr.responseText;},        onFailure:function(){DivOff('span_add_alerte');},        parameters:'cfilm=15287typeref=filmredir=http://www.allocine.fr/monallocine/mesalertes/default.htmlargs=cfilm%3D15287%26action%3D1%26url%3D%2Ffilm%2Ffichefilm%5Fgen%5Fcfilm%3D15287%2Ehtml'    })
} else {
document.write('');
document.write('');
document.write('');
document.write('');
document.write('');
document.write('');
document.write('');
document.write('');
document.write('');
document.write('Obtenez par e-mail les dernières nouvelles (news, photos, bande-annonce, avant-première) sur "La Belle verte"');
document.write('Identifiez-vous si vous êtes membre AlloCiné,');
document.write('ou saisissez votre mail ici :


Not sure how to investigate further but I'd appreciate any help in fixing this.
Many thanks!

#2 FrankieGTH

FrankieGTH

    X-S Enthusiast

  • Members
  • 3 posts

Posted 02 December 2007 - 05:34 PM

BTW: I know this is also a bug for xbmc scraper, so I have reported that bug to the xbmc team as well (https://sourceforge....amp;atid=581838).
However the crash only happens on MC360 skin, not with Project Mayhem III, therefore resolving your end should make your skin more robust.
Cheers!

Edited by FrankieGTH, 02 December 2007 - 05:36 PM.


#3 FrankieGTH

FrankieGTH

    X-S Enthusiast

  • Members
  • 3 posts

Posted 02 December 2007 - 06:26 PM

Actually I've just fixed xbmc's scraper code (see bug reported in sourceforge for more info), my MC360 is no longer crashing, but I still suggest you to strengthen your code in order to avoid the crashing on your side.
Cheers!




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users