Jump to content


Photo

Simple Site Scraping


  • Please log in to reply
1 reply to this topic

#1 Stefan Leroux

Stefan Leroux

    X-S X-perience

  • Members
  • PipPip
  • 307 posts
  • Location:Scotland, UK
  • Xbox Version:v1.3
  • 360 version:v1 (xenon)

Posted 18 March 2008 - 07:04 PM

Hey all,

I'm looking to make a small script to run on my xbox for personal use to retreive minor bits of data from a website. At the moment my intention is to retrive my Halo 3 information to display on my home screen via a skin edit. I've done this part before, and I just call the images/text from a setting which I hope to put down with the skin.setstring function.

However, I really don't have a clue how to scrape at all, and was hoping someone could show me some pointers.

The site I want from is "http://www.bungie.ne...=Stefan Leroux" - I was intending to use a prompt so I could check for other people, but first I just want a static one. I think editing it could definitly help me learn.

The information I want to scrape is:
CODE
<div id="ctl00_mainContent_identityStrip_divHeader" class="header_stats" style="border-bottom:solid 1px #626262;">
            <div id="ctl00_mainContent_identityStrip_divEmblem" class="profile_picA" style="background:#626262;"><a id="ctl00_mainContent_identityStrip_hypGamerTag" href="/Stats/Halo3/Default.aspx?player=$Stefan Leroux$  "><img id="ctl00_mainContent_identityStrip_EmblemCtrl_imgEmblem" src="$/Stats/halo2emblem.ashx?s=70&amp;0=0&amp;1=6&amp;2=2&amp;3=0&amp;fi=4&amp;bi=43&amp;fl=1&amp;m=1$" style="height:70px;width:70px;border-width:0px;" />
</a></div>
            <ul>
                <li><h3>$$Stefan Leroux$$   - <span id="ctl00_mainContent_identityStrip_lblServiceTag">$$O01$$</span></h3></li>
                <li>&nbsp;</li>
                <li><span id="ctl00_mainContent_identityStrip_lblRank">$$Corporal, Grade 2$$</span> &nbsp;</li>
                <li>Highest Skill: <span id="ctl00_mainContent_identityStrip_lblSkill">$5$</span>&nbsp;&nbsp;|&nbsp;&nbsp;Total EXP: <span id="ctl00_mainContent_identityStrip_lblTotalRP">$$19$$</span>&nbsp;&nbsp;|&nbsp;&nbsp; Next Rating: <a id="ctl00_mainContent_identityStrip_hypNextRank" href="$/Stats/Halo3/RankHistory.aspx?player=Stefan Leroux $ ">$$20$$ EXP</a></li>


In this I have enclosed the variables I want to retrieve and store in "$$", and eclosed the other variables that don't really matter in "$"- I've made a similar script before in jscript but as said I've never done any python before, and hope this will teach me.

Can anyone help show me how to scrape this info in a way that will hopefully help me and maybe others to learn? I know that if I can figure out how to do this I intend to elaborate to find stats from more sites, but really this is the basic thing I need.

Thanks
Leroux

Edited by Stefan Leroux, 18 March 2008 - 07:09 PM.


#2 enixidfrag

enixidfrag

    X-S Young Member

  • Members
  • Pip
  • 36 posts
  • Location:Worcester
  • Xbox Version:v1.1
  • 360 version:v1 (xenon)

Posted 21 March 2008 - 05:00 AM

Gamertag related retrieval is typicaly extremly easy believe it or not.


I'm actualy waiting still for someone to send me the old gamertag file so i can modify it to work again. This is pretty much the same principle however this would pretty much just be a blank page with your stats basically i mean u cant do it like a gamer card unless you actualy also set up a site with a sript that will show your stats in an image form like my gamercard does





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users