Saturday, August 11, 2012

Data mining.(updated).

Ever wished you could get data from a web page without reading the whole web page, or for even just getting data from a server.   One thing I like to do is get the football scores. Find that SI.com and some other sites have so much going on it takes forever to navigate the pages. Not only that, you are subjected to all the ads. Just give me the scores so I can move on. Data mining allows me to do that.  In other words, the computer can be your personal secretary to get all the data you need for your special reports. without you having to do all that hard work and the extra time to be spent.




Written a series of beginner guides for data mining. You can find them at.
http://www.instructables.com/id/Data-mining/

Note:  The football score capturing script works best for the preceding week or earlier in the same season.

Update: Let's take what we have learned already and apply it. Showed you how to extract data to make your own web page and also showed you how to cut and paste data in another article. If you thought the last web page we created was cumbersome to look at, now we will strip out everything but the teams and scores.


This time we are using the scores from week 3 of the preseason. So nice to be able to use the same code over and over. Anyway we just want the teams and their scores. Extracting the data is so simple and then we just paste everything together and it might look like this:


All it took to do that was only a short bit of code.

getscores.sh
[code]
#===================================
# Get score's
#
team=""
team="awayteam"
# output data
lynx -width 1000 -dump "http://oesrvr1/testcode/getscores1.php" | grep $team > scorefile
cut -c 12-25 scorefile > f1
cut -c 37-39 scorefile > f2
cut -c 49-60 scorefile > f3
cut -c 73-75 scorefile > f4
paste f1 f2 f3 f4 > allscoresfile.txt
#===================================

[/code]

$ chmod +x getscores.sh
All for now.

Note: the scrips should work fine on nix based machines, For mswindows, you will want to consider installing cgywin.

No comments:

Post a Comment