My project for next week is going to be to write a small perl script
that queries the Shanghai stock/bond exchange website to download historical data on stock and bond prices. I was curious if anyone on the list either has already done something like this or else would find this useful. |
Joseph,
I did something like this last year. The below script gets tickers from a text file, queries Bloomberg's website for details, and puts the results in a spreadsheet. Pretty straightforward - hope it helps. Regards, Matthew #!/usr/bin/perl use LWP::Simple; use HTML::TokeParser; use Spreadsheet::WriteExcel; my $row = my $col = 0; my $workbook = Spreadsheet::WriteExcel->new('c:\temp\stocks.xls'); my $worksheet = $workbook->addworksheet(); open (TXT,'c:\temp\stocks.txt') || die $!; while (<TXT>) { $stock = $_; $stock =~ m{(.*[^\n])}i; $trimmedStock = $1; $page = 'http://quote.bloomberg.com/apps/quote?ticker=' .$trimmedStock; $html = get($page); $worksheet->write($row, $col, $trimmedStock); $row++; my $stream = HTML::TokeParser-> new (\$html); while (my $token = $stream ->get_token()) { if ($token->[0] eq 'T') { if ($token->[1] =~/^Price|Open|Volume|High|Low|^Earnings|Bid|Ask|Beta|P\/E|Shares|Cap|Dividend|Change|^Return|N\.A|^[-\d*]/){ $worksheet->write($row, $col, $token->[1]); $row++; } } } $col++; $row = 0; } >From: Joseph Wang <[hidden email]> >To: [hidden email] >Subject: [Quantlib-users] Shanghai Stock/Bond prices >Date: Fri, 04 Mar 2005 01:04:44 -0600 > >My project for next week is going to be to write a small perl script that >queries the Shanghai stock/bond exchange >website to download historical data on stock and bond prices. I was >curious if anyone on the list either has already done something like this >or else would find this useful. > > > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >_______________________________________________ >Quantlib-users mailing list >[hidden email] >https://lists.sourceforge.net/lists/listinfo/quantlib-users |
Joseph:
Here's another perl script that queries chinabond.com.cn for bond spot and repo prices and dumps to ascii files- the first part worked (six months ago at least) and gets basic data. The issue is that the chinabond page is updated at random times with batch data- since you only want data if it is new, the rest of the code was an attempt to do the right diffs on old and new pages (as well as some dirty parsing)- that part wasn't working at 100%- you'll have to debug it.
If you have access to a server where we can run a cron and then put up the data on the web as a "public service" for researchers, let me know and I'll debug it- I could use the data myself.
Best,
Allen
#!/usr/bin/perl
use LWP::Simple;
# uses "head()"
if( !head("http://www.chinabond.com.cn/chinabond/calculatecurveuser/user/whole_index/RealtimeSpotsFullPrice.jsp?Type=2")) {
print "webpage is down!"; } $oldcount1=0;
$oldcount2=0; $oldfilename1=$oldcount1."spot";
$oldfilename2=$oldcount2."repo"; print "printing to $oldfilename1 $oldfilename2\n";
`touch $oldfilename1`; `touch $oldfilename2`; $count1=1;
$count2=1; do {
print "trying to get spot content1.....\n"; do { $content1=get("http://www.chinabond.com.cn/chinabond/calculatecurveuser/user/whole_index/RealtimeSpotsFullPrice.jsp?Type=2"); sleep(60);
} while (!$content1);
print "got spot prices\n";
print "trying to get repo content2.....\n"; do { $content2=get("http://www.chinabond.com.cn/chinabond/calculatecurveuser/user/whole_index/RealtimeReposData.jsp?Type=1"); sleep(60); } while (!$content2);
print "got repo data\n";
$filename1=$count1."spot"; $filename2=$count2."repo"; open(FILE1,">$filename1");
open(FILE2,">$filename2"); print "printing to $filename1 $filename2\n"; print FILE1 $content1;
print FILE2 $content2; $filename1test=$filename1.".test"; $filename2test=$filename2.".test"; $oldfilename1test=$oldfilename1.".test"; $oldfilename2test=$oldfilename2.".test"; `grep -v "div align" $filename1 > $filename1test`;
`grep -v "div align" $filename2 > $filename2test`; `grep -v "div align" $oldfilename1 > $oldfilename1test`; `grep -v "div align" $oldfilename2 > $oldfilename2test`; $result1 = `diff -q $oldfilename1test $filename1test`; $result2 = `diff -q $oldfilename2test $filename2test`; #print "diff results between $oldfilename1test $filename1test is: $result1 \n";
#print "diff results between $oldfilename2test $filename2test is: $result2 \n"; `rm $filename1test $filename2test $oldfilename1test $oldfilename2test`;
if($result1 =~ /diff/) { $result1=1;print "real diff in spot files: $oldfilename1test $filename1test\n"; } else {$result1=0;print "no real diff in spot files: $oldfilename1test $filename1test\n";}; if($result2 =~ /diff/) { $result2=1;print "real diff in repo files: $oldfilename2test $filename2test\n"; } else {$result2=0;print "no real diff in repo files: $oldfilename2test $filename2test\n";};
$datestring=`date "+%D.%R"`; $datestring =~ s/:/h/g; $datestring =~ s/\//-/g; print $datestring; if($result1) {
$count1++; $oldcount1++; $finalfilename1=$filename1.$datestring; `cp $filename1 $finalfilename1`; print "found new spot rate file in $filename1 or $finalfilename1"; } if($result2) { $count2++; $oldcount2++; $finalfilename2=$filename2.$datestring; `cp $filename2 $finalfilename2`; print "found new repo rate file in $filename2 or $finalfilename2"; } $oldfilename1=$oldcount1."spot"; $oldfilename2=$oldcount2."repo"; print "sleeping for X minutes.....\n\n\n\n"; sleep(60); } while (1);
Joseph,
Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web |
Free forum by Nabble | Edit this page |