User Tools

Site Tools


code:shell:fetchpm25

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
code:shell:fetchpm25 [2015/01/21 05:40] – created percycode:shell:fetchpm25 [2016/05/05 13:07] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Fetch the pm 2.5 for all the cities in China ====== ====== Fetch the pm 2.5 for all the cities in China ======
 +Get the pm 2.5 from http://www.soupm25.com
 +
 +With this script you can get all the pm 2.5 in just one file "all_city/pm25_all.txt", and it will auto backup the pm 2.5 in the directory "history".
 +
 +So have fun.
 +
 +Before you can run it, you need install "dos2unix"
 +  apt-get install dos2unix
 +
  
 <code  Bash> <code  Bash>
Line 11: Line 20:
 OUTPUT_TMP="${TMP}/pm25_all_latest.txt" OUTPUT_TMP="${TMP}/pm25_all_latest.txt"
 wget http://www.soupm25.com/ -O ${RAW_HTML} wget http://www.soupm25.com/ -O ${RAW_HTML}
-ALL_URLS=`cat ${RAW_HTML}|grep ".html"|sed "s/<a href=//g"|sed "s/\"//g"|sed "s/\/\///g"|sed "s/>//g"|sed "s/<\/a//g"|cut -d "=" -f4|sed "s/<\/li//g"|grep -v -E "^$|</html|DOCTYPE"|awk -F "www" '{print "http://www"$2}'|awk -F "html" '{print $1"html "$2}'|sed 's/  / /g'|dos2unix`+ALL_URLS=`cat ${RAW_HTML}|grep ".html"|sed "s/<a href=//g"|sed "s/\"//g"|sed "s/\/\///g"|sed "s/>//g"|sed "s/<\/a//g"|cut -d "=" -f4|sed "s/<\/li//g"|grep -v -E "^$|</html|DOCTYPE"|awk -F "www" '{print "http://www"$2}'|awk -F "html" '{print $1"html "$2}'|sed 's/  / /g'|sort|uniq|dos2unix`
 ALL_CITY=`echo "${ALL_URLS}"|grep "city"` ALL_CITY=`echo "${ALL_URLS}"|grep "city"`
 echo "${ALL_CITY}" > ${CITY} echo "${ALL_CITY}" > ${CITY}
Line 25: Line 34:
     i="${line}"     i="${line}"
     url=`echo "${i}"|cut -d " " -f1`     url=`echo "${i}"|cut -d " " -f1`
-    city=`echo "${i}"|awk -F "city/" '{print $2}'|sed 's/.html /_/g'|dos2unix`+    city=`echo "${i}"|awk -F "city/" '{print $2}'|sed 's/.html /_/g'`
     pingyin=`echo "${city}"|cut -d "_" -f1`     pingyin=`echo "${city}"|cut -d "_" -f1`
     #echo "${pingyin}"     #echo "${pingyin}"
Line 42: Line 51:
 cp ${OUTPUT} ${HISTORY}/"`date`.txt" cp ${OUTPUT} ${HISTORY}/"`date`.txt"
 </code> </code>
 +
 +I have deploy it on my vps, so you can get it from here
 +
 +  - http://ef.pjq.me/download/pm25/all_city/pm25_all.txt
 +  - http://ef.pjq.me/download/pm25/history/
 +
/var/www/dokuwiki/wiki/data/attic/code/shell/fetchpm25.1421790007.txt.gz · Last modified: 2016/05/05 13:06 (external edit)