打开京东官网,搜索图书,随便选择一本进入详情页,点击排行榜详情,可以到达京东的图书排行榜
按F12->网络,Ctrl+R刷新,一条条排查,可以找到数据URL为
https://gw-e.jd.com/client.action?callback=func&body=%7B%22moduleType%22%3A1%2C%22page%22%3A1%2C%22pageSize%22%3A20%2C%22scopeType%22%3A1%7D&functionId=bookRank&client=e.jd.com&_=1732717114846
解码一下,网址内容如下
https://gw-e.jd.com/client.action?callback=func&body={"moduleType":1,"page":1,"pageSize":20,"scopeType":1}&functionId=bookRank&client=e.jd.com&_=1732717114846
其中参数
| 参数 |
含义 |
| moduleType |
图书销量榜/新书热卖榜 |
| page |
页面 |
| pageSize |
页面书籍数量 |
| scopeType |
日榜/周榜/月榜 |
| _ |
时间戳 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
| import requests import json
def get_data(): headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36', }
timestamp = '1731340800' url = f'https://gw-e.jd.com/client.action?callback=func&body=%7B%22moduleType%22%3A1%2C%22page%22%3A1%2C%22pageSize%22%3A10%2C%22scopeType%22%3A1%7D&functionId=bookRank&client=e.jd.com&_={timestamp}' req = requests.get(url,headers=headers) if(req.status_code == 200): req_txt= req.text req_txt = req_txt[5:-1] j = json.loads(req_txt) book_lst = j['data']['books'] rank = 1 for book in book_lst: if(rank > 10): break book_name = book['bookName'] book_definePrice = book['definePrice'] book_sellPrice = book['sellPrice']
print(f'{rank} 图书:{book_name},原价:{book_definePrice},卖价:{book_sellPrice}') rank = rank + 1
if __name__ == '__main__': get_data()
|