- 课堂交流区
- 帖子详情
42
回复
-
<p>比较requests和Scrapy的性能区别嘛</p>添加评论
-
<p style="line-height: 27px;" ><span style="color: rgb(179, 89, 0);" >import</span> <span style="color: rgb(138, 70, 0);" >requests</span></p><p style="line-height: 27px;" ><span style="color: rgb(179, 89, 0);" >import</span> <span style="color: rgb(138, 70, 0);" >time</span></p><p style="line-height: 27px;" ><br></p><p style="line-height: 27px;" ><span style="color: rgb(110, 119, 129);" ># 测试的URL</span></p><p style="line-height: 27px;" >url <span style="color: rgb(179, 89, 0);" >=</span> <span style="color: rgb(10, 48, 105);" >"https://www.baidu.com"</span></p><p style="line-height: 27px;" ><span style="color: rgb(110, 119, 129);" ># 爬取100次网页</span></p><p style="line-height: 27px;" ><span style="color: rgb(179, 89, 0);" >def</span> <span style="color: rgb(130, 80, 223);" >test_requests_performance</span>():</p><p style="line-height: 27px;" > success_count <span style="color: rgb(179, 89, 0);" >=</span> <span style="color: rgb(5, 80, 174);" >0</span></p><p style="line-height: 27px;" > start_time <span style="color: rgb(179, 89, 0);" >=</span> <span style="color: rgb(138, 70, 0);" >time</span>.<span style="color: rgb(130, 80, 223);" >time</span>()</p><p style="line-height: 27px;" ><br></p><p style="line-height: 27px;" > <span style="color: rgb(179, 89, 0);" >for</span> i <span style="color: rgb(179, 89, 0);" >in</span> <span style="color: rgb(138, 70, 0);" >range</span>(<span style="color: rgb(5, 80, 174);" >100</span>):</p><p style="line-height: 27px;" > <span style="color: rgb(179, 89, 0);" >try</span>:</p><p style="line-height: 27px;" > response <span style="color: rgb(179, 89, 0);" >=</span> <span style="color: rgb(138, 70, 0);" >requests</span>.<span style="color: rgb(130, 80, 223);" >get</span>(url)</p><p style="line-height: 27px;" > <span style="color: rgb(179, 89, 0);" >if</span> response.status_code <span style="color: rgb(179, 89, 0);" >==</span> <span style="color: rgb(5, 80, 174);" >200</span>:</p><p style="line-height: 27px;" > success_count <span style="color: rgb(179, 89, 0);" >+=</span> <span style="color: rgb(5, 80, 174);" >1</span></p><p style="line-height: 27px;" > <span style="color: rgb(179, 89, 0);" >except</span> <span style="color: rgb(138, 70, 0);" >Exception</span> <span style="color: rgb(179, 89, 0);" >as</span> e:</p><p style="line-height: 27px;" > <span style="color: rgb(130, 80, 223);" >print</span>(<span style="color: rgb(179, 89, 0);" >f</span><span style="color: rgb(10, 48, 105);" >"第 </span><span style="color: rgb(179, 89, 0);" >{</span>i<span style="color: rgb(179, 89, 0);" >+</span><span style="color: rgb(5, 80, 174);" >1</span><span style="color: rgb(179, 89, 0);" >}</span><span style="color: rgb(10, 48, 105);" > 次请求失败: </span><span style="color: rgb(179, 89, 0);" >{</span>e<span style="color: rgb(179, 89, 0);" >}</span><span style="color: rgb(10, 48, 105);" >"</span>)</p><p style="line-height: 27px;" ><br></p><p style="line-height: 27px;" > end_time <span style="color: rgb(179, 89, 0);" >=</span> <span style="color: rgb(138, 70, 0);" >time</span>.<span style="color: rgb(130, 80, 223);" >time</span>()</p><p style="line-height: 27px;" > elapsed_time <span style="color: rgb(179, 89, 0);" >=</span> end_time <span style="color: rgb(179, 89, 0);" >-</span> start_time</p><p style="line-height: 27px;" > <span style="color: rgb(130, 80, 223);" >print</span>(<span style="color: rgb(179, 89, 0);" >f</span><span style="color: rgb(10, 48, 105);" >"成功爬取 </span><span style="color: rgb(179, 89, 0);" >{</span>success_count<span style="color: rgb(179, 89, 0);" >}</span><span style="color: rgb(10, 48, 105);" > 次网页,总耗时: </span><span style="color: rgb(179, 89, 0);" >{</span>elapsed_time<span style="color: rgb(179, 89, 0);" >:.2f}</span><span style="color: rgb(10, 48, 105);" > 秒"</span>)</p><p style="line-height: 27px;" ><br></p><p style="line-height: 27px;" ><span style="color: rgb(179, 89, 0);" >if</span> __name__ <span style="color: rgb(179, 89, 0);" >==</span> <span style="color: rgb(10, 48, 105);" >"__main__"</span>:</p><p style="line-height: 27px;" > <span style="color: rgb(130, 80, 223);" >test_requests_performance</span>()</p><p style="line-height: 27px;" ><br></p><p style="line-height: 27px;" >大概需要10秒钟左右,看网络情况</p><p style="line-height: 27px;" ><br></p><p><br></p>添加评论
-
<p>网太撇爬了77秒</p><p><br></p>添加评论
-
<p>好难</p>添加评论
-
<p>好</p>添加评论
-
<p>访问的地址是腾讯新闻网qq.com,免费的ip代理失败,或者说超出了所能承受访问的次数,受到限制失败了。所以最后把代理注释掉,总共花了56.7328731000016秒。以下是代码<img src="https://mooc-image.nosdn.127.net/efe6e7a202954f1bafd2c13a07636476.png" style="max-width:750px;" ></p>添加评论
-
<p>好难</p>添加评论
-
import time import requests url = 'https://www.google.com' def test_url(url): start_time = time.time() cnt = 0 while cnt < 100: try: response = requests.get(url, timeout=10) response.raise_for_status() cnt += 1 except Exception: continue end_time = time.time() print(end_time - start_time) if __name__ == '__main__': test_url(url) <p>----------------------------------</p><p>执行结果:</p><p>83.87145042419434</p><p><br></p><p>进程已结束,退出代码为 0</p>添加评论
-
<p>好的</p><p><br></p>添加评论
-
import time import requests url = 'https://www.google.com' def test_url(url): start_time = time.time() cnt = 0 while cnt < 100 try response ="requests.get(url," timeout="10)" responseraise_for_status cnt ="1" except Exception continue end_time ="time.time()" printend_time - start_time if __name__ ="= '__main__':" test_urlurl p>----------------------------------添加评论
-
<p><img src="https://mooc-image.nosdn.127.net/635afafb588d456d88b18c73aca3e1a6.png" style="max-width:750px;" ></p><p>每次时间都不一样</p>添加评论
-
<p><img src="https://mooc-image.nosdn.127.net/64f8b9ec6aa5451f84a36bdb22aab855.png" style="max-width:750px;" ></p><p>https://www.asus.com.cn 总是403</p><p>https://www.baidu.com 68.多</p><p><br></p><p><br></p>添加评论
-
231.45秒 <img src='https://mooc-image.nosdn.127.net/26F6776D3F4A4DC43CD5D3C389637E58.jpg' />添加评论
-
<p>import requests</p><p>import time</p><p><br></p><p>def test_requests_performance(url, times=100):</p><p> start_time = time.time()</p><p> </p><p> for i in range(times):</p><p> try:</p><p> response = requests.get(url)</p><p> response.raise_for_status() # 检查请求是否成功</p><p> except requests.exceptions.RequestException as e:</p><p> print(f"第{i+1}次请求失败: {e}")</p><p> return None</p><p> </p><p> end_time = time.time()</p><p> total_time = end_time - start_time</p><p> avg_time = total_time / times</p><p> </p><p> print(f"成功爬取{times}次网页")</p><p> print(f"总耗时: {total_time:.2f}秒")</p><p> print(f"平均每次请求耗时: {avg_time:.4f}秒")</p><p> return total_time</p><p><br></p><p># 测试URL - 使用httpbin.org的get接口</p><p>test_url = "https://httpbin.org/get"</p><p>total_time = test_requests_performance(test_url)</p><p><br></p><p># 输出结果示例:</p><p># 成功爬取100次网页</p><p># 总耗时: 12.34秒</p><p># 平均每次请求耗时: 0.1234秒</p>添加评论
-
import time import requests url = 'https://www.google.com' def test_url(url): start_time = time.time() cnt = 0 while cnt ----------------------------------添加评论
-
<p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" >import time</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" >import requests</span></p><p><br></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" >url = 'https://www.google.com'</span></p><p><br></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" >def test_url(url):</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > start_time = time.time()</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > cnt = 0</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > while cnt < 100:</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > try:</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > response = requests.get(url, timeout=10)</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > response.raise_for_status()</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > cnt += 1</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > except Exception:</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > continue</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > end_time = time.time()</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > print(end_time - start_time)</span></p><p><br></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" >if __name__ == '__main__':</span></p><p><span style="font-size: 12px; font-family: Arial, "Hiragino Sans GB", "Microsoft YaHei", 微软雅黑, Helvetica, "sans-serif"; color: rgb(102, 102, 102);" > test_url(url)</span></p><p style="line-height: 22px;" >----------------------------------</p><p style="line-height: 22px;" >执行结果:</p><p style="line-height: 22px;" >83.87145042419434</p><p style="line-height: 22px;" ><br></p><p style="line-height: 22px;" >进程已结束,退出代码为 0</p><p><br></p>添加评论
-
<p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >import requests</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >import time</span></p><p><br></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >url = "https://www.example.com" # 这里以示例网站为例,你可以换成合适的url,确保该网站不会屏蔽频繁请求</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >start_time = time.time()</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >for _ in range(100):</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > try:</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > response = requests.get(url)</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > if response.status_code == 200:</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > pass # 这里可以添加对响应内容的处理逻辑,比如解析页面等</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > else:</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > print(f"请求失败,状态码: {response.status_code}")</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > except requests.RequestException as e:</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" > print(f"请求出错: {e}")</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >end_time = time.time()</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >total_time = end_time - start_time</span></p><p><span style="font-size: medium; font-family: Inter, -apple-system, BlinkMacSystemFont, "Segoe UI", "SF Pro SC", "SF Pro Display", "SF Pro Icons", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif; color: rgb(28, 31, 35);" >print(f"成功爬取100次网页的总时间为: {total_time} 秒")</span></p>添加评论
点击加载更多
到底啦~