GSGUNDAM砍柴工

慢慢积累知识和见识，磨刀不误砍柴工

一言分享中...

Python获取页面链接

By GSGundam

发表于 2022年07月14日

字数 189 字阅读时长 ≈ 1 分钟

专题: Python

文章目录

网页资源嗅探功能应该大部分人都很熟悉了，我们也用python来实验一下从指定的网页获取所有链接。

先安装好依赖模块。

1 2	pip install beautifulsoup4 pip install requests

结合前几个实践，这次的也是简单到不行，直接上代码。

import requests as rq
from bs4 import BeautifulSoup

url = input("Enter Link: ")
if ("https" or "http") in url:
    data = rq.get(url)
else:
    data = rq.get("https://" + url)
soup = BeautifulSoup(data.text, "html.parser")
links = []
for link in soup.find_all("a"):
    links.append(link.get("href"))

# 实际运用中可能会使用 a（追加）模式 来替代 w（写入）模式
with open("E:/WindowsDocuments/G7/Desktop/links.txt", 'w') as saved:
    print(links[:10], file=saved)

输入 www.baidu.com ，得到下面的文件：

越赏越努力

♦ 本文固定连接：https://www.gsgundam.com/archive/2022-07-14-how-to-all-links-webpage-python/

♦ 转载请注明：GSGundam 2022年07月14日发布于 GSGUNDAM砍柴工

♦ 本文版权归作者，欢迎转载，但未经作者同意必须保留此段声明，且在文章页面明显位置给出原文链接。

♦ 原创不易，如果页面上有适合你的广告，不妨点击一下看看，支持作者。（广告来源：Google Adsense）

♦ 本文总阅读量次