要提取網(wǎng)頁中的圖片,可以使用Python編程語言和相關庫(如requests、BeautifulSoup和Pillow)進行操作,使用requests庫獲取網(wǎng)頁內(nèi)容,然后使用BeautifulSoup解析HTML,最后通過查找標簽(如img)并獲取其屬性(如src)來提取圖片鏈接,再次使用requests庫下載圖片并保存到本地,以下是一個簡單的示例代碼:
import requests from bs4 import BeautifulSoup from PIL import Image from io import BytesIO def download_image(url, save_path): response = requests.get(url) with open(save_path, 'wb') as f: f.write(response.content) def extract_images(url): response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') img_tags = soup.find_all('img') img_urls = [img['src'] for img in img_tags if 'src' in img.attrs] for img_url in img_urls: image_name = img_url.split('/')[-1] save_path = f'images/{image_name}' download_image(img_url, save_path) if __name__ == '__main__': url = 'https://www.example.com' extract_images(url)
這段代碼首先定義了一個名為download_image
的函數(shù),用于從給定的URL下載圖片并將其保存到指定的路徑,定義了一個名為extract_images
的函數(shù),用于從給定的網(wǎng)頁URL中提取所有圖片鏈接,在主程序中調(diào)用extract_images
函數(shù)并傳入目標網(wǎng)頁的URL。
發(fā)表評論