Here are the basic steps to use Beautiful Soup for web scraping:
1. **Install Beautiful Soup** 💻📦:
```
!pip install beautifulsoup4
!pip install lxml
```
2. **Import the Necessary Libraries** 📚:
```
from bs4 import BeautifulSoup
import requests
```
3. **Fetch the Web Page** 🌐⬇️:
```
url = 'http://example.com'
response = requests.get(url)
html_content = response.content
```
4. **Parse the HTML Content** 🗂️🔍:
```
soup = BeautifulSoup(html_content, 'lxml') # or 'html.parser'
```
5. **Extract Data** 📄➡️🔢:
- Extract specific elements like titles, links, tables, etc.
Example - Extracting all the links 🔗:
```
for link in soup.find_all('a'):
print(link.get('href'))
```
Example - Extracting text from a specific tag 🏷️:
```
title = soup.find('title').get_text()
print(title)
```
Comments