Python Basics

Read html file in python beautifulsoup

Beautifulsoup is a powerful library that helps scrape complex to complex html website easily.

Python+Requests+Beautifulsoup=Successful Web Scraping

Read html file in python beautifulsoup

To use beautifulsoup library, this library must be present in the system. If you are using pycharm, then go to file then project settings and install bs4. bs4 contains beautifulsoup library.

Following are the code to read any html file using python and beautiful soup

import requests

from bs4 import Beautifulsoup

r=requests.get(“https://gyanol.com/”)

soup=Beautifulsoup(r.content,’lxml’)

print(soup)

Output:<html><head>……..</html>

soup variable in the above code contains entire html webpage. By targeting individual elements we can fetch desired data.