A package to parse raw HTML and return structured information.
Project description
html2info
html2info
is a Python package that allows you to parse LinkedIn profiles from raw HTML and return structured information in JSON format.
Features
- Extracts profile information such as name, title, location, profile photo, about, experience, and education.
- Returns a JSON object containing the parsed data.
Installation
Install html2info
using pip:
pip install html2info
Usage
Here's an example of how to use html2info:
from html2info.linkedin import Person
url = "https://www.linkedin.com/in/iglovikov/"
raw_data = "..." # Raw HTML content of the LinkedIn page
person = Person(url, raw_data)
person.parse()
print(person.to_dict())
{
"linkedin_url": "https://www.linkedin.com/in/iglovikov/",
...
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
html2info-0.1.1.tar.gz
(4.1 kB
view hashes)
Built Distribution
Close
Hashes for html2info-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b5750cead098f73ae7f4d92df872fbdb754b4096593d69b6f4b7070d07d9a86 |
|
MD5 | 64bd4dab2bb2b697f26caff59f715b72 |
|
BLAKE2b-256 | 8377d526324733d3bf8d62d3158c9b60b6cf2e40a5add98b88f2ff492fbdda25 |