Skip to main content

A package to parse raw HTML and return structured information.

Project description

html2info

html2info is a Python package that allows you to parse LinkedIn profiles from raw HTML and return structured information in JSON format.

Features

  • Extracts profile information such as name, title, location, profile photo, about, experience, and education.
  • Returns a JSON object containing the parsed data.

Installation

Install html2info using pip:

pip install html2info

Usage

Here's an example of how to use html2info:

from html2info.linkedin import Person

url = "https://www.linkedin.com/in/iglovikov/"
raw_data = "..."  # Raw HTML content of the LinkedIn page

person = Person(url, raw_data)
person.parse()
print(person.to_dict())
{
  "linkedin_url": "https://www.linkedin.com/in/iglovikov/",
  ...
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html2info-0.1.1.tar.gz (4.1 kB view hashes)

Uploaded Source

Built Distribution

html2info-0.1.1-py3-none-any.whl (3.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page