extract-emails

Extract email addresses from given URL.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python
- Python :: 3.6

Project description

Extract emails from a given website

Requirements

Minimum Python3.6
requests
lxml

Installation

pip install extract_emails

Usage

With default browsers

from extract_emails import EmailExtractor
from extract_emails.browsers import ChromeBrowser


with ChromeBrowser() as browser:
    email_extractor = EmailExtractor("http://www.tomatinos.com/", browser, depth=2)
    emails = email_extractor.get_emails()


for email in emails:
    print(email)
    print(email.as_dict())

# Email(email="bakedincloverdale@gmail.com", source_page="http://www.tomatinos.com/")
# {'email': 'bakedincloverdale@gmail.com', 'source_page': 'http://www.tomatinos.com/'}
# Email(email="freshlybakedincloverdale@gmail.com", source_page="http://www.tomatinos.com/")
# {'email': 'freshlybakedincloverdale@gmail.com', 'source_page': 'http://www.tomatinos.com/'}

from extract_emails import EmailExtractor
from extract_emails.browsers import RequestsBrowser


with RequestsBrowser() as browser:
    email_extractor = EmailExtractor("http://www.tomatinos.com/", browser, depth=2)
    emails = email_extractor.get_emails()


for email in emails:
    print(email)
    print(email.as_dict())

# Email(email="bakedincloverdale@gmail.com", source_page="http://www.tomatinos.com/")
# {'email': 'bakedincloverdale@gmail.com', 'source_page': 'http://www.tomatinos.com/'}
# Email(email="freshlybakedincloverdale@gmail.com", source_page="http://www.tomatinos.com/")
# {'email': 'freshlybakedincloverdale@gmail.com', 'source_page': 'http://www.tomatinos.com/'}

With custom browser

from extract_emails import EmailExtractor
from extract_emails.browsers import BrowserInterface

from selenium import webdriver
from selenium.webdriver.firefox.options import Options


class FirefoxBrowser(BrowserInterface):
    def __init__(self):
        ff_options = Options()
        self._driver = webdriver.Firefox(
            options=ff_options, executable_path="/home/di/geckodriver",
        )

    def close(self):
        self._driver.quit()

    def get_page_source(self, url: str) -> str:
        self._driver.get(url)
        return self._driver.page_source


with FirefoxBrowser() as browser:
    email_extractor = EmailExtractor("http://www.tomatinos.com/", browser, depth=2)
    emails = email_extractor.get_emails()

for email in emails:
    print(email)
    print(email.as_dict())

# Email(email="bakedincloverdale@gmail.com", source_page="http://www.tomatinos.com/")
# {'email': 'bakedincloverdale@gmail.com', 'source_page': 'http://www.tomatinos.com/'}
# Email(email="freshlybakedincloverdale@gmail.com", source_page="http://www.tomatinos.com/")
# {'email': 'freshlybakedincloverdale@gmail.com', 'source_page': 'http://www.tomatinos.com/'}

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python
- Python :: 3.6

Release history Release notifications | RSS feed

5.3.3

Feb 16, 2024

5.3.2

Dec 29, 2023

5.3.1

May 24, 2022

5.3.0

May 16, 2022

5.2.0

Apr 3, 2022

5.1.3

Jan 20, 2022

5.1.2

Dec 27, 2021

5.1.1

Oct 11, 2021

5.1.0

Oct 11, 2021

5.0.2

Oct 7, 2021

5.0.0

Oct 4, 2021

4.1.0

Nov 25, 2020

4.0.3

Nov 5, 2020

4.0.2

Oct 16, 2020

4.0.1

Aug 13, 2020

This version

4.0.0

Aug 13, 2020

3.0.5

Mar 3, 2020

3.0.4

Aug 18, 2019

3.0.3

Aug 18, 2019

3.0.2

Aug 16, 2019

3.0.1

Mar 22, 2019

2.0.1

Feb 11, 2018

2.0.0

Feb 11, 2018

1.0.3

Jul 25, 2017

1.0.2

Jul 24, 2017

1.0.1

Jul 24, 2017

1.0.0

Jul 24, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extract_emails-4.0.0.tar.gz (13.7 kB view hashes)

Uploaded Aug 13, 2020 Source

Built Distribution

extract_emails-4.0.0-py2.py3-none-any.whl (17.6 kB view hashes)

Uploaded Aug 13, 2020 Python 2 Python 3

Hashes for extract_emails-4.0.0.tar.gz

Hashes for extract_emails-4.0.0.tar.gz
Algorithm	Hash digest
SHA256	`c3580d12b2413773b1e564be1dd7d3c8532c209a7d0fa93cf233a0b0a08bfd60`
MD5	`f956cd1e0b2d3bac792f7fb2ee7994ad`
BLAKE2b-256	`b13cfb6558e564fcf3cfae788d2cfa9645207b9e08ba59db1a62466557f7a5de`

Hashes for extract_emails-4.0.0-py2.py3-none-any.whl

Hashes for extract_emails-4.0.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9850bdb7b2a9c23ee1331391330939a717e73a04bd07f4d5adb704c0b362035`
MD5	`bde75b647c899d3176049e491c1afe98`
BLAKE2b-256	`80242caf77867a0698414ede28ec3cc381eb073954b624cc7b3213c75786d498`