VADER sentiment classifier updated with financial lexicons
Project description
FinVADER
VADER sentiment classifier updated with financial lexicons
VADER (Valence Aware Dictionary and sEntiment Reasoner) classifier is a mainstream model for sentiment analysis using a general-language human-curated lexicon, including linguistic features expressed on social media. As such, the model works worse on texts that use domain-specific language, such as finance or economics.
FinVADER improves VADER's classification accuracy, including two finance lexicons: SentiBignomics, and Henry's word list. SentiBigNomics is a detailed financial lexicon for aspect-based sentiment analysis with approximately 7300 terms containing a polarity score ranging in [-1,1] for each item. Henry's lexicon covers 189 words appearing in the company earnings press releases.
FinVADER outperforms VADER on Financial PhraseBank data:
The code for this benchmark test is here
Installation
FinVADER requires Python 3.8 - 3.11, and NLTK.
To install using pip, use:
pip install finvader
Data requirements
It requires complete text data without NaN values and empty strings. Remove them in the pre-processing part.
Usage
- Import the library:
from finvader import finvader
- Select lexicons:
def finvader(text = 'str', # Text
indicator = 'str', # VADER's indicator: 'pos'/'neg'/'neu'/'compound'
use_sentibignomics: bool= False, # Use SentiBignomics lexicon
use_henry: bool= False): # Use Henry's lexicon
)
- Use the classifier:
text = "The period's sales dropped to EUR 30.6 m from EUR 38.3 m, according to the interim report, released today."
scores = finvader(text,
use_sentibignomics = True,
use_henry = True,
indicator = 'compound' )
Documentation, examples and tutorials
Example of using the classifier:
import pandas as pd # read data
data = pd.read_csv("ecb_speeches.csv")
from finvader import finvader
data['finvader'] = data.contents.apply(finvader, # apply FinVADER and create a new column in data df
use_sentibignomics = True, # Use Lexicon 1
use_henry = True, # Use Lexicon 2
indicator="compound") # Use VADER's compound indicator
For examples of coding, read these tutorials:
FinVADER: Sentiment Analysis for Financial Applications here
Fine-tuning VADER Classifier with Domain-specific Lexicons here
Please visit here for any questions, issues, bugs, and suggestions.