Skip to main content

A configurable, tunable, and reproducible library for CTR prediction

Project description

FuxiCTR

This repo is the latest dev version of the official release at huawei-noah/benchmark/FuxiCTR.

Click-through rate (CTR) prediction is an critical task for many industrial applications such as online advertising, recommender systems, and sponsored search. FuxiCTR provides an open-source library for CTR prediction, with stunning features in configurability, tunability, and reproducibility. It also supports the building of BARS-CTR-Benchmark, which aims for open benchmarking for CTR prediction.

Model List

Publication Model Paper Available
WWW'07 LR Predicting Clicks: Estimating the Click-Through Rate for New Ads :heavy_check_mark:
ICDM'10 FM Factorization Machines :heavy_check_mark:
CIKM'15 CCPM A Convolutional Click Prediction Model :heavy_check_mark:
RecSys'16 FFM Field-aware Factorization Machines for CTR Prediction :heavy_check_mark:
RecSys'16 YoutubeDNN Deep Neural Networks for YouTube Recommendations :heavy_check_mark:
DLRS'16 Wide&Deep Wide & Deep Learning for Recommender Systems :heavy_check_mark:
ICDM'16 IPNN Product-based Neural Networks for User Response Prediction :heavy_check_mark:
KDD'16 DeepCross Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features :heavy_check_mark:
NIPS'16 HOFM Higher-Order Factorization Machines :heavy_check_mark:
IJCAI'17 DeepFM DeepFM: A Factorization-Machine based Neural Network for CTR Prediction :heavy_check_mark:
SIGIR'17 NFM Neural Factorization Machines for Sparse Predictive Analytics :heavy_check_mark:
IJCAI'17 AFM Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks :heavy_check_mark:
ADKDD'17 DCN Deep & Cross Network for Ad Click Predictions :heavy_check_mark:
WWW'18 FwFM Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising :heavy_check_mark:
KDD'18 xDeepFM xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems :heavy_check_mark:
KDD'18 DIN Deep Interest Network for Click-Through Rate Prediction :heavy_check_mark:
CIKM'19 FiGNN FiGNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction :heavy_check_mark:
CIKM'19 AutoInt/AutoInt+ AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks :heavy_check_mark:
RecSys'19 FiBiNET FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction :heavy_check_mark:
WWW'19 FGCNN Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction :heavy_check_mark:
AAAI'19 HFM/HFM+ Holographic Factorization Machines for Recommendation :heavy_check_mark:
Neural Networks'20 ONN Operation-aware Neural Networks for User Response Prediction :heavy_check_mark:
AAAI'20 AFN/AFN+ Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions :heavy_check_mark:
AAAI'20 LorentzFM Learning Feature Interactions with Lorentzian Factorization :heavy_check_mark:
WSDM'20 InterHAt Interpretable Click-through Rate Prediction through Hierarchical Attention :heavy_check_mark:
DLP-KDD'20 FLEN FLEN: Leveraging Field for Scalable CTR Prediction :heavy_check_mark:
WWW'21 FmFM FM^2: Field-matrixed Factorization Machines for Recommender Systems :heavy_check_mark:

Dependency

FuxiCTR has the following dependent requirements to install. While the implementation of FuxiCTR should support more pytorch versions, we currently perform the tests on pytorch v1.0~1.1 only.

  • python 3.6
  • pytorch v1.0/v1.1
  • pyyaml >=5.1
  • scikit-learn
  • pandas
  • numpy
  • h5py
  • tqdm

Get Started

1. Run the demo

Please follow the examples in the demo directory to get started. The code workflow is structured as follows:

# Set the data config and model config
feature_cols = [{...}] # define feature columns
label_col = {...} # define label column
params = {...} # set data params and model params

# Set the feature encoding specs
feature_encoder = FeatureEncoder(feature_cols, label_col, ...) # define the feature encoder
feature_encoder.fit(...) # fit and transfrom the data

# Load data generators
train_gen, valid_gen, test_gen = data_generator(feature_encoder, ...)

# Define a model
model = DeepFM(...)

# Train the model
model.fit_generator(train_gen, validation_data=valid_gen, ...)

# Evaluation
model.evaluate_generator(test_gen)

2. Run the benchmark with given experiment_id in config file

For reproducing the experiment result, you can run the benchmarking script with the corresponding config file as follows.

  • --config: The config directory of data and model config files.
  • --expid: The specific experiment_id that denotes the detailed data and model settings.
  • --gpu: The gpu index used for experiment, and -1 for CPU.

In the following example, we create a demo model_config.yaml and dataset_config.yaml in benchmarks/expid_config, and set the experiemnt id FM_test.

cd benchmarks
python run_expid.py --config ./expid_config --expid FM_test --gpu 0

3. Tune the model hyper-parameters

For tuning model hyper-parameters, you can apply grid-search over the specified tuning space with the following script.

  • --config: The config file that defines the tuning space
  • --tag: (optional) Specify the tag to determine which expid to run (e.g. 001 for the first expid). This is useful to rerun one specific experiment_id that contains the tag.
  • --gpu: The available gpus for parameters tuning and multiple gpus can be used (e.g., using --gpu 0 1 for two gpus)

In the following example, we use the hyper-parameters of FM_test in benchmarks/expid_config as the base setting, and create a tuner config file FM_tuner_config.yaml in benchmarks/tuner_config, which defines the tuning space for parameter tuning. In particular, if a key in tuner_space has values stored in a list, those values will be grid-searched. Otherwise, the default value in FM_test will be applied. After finished, all the searched results can be accessed from FM_tuner_config.csv in the ./benchmarks folder.

cd benchmarks
python run_param_tuner.py --config ./tuner_config/FM_tuner_config.yaml --gpu 0 1

For more running examples, please refer to the benchmarking results in BARS-CTR-Benchmark.

Code Structure

Check an overview of code structure for more details on API design.

Discussion

Welcome to join our WeChat group for any questions and discussions.

Join Us

We have open positions for internships and full-time jobs. If you are interested in research and practice in recommender systems, please send your CV to jamie.zhu@huawei.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

fuxictr-1.0.2-py3-none-any.whl (79.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page