pyvine
This package provides regular vine modeling, sampling and testing algorithms. Also some popular bivariate copulas routines which are optimized for wider range of parameters, high precision and good performances.
Regular vine copula provides rich models for dependence structure
modeling. It combines vine structures and families of bivariate
copulas to construct a number of multivariate distributions that can
model a wide range dependence patterns with different tail dependence
for different pairs. Two special cases of regular vine copulas, C-vine
and D-vine copulas, have been deeply investigated.
We propose the Python package, pyvine, for modeling, sampling and
testing a more generalized regular vine copula (R-vine for
short). R-vine modeling algorithm searches for the R-vine structure
which maximizes the vine tree dependence, i.e., the sum of the
absolute values of kendall's tau for paired variables on edges using
PRIM algorithm of minimum-spanning-tree in a sequential way. The
maximum likelihood estimation algorithm takes the sequential
estimation as initial value and uses L-BFGS-B algorithm for the
likelihood value optimization. R-vine sampling algorithm traverses all
the edges of vine structure from the last tree in a recursive way, and
generates the marginal samples on each edge according to some nested
conditions. Goodness-of-fit testing algorithm first generates
Rosenblatt's transformed data E, then tests the composite hypothesis
H_0*: E ~ C* by using Anderson-Darling statistic, where C* is the
independence copula. Bootstrap method will generate the empirical
distribution of Anderson-Darling statistic replications to compute an
adjusted P-value.
The computing of related functions of copulas such as cumulative
distribution functions often meets with the problem of overflow. We
solve this problem by reinvestigating the following six popular
families of bivariate opulas: Normal, Student t, Clayton, Gumbel,
Frank and Joe copulas. Approximations of the above related functions
of copulas are given when the overflow occurs in the computations. All
these are implemented in a subpackage bvcopula of pyvine, in which
subroutines are written in Fortran and wrapped into Python via f2py
and good performance and high precision are both guaranteed.
An example for Rvine copula modeling is given as below::
# Example
import pandas as ps
import pyvine as pv
## read the data and do rank transformation
dat = ps.read_csv("data.csv",index_col = 0, parse_dates = 0)
cp_dat = dat.rank() / ( len(dat) + 1 )
## initialize R-vine object named rv
rv = pv.Rvine(cp_dat)
## sequential estimation for rv. 'structure' accepts 'r' for R-vine,
## 'c' for C-vine and 'd' for D-vine, 'familyset' accepts list of
## integers from 1 to 6, 'threads_num' accepts integer specifying number
## of threads using for taking mle on edges of the same vine tree
## simultaneously.
rv.modeling(structure = 'r', familyset = [1,2,3,4,5,6], threads_num = 2)
## maximum likelihood estimation for rv. 'disp' controls the printing
## of ratio of progress of iterating for L-BFGS-B algorithm, 'threads_num'
## specifies the number of threads using for computing loglikelihood value
## for each edge in the same vine tree.
rv.mle(disp=False, threads_num = 2)
## plot the R-vine structure for modeled object rv. All the vine trees will
## be plotted as default.
rv.plot()
## display the result of estimation on each edge. 'ndigits' controls number
## of decimal digits for result.
rv.res(ndigits = 3)
## testing
rv.test()
To compile and install on linux (substitute 'gnu95' with 'mingw32' on Windows)::
$ python setup.py config_fc --opt="-fopenmp" build --fcompiler=gnu95
$ python setup.py install
taizhonglab.ustc.edu.cn/software/pyvine/pyvine-0.5.0.tar.gz
Zhenfei Yuan, Taizhong Hu
4373a9bc3de658cd0c31711489e3a6621644011f
0.5.0