NaN Quantitavity – Quant Trading, Statistical Learning, Coding and Brainstorming

2013-04-01, 9:00 PM, Monday

Risk of mistakely using empirical correlation

Filed under: Finance, R, Statistics — Tags: , , — weekendsunny @ 9:00 PM

picture1

#---------------------------------------------------------------------------------------------
#| Comparing different copula;
#| Showing why empirical correlation will be risky if your assumption
#| of the join distribution (ex: Multi normal vs. t) is not correct.
#|
#| Created by: Nan Zhou
#| Time: 2012-03-05
#---------------------------------------------------------------------------------------------
rm(list=ls())
#memory.size(max=4000)
require(copula)
n = 1000 # number of simulation of random sample
p =0.95 # quantile line for the plot
df = 1 # d.f. of marginal distribution of t
rho = 0.75 # empirical correlation
dist.margin = "norm" # Marginal distribution
#---------------------------------------------------------------------------------------------

#Normal Marginal

mvd.gaussian <- mvdc(copula = ellipCopula(family = "normal", param = rho),
                     margins = rep(dist.margin,2),
                     paramMargins = list(list(mean = 0,sd = 1), list(mean = 0, sd = 1)))

myMvd2 <- mvdc(copula = ellipCopula(family = "t", param = rho, df = df),
                margins = rep(dist.margin,2),
                paramMargins = list(list(mean = 0,sd = 1), list(mean = 0, sd = 1)))

#windows ( width = 150, height = 200)
#layout(matrix(c(1,2,3,3),ncol=2,byrow=TRUE),heights=c(1,2))

par(mfrow=c(3,2))
contour(mvd.gaussian, dmvdc, xlim = c(-3, 3), ylim = c(-3, 3))
title(paste("Gaussian Copula, Margin Dist =", dist.margin, ", rho =", rho, ", df =",df),
        cex.main=.8)
contour(myMvd2, dmvdc, xlim = c(-3, 3), ylim = c(-3, 3))
title(paste("t Copula, Margin Dist =", dist.margin, ", rho =", rho, ", df =",df),
        cex.main=.8)

for (rho in c(0.75,0.25))
for (df in c(1,5))
{
    seed = 1
    lim.plot = c(-3,3)
    u <- rcopula(normalCopula(param = rho, dim = 2), n)
    v <- rcopula(tCopula(param = rho, dim = 2, df = df), n)
    plot(qnorm(u),xlab='x1',ylab='x2',col='blue',cex=0.6,xlim=lim.plot,ylim=lim.plot)
    title(paste("rho=",rho,"df=",df))
    points(qnorm(v),xlab='x1',ylab='x2',col='red',cex=0.6)
    q = qnorm(p)
    abline(v = q)
    abline(h = q)
    text(-2,q,paste(p*100,"% quantile"))
}

Created by Pretty R at inside-R.org

2013-03-20, 10:04 AM, Wednesday

How many times are expected to try all songs in your itune library?

Filed under: R, Statistics — weekendsunny @ 10:04 AM

Today, when I was very boring with all my songs in iphone libary, I suddenly thought about one quick but useful math puzzle: assuming the iphone library has 100 songs and you use random shuffle (with re-sample) for next song, how many times are expected to listen all these songs?

This is very similar as the classical Coupon Collection problem. Here is a quick dirty R simulations, with two plots in regular scale and log scale:

N.songs = 100
temp <- function(N.songs){
return ( sum(N.songs/(1:N.songs)) )
}

x <- 1:N.songs
y <- sapply(x,temp)

par(mfrow=c(1,2))
plot(x,y,type=’l’,col=3, lty=2, lwd=2,log =’x’, xlab=’Number of New Songs’, ylab = “# Trials Needed”)
plot(x,y,type=’l’,col=3, lty=2, lwd=2, xlab=’Number of New Songs’, ylab = “# Trials Needed”)

songs

2012-10-18, 10:27 PM, Thursday

Pairs Trading – VXX vs. VXZ

Filed under: R, Statistics, Strategy — weekendsunny @ 10:27 PM
################################################################################################
# Quantitative Trading Strategies - Volatility Trading:
# Date Created: 2012-10-18
# Author:       Nan Zhou, zhounan1983@gmail.com
#
# Use volatility futures, shortdate vs medium dated
# VXX iPath S&P 500 VIX Short-Term Futures ETN (VXX)
# VXZ iPath S&P 500 VIX Mid-Term Futures ETN (VXZ)
#
# Plan is to check for trade entry at the open, and exit the trade at the close:
# IF signal < signalLowLim then in contango and do a carry trade
# IF signal > signalUpperLim then in backwardation so do a reverse carry
# ELSE do nothing

################################################################################################
## Install all required packages:
rm(list=ls())
library("quantmod"); library("PerformanceAnalytics")
par(ask=T); memory.size(max = 4000); options(digits=8, digits.secs=4)

## PLEASE CHANGE THE WORKFOLD TO YOUR WORK FOLD FIRST ##
workfold <- "H:\\My References\\Research\\2012_10_Trading Codes";
# workfold <- "C:\\Dropbox\\Research\\2012-10 VIX Trading Strategies";
setwd(workfold);

################################################################################################
# INPUT OF PARAMETERS
################################################################################################

signalLowLim   <- 0.9
signalUpperLim <- 1.1
symbolLst      <- c("VXX", "VXZ", "^GSPC")
dataSrc        <- "yahoo"
dataSrc        <- "csv"
startDate      <- as.Date("2010-01-01") #Specify what date to get the prices from
hedgeTrainingStartDate <- startDate #Start date for training the hedge ratio
hedgeTrainingEndDate   <- as.Date("2011-01-01") #End date for training the hedge ratio
tradingStartDate       <- as.Date("2011-01-02") #Date to run the strategy from
# tradingEndDate         <- as.Date("2011-12-31") #Date to run the strategy from
tradingEndDate         <- Sys.Date()

#--------------------------------------------------------------
### SECTION 1 - Download Data & Calculate Returns ###

## Download the data:
pnlEnv <- new.env() #Make a new environment for quantmod to store data in

importOHLC <- function(symbolLst = NULL, env = .GlobalEnv, src = "yahoo", format = "%m/%d/%Y",
                        return.class = "xts", from = Sys.Date()-365, end = Sys.Date()){
    if (src == "csv"){
        for (i in 1:length(symbolLst)){
            symbolLst <- toupper(gsub("\\^", "", symbolLst))
            symbol <- symbolLst[i]
            namesData <- c("Open", "High", "Low", "Close", "Volume", "Adj.Close")
            fr     <- read.csv(file = paste(symbol,'csv',sep='.'))
            fr     <- xts(fr[, namesData], as.Date(as.character(fr[, "Date"]), format = format))
            colnames(fr) <- paste(symbol, c("Open", "High", "Low", "Close", "Volume", "Adjusted"), sep = ".")
            fr <- quantmod:::convert.time.series(fr = fr, return.class = return.class)
            fr <- window(fr, start = from, end = end, extend = TRUE)
            assign(symbol, fr, env)
        }
    }
    else{
        getSymbols(symbolLst, env = env, src = src, from = from, end = end)
    }

    return(symbolLst)    
}

importOHLC(symbolLst, env = pnlEnv, src = "csv", format = "%m/%d/%Y", from = startDate, end = Sys.Date())

## Calculate returns for VXX and VXZ:
vxxRet <- (Cl(pnlEnv$VXX)/Op(pnlEnv$VXX))-1
colnames(vxxRet) <- "Ret"
pnlEnv$VXX <- cbind(pnlEnv$VXX,vxxRet)
vxzRet <- (Cl(pnlEnv$VXZ)/Op(pnlEnv$VXZ))-1
colnames(vxzRet) <- "Ret"
pnlEnv$VXZ <- cbind(pnlEnv$VXZ,vxzRet)

#--------------------------------------------------------------
### SECTION 2 - Calculating the hedge ratio ###
## Want to work out a hedge ratio, so that we can remain Vega neutral (the futures contact are trading VEGA)

trainVxx <- window(pnlEnv$VXX$Ret,start=hedgeTrainingStartDate ,end=hedgeTrainingEndDate)
trainVxz <- window(pnlEnv$VXZ$Ret,start=hedgeTrainingStartDate ,end=hedgeTrainingEndDate)
trainData = na.omit(cbind(trainVxx,trainVxz))
colnames(trainData) <- c("VXX","VXZ")

## Simply linearly regress the returns of Vxx with Vxz:
regression <- lm(trainData[,"VXZ"] ~ trainData[,"VXX"]) #Linear Regression
hedgeratio <- regression$coefficient[2]
plot(x=as.vector(trainData[,"VXX"]), y=as.vector(trainData[,"VXZ"]),
     main=paste("Hedge Regression: Y =", regression$coefficient[2], " * X + intercept"),
  	 xlab="Vxx Return", ylab="Vxz Return", pch=19)
abline(regression, col = 2 )

#--------------------------------------------------------------
### SECTION 3 - Generate trading signals ###
tSig <- Op(pnlEnv$VXX)/Op(pnlEnv$VXZ)
colnames(tSig) <- "Signal"
vxxSignal <- apply(tSig,1, function(x) {if(x<signalLowLim) { return (-1) } else { if(x>signalUpperLim) { return(1) } else { return (0) }}})
vxzSignal <- -1 * vxxSignal

strategyRet <- ((vxxSignal * pnlEnv$VXX$Ret) + hedgeratio * (vxzSignal * pnlEnv$VXZ$Ret) )
strategyRet <- window(strategyRet,start=tradingStartDate,end=tradingEndDate, extend = FALSE)
#Normalise the amount of money being invested on each trade so that we can compare to the S&P index later
strategyRet <- strategyRet * 1 / (1+abs(hedgeratio))
colnames(strategyRet) <- "StrategyReturns"

#--------------------------------------------------------------
### SECTION 5 - Performance Analysis ###

## Get the S&P 500 index data:
indexData <- pnlEnv
indexRet  <- (Cl(indexData$GSPC)-lag(Cl(indexData$GSPC),1))/lag(Cl(indexData$GSPC),1)
colnames(indexRet) <- "IndexRet"
pnlVec <- cbind(as.zoo(strategyRet),as.zoo(indexRet)) #Convert to zoo object
colnames(pnlVec) <- c("Vol Carry Trade","S&P500")
pnlVec <- na.omit(pnlVec)

charts.PerformanceSummary(pnlVec,main="Performance of Volatility Carry Trade",geometric=FALSE)
#Lets calculate a table of montly returns by year and strategy
cat("Calander Returns - Note 13.5 means a return of 13.5%\n")
print(table.CalendarReturns(pnlVec))

#Calculate the sharpe ratio
cat("Sharpe Ratio")
print(SharpeRatio.annualized(pnlVec))

#Calculate other statistics
cat("Other Statistics")
print(table.CAPM(pnlVec[,"Vol Carry Trade"],pnlVec[,"S&P500"]))
chart.Boxplot(pnlVec)
layout(rbind(c(1,2),c(3,4)))
chart.Histogram(pnlVec, main = "Plain", methods = NULL)
chart.Histogram(pnlVec, main = "Density", breaks=40, methods = c("add.density", "add.normal"))
chart.Histogram(pnlVec, main = "Skew and Kurt", methods = c("add.centered", "add.rug"))
chart.Histogram(pnlVec, main = "Risk Measures", methods = c("add.risk"))
layout(c(1,1))

Created by Pretty R at inside-R.org

Blog at WordPress.com.

Design a site like this with WordPress.com
Get started