marcozanotti / dispositioneffect Goto Github PK

R Package to perform behavioral analysis on financial data.

Home Page: https://marcozanotti.github.io/dispositionEffect

License: Other

R 100.00%

behavioral-economics behavioral-sciences econometrics economics finance financial-analysis financial-data financial-markets rstats-package time-series

dispositioneffect's Introduction

Hi 👋, I'm Marco

A Data Scientist from 🇮🇹

📖 About Me

🖥 R and Python developer, passionate about learning and education
💼 Working at Shibumi Group
🎓 PhD Statistics at University of Milan-Bicocca
📑 Have a look at my Curriculum Vitae

Wanna know more? 👇

✔️ Skills

Python

SQLite

MySQL

PostgreSQL

BigQuery

Influx

MongoDB

Elastic

Git

GitHub

GitLab

GCP

AWS

Linux

Bash

👨‍💻 Data Science Libraries

👨‍🏫 Teachings

🔖 Learning

Python
Shiny
FastAPI
VS Code

🎲 Hobbies

I was a footballer. I played in Italian minor championships until 22.

I also like curling and footgolf.

Now I spend my free time reading manga, playing board games, and PlayStation.

I love listening folk and country music.

Since 2020, I practice kite surf, an amazing water-board sport,
and whenever I can I go kite surfing all around the world.

💻 Dev Stuff

🏆 Github Trophies

⚡ Github Stats

⚙️ Things I use to get stuff done

OS: Ubuntu 22.04
Laptop: Dell
Browser: Chrome
Code Editor: RStudio, PyCharm
To Stay Updated: Twitter, Medium

dispositioneffect's People

Contributors

Stargazers

Watchers

Forkers

rohitpandey13

dispositioneffect's Issues

parallelization

check whether it is possible to parallelize code within portfolio_compute

time series disposition effect

Improve portfolio_compute with the computation of:

time series disposition effect on updated results (absolute time series) using mean or median
time series disposition effect on transaction results (relative time series) using mean or median

Evaluate the possibility of adding the time series for each asset.

closest_historical_price add args

Add argument to control the unit time to round at.

add checks to portfolio_compute

Impossibilità riproduzione con diversi dati

Dopo aver sottomesso alla funzione portfolio_result i dataset investor e marketprices personalmente creati (file .csv), ricevo una serie di warnings relativi all'assenza del prezzo per la specifica transazione anche se presente nel dataset

Exporting a list of dataframes from R to Excel

I would like to conduct further analysis in excel, having computed Gains and Losses counts from a package "dispositionEffect" in R. The output having run the code below is also presented below.

code

p_res_full <- purrr::map(trx_list, portfolio_compute, market_prices = mkt)
p_res_full

This generates a list of data frames (more than 3000), as below for example.The columns for each of the data frame have same variables (same number), but there are different rows.

[992]]
  investor   asset  quantity  price   datetime      RG_count    RL_count   PG_count   PL_count
1    1932   NSK    223          6       2017-03-17        0                0            0                 0

[[993]]
  investor asset   quantity  price   datetime           RG_count    RL_count     PG_count     PL_count
1    1933   MC     7639        8        2016-03-02        0                    0                0                  0
2    1933  NL       4700       50       2016-02-22       NA                   NA           NA               NA
3    1933  RL       3880        2        2016-02-16        0                       0              0                  0

[[994]]
  investor  asset   quantity    price            datetime                          RG_count RL_count  PG_count  PL_count
1    1936     IV          439        10.6          2010-09-15 01:00:00               0                  0          1            0

[[995]]
  investor asset    quantity   price            datetime                         RG_count RL_count     PG_count PL_count
1    1940   PL    272              55                 2017-03-27 01:00:00        0                   0             1              0


[[997]]
  investor           asset    quantity  price            datetime                         RG_count   RL_count  PG_count PL_count
1    1944            FB         9040      6.0                2011-07-14 01:00:00        0                    0          1               0
2    1944            MC        21490    3.00              2010-10-20 01:00:00        0                   0           1               0
3    1944           RL          9340     1.20               2012-03-13 00:00:00        0                   0            0              0
4    1944            NM        6300     2.75              2012-03-22 00:00:00       NA                NA        NA              NA

I would like to export all these dataframes into a single excel sheet for further analysis. Please help out. I tried to use write_xlsx() but got error messages that i do not have enogh memory.

Grateful for your kind feedback

new update_portfolio function

Create a function that updates the portfolio when necessary (same as other update_. functions).

Change functions' names

Most function have non-consistent names. Change them.

data testing

clean db, create market prices and create a single db for each investor

indaco

test pkg on indaco cloud servers

unit test

Create unit tests on functions

unit tests onpaper_compute (testing all parameters and methods)
unit tests on realized_compute (testing all parameters and methods)
unit tests on evaluate_portfolio
unit tests on diffime_financial, difftime_compare
unit tests on update_expectedvalue
unit tests on update_portfolio
unit tests on update_results
unit tests on gains_and_losses (with different parameters)
unit tests on closest_market_price, aggregate_market_price
unit tests on portfolio_compute (with different parameters)
unit tests on disposition_effect et al.

xaringan presentation

significance test of disposition effect

I would like to test the significance of disposition effect from the dataframe of counts of gains and losses. The code below computes disposition effect on each portfolio. The cde is from Cran (developed by Marco Zanotti). It works well. My challenge and where i get errors is testing for statistical significance. I use the data set(DEanalysis -its for 10 investors) that comes with the code.

de <- purrr::map(p_res_full, disposition_compute) %>%
dplyr::bind_rows()
skimr::skim(de)
de # data frame containing DE_count

test for significance- to find out if the differences (i.e the disposition effected computed by methodology of RG/(RG+PG) - (RL/(RL+PL)) are statistically significant. I test using t test as below:

t.test(de, mu =0, alternative = "greater")

Error in if (stderr < 10 * .Machine$double.eps * abs(mx)) stop("data are essentially constant") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In var(x) : NAs introduced by coercion

The second part of my challenge is that finance literature about disposition effect computes further the rate at which investors realize gains rather losses. So besides, finding DE as difference, they compute DE as a ratio of (RG/(RG+PG)) / (RL/(RL+PL)) . Any guidance if there is any code that computes disposition effect in this manner?
Thanks

package website

Set up the package website with pkgdown and GitHub Pages

parallel portfolio_compute

create a portfolio_compute that runs in parallel on multiple investors

python developments

Translate R code into Python code.

CI/CD

set up continuous integration with GitHub Actions

disposition Effect-failure to pick new created data-Portfolio Compute function

Hi, I would like to be helped on the following issue- the function Portfolio Compute, is showing me results of the sample code when i have changed the data set , to use my own. For example, please look at the data sets-they have my own data eg head(investori) reflects the new data set from head(investor). The same applies for head(marketpricesi) This is done to differentiate from the data set that comes with the code. When i run portfolio compute, the code returns unchanged data set (the one used by the priducer), even after i change the wording in the portfolio compute function. It appears i have to define the arguments elsewhere, but i am not able to tell where exactly these have to be varied. Will be grateful for your kind feedback

portfolio transactions

head(investori)

A tibble: 6 x 6

investor type asset quantity price datetime

1 10000 B NBM 100 241 11/11/2014
2 10000 B STANDARD 100 400. 03/11/2014
3 10001 S MPICO 99000 6 19/05/2015
4 10002 S NBS 5489 10 15/03/2012
5 10003 S FMBCH 21876 6 25/01/2011
6 10003 S MPICO 45500 2.6 21/07/2010

#marketprices
head(marketpricesi)

A tibble: 6 x 3

asset datetime price

1 MPICO 04/01/10 2.6
2 NBM 04/01/10 59
3 NBS 04/01/10 14
4 FMBCH 05/01/10 10
5 ILLOV 05/01/10 110
6 OML 05/01/10 250

Gains and losses

portfolio_results <- portfolio_compute(

portfolio_transactions = investori,
market_prices = marketpricesi,
method = "count"
)
Error: Column type of portfolio_transactions should contain 'B' or 'S' only.

dplyr::select(portfolio_results, -datetime)
investor asset quantity price RG_count RL_count PG_count PL_count RG_total
1 4273N ACO 222 2.840 1 0 6 0 45
2 4273N AST 0 0.000 0 1 0 0 0
3 4273N IT3S 0 0.000 0 1 0 0 0
4 4273N LSUG 0 0.000 2 0 4 0 990
5 4273N TFI 1400 0.284 0 0 0 0 0
RL_total PG_total PL_total RG_value RL_value PG_value PL_value RG_duration
1 0 1054 0 0.008701958 0.00000000 0.03002533 0 439.900
2 430 0 0 0.000000000 -0.07497820 0.00000000 0 0.000
3 230 0 0 0.000000000 -0.03096539 0.00000000 0 0.000
4 0 1800 0 0.010551792 0.00000000 0.13492566 0 1411.617
5 0 0 0 0.000000000 0.00000000 0.00000000 0 0.000
RL_duration PG_duration PL_duration
1 0.0000 431.4500 0
2 165.1167 0.0000 0
3 56.9500 0.0000 0
4 0.0000 154.6167 0
5 0.0000 0.0000 0

Gains and losses

portfolio_resultsi <- portfolio_compute(

portfolio_transactionsi = investori,
market_pricesi = marketpricesi,
method = "count"
)
Error in portfolio_compute(portfolio_transactionsi = investori, market_pricesi = marketpricesi, :
unused arguments (portfolio_transactionsi = investori, market_pricesi = marketpricesi)

dplyr::select(portfolio_resultsi, -datetime)
Error in dplyr::select(portfolio_resultsi, -datetime) :
object 'portfolio_resultsi' not found

bug RL_value positive

check on INV025

Initial checks does not work

Debug the initial checks through the checking functions.

C++ difftime_financial

C++ implementation of difftime_financial

optimization

Code profiling and optimization. Evaluate possible parallelizations.

DE function

add function to compute disposition effect
de = (RG / (RG + PG)) - (RL / (RL + PL))
-1 a 1

0 disp eff (non rational)
<0 inv disp eff (rational)

disposition_summary improvement

add n

debug investors' failures

vignettes

Create the pkg vignettes on:

get started, simple tutorial on how to use the pkg
more complete tutorial with 10 sample investors (500 trx), all the charts and summaries
simple tutorial on code parallelization (with 10 sample investors)
simple tutorial on time series disposition effect

closest_market_price

consider returning a warning (not an error) if no market prices are available
in this case, no computations have to be performed on those assets whose market prices are not available

test "method" argument

method = "count"
Error: Can't subset columns that don't exist.
x Column RG_value doesn't exist.

disposition summary

create function that performs summary statistics of disposition effect based on portfolio results

%>% import

the %>% is not imported on library

add disposition effect computations

Create the function that allows to compute the disposition effect.

mean-reversion

Implement new functions that allow the analysis of mean reversion

Disposition Effect-portfolio compute function

Am new in R and am running a test for disposition effect using the data for a single investor before i slot in my data to see how the code runs. However, i get errors when i compute the gains and lossess using portfolio compute function. The line of the code is as below>>
p_res <- portfolio_compute(portfolio_transactions = trx_QZ621, market_prices = mkt_QZ621)
head(p_res)[, -5]

The error is as below>>
Error in check_values(names(portfolio_transactions), trg) :
could not find function "check_values"

head(p_res)[, -5]
Error in head(p_res) : object 'p_res' not found

Please assist