Skip to content

trr266/discint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Disclosure Intensity Effects of Disclosure Regulations: European Evidence

This is the repository for the work-in-progress study

Euler, Simone, Joachim Gassen, and Jonas Materna (2025): Disclosure Intensity Effects of Disclosure Regulations: European Evidence.

While the paper relies on commercial "flat files" Orbis data, we here provide the code to reproduce our simulated results as well as the code that generates our findings and paper based on the Orbis data.

Configuration

Before running the code, you need to configure the repository by copying the file _discint.env to discint.env and editing it. It is safe to leave the default settings as they are. But if you want, you can set the log file location by changing the variable LOG_FILE.

The other options are only relevant if you have access to BvD Orbis data and want to reproduce the full paper. The processing of the Orbis data is quite memory intensive. If you run the code in an environment with low memory (less than 64GB of RAM and/or in a development container, see below), you should set the variable LOW_MEMORY to true (the default).

In this case, you also need to configure the following variables:

  • DUCKDB_FILE: Set this to the path where the DuckDB database file should be stored. The file will be created if it does not exist.
  • DUCKDB_MEMORY_LIMIT: Set this to the maximum amount of memory DuckDB can use. A good first guess is to set this to about 40% of your available RAM when running the code in a development environment.
  • DUCKDB_THREADS: Set this to the number of threads DuckDB can use. Maximum is the number of CPU cores available. Smaller values will reduce memory usage.

The default configuration has been tested to work on a M2 Macbook with 64GB RAM running docker. If you have more memory available, you can increase the memory limit and/or the number of threads to speed up the processing.

Reproducing the simulation results

We suggest to use the development container provided in the repository to reproduce our results. You can open the repository in a container in VSCode by following these instructions.

Our simulated data is provided in the data/precomputed folder. You can create the R objects for the figures and tables presented in the paper by sourcing code/res_simulations.R via running make results_sim in the terminal. This will generate and store all result objects in the file data/generated/results_sim.rdata.

If you want to reproduce the simulation data itself, simply delete (or rename) the data file data/precomputed/sim_data_1000.rds and run make results_sim again. This will take several hours (monitor the logs) but should eventually result in the same findings.

Reproducing the full paper

To reproduce the full paper, you need to download the BvD Orbis flat files manually, convert them to parquet format, and place them into the folder data/pulled/. These are the files that we used (downloaded in December 2024):

  • data/pulled/industry_global_financials_and_ratios_eur.parquet
  • data/pulled/legal_info.parquet
  • data/pulled/industry_classifications.parquet

Then, you should be able to run the analysis and build the paper by running make all.

Thoughts about the study?

If you have read or even reproduced our study, we would be super interested in hearing your views. Besides reaching out to us via email, you could also start a public discussion by opening up a GitHub Issue here in this repository.

Disclaimer

This project has received financial support from the TRR 266 "Accounting for Transparency".

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published