⚓️ SonaR R
Back-End for use with OpenCPU
Temporary Building Process
The public data is attached to this package as public data set in the data/
folder. Using the .onLoad()
function in the zzz.R
file, the data is loaded on package load to ensure fast access times. Since this package is automatically loaded in the OpenCPU server, the public data is available to all sessions instantly.
Currently the way to rebuild the public dataset is to run the create_public_data_package_files(public_data_path)
function and rebuild the package. This needs to be automated later on to cope with changing public data sets.
TODO
- Automatically build the package in the docker image using a CRON Job
- Include additional functionality distinguishing between private and public data
Problems with Roxygen
Roxygen does not know the public_data object. When running roxygen2::roxygenise()
you will get a error like Error in get(name, envir = env) : object 'public_data' not found
. A quickfix is to define public_data
and then Roxygen runs without problems.
public_data <- ''
roxygen2::roxygenise()
data
folder when editing inside of Docker image
File permissions of When RStudio
is ran in the docker image and it tries to change the contents of the folder sonaR/data
, you have to give it the proper permissions:
# The OpenCPU user is part of the www-data group
chgrp -R www-data data
# Allow group to modify the data
chmod -R 775 data
Principal Component Analysis
I've experimented with a number of principal component analysis implementations. Most of them take approximately the same time.
I checked prcomp
, fast.prcomp
and princomp
. The latter is the fastest and therefore the PCA is now created with princomp
.
See also
- https://stackoverflow.com/questions/8299460/what-is-the-fastest-way-to-calculate-first-two-principal-components-in-r
- https://www.rdocumentation.org/packages/gmodels/versions/2.16.2/topics/fast.prcomp