-
Notifications
You must be signed in to change notification settings - Fork 16
Description
The precursor of this issue is #400 .
Currently, if you start an R session in a rv project folder, the library paths are 2 paths:
The first is the local ./rv environment, and the second is the global library which is defined in .Library.
# setup:
cd /tmp
mkdir rv-test
cd rv-test
rv init
R.libPaths()
## [1] "/private/tmp/rv-test/rv/library/4.5/arm64"
## [2] "/Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library"Having a look into the source code of .libPaths shows why:
> .libPaths
function (new, include.site = TRUE)
{
if (!missing(new)) {
new <- Sys.glob(path.expand(new))
paths <- c(new, if (include.site) .Library.site, .Library)
paths <- paths[dir.exists(paths)]
.lib.loc <<- unique(normalizePath(paths, "/"))
}
else .lib.loc
}
<bytecode: 0x138b77b18>
<environment: 0x138b72ed8>.lib.loc is a variable generated as a closure, when .libPaths were defined.
paths <- c(new, if (include.site) .Library.site, .Library)This always appends .Library as the second path, while .Library.site which is by default empty get only appended depending on include.site is TRUE or FALSE.
What is the problem here?
## [2] "/Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library"
is a global library folder and one can add libraries to it if one uses install.packages().
This leads to a leakage of packages into the rv environment - and it is not possible to avoid this kind of "contamination" of the local rv environment by this global library folder - at least with the current definition of .libPaths().
Option 1: Manipulate .lib.loc
It is possible, however, to manipulate the closure variabel .lib.loc:
e <- environment(.lib.loc)
e$.lib.loc
## [1] "/private/tmp/rv-test/rv/library/4.5/arm64"
## [2] "/Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library"
e$.lib.loc <- .libPaths()[1]
> e$.lib.loc
[1] "/private/tmp/rv-test/rv/library/4.5/arm64"
> .libPaths()
[1] "/private/tmp/rv-test/rv/library/4.5/arm64"We can manipulate temporarily .libPaths().
But this is not a clean solution, because core functions can and at least the next library() or require() will call .libPaths()again and re-attach .Library at the end of the .lib.loc closure variable.
Option 2: Shadow library() and require() to cleanly address only the current library path
This is possible - but core functions could still access .Library or re-attach .Library to .lib.loc.
In addition - in deeper processes of R, .Library might play a role (as far as I know, it does).
Option 3: The best solution
If .Library would contain only the first packages without user-attached packages, then keeping .Library would be possible.
And no other code in base R needs to be redefined.
The best solution would be - to keep the global library as it is (with user-attached packages), and instead create a sandbox library folder,
which contains only the "base" and "recommended" packages of R - set it to read-only mode and point .Library to it.
By this, we can ensure that the second path, the .Library path is always available with the full r-base and recommended packages, but clean
from user-attached files, ensuring the full functionality of R to any time.
I realized this exactly is the solution renv follows - which was create by the Posit and Tidyverse teams - to ensure the cleanness of renv environments.
So with my PR #402 I imitate exactly this strategy: To ensure at activate.R execution the generation or a sandbox environment, moving all base and recommended packages to it and pointing .Library to it (it has to be unblocked first and then at the end re-blocked).