<finalization interface>= typedef void (*R_CFinalizer_t)(SEXP); void R_RegisterFinalizer(SEXP s, SEXP fun); void R_RegisterCFinalizer(SEXP s, R_CFinalizer_t fun); void R_RegisterFinalizerEx(SEXP s, SEXP fun, Rboolean onexit); void R_RegisterCFinalizerEx(SEXP s, R_CFinalizer_t fun, Rboolean onexit);
<weak reference interface>= SEXP R_MakeWeakRef(SEXP key, SEXP val, SEXP fin, Rboolean onexit); SEXP R_MakeWeakRefC(SEXP key, SEXP val, R_CFinalizer_t fin, Rboolean onexit); SEXP R_WeakRefKey(SEXP w); SEXP R_WeakRefValue(SEXP w); void R_RunWeakRefFinalizer(SEXP w);
The design of the weak reference system is based on the one used in the Glasgow Haskell system. A weak reference contains a key and a value. Values are reachable if they are reachable directly from roots or through weak references with reachable keys. Whether or not the weak reference itself is reachable does not matter. This recursive definition requires a fixed point calculation to determine the reachable nodes.
When the collector determines that a key in a weak reference is no
longer reachable, the the key and value of the reference are replaced
by R_NilValue
and the finalizer is scheduled to run.
The finalization interface is layered on top of the weak reference
system. For example, the R_RegisterCFinalizerEx
function is just
<finalization implementation>= void R_RegisterCFinalizerEx(SEXP s, R_CFinalizer_t fun, Rboolean onexit) { R_MakeWeakRefC(s, R_NilValue, fun, onexit); }
R does not yet have thread support, so this is not an option. With code written in C there is complete control over where a GC could occur, and hence where a finalizer might be run. This allows safe code to be written in C. In R we cannot control when the collector runs and hence cannot control when finalizers might run.
A possible interim solution might be to allow finalizations to be suspended temporarily from R, for example allowing
<suspend finalizations in R>= without.finalizations(expr)
The main drawback of doing this is that it does not make sense in a threaded context and code that uses it would have to be changed once we add threads. On the other hand very little code will use this and all uses would be easy enough to find.
Another approach would be to borrow from MzScheme's weak boxes the idea that an object ready for finalization be placed on some sort of queue and leave it to the programmer to run the finalizers in that queue periodically. With threads, the default queue could be one that is managed by a system finalization thread, but an alternate queue could be provided if needed.
wfile
.
A simple interface to the fopen
and fclose
calls could be
implemented using external pointer objects to represent file streams
and finalization to insure files are closed. Weak references allow us
to maintain a list of open files without preventing the garbage
collection of unreachable files.
The internal portions of the interface might consist of a file
wfile.c
and the R portions might be in wfile.R
.
<wfile.c>= #include <stdio.h> #include "Rinternals.h" #include "R_ext/Rdynload.h" <wfile.c declarations> <wfile.c globals and macros> <wfile.c functions>
<wfile.R>= <wfile.R public functions> <wfile.R initialization function>
<wfile.c globals and macros>= (<-U) [D->] static SEXP WFILE_type_tag;
DefinesWFILE_type_tag
(links are to index).
<initialize type tag>= (U->) WFILE_type_tag = install("WFILE_TYPE_TAG");
Checking of a file stream argument is done by the macro
CHECK_WFILE_STREAM
:
<wfile.c globals and macros>+= (<-U) [<-D->] #define CHECK_WFILE_STREAM(s) do { \ if (TYPEOF(s) != EXTPTRSXP || \ R_ExternalPtrTag(s) != WFILE_type_tag) \ error("bad file stream"); \ } while (0)
DefinesCHECK_WFILE_STREAM
(links are to index).
An alternative to using a symbol as the type identifier would be to use an arbitrary allocated object, which would then have to be stored in the precious list. The advantage would be complete uniqueness within the session; the drawback is somewhat unclear semantics across save/load.
fopen
passes its file name and mode arguments to
the C function WFILE_open
.
<wfile.R public functions>= (U->) [D->] fopen <- function(name, mode = "r") .Call("WFILE_open", as.character(name), as.character(mode), PACKAGE="wfile")
Definesfopen
(links are to index).
The C function WFILE_open
opens the file and creates a weak
reference to register a finalizer and store the name of the file
stream while it is reachable.
<wfile.c declarations>= (<-U) [D->] static SEXP WFILE_open(SEXP name, SEXP mode);
DefinesWFILE_open
(links are to index).
<wfile.c functions>= (<-U) [D->] static SEXP WFILE_open(SEXP name, SEXP mode) { FILE *f = fopen(CHAR(STRING_ELT(name, 0)), CHAR(STRING_ELT(mode, 0))); if (f == NULL) return R_NilValue; else { SEXP val, ref; PROTECT(val = R_MakeExternalPtr(f, WFILE_type_tag, R_NilValue)); PROTECT(ref = R_MakeWeakRefC(val, name, (R_CFinalizer_t) WFILE_close, TRUE)); AddFileRef(ref); UNPROTECT(2); return val; } }
DefinesWFILE_open
(links are to index).
The R function fclose
just calls the C function WFILE_close
:
<wfile.R public functions>+= (U->) [<-D->] fclose <- function(stream) .Call("WFILE_close", stream, PACKAGE="wfile")
Definesfclose
(links are to index).
The C function WFILE_close
closes the stream and clears the
pointer unless the pointer is already NULL
, which would indicate
that the file has already been closed.
<wfile.c declarations>+= (<-U) [<-D->] static SEXP WFILE_close(SEXP s);
DefinesWFILE_close
(links are to index).
<wfile.c functions>+= (<-U) [<-D->] static SEXP WFILE_close(SEXP s) { FILE *f; CHECK_WFILE_STREAM(s); f = R_ExternalPtrAddr(s); if (f != NULL) { fclose(f); R_ClearExternalPtr(s); } return R_NilValue; }
DefinesWFILE_close
(links are to index).
If a file stream is closed by user code, then there is no longer any
need for finalization. But providing a mechanism for removing
finalizers is more trouble than it is worth, so the finalization
mechanism will eventually call fclose
, but nothing much will
happen since the stream pointer will have been cleared. But this
issue needs to be kept in mind in designing finalizer functions.
fgets
function that uses a fixed size buffer.
<wfile.R public functions>+= (U->) [<-D->] fgets <- function(stream) .Call("WFILE_gets", stream, PACKAGE="wfile")
Definesfgets
(links are to index).
<wfile.c declarations>+= (<-U) [<-D->] static SEXP WFILE_gets(SEXP s);
<wfile.c functions>+= (<-U) [<-D->] static SEXP WFILE_gets(SEXP s) { char buf[512]; FILE *f; CHECK_WFILE_STREAM(s); f = R_ExternalPtrAddr(s); if (f == NULL) error("file pointer is NULL"); if (fgets(buf, sizeof(buf), f) == NULL) return R_NilValue; else { SEXP val; PROTECT(val = allocVector(STRSXP, 1)); SET_STRING_ELT(val, 0, mkChar(buf)); UNPROTECT(1); return val; } }
DefinesWFILE_gets
(links are to index).
FileList
. The
value is a CONS
cell that is registered as a permanent object.
The actual list is stored in the CDR
of the cell.
<wfile.c globals and macros>+= (<-U) [<-D->] static SEXP FileList;
DefinesFileList
(links are to index).
<initialize file list>= (U->) FileList = CONS(R_NilValue, R_NilValue); R_PreserveObject(FileList);
This should probably be in a public header file:
<wfile.c declarations>+= (<-U) [<-D->] extern void R_PreserveObject(SEXP);
DefinesR_PreserveObject
(links are to index).
A new file is added to the list with AddFileRef
.
<wfile.c declarations>+= (<-U) [<-D->] static void AddFileRef(SEXP ref);
DefinesAddFileRef
(links are to index).
<wfile.c functions>+= (<-U) [<-D->] static void AddFileRef(SEXP ref) { SEXP f, files, next = NULL, last = NULL; files = CDR(FileList); for (f = files; f != R_NilValue; f = next) { SEXP ref = CAR(f); SEXP key = R_WeakRefKey(ref); next = CDR(f); if (key == R_NilValue || R_ExternalPtrAddr(key) == NULL) { if (last == NULL) files = next; else SETCDR(last, next); } else last = f; } SETCDR(FileList, CONS(ref, files)); }
DefinesAddFileRef
(links are to index).
The function flist
returns a list of the names, as specified to
fopen
, of the open files.
<wfile.R public functions>+= (U->) [<-D] flist <- function() .Call("WFILE_list", PACKAGE="wfile")
Definesflist
(links are to index).
<wfile.c declarations>+= (<-U) [<-D] static SEXP WFILE_list(void);
DefinesWFILE_list
(links are to index).
<wfile.c functions>+= (<-U) [<-D->] static SEXP WFILE_list(void) { SEXP files, val = R_NilValue; for (files = CDR(FileList); files != R_NilValue; files = CDR(files)) { SEXP ref = CAR(files); SEXP key = R_WeakRefKey(ref); if (key != R_NilValue && R_ExternalPtrAddr(key) != NULL) { PROTECT(key); val = CONS(R_WeakRefValue(ref), val); UNPROTECT(1); } } return PairToVectorList(val); }
DefinesWFILE_list
(links are to index).
The list returned reflect files that were open sometime while this routine was run. It is possible for files as the end of the list to be closed by an allocation needed for adding items to the beginning of the list. A more sophisticated implementation would return a list of the file objects and these objects would provide access to their file names.
<wfile.c globals and macros>+= (<-U) [<-D] static R_CallMethodDef WFILE_CallDefs[] = { {"WFILE_open", (DL_FUNC) WFILE_open, 2}, {"WFILE_close", (DL_FUNC) WFILE_close, 1}, {"WFILE_gets", (DL_FUNC) WFILE_gets, 1}, {"WFILE_list", (DL_FUNC) WFILE_list, 0}, {NULL} };
DefinesWFILE_CallDefs
(links are to index).
The initialization routines are
<wfile.c functions>+= (<-U) [<-D] void R_init_wfile(DllInfo *info) { <initialize type tag> <initialize file list> R_registerRoutines(info, NULL, WFILE_CallDefs, NULL, 0); }
DefinesFILE_init
(links are to index).
<wfile.R initialization function>= (U->) .First.lib <- function(lib, pkg) { library.dynam( "wfile", pkg, lib ) }
Defines.First.lib
(links are to index).
> library(wfile) > f<-fopen("simpleref.nw") > g<-fopen("weakfinex.nw")
The list of open files:
> flist()[[1]]
[1] "simpleref.nw"[[2]]
[1] "weakfinex.nw"
Read a few lines from each:
> fgets(g) [1] "% -*- mode: Noweb; noweb-code-mode: c-mode -*-\n" > fgets(g) [1] "\n" > fgets(f) [1] "% -*- mode: Noweb; noweb-code-mode: c-mode -*-\n" > fgets(f) [1] "\n"
Now drop the reference to f
, run the garbage collector and look at
the new list of open files:
> f<-NULL
> gc()
used (Mb) gc trigger (Mb)
Ncells 194292 5.2 407500 10.9
Vcells 37333 0.3 786432 6.0
> flist()
[[1]]
[1] "weakfinex.nw"
If we open a new file and explisitly close g
, then the result will
also be reflected in the open file list:
> f<-fopen("weakfin.nw")
> fclose(g)
NULL
> flist()
[[1]]
[1] "weakfin.nw"