Anatomy of a module

Modules are defined in a declarative way, using the keywords %requires% and %provides%, and have four main components:

  • A name, which identifies the module.
  • A list of required dependencies, if any.
  • A provider, which expresses the module’s feature(s).
  • Optional docstrings, intended to document the internals of a module.

A typical module looks like the following:

"name_of_module" %requires% list(

  # A list of dependencies.
  dependency_1 = "name_of_dependency_1", 
  dependency_2 = "name_of_dependency_1",
  ... = "..."

) %provides% {

  #' A recommended docstring intended to document the internals of the module.

  # A section where add-on packages are loaded and attached.
  library(package_1)
  library(package_2)
  library(...)

  # Some code that uses the objects `dependency_1` and `dependency_2" 
  # returned by the modules "name_of_dependency_1" and "name_of_dependency_2".
  object <- { ... }

  # A resulting object, which can be directly consumed or in turn injected as a 
  # dependency.
  return(object)

} 

When a module is defined, modulr has to make it in order to evaluate the code it provides:

result <- make("name_of_module") 

or with a handy syntactic sugar:

result %<=% "name_of_module" 

or interactively with the hit function:

hit(name_of_module) # or
hit(name_of) # which will prompt the user to choose among all possible match

The result contains the computed object exposed by the module. Under the hood, the dependencies have been sorted and appropriately made, and their resulting objects injected where required.

A first working example

Let us start by defining some modules and dependencies:

"foo" %provides% "Hello"
#> [2018-12-02T16:14:33 UTC] Defining 'foo' ... OK

"bar" %provides% "World"
#> [2018-12-02T16:14:33 UTC] Defining 'bar' ... OK

"baz" %provides% "!"
#> [2018-12-02T16:14:33 UTC] Defining 'baz' ... OK

"foobar" %requires% list(
  f = "foo", 
  b = "bar",
  z = "baz"
) %provides% { 
  #' Return a concatenated string. 
  paste0(f, ", ", tolower(b), z) 
} 
#> [2018-12-02T16:14:33 UTC] Defining 'foobar' ... OK

Use info to output the docstrings:

info("foobar") 
#> Return a concatenated string.

Use lsmod to list all defined modules and their properties in a data frame:

lsmod(cols = c("name", "type", "dependencies", "uses", "size", "modified")) 
#>     name type dependencies uses      size                modified
#> 1    bar <NA>            0    1 152 bytes 2018-12-02T16:14:33 UTC
#> 2    baz <NA>            0    1 152 bytes 2018-12-02T16:14:33 UTC
#> 3    foo <NA>            0    1 152 bytes 2018-12-02T16:14:33 UTC
#> 4 foobar <NA>            3    0    5.2 Kb 2018-12-02T16:14:33 UTC

In this example, foobar relies on three dependencies, and foo, bar and baz are both injected once. Use plot_dependencies to see these relations:

plot_dependencies(render_engine = chord_engine) 

Use make to get the resulting object provided by the module:

make("foobar")
#> [2018-12-02T16:14:33 UTC] Making 'foobar' ...
#> [2018-12-02T16:14:33 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:33 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:33 UTC] * Sorting 3 dependencies with 3 relations ... on 1 layer, OK
#> [2018-12-02T16:14:33 UTC] * Evaluating new and outdated dependencies ...
#> [2018-12-02T16:14:33 UTC] ** Evaluating #1/3 (layer #1/1): 'baz' ...
#> [2018-12-02T16:14:33 UTC] ** Evaluating #2/3 (layer #1/1): 'bar' ...
#> [2018-12-02T16:14:33 UTC] ** Evaluating #3/3 (layer #1/1): 'foo' ...
#> [2018-12-02T16:14:33 UTC] DONE ('foobar' in 0.056 secs)
#> [1] "Hello, world!"

Voilà! All the depencencies have been evaluated, injected and processed to return the expected "Hello, world!". As a matter of fact, the dependencies form a directed acyclic graph which is topologically sorted, just in time to determine a well ordering for their evaluation before injection.

lsmod(cols = c("name", "type", "dependencies", "uses", "size", "modified")) 
#>     name      type dependencies uses      size                modified
#> 1    bar character            0    1 152 bytes 2018-12-02T16:14:33 UTC
#> 2    baz character            0    1 152 bytes 2018-12-02T16:14:33 UTC
#> 3    foo character            0    1 152 bytes 2018-12-02T16:14:33 UTC
#> 4 foobar character            3    0    5.2 Kb 2018-12-02T16:14:33 UTC

Types of modules

Singletons

All modules are singletons: once evaluated, they always return the same resulting object. This is one of the great advantages of modulr: module evaluation takes place parsimoniously, when changes are detected or explicitely required, à la GNU Make.

"timestamp" %provides% {
  #' Return a string containing a timestamp.
  format(Sys.time(), "%H:%M:%OS6")
}
#> [2018-12-02T16:14:33 UTC] Defining 'timestamp' ... OK

Successive make calls on the module will not imply its re-evaluation:

make("timestamp")
#> [2018-12-02T16:14:33 UTC] Making 'timestamp' ...
#> [2018-12-02T16:14:33 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:33 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:33 UTC] DONE ('timestamp' in 0.014 secs)
#> [1] "16:14:33.563820"
with_verbosity(0, make("timestamp")) # temporarily change the verbosity of make
#> [1] "16:14:33.563820"

Notice the with_verbosity wrapper around the call. To force re-evaluation, just touch the module:

touch("timestamp")
#> [2018-12-02T16:14:33 UTC] Touching 'timestamp' ... OK
make("timestamp")
#> [2018-12-02T16:14:33 UTC] Making 'timestamp' ...
#> [2018-12-02T16:14:33 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:33 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:33 UTC] DONE ('timestamp' in 0.015 secs)
#> [1] "16:14:33.611773"

Any change of the module’s definition (even its docstrings) will be detected:

"timestamp" %provides% {
  #' Return a string containing a timestamp with more information.
  format(Sys.time(), "%Y-%m-%d %H:%M:%OS6")
}
#> [2018-12-02T16:14:33 UTC] Re-defining 'timestamp' ... OK

make("timestamp")
#> [2018-12-02T16:14:33 UTC] Making 'timestamp' ...
#> [2018-12-02T16:14:33 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:33 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:33 UTC] DONE ('timestamp' in 0.014 secs)
#> [1] "2018-12-02 16:14:33.645560"

Prototypes

It is granted that all modules are singletons. Nonetheless, a module is allowed to return any object, in particular it can return a function (closure) that itself returns a desired object or produces some side effect. In this case, such a module behaves like a so-called prototype.

"timestamp" %provides% {
  function() format(Sys.time(), "%H:%M:%OS6")
}
#> [2018-12-02T16:14:33 UTC] Defining 'timestamp' ... OK
make("timestamp")()
#> [2018-12-02T16:14:33 UTC] Making 'timestamp' ...
#> [2018-12-02T16:14:33 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:33 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:33 UTC] DONE ('timestamp' in 0.015 secs)
#> [1] "16:14:33.701388"

with_verbosity(0L, make("timestamp")())
#> [1] "16:14:33.720088"

It is important to emphasize that the module is still a singleton: the second make call doesn’t re-evaluate. But the function that is returned by the module is itself re-evaluated each time it is called.

Memoised prototypes

Singletons produce cached objects at make-time and prototypes produce computed objects at run-time. In a complementary manner, memoised modules produce cached objects at run-time. Memoisation and Hadley Wickam’s memoise package give an elegant solution to this requirement.

To see the essence of what is happening, we decrease the verbosity of modulr and set up a simple starting scenario: foo requires the somewhat resource-consuming timestamp module, defined as a singleton:

set_verbosity(1L) # messages are shown only when changes occur

"timestamp" %provides% {
  # This is a singleton.
  message("'timestamp' is evaluated after a (short) pause...")
  Sys.sleep(1L)
  format(Sys.time(), "%H:%M:%OS6")
}

"foo" %requires% list(
  timestamp = "timestamp"
) %provides% {
  "foo"
}
#> Warning: [2018-12-02T16:14:33 UTC] Possibly unused dependency in 'foo': 'timestamp'.

system.time(make("foo"))
#> [2018-12-02T16:14:33 UTC] Evaluating #1/1 (layer #1/1): 'timestamp' ...
#> 'timestamp' is evaluated after a (short) pause...
#>    user  system elapsed 
#>   0.029   0.000   1.030

In this example, timestamp is evaluated even though it is not explicitely used by foo. It just computes a timestamp after a short pause, but it could be virtually very resource-consuming at make-time.

Let us re-define timestamp as a prototype:

"timestamp" %provides% {
  # This is a prototype.
  function() {
    message("'timestamp' is evaluated after a (short) pause...")
    Sys.sleep(1L)
    format(Sys.time(), "%H:%M:%OS6")
  }
}
#> [2018-12-02T16:14:34 UTC] Re-defining 'timestamp' ... OK

system.time(make("foo"))
#> [2018-12-02T16:14:34 UTC] Evaluating #1/1 (layer #1/1): 'timestamp' ...
#>    user  system elapsed 
#>   0.024   0.002   0.026

Here, the evaluation consists of defining a function that pauses for a while and returns a timestamp, only when the function is explicitely called. Even if the computation encapsulated by the function is very resource-consuming, no evaluation of the returned function takes place at make-time.

Finally, let us re-define timestamp as a memoised module:

"timestamp" %provides% {
  # This is a memoised module.
  memoise::memoise(
    function() {
      message("'timestamp' is evaluated after a (short) pause...")
      Sys.sleep(1L)
      format(Sys.time(), "%H:%M:%OS6")
    }
  )
}
#> [2018-12-02T16:14:34 UTC] Re-defining 'timestamp' ... OK

system.time(make("foo"))
#> [2018-12-02T16:14:35 UTC] Evaluating #1/1 (layer #1/1): 'timestamp' ...
#>    user  system elapsed 
#>   0.028   0.000   0.027

The timestamp module returns a function which will be evaluated only when explicitely called at run-time. Let us re-define foo in order that it effectively uses timestamp.

"foo" %requires% list(
  timestamp = "timestamp"
) %provides% {
  message("It is ", timestamp())
  "foo"
}
#> [2018-12-02T16:14:35 UTC] Re-defining 'foo' ... OK

system.time(make("foo"))
#> 'timestamp' is evaluated after a (short) pause...
#> It is 16:14:36.197722
#>    user  system elapsed 
#>   0.028   0.000   1.029

Here, a timestamped message is outputed. Let us force the re-evaluation of foo.

touch("foo")
system.time(make("foo"))
#> It is 16:14:36.197722
#>    user  system elapsed 
#>   0.028   0.000   0.029

The memoised version of timestamp is evaluated only at run-time, not at make-time; moreover, the string containing the actual timestamp is computed only once and then cached for future calls, avoiding re-evaluation.

To force re-evaluation of the memoised function exposed by timestamp, use memoise::forget.

memoise::forget(make("timestamp"))
#> [1] TRUE
touch("foo")
system.time(make("foo"))
#> 'timestamp' is evaluated after a (short) pause...
#> It is 16:14:37.437601
#>    user  system elapsed 
#>   0.029   0.001   1.031

Lists

It is often useful for a module to expose several (immutable, cf. infra) objects at once by returning a list.

"timestamps" %provides% {
  now <- function() Sys.time()
  list(
    origin = structure(0L, class = "Date"),
    yesterday = function() now() - 86400L,
    now = now,
    tomorrow = function() now() + 86400L
  )
}
#> [2018-12-02T16:14:37 UTC] Defining 'timestamps' ... OK

ts %<=% "timestamps"
#> [2018-12-02T16:14:37 UTC] Making 'timestamps' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('timestamps' in 0.025 secs)

ts$origin
#> [1] "1970-01-01"
ts$yesterday()
#> [1] "2018-12-01 16:14:37 UTC"
ts$now()
#> [1] "2018-12-02 16:14:37 UTC"
ts$tomorrow()
#> [1] "2018-12-03 16:14:37 UTC"

Environments

It is often useful for a module to expose several mutable objects at once by returning an environment.

"configuration" %provides% {
  env <- new.env(parent = emptyenv())
  env$shape <- "circle"
  env$color <- "blue"
  env$size <- 13L
  env
}
#> [2018-12-02T16:14:37 UTC] Defining 'configuration' ... OK

config %<=% "configuration"
#> [2018-12-02T16:14:37 UTC] Making 'configuration' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('configuration' in 0.014 secs)

config$color
#> [1] "blue"
config$color <- "red"
config$color
#> [1] "red"

This kind of module can be used to share mutable data between modules, without polluting the Global Environment.

"widget_A" %requires% list(
  config = "configuration"
) %provides% {
  list(
    switch_color = function()
      config$color <- if (config$color == "blue") "red" else "blue"
  )
}
#> [2018-12-02T16:14:37 UTC] Defining 'widget_A' ... OK

"widget_B" %requires% list(
  config = "configuration"
) %provides% {
  list(
    switch_shape = function()
      config$shape <- if (config$shape == "circle") "square" else "circle"
  )
}
#> [2018-12-02T16:14:37 UTC] Defining 'widget_B' ... OK

widget_A %<=% "widget_A"
#> [2018-12-02T16:14:37 UTC] Making 'widget_A' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('widget_A' in 0.027 secs)
widget_B %<=% "widget_B"
#> [2018-12-02T16:14:37 UTC] Making 'widget_B' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('widget_B' in 0.027 secs)

widget_A$switch_color()
config$color
#> [1] "blue"

widget_B$switch_shape()
config$shape
#> [1] "square"

The modulr package implements the dedicated syntactic sugar %provides_options% for this frequent purpose.

undefine("configuration")
#> [2018-12-02T16:14:37 UTC] Undefining 'configuration' ... OK

"configuration" %provides_options% list(
  shape = "circle",
  color = "blue",
  size = 13L
)
#> [2018-12-02T16:14:37 UTC] Defining 'configuration' ... OK

config %<=% "configuration"
#> [2018-12-02T16:14:37 UTC] Making 'configuration' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('configuration' in 0.014 secs)
widget_A %<=% "widget_A"
#> [2018-12-02T16:14:37 UTC] Making 'widget_A' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('widget_A' in 0.027 secs)

config$color
#> [1] "blue"
widget_A$switch_color()
config$color
#> [1] "red"

It is also possible to use the shared environment associated to every injector:

"widget_B_prime" %provides% {
  list(
    switch_shape = function()
      .SharedEnv$shape <- 
        if (.SharedEnv$shape == "circle") "square" else "circle"
  )
}
#> [2018-12-02T16:14:37 UTC] Defining 'widget_B_prime' ... OK

widget_B_prime <- make()
#> [2018-12-02T16:14:37 UTC] Making 'widget_B_prime' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('widget_B_prime' in 0.014 secs)

.SharedEnv$shape <- "circle"
widget_B_prime$switch_shape()
.SharedEnv$shape
#> [1] "square"

Byte-compiled modules

Since a module is allowed to expose any R object, there is no restriction on byte-compiled code.

"hello" %provides% {
  compiler::compile("Hello, world!")
}
#> [2018-12-02T16:14:37 UTC] Defining 'hello' ... OK

eval(make())
#> [2018-12-02T16:14:37 UTC] Making 'hello' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('hello' in 0.023 secs)
#> [1] "Hello, world!"
"lapply_old" %provides% {
  # Old R version of lapply.
  function(X, FUN, ...) {
    FUN <- match.fun(FUN)
    if (!is.list(X))
      X <- as.list(X)
    rval <- vector("list", length(X))
    for(i in seq(along = X))
      rval[i] <- list(FUN(X[[i]], ...))
    names(rval) <- names(X) # keep `names' !
    return(rval)
  }
}
#> [2018-12-02T16:14:37 UTC] Defining 'lapply_old' ... OK

"lapply_old/compiled" %requires% list(
  lapply_old = "lapply_old"
) %provides% {
  compiler::cmpfun(lapply_old)
}
#> [2018-12-02T16:14:37 UTC] Defining 'lapply_old/compiled' ... OK

lapply_old %<=% "lapply_old"
#> [2018-12-02T16:14:37 UTC] Making 'lapply_old' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('lapply_old' in 0.015 secs)
lapply_old_compiled %<=% "lapply_old/compiled"
#> [2018-12-02T16:14:37 UTC] Making 'lapply_old/compiled' ...
#> [2018-12-02T16:14:37 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:37 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:37 UTC] DONE ('lapply_old/compiled' in 0.053 secs)

system.time(for (i in 1L:10000L) lapply_old(1L:10L, is.null))
#>    user  system elapsed 
#>   0.288   0.000   0.289
system.time(for (i in 1L:10000L) lapply_old_compiled(1L:10L, is.null))
#>    user  system elapsed 
#>   0.183   0.000   0.183

Semantic Versioning (SemVer)

The modulr package offers Semantic Versioning capabilities: every module can live in several versions numbers of the form x.y.z, where x, y, and z are the major, minor, and patch versions, respectively. For instance, foo#1.2.3 designates module foo in version 1.2.3.

Given a version number, increment the:

  • major version when you make incompatible changes,
  • minor version when you refactor and/or add functionality in a backwards-compatible manner, and
  • patch version when you make backwards-compatible bug fixes.

For instance, foo#1.2.3 becomes foo#1.2.4 after a bug fix and foo#1.3.0 after a functionality bump.

Use:

  • ~x.y.z to refer to the most up-to-date available patch version above x.y.z and allow bug fixes, but nothing else,
  • ^x.y.z or ^x.y to refer to the most up-to-date available minor version above x.y and allow bug fixes and new functionalites, but nothing else, and
  • >=x.y.z, >=x.y, or >=x to refer to the most up-to-date version above x.y.z, x.y, or x, and live on the edge of developpments.

Here are some examples among foo#1.2.3, foo#1.2.4, foo#1.3.0:

  • foo#~1.2.0 refers to foo#1.2.4,
  • foo#~1.2.5 refers to nothing,
  • foo#^1.2.5 and foo#^1.1 refer to foo#1.3.0,
  • foo#^1.3.1 and foo#^1.4 refer to nothing, and
  • foo#>=1.1.0, foo#>=1.5, and foo#>=0 (aka latest) refer to foo#1.3.0.

Initial scenario: no versioning

There is a good chance that your initial scenario contains no versioned module.

"great_module" %provides% {
  function() {
    Sys.sleep(1L)
    "great features"
  }
}
#> [2018-12-02T16:14:38 UTC] Defining 'great_module' ... OK
"complex_module" %requires% list(
  great = "great_module"
) %provides% {
  function() cat(paste("complex module using", great()))
}
#> [2018-12-02T16:14:38 UTC] Defining 'complex_module' ... OK
system.time(make("complex_module")())
#> [2018-12-02T16:14:38 UTC] Making 'complex_module' ...
#> [2018-12-02T16:14:38 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:38 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:38 UTC] * Evaluating #1/1 (layer #1/1): 'great_module' ...
#> [2018-12-02T16:14:38 UTC] DONE ('complex_module' in 0.026 secs)
#> complex module using great features
#>    user  system elapsed 
#>   0.029   0.000   1.029

In this scenario, great_module does what it is supposed to do, but clearly not very efficiently. You then decide to work on a new version that improves its performance.

Setting-up versioning

First, we clone great_module with an initial version number.

"great_module#0.1.0" %clones% "great_module"
#> [2018-12-02T16:14:39 UTC] Defining 'great_module#0.1.0' ... OK

We then adapt the requirements where great_module is injected as a dependency: for complex_module, we decide to accept bug fixes, refactorisations, and new functionalities, as long as the API does not change in an incompatible backward manner.

"complex_module" %requires% list(
  great = "great_module#^0.1.0"
) %provides% {
  function() cat(paste("complex module using", great()))
}
#> [2018-12-02T16:14:39 UTC] Re-defining 'complex_module' ... OK
system.time(make("complex_module")())
#> [2018-12-02T16:14:39 UTC] Making 'complex_module' ...
#> [2018-12-02T16:14:39 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:39 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:39 UTC] * Evaluating #1/1 (layer #1/1): 'great_module#0.1.0' ...
#> [2018-12-02T16:14:39 UTC] DONE ('complex_module' in 0.033 secs)
#> complex module using great features
#>    user  system elapsed 
#>   0.035   0.000   1.035

Here is the minor bump of great_module, which is a little bit more efficient.

"great_module#0.2.0" %provides% {
  # Improved internals, same interface
  function() "great optimisd features"
}
#> [2018-12-02T16:14:40 UTC] Defining 'great_module#0.2.0' ... OK
system.time(make("complex_module")())
#> [2018-12-02T16:14:40 UTC] Making 'complex_module' ...
#> [2018-12-02T16:14:40 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:40 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:40 UTC] * Evaluating #1/1 (layer #1/1): 'great_module#0.2.0' ...
#> [2018-12-02T16:14:40 UTC] DONE ('complex_module' in 0.039 secs)
#> complex module using great optimisd features
#>    user  system elapsed 
#>   0.040   0.000   0.041

And here is the latest bug fix correcting the typo.

"great_module#0.2.1" %provides% {
  # Bug fix
  function() "great optimised features"
}
#> [2018-12-02T16:14:41 UTC] Defining 'great_module#0.2.1' ... OK
make("complex_module")()
#> [2018-12-02T16:14:41 UTC] Making 'complex_module' ...
#> [2018-12-02T16:14:41 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:41 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:41 UTC] * Evaluating #1/1 (layer #1/1): 'great_module#0.2.1' ...
#> [2018-12-02T16:14:41 UTC] DONE ('complex_module' in 0.043 secs)
#> complex module using great optimised features

Location of modules

Modules can be defined in several locations: in-memory, on-disk in its own file or along another module’s file, and remotely on GitHub’s Gist or via the HTTP(S) protocol.

In-memory

This is the most direct method to define a module. This is also the most volatile, since the lifespan of the module is limited to the R session.

"foo" %provides% "bar"
#> [2018-12-02T16:14:41 UTC] Defining 'foo' ... OK
lsmod(cols = c("name", "storage", "along", "filepath", "url"))
#>   name   storage along filepath  url
#> 1  foo in-memory  <NA>     <NA> <NA>

On-disk

This is the way to go when a module is intended to be reused. In such a case, the definition takes place in a dedicated R, R Markdown, or R Sweave file, which path and name are closely related to the module’s name.

For instance, the following module definition is stored in the R file swissknife.R, under the sub-directory vendor/tool of the ./modules directory.

# File: ./modules/vendor/tool/swissknife.R

library(modulr)

"vendor/tool/swissknife" %provides% {
  list(
    large_blade = "Large blade",
    small_blade = "Small blade",
    scissors = "Scissors",
    led_light = "LED light"
  )
}

When a module is invoked, modulr searches for it in-memory first, then on-disk if necessary. There are several default root places where modulr looks for the module’s file: ./modules/, ./module/, ./libs/, ./lib/, and ./. This behaviour can be configured with the help of root_config.

root_config$get_all()
#> [[1]]
#> [1] "../../inst/modules"

This explains why modulr finds the module vendor/tool/swissknife under the file ./modules/vendor/tool/swissknife.R.

my_swissknife %<=% "vendor/tool/swissknife"
#> [2018-12-02T16:14:41 UTC] Making 'vendor/tool/swissknife' ...
#> [2018-12-02T16:14:41 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:41 UTC] ** Defining 'vendor/tool/swissknife' ... OK
#> [2018-12-02T16:14:41 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:41 UTC] DONE ('vendor/tool/swissknife' in 0.033 secs)

This also works with R Markdown .Rmd and R Sweave .Rnw files.

<!--- File: ./modules/vendor/tool/multitool.Rmd -->

# Module `vendor/tool/multitool`

```{r}
library(modulr)

"vendor/tool/multitool" %provides% {
  list(
    pliers = "Pliers",
    keychain = "Keychain"
  )
}
```
load_module("vendor/tool/multitool") # load only, do not make
#> [2018-12-02T16:14:41 UTC] Defining 'vendor/tool/multitool' ... OK
#>                          vendor/tool/multitool 
#> "../../inst/modules/vendor/tool/multitool.Rmd"
lsmod(cols = c("name", "storage", "along", "filepath", "url"))
#>                     name storage along
#> 1  vendor/tool/multitool on-disk  <NA>
#> 2 vendor/tool/swissknife on-disk  <NA>
#>                                       filepath  url
#> 1 ../../inst/modules/vendor/tool/multitool.Rmd <NA>
#> 2  ../../inst/modules/vendor/tool/swissknife.R <NA>

Along a principal module, it is possible to define other related modules, for instance mock-ups and testing modules (cf. infra).

Remote (modulr gears)

Using GitHub’s Gist is a simple way to share modules with others. To illustrate this, let us consider the following remote module, aka modulr gear: https://gist.github.com/aclemen1/3fcc508cb40ddac6c1e3.

"modulr/vault" %imports% "https://gist.github.com/aclemen1/3fcc508cb40ddac6c1e3"
#> [2018-12-02T16:14:41 UTC] Importing 'modulr/vault' from gist ID '3fcc508cb40ddac6c1e3' ...
#> [2018-12-02T16:14:41 UTC] * Found 1 file(s) with R flavour (see https://gist.github.com/3fcc508cb40ddac6c1e3).
#> [2018-12-02T16:14:41 UTC] * Installing gear at '/tmp/RtmpHAFvqv/gears/modulr/vault/d4f0872e6d7a00c9'.
#> [2018-12-02T16:14:41 UTC] DONE ('modulr/vault')
#> [2018-12-02T16:14:41 UTC] Defining 'modulr/vault/example_SECRET_' ... OK
#> [2018-12-02T16:14:41 UTC] Defining 'modulr/vault/example' ... OK
#> [2018-12-02T16:14:41 UTC] Defining 'modulr/vault#0.1.0' ... OK
#> [2018-12-02T16:14:41 UTC] Defining 'modulr/vault#0.1.0/mock' ... OK
#> [2018-12-02T16:14:41 UTC] Defining 'modulr/vault#0.1.0/test' ... OK
#> [2018-12-02T16:14:41 UTC] Digest of 'modulr/vault#0.1.0' is '390032bb2476c24e'.

Notice that only specifiying the gist ID in "modulr/vault" %imports% "3fcc508cb40ddac6c1e3" has the same effect. It is possible to import modules from any URL using the HTTP(S) protocol.

Once imported, a remote module appears to be in-memory defined.

lsmod(cols = c("name", "storage", "along", "filepath", "url"))
#>                           name   storage              along
#> 1         modulr/vault/example in-memory modulr/vault#0.1.0
#> 2 modulr/vault/example_SECRET_ in-memory modulr/vault#0.1.0
#> 3           modulr/vault#0.1.0 in-memory               <NA>
#> 4      modulr/vault#0.1.0/mock in-memory modulr/vault#0.1.0
#> 5      modulr/vault#0.1.0/test in-memory modulr/vault#0.1.0
#>           filepath                                                   url
#> 1 modulr-vault.Rmd https://gist.github.com/aclemen1/3fcc508cb40ddac6c1e3
#> 2 modulr-vault.Rmd https://gist.github.com/aclemen1/3fcc508cb40ddac6c1e3
#> 3 modulr-vault.Rmd https://gist.github.com/aclemen1/3fcc508cb40ddac6c1e3
#> 4 modulr-vault.Rmd https://gist.github.com/aclemen1/3fcc508cb40ddac6c1e3
#> 5 modulr-vault.Rmd https://gist.github.com/aclemen1/3fcc508cb40ddac6c1e3

To use a remote module as a dependency, just import it where needed (even in a remote module).

"modulr/vault" %imports% "3fcc508cb40ddac6c1e3"

"module/using/a/gear" %requires% list(
  vault = "modulr/vault"
) %provides% {
  vault$decrypt(
    secret = "TWUnCkRAlP70XvmRlnAFrw==",
    key = "EaJWzAZjjphu9CoA+MPUVCL8mmMAGp0j6Nbga29kV/A=")
}
#> [2018-12-02T16:14:42 UTC] Defining 'module/using/a/gear' ... OK

make()
#> [2018-12-02T16:14:42 UTC] Making 'module/using/a/gear' ...
#> [2018-12-02T16:14:42 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:42 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:42 UTC] * Evaluating #1/1 (layer #1/1): 'modulr/vault#0.1.0' ...
#> Loading required package: base64enc
#> [2018-12-02T16:14:42 UTC] DONE ('module/using/a/gear' in 0.084 secs)
#> [1] "passw0rd"

Notice that sharing a module is as easy as sending this one-liner code snippet:

library(modulr); "modulr/vault#^0.1.0" %imports% "3fcc508cb40ddac6c1e3"

Finally, private Gists and GitHub Enterprise users are also covered, thanks to GitHub’s Personal Access Tokens (PAT). For instance, with the GitHub Enterprise instance of the University of Lausanne:

# Set 'GITHUB_PAT' in your '.Renviron' file or right here:
# Sys.setenv(GITHUB_PAT = "Your Personal Access Token here")

"modulr/private_GitHubEnterprise_module" %imports% 
  "https://github.unil.ch/api/v3/gists/1afa4770670975d70806c2153aac50a9"
#> [2018-12-02T16:14:42 UTC] Importing 'modulr/private_GitHubEnterprise_module' from gist ID '1afa4770670975d70806c2153aac50a9' (endpoint 'https://github.unil.ch/api/v3') ...
#> [2018-12-02T16:14:42 UTC] * Found 1 file(s) with R flavour (see https://github.unil.ch/gist/1afa4770670975d70806c2153aac50a9).
#> [2018-12-02T16:14:42 UTC] * Installing gear at '/tmp/RtmpHAFvqv/gears/modulr/private_GitHubEnterprise_module/a09895f039767119'.
#> [2018-12-02T16:14:42 UTC] DONE ('modulr/private_GitHubEnterprise_module')
#> [2018-12-02T16:14:42 UTC] Defining 'modulr/private_GitHubEnterprise_module' ... OK
#> [2018-12-02T16:14:42 UTC] Digest of 'modulr/private_GitHubEnterprise_module' is 'efbb8ee38858aaef'.

Let us assume that the following module is worth publishing:

"modulr/release_gist_example" %provides% "Hello World!"

To release this module as a modulr gear on GitHub’s Gist, simply use release_gear_as_gist:

#> [2018-12-02T16:14:42 UTC] Updating 'modulr-release_gist_example.Rmd' in gist ID '91ffa1950571b67d476131873a8a069b'.

The module is then publicly available here: https://gist.github.com/91ffa1950571b67d476131873a8a069b.

Special variables

.Last.name

When a module is defined, touched, or made, its name is always assigned to .Last.name. The special variable .Last.name is also used as a default parameter for make, touch, and undefine.

"foo" %provides% "bar"
#> [2018-12-02T16:14:44 UTC] Defining 'foo' ... OK

.Last.name
#> [1] "foo"

make()
#> [2018-12-02T16:14:44 UTC] Making 'foo' ...
#> [2018-12-02T16:14:44 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:44 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:44 UTC] DONE ('foo' in 0.014 secs)
#> [1] "bar"

touch()
#> [2018-12-02T16:14:44 UTC] Touching 'foo' ... OK

undefine()
#> [2018-12-02T16:14:44 UTC] Undefining 'foo' ... OK

Module’s metadata

Every module has access to some of its metadata: name, version, file path (when on-disk), etc. The following module illustrates this feature and is self-explanatory.

# File: ./modules/my/great/module/reflection.R

library(modulr)

"my/great/module/reflection#0.1.0" %provides% {
  list(
    .__name__ = .__name__,
    .__namespace__ = .__namespace__,
    .__initials__ = .__initials__,
    .__final__ = .__final__,
    .__version__ = .__version__,
    .__file__ = .__file__,
    .__path__ = .__path__
  )
}
with_verbosity(0L, make("my/great/module/reflection"))
#> $.__name__
#> [1] "my/great/module/reflection#0.1.0"
#> 
#> $.__namespace__
#> [1] "my/great/module/reflection"
#> 
#> $.__initials__
#> [1] "my/great/module"
#> 
#> $.__final__
#> [1] "reflection"
#> 
#> $.__version__
#> [1] '0.1.0'
#> 
#> $.__file__
#>                                                   my/great/module/reflection#0.1.0 
#> "/home/projects/aclemen1/RStudio/modulr/inst/modules/my/great/module/reflection.R" 
#> 
#> $.__path__
#> [1] "/home/projects/aclemen1/RStudio/modulr/inst/modules/my/great/module"

The special module modulr

The modulr package defines a special module named modulr that can be injected in any module. The purpose of this special module is to give access to useful helper functions related to the module into which it is injected.

info("modulr")

Messages

TODO

Post-evaluation hook

There are situations where a post-evaluation hook is needed. For instance, to define an ephemeral module that can be evaluated only once, or to define a so-called no-scoped module, which looks like a pure singleton, but behaves like a prototype.

"ephemeral" %requires% list(
  modulr = "modulr"
) %provides% {
  modulr$post_evaluation_hook(undefine("ephemeral"))
  "A butterfly"
}
#> [2018-12-02T16:14:44 UTC] Defining 'ephemeral' ... OK

make("ephemeral") # returns a butterfly
#> [2018-12-02T16:14:44 UTC] Making 'ephemeral' ...
#> [2018-12-02T16:14:44 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:44 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:44 UTC] * Evaluating #1/1 (layer #1/1): 'modulr#0.1.7.9208' ...
#> [2018-12-02T16:14:44 UTC] * Undefining 'ephemeral' ... OK
#> [2018-12-02T16:14:44 UTC] DONE ('ephemeral' in 0.16 secs)
#> [1] "A butterfly"
try(make("ephemeral"), silent = TRUE) # no more
#> [2018-12-02T16:14:44 UTC] Making 'ephemeral' ...
#> [2018-12-02T16:14:44 UTC] * Visiting and defining dependencies ...
cat(geterrmessage())
#> Error : "ephemeral" is not defined.
"no_scoped" %requires% list(
  modulr = "modulr"
) %provides% {
  modulr$post_evaluation_hook(touch("no_scoped"))
  Sys.time()
}
#> [2018-12-02T16:14:44 UTC] Defining 'no_scoped' ... OK

make("no_scoped")
#> [2018-12-02T16:14:44 UTC] Making 'no_scoped' ...
#> [2018-12-02T16:14:44 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:44 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:44 UTC] * Touching 'no_scoped' ... OK
#> [2018-12-02T16:14:44 UTC] DONE ('no_scoped' in 0.18 secs)
#> [1] "2018-12-02 16:14:44 UTC"
Sys.sleep(1L)
make("no_scoped")
#> [2018-12-02T16:14:45 UTC] Making 'no_scoped' ...
#> [2018-12-02T16:14:45 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:46 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:46 UTC] * Touching 'no_scoped' ... OK
#> [2018-12-02T16:14:46 UTC] DONE ('no_scoped' in 0.23 secs)
#> [1] "2018-12-02 16:14:46 UTC"

Notice that the expression passed to the hook is evaluated in the environment in which the module is used. Therefore, a direct call to .__name__ would not return the name of the intuitively expected module. The following example illustrates how to circumvent this kind of difficulty.

"no_scoped" %requires% list(
  modulr = "modulr"
) %provides% {
  eval(substitute(modulr$post_evaluation_hook(touch(me)), list(me = .__name__)))
  Sys.time()
}
#> [2018-12-02T16:14:46 UTC] Re-defining 'no_scoped' ... OK

make("no_scoped")
#> [2018-12-02T16:14:46 UTC] Making 'no_scoped' ...
#> [2018-12-02T16:14:46 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:46 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:46 UTC] * Touching 'no_scoped' ... OK
#> [2018-12-02T16:14:46 UTC] DONE ('no_scoped' in 0.17 secs)
#> [1] "2018-12-02 16:14:46 UTC"
Sys.sleep(1L)
make("no_scoped")
#> [2018-12-02T16:14:47 UTC] Making 'no_scoped' ...
#> [2018-12-02T16:14:47 UTC] * Visiting and defining dependencies ...
#> [2018-12-02T16:14:47 UTC] * Constructing dependency graph ... OK
#> [2018-12-02T16:14:47 UTC] * Touching 'no_scoped' ... OK
#> [2018-12-02T16:14:47 UTC] DONE ('no_scoped' in 0.19 secs)
#> [1] "2018-12-02 16:14:47 UTC"

Scripting

Turning a bunch of modules working perfectly well together into a script is a very common situation, that can be handled with the help of the following boilerplate code:

# filepath: ./script.R
"script" %requires% list(
  dep_1 = "dependency_1",
  ...
) %provides% {
  function() {
    # body of the script here
  }
}

if (.__name__ == "__main__") 
  # execute only if sourced/run as a script (à la Python)
  make()()

Coding style