At CBRD, a lot of the research work we do is done on remote machines. For various reasons, we like being able to spin up and wind down these boxes at will, and auto-configure them at short notice according to a few standard variables. Depending on the installation, then, we would have a perfectly set up box with all the features we want, focused around R, Python and as frontends, RStudio and Jupyterhub.
Where StackScripts fit in
There are a wide range of ways to configure boxes – Puppet, Chef cookbooks, Terraform, Dockerfiles, and all that –, but for ease of use, we rely on simple shell files that can be run as Linode StackScripts.
StackScripts are an extremely convenient way to configure single Linode with the software you need. Unlike more complex systems like Terraform,
The Ares research node generator StackFile does a handful of things:
- Update system and add the CRAN repo as a source
- Install R and the RStudio version of choice
- Install Python and JupyterHub version of choice
- Install an opinionated set of system level packages (i.e. available to all users)
- Configures ports and some other configuration items for the instance
- Creates the root user
- Daemonises the RStudio server and JupyterHub to automatically start at failure and automatically start at reboot
When deployed using Linode from its StackFile, it allows for a wide range of configuration options, including ports for both Jupyter and RStudio, and a completely configured first user set up both on JupyterHub and RStudio. In addition, you can configure some install settings. A ‘barebones’ install option exists that allows for a minimum set of packages to be installed – this is useful for testing or if the desired configuration diverges from the ordinary structure. In addition, OpenCV, deep learning tools and cartography tools can be selectively disabled or enabled, as these are not always required.
User administration for Jupyterhub and RStudio
In general, user administration is by preset attached to
PAM, i.e. the built-in Linux administration structure. JupyterHub has its own administration features, described here. RStudio, on the other hand, authenticates by user group membership. The two share the same usergroup, specified in the configuration (by default and convention, this is
jupyter, but you can change it), and because users created by JupyterHub fall into that user group, creating users in JupyterHub automatically grants them access to RStudio. This is overall acceptable as we tend to use both, but there might be a safety concern there. If so, you can change the
auth-required-user-group=$USERGROUPNAME setting to a defined usergroup in the
There are some glitches that we’re trying to iron out:
- Cartography and GIS tools glitch a little due to issues with PROJ.4.
- GPU/CUDA support is not implemented as this is not customarily used or provided on Linodes
Let's Encryptis not really supported yet, as our boxes are never directly public-facing, but you should most definitely find a way to put your server behind some form of SSL/TLS.
- Currently, only Ubuntu 16.04 LTS is supported, as it’s the most widely used version. CRAN does not yet support more recent versions yet, but these will be added once CRAN support is added.
As always, do let me know how things work out for you! Feel free to leave comments and ideas below. If you’re getting value out of using this script, just let me know.