class: center, middle, inverse, title-slide # Reproducibility in R ## Sharing interactive environments with Binder ### Florencia DβAndrea ### 2020-09-03 --- class: center, middle <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a> --- # Hello! π Here I would like to share with you this code I am working on. Let me know if you have any problems. [](https://mybinder.org/v2/gh/flor14/reproducibilidad_meetup/master?urlpath=rstudio) -- **Issue 1** Absolute path -- **Issue 2** Package version of `tidyr 0.8.3` does not include `pivot_longer()` and `pivot_wider()` functions added in `tidyr version 1.0.0` Use `sessionInfo()` to check this --- class: inverse, center, middle # Would I have reproducible work only sharing the code and the data? --- class: center, middle, inverse # Reproducible environments --- ## There are several tools to capture computational environments * Package management systems (π¦`packrat` -π¦ `renv`) * Binder * Virtual machines * Containers .footnote[[More details in The Turing Way Handbok](https://www.turing.ac.uk/research/research-projects/turing-way-handbook-reproducible-data-science)] --- class: center # Package management systems **`renv` package** <!-- --> .footnote[[Reproducible Environments - RStudio](https://environments.rstudio.com/)] --- # `renv` package 1. π `renv::init()` works by creating a new library. A library stores installed packages. -- 2. πΈ `renv::snapshot()` creates a new file in your project titled `renv.lock`. The file contains all the information you need to communicate your projectβs dependencies at the moment you call snapshot. -- 3. π± `renv::restore()` recreates the environment! .footnote[[* Read more about `renv` here](https://environments.rstudio.com/snapshot.html#pre-requisite-steps)] --- class: center # Binder [Jupyter Notebooks](https://mybinder.org/v2/gh/binder-examples/r/master?filepath=index.ipynb) [Shiny](https://mybinder.org/v2/gh/flor14/shiny_reproducibilidad_meetup/master?urlpath=shiny/elipse/) [RStudio](http://mybinder.org/v2/gh/flor14/shiny_reproducibilidad_meetup/master?urlpath=rstudio) [Tutorial 1 - Ines Montani](https://noamross.github.io/gams-in-r-course/) [Tutorial 2 - LearnR](https://mybinder.org/v2/gh/syoh/learnr-tutorial/master?urlpath=shiny/test1/) --- # Binder Binder is an open source web service that lets users create sharable, interactive, reproducible environments in the cloud. <!-- --> --- .pull-left[ ## Advantages - Easy to use - You can access the infomation with one click - It is free ] .pull-right[ ## Limitations - Computational power - Security/privacy (using `mybinder.org` BinderHub) - Also no FTP for connecting to some data. ] --- class: inverse, center, middle, ## βπ»β ### Binderβs goal is to lower the barrier to interactivity, and to allow users to utilize code that is hosted in repository providers such as GitHub .footnote[[Binder 2.0 - Reproducible, interactive, sharable environments for science at scale](https://pdfs.semanticscholar.org/c043/bef741a9616d1144e0205ac21ceae881485d.pdf)] --- # mybinder.org A free, public BinderHub. Because it is public, you should not use it if your project requires personal or confidential information (such as passwords). <!-- --> --- ## "Binderizing" your project π **1-** Specify the computational environment βπ **intall.R** βπ **runtime.txt** -- **2-** Upload the project files to a publicly available repository hosting service, such as **GitHub / GitLab** -- **3-** "Binderize" the project (**mybinder.org**) β³ -- **4-** Use the correct URL π --- class: middle, center, inverse ## βπ»β # Demo --- ## βπ `install.R` > This file should have listed all of the packages to be installed ```r install.packages("ggplot2") install.packages("shiny") ``` --- # What is MRAN? π· Since September 17th, 2014, the checkpoint server has been taking a daily snapshot πΈ at precisely midnight UTC of the entire CRAN repository and storing it on [Microsoft R Archived Network (MRAN)](https://mran.microsoft.com/documents/rro/reproducibility#reproducibility) -- > β Non-CRAN packages, such as those available on GitHub, are not part of the snapshot process. .footnote[[MRAN](https://mran.microsoft.com/)] --- # EXTRA: `checkpoint` package β± `checkpoint` package allows you to install packages as they existed on CRAN on a specific snapshot date as if you had a CRAN time machine. ```r library(checkpoint) checkpoint("YYYY-MM-DD") ``` .footnote[[`checkpoint` package](https://mran.microsoft.com/documents/rro/reproducibility#checkpointpkg)] --- # βπ `runtime.txt` > Specify the R and package versions used For this you must choose a date where the versions of your packages are captured in MRAN. **`r-version-<YYYY>-<MM>-<DD>`** [*READ HERE - Important about R versions*](https://github.com/binder-examples/r) --- # βπ `runtime.txt` ```r r-3.6-2020-08-20 #r-version-<YYYY>-<MM>-<DD> ```  --- # 2. Upload your code to the repository <!-- --> --- # 3. "Binderize" your project π a. Go to https://mybinder.org -- b. Paste the repository URL `https://github.com/<your-username>/<your-repository>` -- c. Finally, click the `Launch` button. --- class: center, middle, inverse # Patience! This could take a while β³ --- # RStudio IDE URL π» β**`?urlpath=rstudio`** You should call the binderized project using this template link `https://mybinder.org/v2/gh/<user>/<repository>/<branch>?urlpath?rstudio` Example π: http://mybinder.org/v2/gh/flor14/shiny_reproducibilidad_meetup/master?urlpath=rstudio .footnote[[Ejemplos en el repositorio de Binder](https://github.com/binder-examples/r)] --- # Shiny app URL β¨ β**`?urlpath=shiny/<folder>/`** You should call the binderized project using this template link `https://mybinder.org/v2/gh/<user>/<repository>/<branch>?urlpath=shiny/<folder>/` Example π: https://mybinder.org/v2/gh/flor14/shiny_reproducibilidad_meetup/master?urlpath=shiny/elipse/ .footnote[[Ejemplos en el repositorio de Binder](https://github.com/binder-examples/r)] --- #Tutorials π»π© * [Ines Montani framwework uses Binder](https://github.com/ines/course-starter-r) * [Interactive Tutorial with learnr and Binder - Sang-Yun Oh blog post](https://syoh.org/learnr-tutorial/) --- # Others β [Holepunch Package](https://karthik.github.io/holepunch/articles/getting_started.html) β Faster installation [r-conda](https://github.com/binder-examples/r-conda) β [More info about Binder](https://mybinder.readthedocs.io/en/latest/faq.html) -- π³ Changes to Docker terms of service on November 1. Lack of activity for 6 months could leave links inactive. βdiscourse.jupyter.org --- class: middle, inverse, center # Practice πͺ --- # Exercise Could you modify the code from the first exercise to make it work?  --- class: inverse # Links π * [Binder 2.0 - Reproducible, interactive, sharable environments for science at scale](https://pdfs.semanticscholar.org/c043/bef741a9616d1144e0205ac21ceae881485d.pdf) * [Reproducibility in Production - Webinar](https://rstudio.com/resources/webinars/reproducibility-in-production/) * [The Turing Way Book](https://the-turing-way.netlify.app/) * [Reproducible Environments - RStudio](https://environments.rstudio.com/) * [renv: Project Environments with R - RStudio blog](https://blog.rstudio.com/2019/11/06/renv-project-environments-for-r/) * [Putting the R into Reproducible Research - Anna Krystalli](https://annakrystalli.me/talks/r-in-repro-research.html#1) * [Demo renv package](https://environments.rstudio.com/snapshot.html#watch-a-video-demo-of-snapshot-and-restore-with-renv) --- class: center, middle # β» # Β‘Thank you! Web [florencia.netlify.app](florencia.netlify.app) Twitter [@cantoflor87](twitter.com/cantoflor_87)