exclude: true --- class: title_bg .title[ Teaching computing<br/>using git and GitHub ] .conference[ .name[ TLMSCO, Sept 2020 ] .bitly[ [bit.ly/tlmsco_rundel](https://bit.ly/tlmsco_rundel) ] ] .author[ .name[ Dr. Colin Rundel ] .school[ Univ of Edinburgh ] ] --- class: middle, center # Teaching Reproducible Workflows --- ## Reproducible vs Replicable <img src="imgs/leek_repro.jpeg" width="50%" style="display: block; margin: auto;" /> .footnote[ Source: Patil, Peng, Leek (2019) A visual tool for defining reproducibility and replicability. <i>Nature Human Behaviour</i> ] --- ## Reproducibility in practice <br/> - Can you recreate the tables and figures reproducible from the code and data? - Does the code actually do what you think it does? - In addition to what was done, is it clear *why* it was done? (e.g. how were hyper / tuning parameters chosen?) - Can the code be used for other data? - Can you hand the code off to someone else and expect it to work? --- ## Core pieces <img src="imgs/repro_pieces.png" width="50%" style="display: block; margin: auto;" /> --- ## Context * I am the course organizer for Math 11176 - Statistical Programming * Course with ~200 Maths MSC students enrolled * 100% coursework, multiple marked assignments (individual and team based) * For each assignment we distribute: * Instruction document * Template `Rmd` for solutions * Data and other support files * Need to collect: * Completed template `Rmd` * Rendered output (`pdf`, `html`, `md`, etc.) --- ## GitHub Organization * 1 organization / course * Students are added (anonymously) members of the organization * 1 template repository / assignment * 1 private repository / assignment / (team | individual) * Automate the distribution, collection, and feedback using GitHub's API (`ghclass`) --- ## GitHub Organization <img src="imgs/github_org.png" width="100%" style="display: block; margin: auto;" /> --- ## Template Example - hw1 <img src="imgs/github_hw1.png" width="80%" style="display: block; margin: auto;" /> --- ## `ghclass` An R package that enables instructors to automate the management of courses on GitHub. Key features: - Repository creation, mirroring, updating, collecting, etc. - Organization management (members, teams, etc.) - Summary statistics (e.g. commits) by repo or over the org - Many other common tasks (issues, PR, etc.) For more details see the package website - https://rundel.github.io/ghclass/ --- ## Creating a team assignment ```r org_create_assignment( org = "ghclass-demo", repo = c("hw01-team01", "hw01-team01", "hw01-team02", "hw01-team02"), user = c("ghclass-anya", "ghclass-bruno", "ghclass-celine", "ghclass-diego"), team = c("hw01-team01", "hw01-team01", "hw01-team02", "hw01-team02"), source_repo = "statprog-s1-2019/hw1" ) ``` ``` ## ✓ Mirrored repo 'statprog-s1-2019/hw1' to repo 'ghclass-demo/hw01-team01'. ``` ``` ## ✓ Mirrored repo 'statprog-s1-2019/hw1' to repo 'ghclass-demo/hw01-team02'. ``` ``` ## ✓ Created team 'hw01-team01' in org 'ghclass-demo'. ``` ``` ## ✓ Created team 'hw01-team02' in org 'ghclass-demo'. ``` ``` ## ✓ Added user 'ghclass-anya' to team 'hw01-team01'. ``` ``` ## ✓ Added user 'ghclass-bruno' to team 'hw01-team01'. ``` ``` ## ✓ Added user 'ghclass-celine' to team 'hw01-team02'. ``` ``` ## ✓ Added user 'ghclass-diego' to team 'hw01-team02'. ``` ``` ## ✓ Added team 'hw01-team01' to repo 'ghclass-demo/hw01-team01' with 'push' access. ``` ``` ## ✓ Added team 'hw01-team02' to repo 'ghclass-demo/hw01-team02' with 'push' access. ``` --- ## Collecting student work ```r local_repo_clone(repo = org_repos(org = "ghclass-demo", "hw01-"), local_path = "hw01") ``` ``` ## ✓ Cloned 'ghclass-demo/hw01-team01'. ``` ``` ## ✓ Cloned 'ghclass-demo/hw01-team02'. ``` -- <img src="imgs/github_clone.png" width="65%" style="display: block; margin: auto;" /> --- ## Contributor statistics ```r repo_contributors(repo = "statprog-s1-2019/hw02-lab01-team03") %>% mutate(username = LETTERS[1:4]) %>% arrange(desc(commits)) ``` ``` ## # A tibble: 4 x 3 ## repo username commits ## <chr> <chr> <int> ## 1 statprog-s1-2019/hw02-lab01-team03 D 8 ## 2 statprog-s1-2019/hw02-lab01-team03 B 5 ## 3 statprog-s1-2019/hw02-lab01-team03 C 5 ## 4 statprog-s1-2019/hw02-lab01-team03 A 3 ``` ```r repo_contributors(repo = "statprog-s1-2019/hw02-lab01-team10") %>% mutate(username = LETTERS[12+1:5]) %>% arrange(desc(commits)) ``` ``` ## # A tibble: 5 x 3 ## repo username commits ## <chr> <chr> <int> ## 1 statprog-s1-2019/hw02-lab01-team10 Q 17 ## 2 statprog-s1-2019/hw02-lab01-team10 P 9 ## 3 statprog-s1-2019/hw02-lab01-team10 O 5 ## 4 statprog-s1-2019/hw02-lab01-team10 M 1 ## 5 statprog-s1-2019/hw02-lab01-team10 N 1 ``` --- ## Automated feedback <img src="imgs/github_actions0.png" width="100%" style="display: block; margin: auto;" /> --- ## Automated feedback <img src="imgs/github_actions1.png" width="100%" style="display: block; margin: auto;" /> --- ## Automated feedback <img src="imgs/github_actions2.png" width="100%" style="display: block; margin: auto;" /> --- ## Related ongoing work * Peer evaluation (Mine Cetinkaya-Rundel and Therese Anders) * Simplifying the automated feedback process: * `checklist` - R package for simplifying automated checks <br/> https://github.com/rundel/checklist * `parsermd` - R package for programmatic interaction with R markdown documents <br/> https://rundel.github.io/parsermd/ --- ## Additional Resources * [Happy Git and GitHyb for the useR](https://happygitwithr.com/) <br/> Jenny Bryan, Jim Hester * [Excuse me, do you have a moment to talk about version control?](https://peerj.com/preprints/3159/) <br/> Jenny Bryan (2018), *The American Statistician*. * [Using GitHub Classroom To Teach Statistics](https://www.tandfonline.com/doi/full/10.1080/10691898.2019.1617089) <br/> Jacob Fiksel, Leah Jager, Johannna Hardin, and Margaret Taub (2019), <br/> *Journal of Statistics Education*. * [Implementing version control with Git as a learning objective in statistics courses](https://arxiv.org/abs/2001.01988) <br/> Matthew Beckman, Mine Çetinkaya-Rundel, Nicholas Horton, Colin Rundel, Adam Sullivan, Maria Tackett (2020), *Journal of Statistics Education (in review)* * [Teaching Statistics and Data Science Online Workshops](https://centreforstatistics.maths.ed.ac.uk/cfs/events/past-events-and-recordings/2020-events/teaching-statistics-and-data-science-online) <br/> Mine Çetinkaya-Rundel, Colin Rundel (2020), *Centre for Statistics Online Workshop* --- # Thank you! .middle[ .center[ <div style="width: 98%"> <table class="contact" style="text-align: left; margin-left:auto; margin-right:auto; width:50%;"> <tbody> <tr><td><br/></td><td> </td></tr> <tr> <td style="vertical-align: middle;"> <i class="fas fa-envelope fa-fw fa-2x"></i> </td> <td></td> <td> <a href="mailto:rundel@gmail.com">rundel@gmail.com</a> <a href="mailto:colin.rundel@ed.ac.uk">colin.rundel@ed.ac.uk</a> </td> </tr> <tr><td><br/></td></tr> <tr> <td style="vertical-align: middle;"> <i class="fab fa-twitter-square fa-fw fa-2x"></i> </td> <td></td> <td> <a href="https://twitter.com/rundel">@rundel</a> </td> </tr> <tr><td><br/></td></tr> <tr> <td style="vertical-align: middle;"> <i class="far fa-file-powerpoint fa-fw fa-2x"></i> </td> <td></td> <td> <a href="http://bit.ly/tlmsco_rundel">bit.ly/tlmsco_rundel</a> </td> </tr> <tr><td><br/></td></tr> <tr> <td style="vertical-align: middle;"> <i class="fas fa-chalkboard-teacher fa-fw fa-2x"></i> </td> <td></td> <td> <a href="https://statprog-s1-2019.github.io/">statprog-s1-2019.github.io/</a> </td> </tr> </tbody> </table> </div> ] ]