Who is this package for?

This package is for everyone! But really, if you’re an instructor who uses GitHub for your class management, e.g. students submit assignments via GitHub repos, this package is definitely for you! The package also assumes that you’re an R user, and you probably teach R as well, though that’s not a requirement since this package is all about setting up repositories with the right permissions, not what your students put in those repositories. (If you’re a Python user, see this post for a Python based alternative.)

What is this vignette about?

This vignette is about the nitty-gritty of setting your class up in GitHub with ghclass. For a higher level discussion of why and how to use Git/GitHub in data science education, see this paper by the package authors.

Structuring your class on GitHub

The general framework is outlined below. This is not the only way to structure a class on GitHub, but it’s a good way, and one that ghclass is optimized to work with.

We outline steps for achieving this structure in the next section. This section is meant to give a high level view of what your course looks like on GitHub.

  • One organization per class: If you teach at a university, this means one semester of a given course. If you teach workshops, this would be one workshop. The instructor and any additional instructional staff, e.g. teaching assistants, are owners. Going forward we will refer to this group of people as “instructors”. The students are members.

  • One repo per student (or team) per assignment: The instructors have admin access to repos, i.e. they can read, clone, push, and add collaborators to assignment repositories as well as delete them. The students have write access to repo, which means that they can read, clone, and push to this repository but they cannot delete them and they cannot add others to them. This can help with minimizing accidents that cannot be undone and makes sure students cannot peek into each others’ repositories unless you explicitly allow them to do so.

If you have a teamwork component to your course, you can also set up teams on GitHub and give them access to repositories for team assignments.

Suppose you have 40 students in your class, and they are in 10 teams of 4 students each. Suppose also that students turn in the following throughout a semester:

  • Individual: 10 homework assignments + 2 exams
  • Teamwork: 8 lab assignments + 1 project

Then, throughout the semester you will need to create total of 570 repositories.

That is a lot of repos to create and set permissions to manually! It’s also a lot of repos to clone when it comes time to grading. ghclass addresses this problem, and more! It does not, however, address the problem that that’s a lot of grading. Sorry, you’re on your own there!

That being said, ghclass does also facilitate setting up continuous integration for students’ assignment repos, which means that some of the checking and feedback giving can be done automatically each time they push to the repo without intervention from the instructors.

Authentication

This package uses tokens for authentication with github, these values can be supplied via environmental variables GITHUB_PAT or GITHUB_TOKEN or saved as text in ~/.github/token.

If this is your first time setting up a personal access token (PAT), generate a token in the browser after logging into Github (Settings > Developer Settings > Personal access tokens) or use usethis::browse_github_token.

You can test that your token is working correctly using the github_test_token() function.

github_test_token()
#> ✔ Your github token is functioning correctly.
github_test_token("bad token")
#> ✖ Your github token failed to authenticate.
#>   GitHub API error (401): 401 Unauthorized
#>   └─ Bad credentials (See https://developer.github.com/v3)

Step-by-step guide

Start with creating an organization on GitHub for the course. We recommend using the course number, semester/quarter, and year in the organization name, e.g. for a course numbered Sta 199 in Spring 18, you can use something like Sta199-Sp18. The exact format is not critical, but being consistent is helpful so that you can keep track of all of your different courses.

Once you have created the course organization you can then go here to apply for GitHub Education discount. This discount will allow you and your organizations to have unlimited private repositories. Once you have been approved as an educator you should be able to see a list of your organizations on the GitHub Education benefits page. Upgrading your organizations should now just be a matter of clicking the upgrade button and answering a couple of quick questions.

All of this is an optional step, but one that many will want to do. GitHub charges for private repositories unless they are used for teaching purposes. Approval is usually pretty quick, but you don’t want to do this the night before classes begin. Give yourself at least a week to be safe.

Adding students to the organization

Next, collect your students’ GitHub user names. You can do this with a Google form and then read the spreadsheet containing their responses into R using the googlesheets package. The resulting data frame should include two columns: one called github which contains students’ GitHub user names and another column called team that contains the team name for each student.

For example, your roster file might look something like the following:

roster = readr::read_csv( system.file("roster.csv", package = "ghclass") )
roster
#> # A tibble: 6 x 5
#>   email              github          hw1        hw2        hw3       
#>   <chr>              <chr>           <chr>      <chr>      <chr>     
#> 1 anya@school.edu    ghclass-anya    hw1-team01 hw2-team01 hw3-team01
#> 2 bruno@school.edu   ghclass-bruno   hw1-team02 hw2-team02 hw3-team02
#> 3 celine@school.edu  ghclass-celine  hw1-team03 hw2-team03 hw3-team03
#> 4 diego@school.edu   ghclass-diego   hw1-team01 hw2-team03 hw3-team02
#> 5 elijah@school.edu  ghclass-elijah  hw1-team02 hw2-team01 hw3-team03
#> 6 francis@school.edu ghclass-francis hw1-team03 hw2-team02 hw3-team01

Using the roster data frame, we can then invite the students to the class’ organzation. This will send an email to each student inviting them to join the GitHub organization.

org_invite("ghclass-vignette", roster$github)
#> ✔ Invited user 'ghclass-anya' to org 'ghclass-vignette'.
#> ✔ Invited user 'ghclass-bruno' to org 'ghclass-vignette'.
#> ✔ Invited user 'ghclass-celine' to org 'ghclass-vignette'.
#> ✔ Invited user 'ghclass-diego' to org 'ghclass-vignette'.
#> ✔ Invited user 'ghclass-elijah' to org 'ghclass-vignette'.
#> ✔ Invited user 'ghclass-francis' to org 'ghclass-vignette'.

We now need to wait for the students to accept these invitations before the will have access to the organization. We can check the status of these acceptances using the org_members() and org_pending() to see which students have accepting and have not accepted the invitation.

org_members("ghclass-vignette")
#> [1] "mine-cetinkaya-rundel" "rundel"                "thereseanders"
org_members("ghclass-vignette", include_admins = FALSE)
#> character(0)
org_pending("ghclass-vignette")
#> [1] "ghclass-anya"    "ghclass-bruno"   "ghclass-diego"   "ghclass-elijah" 
#> [5] "ghclass-francis" "ghclass-celine"

If we give things a bit of time, some of the students will have accepted the invitation.

org_members("ghclass-vignette")
#> [1] "ghclass-anya"          "ghclass-celine"        "ghclass-francis"      
#> [4] "mine-cetinkaya-rundel" "rundel"                "thereseanders"
org_pending("ghclass-vignette")
#> [1] "ghclass-bruno"  "ghclass-diego"  "ghclass-elijah"

We can now see that Anya, Celine, and Francis have accepted the invite and we are still waitin on Bruno, Diego, and Elijah. Gentle prodding and reminder emails are often necessary to get all of the students into the organization.

Creating teams

We will now create teams which will be used for the first two homework assignments in the class. We can either use the hw1 and hw2 columns from the roster data frame directly or we can vectorize the process and add either a common prefix or suffix.

team_create("ghclass-vignette", roster$hw1)
#> ✔ Created team 'hw1-team01' in org 'ghclass-vignette'.
#> ✔ Created team 'hw1-team02' in org 'ghclass-vignette'.
#> ✔ Created team 'hw1-team03' in org 'ghclass-vignette'.
team_create("ghclass-vignette", team = c("team01", "team02", "team03"), prefix = "hw2-")
#> ✔ Created team 'hw2-team01' in org 'ghclass-vignette'.
#> ✔ Created team 'hw2-team02' in org 'ghclass-vignette'.
#> ✔ Created team 'hw2-team03' in org 'ghclass-vignette'.

Once the teams are created we can then add the students to these teams.

team_invite(org = "ghclass-vignette", user = roster$github, team = roster$hw1)
#> ✔ Added 'ghclass-anya' to team 'hw1-team01'.
#> ✔ Added 'ghclass-diego' to team 'hw1-team01'.
#> ✔ Added 'ghclass-bruno' to team 'hw1-team02'.
#> ✔ Added 'ghclass-elijah' to team 'hw1-team02'.
#> ✔ Added 'ghclass-celine' to team 'hw1-team03'.
#> ✔ Added 'ghclass-francis' to team 'hw1-team03'.
team_invite(org = "ghclass-vignette", user = roster$github, team = roster$hw2)
#> ✔ Added 'ghclass-anya' to team 'hw2-team01'.
#> ✔ Added 'ghclass-elijah' to team 'hw2-team01'.
#> ✔ Added 'ghclass-bruno' to team 'hw2-team02'.
#> ✔ Added 'ghclass-francis' to team 'hw2-team02'.
#> ✔ Added 'ghclass-celine' to team 'hw2-team03'.
#> ✔ Added 'ghclass-diego' to team 'hw2-team03'.

Also if we happen to try to invite students to a team that does not exist yet, then helpfully team_invite will create the team for you and then invite the students. This behavior is governed by the create_missing_teams parameter, which is TRUE by default.

team_invite("ghclass-vignette", roster$github, roster$hw3)
#> ✔ Created team 'hw3-team01' in org 'ghclass-vignette'.
#> ✔ Created team 'hw3-team02' in org 'ghclass-vignette'.
#> ✔ Created team 'hw3-team03' in org 'ghclass-vignette'.
#> ✔ Added 'ghclass-anya' to team 'hw3-team01'.
#> ✔ Added 'ghclass-francis' to team 'hw3-team01'.
#> ✔ Added 'ghclass-bruno' to team 'hw3-team02'.
#> ✔ Added 'ghclass-diego' to team 'hw3-team02'.
#> ✔ Added 'ghclass-celine' to team 'hw3-team03'.
#> ✔ Added 'ghclass-elijah' to team 'hw3-team03'.

Creating a team assignment

Now that we have teams within our organization we can now create repos for each team and each assignment as well as add each team to these repositories (which gives then read and write access).

r = repo_create("ghclass-vignette", name = c("team01", "team02","team03"), prefix="hw1-")
#> ✔ Created repo 'ghclass-vignette/hw1-team01'.
#> ✔ Created repo 'ghclass-vignette/hw1-team02'.
#> ✔ Created repo 'ghclass-vignette/hw1-team03'.
repo_add_team(r, roster$hw1)
#> ✔ Added team 'hw1-team01' to repo 'ghclass-vignette/hw1-team01'.
#> ✔ Added team 'hw1-team02' to repo 'ghclass-vignette/hw1-team02'.
#> ✔ Added team 'hw1-team03' to repo 'ghclass-vignette/hw1-team03'.

In the above example we are counting on the fact that the team and repository names are identical, if this were not the case we could do something like the following:

teams = org_teams("ghclass-vignette", filter="hw1-", exclude = TRUE)

repo_create("ghclass-vignette", teams)
#> ✔ Created repo 'ghclass-vignette/hw2-team01'.
#> ✔ Created repo 'ghclass-vignette/hw2-team02'.
#> ✔ Created repo 'ghclass-vignette/hw2-team03'.
#> ✔ Created repo 'ghclass-vignette/hw3-team01'.
#> ✔ Created repo 'ghclass-vignette/hw3-team02'.
#> ✔ Created repo 'ghclass-vignette/hw3-team03'.

repo_add_team(org_repos("ghclass-vignette","hw2-"), roster$hw2)
#> ✔ Added team 'hw2-team01' to repo 'ghclass-vignette/hw2-team01'.
#> ✔ Added team 'hw2-team02' to repo 'ghclass-vignette/hw2-team02'.
#> ✔ Added team 'hw2-team03' to repo 'ghclass-vignette/hw2-team03'.

repo_add_team(org_repos("ghclass-vignette","hw3-"), roster$hw3)
#> ✔ Added team 'hw3-team01' to repo 'ghclass-vignette/hw3-team01'.
#> ✔ Added team 'hw3-team02' to repo 'ghclass-vignette/hw3-team02'.
#> ✔ Added team 'hw3-team03' to repo 'ghclass-vignette/hw3-team03'.

We can check on all of the existing teams and repos using the org_teams and org_repos functions respectively.

org_teams("ghclass-vignette")
#> [1] "hw1-team01" "hw1-team02" "hw1-team03" "hw2-team01" "hw2-team02"
#> [6] "hw2-team03" "hw3-team01" "hw3-team02" "hw3-team03"
org_repos("ghclass-vignette")
#> [1] "ghclass-vignette/hw1-team01" "ghclass-vignette/hw1-team02"
#> [3] "ghclass-vignette/hw1-team03" "ghclass-vignette/hw2-team01"
#> [5] "ghclass-vignette/hw2-team02" "ghclass-vignette/hw2-team03"
#> [7] "ghclass-vignette/hw3-team01" "ghclass-vignette/hw3-team02"
#> [9] "ghclass-vignette/hw3-team03"

If we want to place starter docs / code into our assignment repos we can do this by either adding the files directly or mirroring (copying) and existing repository on top of our assignment repo. Generally we have found it to be a good idea to create a public template repository for each assignment which is then used to populate each team individual repository. In this case we will borrow assignments from a class we’ve previously taught.

repo_mirror(
  source_repo = "Sta323-Sp19/hw1",
  target_repo = org_repos("ghclass-vignette","hw1-")
)
#> ✔ Cloned 'Sta323-Sp19/hw1'.
#> ✔ Pushed from 'hw1' to 'https://github.com/ghclass-vignette/hw1-team01.git/master'.
#> ✔ Pushed from 'hw1' to 'https://github.com/ghclass-vignette/hw1-team02.git/master'.
#> ✔ Pushed from 'hw1' to 'https://github.com/ghclass-vignette/hw1-team03.git/master'.
#> ✔ Removed local copy of 'Sta323-Sp19/hw1'

Creating an individual assignment

The process for creating an individual assignment is very similar to that for creating a team assignment, we will need to create repositories for each student and then add them individually to those repos.

students = org_members("ghclass-vignette", include_admins = FALSE)
repo_create("ghclass-vignette", students, prefix="hw4-")
#> ✔ Created repo 'ghclass-vignette/hw4-ghclass-anya'.
#> ✔ Created repo 'ghclass-vignette/hw4-ghclass-bruno'.
#> ✔ Created repo 'ghclass-vignette/hw4-ghclass-celine'.
#> ✔ Created repo 'ghclass-vignette/hw4-ghclass-diego'.
#> ✔ Created repo 'ghclass-vignette/hw4-ghclass-elijah'.
#> ✔ Created repo 'ghclass-vignette/hw4-ghclass-francis'.

Adding students individually to repos is done using the repo_add_user instead of the repo_add_team function.

repo_add_user(org_repos("ghclass-vignette", "hw4-"), students)
#> ✔ Added user 'ghclass-anya' to repo 'ghclass-vignette/hw4-ghclass-anya'.
#> ✔ Added user 'ghclass-bruno' to repo 'ghclass-vignette/hw4-ghclass-bruno'.
#> ✔ Added user 'ghclass-celine' to repo 'ghclass-vignette/hw4-ghclass-celine'.
#> ✔ Added user 'ghclass-diego' to repo 'ghclass-vignette/hw4-ghclass-diego'.
#> ✔ Added user 'ghclass-elijah' to repo 'ghclass-vignette/hw4-ghclass-elijah'.
#> ✔ Added user 'ghclass-francis' to repo 'ghclass-vignette/hw4-ghclass-francis'.

And once again we can provide template code and documentation using the repo_mirror function.

repo_mirror(
  source_repo = "Sta323-Sp19/hw4",
  target_repo = org_repos("ghclass-vignette","hw4-")
)
#> ✔ Cloned 'Sta323-Sp19/hw4'.
#> ✔ Pushed from 'hw4' to 'https://github.com/ghclass-vignette/hw4-ghclass-anya.git/master'.
#> ✔ Pushed from 'hw4' to 'https://github.com/ghclass-vignette/hw4-ghclass-bruno.git/master'.
#> ✔ Pushed from 'hw4' to 'https://github.com/ghclass-vignette/hw4-ghclass-celine.git/master'.
#> ✔ Pushed from 'hw4' to 'https://github.com/ghclass-vignette/hw4-ghclass-diego.git/master'.
#> ✔ Pushed from 'hw4' to 'https://github.com/ghclass-vignette/hw4-ghclass-elijah.git/master'.
#> ✔ Pushed from 'hw4' to 'https://github.com/ghclass-vignette/hw4-ghclass-francis.git/master'.
#> ✔ Removed local copy of 'Sta323-Sp19/hw4'

Modifying Repos

Mirroring repos is somewhat heavy handed, since it forces the target repo to be identical to the source repo. In some cases we only want to add or modify a single file in the repository.

You can modify repos after they have been created. This will overwrite existing files with the same name in the repo, so you should be careful not to do this if students have already started working on the repos.

file = system.file("README.md", package = "ghclass")
file
#> [1] "/private/var/folders/kx/pvtbsc010kn6hbch7hm5yv040000gn/T/RtmpQFVZUe/temp_libpathbf33ba4ff01/ghclass/README.md"

repo_add_file(
  org_repos("ghclass-vignette","hw1-"),
  message = "Replace README.md with the correct version",
  file = file
)
#> ✖ Failed to add file 'README.md' to repo 'ghclass-vignette/hw1-team01', this file already exists.
#>   File has been modified after initial commit.
#>   If you want to force add this file, re-run the command with `overwrite = TRUE`.
#> ✖ Failed to add file 'README.md' to repo 'ghclass-vignette/hw1-team02', this file already exists.
#>   File has been modified after initial commit.
#>   If you want to force add this file, re-run the command with `overwrite = TRUE`.
#> ✖ Failed to add file 'README.md' to repo 'ghclass-vignette/hw1-team03', this file already exists.
#>   File has been modified after initial commit.
#>   If you want to force add this file, re-run the command with `overwrite = TRUE`.
repo_add_file(
  org_repos("ghclass-vignette","hw1-"),
  message = "Replace README.md with the correct version",
  file = file,
  overwrite = TRUE
)
#> ✔ Added file 'README.md' to repo 'ghclass-vignette/hw1-team01'.
#> ✔ Added file 'README.md' to repo 'ghclass-vignette/hw1-team02'.
#> ✔ Added file 'README.md' to repo 'ghclass-vignette/hw1-team03'.
cat( repo_get_readme("ghclass-vignette/hw1-team01") )
#> ## Homework 01
#> 
#> This is the corrected version of the HW01 Readme

We can also use the function repo_modify_file to make changes to existing files,

repo_modify_file(
  repo = org_repos("ghclass-vignette","hw1-"),
  path = "README.md",
  pattern = "Due by 11:59 pm on Thursday 1/24/2019.\n",
  content = "Due: Tomorrow\n",
  method = "replace"
)
#> ✖ Unable to find pattern '(Due by 11:59 pm on Thursday 1/24/2019.\n)' in 'ghclass-vignette/hw1-team01/README.md'.
#> ✖ Unable to find pattern '(Due by 11:59 pm on Thursday 1/24/2019.\n)' in 'ghclass-vignette/hw1-team02/README.md'.
#> ✖ Unable to find pattern '(Due by 11:59 pm on Thursday 1/24/2019.\n)' in 'ghclass-vignette/hw1-team03/README.md'.
cat( repo_get_readme("ghclass-vignette/hw1-team01") )
#> ## Homework 01
#> 
#> This is the corrected version of the HW01 Readme

Collecting Student Work

Eventually the students will be finished with the work and of the assignment deadline will have passed. ghclass makes it easy to collect all of the student work off of GitHub and make it accessible on your local computer for grading.

local_repo_clone(
  repo = org_repos("ghclass-vignette", "hw1-"),
  local_path = "hw1"
)
#> ✔ Cloned 'ghclass-vignette/hw1-team01'.
#> ✔ Cloned 'ghclass-vignette/hw1-team02'.
#> ✔ Cloned 'ghclass-vignette/hw1-team03'.

fs::dir_tree("hw1/.")
#> hw1/.
#> ├── hw1-team01
#> │   ├── README.md
#> │   ├── fizzbuzz.png
#> │   ├── hw1.Rmd
#> │   ├── hw1.Rproj
#> │   ├── hw1_whitelist.R
#> │   └── wercker.yml
#> ├── hw1-team02
#> │   ├── README.md
#> │   ├── fizzbuzz.png
#> │   ├── hw1.Rmd
#> │   ├── hw1.Rproj
#> │   ├── hw1_whitelist.R
#> │   └── wercker.yml
#> └── hw1-team03
#>     ├── README.md
#>     ├── fizzbuzz.png
#>     ├── hw1.Rmd
#>     ├── hw1.Rproj
#>     ├── hw1_whitelist.R
#>     └── wercker.yml

Other Suggestions

Managing repository permissions

Individual-level permissions can be set via the “People” tab on the organization page. We recommend the course instructor to be the owner of the organization and teaching assistants to receive admin privileges. Students should receive member privileges.

Github allows further permissions for accessing and changing repositories to be set for each individual member or at the organization-level (under Settings > Member Privileges). We suggest the organization-level settings below.

Member repository permissions

  • Base permissions: None
  • Repository creation: Disabled
  • Repository forking: Disabled

Admin repository permissions

  • Repository visibility change: Disabled
  • Repository deletion and transfer: Disabled
  • Issue deletion: Disabled

Member team permissions

  • Allow members to create teams: Disabled

FAQ

  1. Do I really need private repositories for my students’ assignments? I don’t care if they see each others’ work.

You might not care, but the law might. For example, in the United States, FERPA regulations stipulate that student information should be kept private. If you use public repositories, anyone can find out who is enrolled in your course. Additionally, you will likely be using GitHub issues for providing feedback on the students’ work, and potentially even mention their grade in a given assignment. This information should not be publicly available to anyone.

Also, your students may not want their coursework to be publicly available. They are bound to make mistakes as they learn and it should be up to them whether they want those to be a piece of their public profile on GitHub.

  1. Why not use GitHub Classroom?

At some level this is a matter of preference, but there are a few functionalities here that are not present in GitHub Classroom:

  • Pre-defined teams – as opposed to relying on students to pick their team when creating their assignment repo.
  • Command-line interface – if you like writing R code to solve your problems this may be a better fit for you as it provides a greater level of control and more flexibility.
  • Actually you don’t have to choose between ghclass and GitHub Classroom, your workflow might involve using both.