03_HintsAndTests.Rmd
I put a lot of effort into writing sensible automatic tests and hints for RTutor. In most cases problem set tasks without customized hints or tests should yield a decent experience for users.
If no custom hint is provided, RTutor calls automatically either hint.for.assign
(if a variable is assigned), hint.for.call
(function call without assignment), hint.for.function
(if the user generates a function) or hint.for.compute
(for multi-step computations in a compute block).
For an example of an automatic hint, consider a chunk were students shall write a dplyr pipe. The code in the solution file is:
#< task
dat = tibble(a = sample(1:4,100, replace=TRUE), x=runif(100,0,a))
#>
dat %>%
group_by(a) %>%
summarize(mean_x = mean(x)) %>%
arrange(-mean_x)
If the user writes the following wrong code
dat = tibble(a = sample(1:4,100, replace=TRUE), x=runif(100,0,a))
dat %>%
group_by(a) %>%
summarize(mean_x = sum(x)) %>%
arrange(-mean_x)
The test fails with a not very informative message:
Error:
You have not yet entered all correct commands in chunk 1.
For a hint, type hint() in the console and press Enter.
When typing hint()
the automatic hint shows a relatively detailed error analysis:
In your following pipe chain, I detect an error in the 3rd element:
dat %>%
group_by(a) %>%
summarize(mean_x = sum(x)) %>% !! WRONG !!
arrange(-mean_x)
It is correct to call summarize. But: Your argument mean_x = sum(x)
differs from my solution, where I call the function mean.
There are cases were you may want to customize your hints. Sometimes the automatic hints may reveal to much, sometimes they may reveal to little, or sometimes you just want to really polish your problem set and want hints with nicer texts.
There are also (in my experience substantially fewer) cases, in which you want to adapt the automatic correctness tests. We will describe test customization further below.
In general it is hard to predict where and how hints should be customized. If you use your problem sets in a course in which students submit there solution, take a look at the README.md of the companion package RTutorSAGI. It describes how you can analyse log files contained in the submission to detect and improve problematic tasks were many students got stuck. This allows for a nice iterative improvement process for your problem sets.
Consider the following specification in a problem set solution file. The user shall store the mean of a variable x
in the variable m
and then show m
.
m = mean(x)
#< hint
cat("Use Google, e.g. search for 'R compute mean'.")
#>
m
#< hint
cat("Great, you computed m. But read the task exactly, you shall also show it. Just type m in your code.")
#>
The #< hint
blocks describe the code that is run if the user calls hint()
after the expression directly above was not solved correctly.
We have a different hint for each of the two expressions. Such a custom hint block overwrites the automatic hints, which won’t be shown. A motivation to overwrite the first automatic hint is that it makes the solution too simple by already revealing the function name mean
(although not the argument y
). Here the custom hint pushes students to learn the extremely useful method to solve problems when coding: search the internet.
The second custom hint is not really neccessary though, since the automatic hint essentially reveals the same info. We can also set custom hints only for some expressions in a chunk, e.g.
The hints described above always show the same message, independent of the student’s code. Automatic hints already adapt to the student’s input and from RTutor version 2019.07.23 it is also more simple to make custom hints adaptive.
Here is an example, where the task is to assign the first 100 square numbers to the variable z
.
z = 1:100 * 1:100
#< hint
if (true(identical(z,1:10*1:10))) {
cat("Huh, you made a common mistake. Read the task precisely.
You shall assign to z the first 100 square numbers,
not only all square numbers that are less or equal to 100")
} else if (true(length(z)!=100)) {
cat("Your variable z must have 100 elements,
but in your solution z has", length(z),"elements.")
} else {
cat("There is a simple one-line formula to compute the first 100 square numbers.
Just combine what you have learned in exercise 2 f) and in exercise 3 b).
")
}
#>
The code in the hint block will be evaluated in an environment in which all variables defined in earlier solved chunks are known. Also (at least if you use the default tests) the student’s current code has been already be evaluated in the hint environment.
This means whether whether z
exists in the hint environment depends on whether the user has defined it in her solution for the chunk or not. The function true
is a robust version of isTRUE
that never throws an error. This means even if z
does not exist and thus the expression cannot be evaluated, we just get a FALSE
.
There are many situations were you want to keep the automatic hint but just add some additional hint. Then write your hint code in an add_to_hint
block.
Here is an example for an adaptive custom hint in an #< add_to_hint
block from one of my problem sets:
Task: Using the command
cbind
, generate the matrix X of explanatory variable whose first column consists of 1 and the second column consists of p.
X = cbind(1,p)
#< add_to_hint
if (exists("x") & !exists("X")) {
cat("It looks like you assigned the value to 'x' (lowercase), but you shall assign the value to 'X' (uppercase).")
}
#>
Looking at the logs of students’ solution, it became apparent that many mixed up X with x. The automatic hint did not help for this problem, yet in other cases, it was helpful. So I just wanted to add this specific custom hint to the automatic hint.
I tried my best to automatically test whether the student entered a correct solution or not.
A typical reason for adapting the automatic tests or writing custom tests is when you want to allow several correct solutions for a specific task.
Automatic tests either call check.assign
(if a value is assigned to a variable), check.call
(a statement that does not assign a variable), or check.function
(if a function is generated). These test function have a number of arguments, that allow to customize the tests.
A test_arg
blocks allows you to change the arguments of a default test for the preceding statement. Consider the following example:
plot(x=p,y=q,main="A plot", xlab="Prices")
#< test_arg
ignore.arg = c("main","xlab")
allow.extra.arg=TRUE
#>
The #< test_arg
block customizes the parameters ignore.arg
and allow.extra.arg
of the check.call
function. The parameter ignore.arg = c("main","xlab")
means that the student does not have to add these two arguments to the plot function or can use different values. The parameter allow.extra.arg=TRUE
allows the student to specify additional arguments when calling plot, e.g. specifying a ylab
. So essentially, it will now only be tested whether the x and y arguments of the plot are correct and any customization of the plot will still be considered a correct solution. See the help of check.call
for a description of arguments.
Consider the following example
Task: Let x contain the square roots of 4
x = c(-2,2)
#< test_arg
other.sols = list(quote(
x<-c(2,-2)
))
#>
#< add_to_hint
if (true(identical(x,2) | identical(x,-2))) {
cat("Recall that 4 has two square roots: a positive and a negative one.)
}
#>
The argument other.sols
of the default test function check.assign
takes a list of quoted assignments that consitute alternative correct solutions. In the example above, we don’t care about the order of the solution vector. Note that you must assign with <-
instead of =
inside the quote
function.
Consider a data set dat
were we observe the sales quantity q
of all car models for different years and regions. The student has the task to create a new data frame dat1
with an additional column that contains the market share
of each car model in each year and region. We don’t care with which exact commands share
is computed, however.
Below is the specification in the solution file:
dat1 = dat %>%
group_by(year,region) %>%
mutate(
Q = sum(q), # Total sales
share = q / Q # Market share of each car
)
#< test_arg
check.cols = "share"
sort.cols = c("year","market","car_model")
#>
In the sample solution, we compute an intermediate variable Q
, which is not really neccessary to solve the exercise. The test argument check.cols
can specify a subset of columns that will be checked. Here we only want to check the column share
. The argument sort.cols
specifies that both the sample solution and the student’s solution shall be sorted by the given columns. If the sort columns uniquely identify each row, this guarantees that solution is also accepted if the student has ordered the data frame in a different way than the sample solution. (If the student deletes one of the sort columns, the test will return FALSE
.)
While the correctness checks accept many different solutions, the automatic hints when hint()
is typed will guide the student along the concrete given sample solution.
Note that check.cols
and sort.cols
are only used if the sample solution computes a data.frame
or tibble
.
If you need more customization, e.g. because there is a large number of correct solutions, you can use a test
block. Unlike a hint
block you cannot just enter arbitrary code that will be evaluated in an approbriate environment. Instead you should call one or several given test functions like holds.true
, check.variable
, check.expr
, test.H0.rejected
, test.H0
and check.regression
.
Consider the following an example:
#' b) Save in the variable u a vector of 4 different numbers
u = c(3,6,7,99)
#< test
check.variable("u",c(3,6,7,99),values=FALSE)
#>
The automatic test check.assign
would pass if one of the following two conditions is met:
u
has the same value than in the solution. This means u=c(2,5,6,98)+1
would also pass as correct solutionu
is generated by an equivalent call as in the solution (equivalent means the function name should be the same and the arguments should have the same value). This is useful if the solution is a call that generates a random variable like x = runif(1)
.Yet in this example, the automatic test is too restrictive. The student shall just generate some arbitrary vector consisting of 4 numbers. The block
#< test
check.variable("u",c(3,6,7,99),values=FALSE)
#>
replaces the automatic test with a test that just checks whether a variable u
exists, and has the same length and class (numeric or integer) as an example solution c(3,6,7,99)
.
Here is a more complex example for customized tests and hints.
Task: Simulate a vector p of T prices that are correlated (but not perfectly correlated) with the weather w but uncorrelated with the demand shock u.
p = 1.1*c + 0.5*w
#< test
check.variable("p",1.1*c + 0.5*w,values=FALSE)
test.H0.rejected(cor.test(p,w),failure.message="I don't find a significant correlation between p and w (p-value=={{p_value}})")
holds.true(cor(p,w)<1-1e-14, failure.message="p and w shall not be perfectly correlated!")
test.H0(cor.test(p,u),failure.message="I do find a significant correlation between p and u (p-value=={{p_value}}), but they shall be uncorrelated.")
#>
#< hint
display("To make your prices correlated with w, you have to make w appear in the formula of p. To make p not perfectly correlated with p, there must also be other random factors that influence p, like cost c or some newly drawn vector of random price shocks.")
#>
We still provide a sample solution p = 1.1*c + 0.5*w
, but many other solutions are possible here.
A feasible solution must pass all 4 tests in the test block to be accepted. The function holds.true
is a quite general tool that can be used in custom tests.
The functions test.H0.rejected
, and test.H0
are just examples for special purpose convenience test functions.
Note that if you specify a custom test there will be no automatic hint. We thus added a custom hint.
Personally, I find it very valuable if students learn to write own functions. However, it is not easy to nicely test functions and students with little programming experience can get stuck easily.
Assume, you would like students to write an OLS function
ols = function(y,X) {
Xt = t(X)
invXtX = solve(Xt %*% X)
beta.hat = invXtX %*% Xt %*% y
as.numeric(beta.hat)
}
Directly asking them to correctly write such a function will probably fail for many students.
But you could first create one or several chunks, were students first develop the function body for an example. A solved chunk may look like
# Example arguments
x1 = 1:T
y = 100 + 2*x1 + rnorm(1)
X = cbind(1,x1)
# Code that we will put inside the function
Xt = t(X)
invXtX = solve(Xt %*% X)
beta.hat = invXtX %*% Xt %*% y
as.numeric(beta.hat)
Then provide a function stub like
and let students insert their code here.
Here is an example how you might specify the second chunk in your solution file:
#< fill_in
ols = function(y,X) {
# enter code to compute beta.hat here ...
return(as.numeric(beta.hat))
}
#>
ols <- function(y,X) {
beta.hat = solve(t(X) %*% X) %*% t(X) %*% y
as.numeric(beta.hat)
}
#< test_arg
ols(
c(100,50,20,60),
cbind(1,c(20,30,15,20))
)
#>
#< add_to_hint
display("Just insert inside the function ols the code to compute beta.hat from the previous task")
#>
First, we have a #< fill_in
block that specifies an unfinished function that will be shown to the student. Afterward, we have an example of a correct function ols
. Then the #< test_arg
block specifies parameters for the automatic test check.function
. The unnamed parameter
ols(c(100,50,20,60),cbind(1,c(20,30,15,20)))
Specifies a test call. check.function
will run this test call for both the sample solution and the student’s solution. The test will only pass if the both versions of the ols function return the same value. Finally, the #< add_to_hint
add some information to the automatic hint.
If you want to check a function that creates random variables, you can compare the results of the student’s function and the official solution using the same random seed.
Here is an example:
a) Write a function `runif.square` with parameters n, min and max
that generates n random variables that are the square of variables
that are uniform distributed on the interval from min to max.
```{r}
runif.square = function(n,min,max) {
runif(n,min,max)^2
}
#< test_arg
with.random.seed(runif.square(n=20,min=4,max=9), seed=12345)
#>
```
Our test call is now embedded in the function with.random.seed
that calls the function with a fixed random seed. Then the automatically called check.function
only passes if the students function and official solution return the same value when called with the same random seed.
If a function requires simulation of more than one random number, this testing procedure only works if the student draws the random numbers in the same order than the official solution. This means your task should specify already a lot of structure for the function and tell the student not to draw any additional random variables inside the function.
Sometimes you want to ask students to perform computations that will usually require several intermediate steps. Two somewhat opposite ways of implementing such multistep computations in a problem set would be the following:
A #< compute ... #>
block allows an intermediate approach. Here is an example from a problem set of mine that asks a student to compute a matrix of choice probabilities from a conditional logit model.
#< compute P
## Take a look at the formula for the choice probabilities P.nj of the logit discrete choice model in the slides.
## Let exp.V be a matrix that contains the numerators of P.nj (use scale as sigma in your formula)
exp.V = exp(V/scale)
## Let sum.exp.V be a vector that contains for each person n the denominator in P.nj. You can use the function 'rowSums'.
sum.exp.V = rowSums(exp.V)
## Compute P as ratio of the numerator and the denominator
P = exp.V / sum.exp.V
#>#< compute P
## Take a look at the formula for the choice probabilities P.nj of the logit discrete choice model in the slides.
## Let exp.V be a matrix that contains the numerators of P.nj (use scale as sigma in your formula)
exp.V = exp(V/scale)
## Let sum.exp.V be a vector that contains for each person n the denominator in P.nj. You can use the function 'rowSums'.
sum.exp.V = rowSums(exp.V)
## Compute P as ratio of the numerator and the denominator
P = exp.V / sum.exp.V
#>
The solution will pass as correct if the final values of P
are correct. The student is not obliged to perform the particular intermediate computations like exp.V
. Yet, if the student has not yet correctly computed P
and types hint()
, the hint function tries to steer the student step by step through the sample solution described in the block. The comments starting with ##
will be transformed into text that will be shown in the hint.
RTutor is not good at providing sensible hints or automatic tests that allow a user build long ggplot chains from scratch.
Assume you want to show in a problem set the following ggplot:
ggplot(data = counties,aes(x=area, y=population, col=state)) +
geom_point()+
xlim(c(0, 0.1)) +
ylim(c(0, 500000)) +
labs(title="Area vs Population",
y="Population",
x="Area") +
theme_bw()
One option would be to just provide the plotting commands in a task_notest
block and don’t force the student to write anything.
An alternative would be to specify in your solution a fill in block, like:
#< fill_in
# # Fill in the ???
# ggplot(data = counties,aes(x=???, y=population, colour=???)) +
# geom_point()+
# xlim(c(0, 0.1)) +
# ylim(c(0, 500000)) +
# labs(title="Area vs Population",
# y="Population",
# x="Area") +
# theme_bw()
#>
# Sample solution
ggplot(data = counties,aes(x=area, y=population, colour=state)) +
geom_point()+
xlim(c(0, 0.1)) +
ylim(c(0, 500000)) +
labs(title="Area vs Population",
y="Population",
x="Area") +
theme_bw()
#< test_arg
ok.if.same.val = FALSE
#>
Here the student only has to fill in the two mising aesthetics. RTutor accepts the solution only if the student has the exact sample solution, but since only blanks have to be filled in that should be possible.
We put the code in the fill_in
block in comments, since due to the ???
this is not syntactically correct R code and create.ps
would fail otherwise. The student will directly see the code without comments, however:
# Fill in the ???
ggplot(data = counties,aes(x=???, y=population, colour=???)) +
geom_point()+
xlim(c(0, 0.1)) +
ylim(c(0, 500000)) +
labs(title="Area vs Population",
y="Population",
x="Area") +
theme_bw()
The block
is optional. It has the effect that when checking the student’s solution RTutor does not need to evaluate the sample solution or the student’s solution but just compares the arguments of the function call with the student solution. This makes checking the the problem set faster.