For many times, I found that testing a solution could be equally hard as writing a solution, if not harder. Here is an example: how to randomly generate test cases for a HasCycle() function.
HasCycle() is a coding question that I used for many times in the past: write a function to detect whether there is a cycle in a directed graph. Some candidates finished the code pretty quickly, so my follow-up question would be "how would you test it". For those who manually enumerated test cases quickly and thoroughly, my next question was "how would you randomly generate test cases".
A wrong answer would be to write the solution again, say HasCycle2(), and see if HasCycle() produces the same results as HasCycle2() does for any randomly generated directed graph. That's a wrong method because it's circular proof: using this method requires the proof of HasCycle2() being fully correct, which is the same as to prove HasCycle() being fully correct. Someone argued that he could write HasCycle2() in a different algorithm to make it not a circular proof. No, that would be still a circular proof, just in a disguised form.
There wasn't a satisfactory answer by any candidate, as far as I remembered. To be honest, I didn't know the answer in the first place, either. It wasn't about the answer itself, anyway. It was mainly to figure out how the candidate approaches problems. As long as the reasoning is sound, it would be cool if the candidate told me "no, there is no way, because blah blah blah". But soon I started to think: to be fair, maybe I should figure that out myself first, to make sure either there is an answer, or the answer is "no, there isn't an answer".
I did come up with a way to randomly generate test cases for HasCycle(). Here are the steps:
The crux is how to identify which red arcs to add in step (4). Here is how:
Use an example to illustrate how this works:
First, randomly pick N=11, to generate a 11-nodes directed graph with no arcs:
Add the first blue arc: A and K are randomly picked. A blue arc A->K is added. At the same time, a red arc K->A is added. K->A is red because it will form a cycle together with a blue arc A->K:
Add the second blue arc randomly: B->A. This time, not only A->B should be added as a red arc (because A->B and B->A together will be a cycle), but also K->B should be added ad red arc as well, because B->A, A->K and K->B will be a cycle:
Keep doing this. When G->H is added as blue arc, three red arcs need to be added: H->G, H->A, H->B. Because each of them will form a cycle with other blue arc(s):
Now let's examine a more general example of how to add red arcs. This time, a blue arc G->F is randomly added:
In this case, from-Y is F, C, E and to-X is G, A, B. So we should add nice arcs: F->G, F->A, F->B, C->G, C->A, C->B, E->G, E->A, E->B:
So the outcome of step (5) is this:
Now, if we want a random graph without cycle, we just need to remove all the red arcs:
If we want a random graph with cycle, we will randomly convert one red arc to blue and remove all the rest red arcs. For example, we convert C->A to blue. That will give us a cycle A->G->F->C->A:
I believe this method doesn't have any blind spot. Given enough time, it should be able to generate all possible kinds of directed graph, with 0, 1 or multiple cycles.