Handbook - Problems

In the previous two chapters, we learned the basics about designing programming problems for Online Judge systems (such as Sphere Engine Problems and Spoj) and how Sphere Engine system processes submissions. This chapter is devoted to providing deep insight into programming problem construction.

Here we explore the components of a programming problem before discussing them in greater detail:

  • Description:
    • the task,
    • input and output specification,
    • input and output examples,
  • Test cases (possibly many):
    • input file,
    • output file,
    • time limit,
    • test case judge,
  • Master judge.

Before we begin an individual components analysis, let's take a look at the complete submission flow diagram to embed the listed components in the proper context:

Submission processing diagram
Fig. 1. Submission processing diagram

We will recall the Integer Power problem, as we are going to use it for demonstration purposes. The task described in the problem involves computing the value of ab for given integers a and b.


The user must be able to understand the problem to solve it successfully. Description contains complete information needed to solve the problem. Therefore, it is the most important part from the user's point of view.

The task

The formal text that describes the specificity of the problem usually in a mathematical manner.

Tip: To achieve good description quality, a good habit is to:

  • graphically illustrate the problem,
  • support a dry theory with simple examples.

Referring to the Integer Power problem, we could write the task as follows:

Example (part 1)
Write the program which takes two numbers a and b and returns the value of ab. For example for a = 3 and b = 4 the result is 34 = 81.

Input and output specification

The description of the task is complete in the sense of being able to understand the programming problem. However, automatic verification of the program's correctness is only possible when we can make an assumption on the input and output data specifications. At first, we need to specify the input file structure to make it possible for users to implement proper input reading.

Tip: The specification of the input file should be a piece of information that the user can rely on.

On the other hand, we expect the user to produce an output file with a predictable structure. Formal specification of the output file is information that we are going to rely on to verify the correctness of the solution.

Referring to the Integer Power problem let us recall the first solution to the problem. We didn't specify any formal input or output syntax in the previous section. The first attempt was the solution with the human-readable interface. Afterwards, we modified the code to achieve a simplified version. We can use this source code as a base for the following formal specification:

Example (part 2)
In the only line of the input there will be two integer numbers 1 ≤ a ≤ 8 and 0 ≤ b ≤ 10 separated by a single space character.
Program should write a single number which is a value of ab.

Input and output examples

The section with input and output examples is dedicated to illustrate the input and output files structure. In the best-case scenario, they cover every distinct configuration of parameters (up to numbers, letters, etc.) which is essential for more complex problems. Referring to the Integer Power problem we present how we could compose the examples:

Example (part 3)
3 4

Test case(s)

Just as the description was for users only, the test cases are for the machine checker. This is the essence of the Online Judge (e.g., Sphere Engine Problems) idea.

The vast majority of use cases implement the following schema: there is a model input paired to a model output along with the program, which can compare that model output with the user's output to decide whether the user's answer is correct or not.

Input and output files

The input file contains the data that will be provided to the user's program's input stream. The format of the data must be consistent with the input specification defined as a part of the programming problem description. The output file should contain the corresponding correct answer formatted in accordance with the output specification.

Note: It's a good habit to write a model solution to the programming problem. It allows creating the model output files for test cases and it is useful as a reference for the future.

Referring to the Integer Power problem, we present how we could prepare the following test cases:

Example (part 4)

The following table presents test cases in the form of pairs of an input file and an output file.

Test case Input file Output file
(example data)
3 4 81
2 7 0 1
3 2 10 1024

Tip: It is advised to pack multiple input-output pairs into a single test case instead of making multiple test cases for a single input-output pair in each test case. There are many reasons for that approach. For further information please visit the Good test cases design section of the Related topics.

Time limit

We have already pointed out that one of the features of Online Judge systems (e.g., Sphere Engine Problems) is the possibility of estimating the time complexity. To achieve this, the content manager has to adjust the time limit for program execution. We cover this topic in testing time complexity section of the Related topics chapter.

Example (part 5)
Our toy model example problem is much too simple in assumptions to allow us to present an example of time limits that distinguish different algorithms. Therefore, we set a default time limit of 1 second. In the next section, we present a more advanced example where we further discuss the time limit which can help to estimate algorithm quality.

Test case judge

The test case judge is a program which processes the user's output file after execution. Its task is to establish if the submission passes the test case (we call it a verdict) and potentially also returns the score. When the user's program passes the test case the verdict is accepted.

Usually, the test case judge implementation is reduced to compare the reference output file with the user's output file. We support content managers with default test case judges:

  • Ignoring extra white spaces - compares output files up to extra white characters (spaces, tabulations, etc.),
  • Ignoring floating point errors up to a specific position - works analogously to the judge “Ignoring extra white spaces”, but in addition it allows the floating point numbers to be inaccurate i.e., we can accept the errors up to, for example, 0.01,
  • Exact judge - requires output files to be identical.

There is more information about test case judges available in the Judges chapter. In this chapter you will also find information about the score.

Tip: The Ignoring extra white spaces test case judge is the most popular choice. It is more liberal for output formatting errors, which in fact doesn't affect the solution correctness.

It is possible to create custom test case judges. The content manager can implement any kind of verification when they have full access to the input file, reference output file, user's output file, and even user's source code. For more information visit the implementation of a test case judge section of the Custom Judges chapter.

Example (part 6)

For the Integer Power problem, we decide to use default Ignoring extra white spaces test case judge for each test case. Therefore, we allow the user to generate extra white characters before and after the resulting number ab.

For example when the user's solution prints "     81    " as a result for the 3 4 problem instance, it is still a correct answer.

Master judge

We have discussed the individual test cases for the problem and established that each of them returns information, i.e., verdict and score. The master judge is the component which combines all incoming results obtained from test cases to produce the final result, which is the final verdict and the final score. You can look again at the submission flow diagram for better understanding.

There are predefined master judges proper for most situations:

  • Generic master judge - it gathers information from test case judges and requires each of them to achieve accepted as the verdict to establish the final verdict as accepted,
  • Score is % of correctly solved sets - it is a more liberal master judge which allows accepting an incomplete solution with the final score, which is the percentage of correctly solved test cases.

More information about master judges is available in the master judges section of the Judges chapter. If you need a custom approach for establishing the final result, it is possible to create a new master judge. You have access to the source code of default master judges, so you can use it as a base for modifications.

You can find further information about designing master judges in the implementation of the master judge section of the Custom Judges chapter.

Example (part 7)

The last missing part of the example that we are successively developing is the master judge. We have created three test cases. There is no need to implement the specific own master judge. We, therefore, make our choice from among the available by default.

When we need to distinguish the solutions as better or worst (but both correct, at least partially) we should choose Score is % of correctly solved sets. However, in our situation, each test case verifies the correctness (i.e., no performance or border cases are tested). Therefore, we select Generic master judge to force the user's solution to pass all test cases.

Complete example

We have discussed all components of the problem specification. Therefore, we are able to present the whole programming problem ready to be used in the Online Judge system (e.g., Sphere Engine Problems):

The Integer Power


The task

Write the program which takes two numbers a and b and returns the value of ab. For example for a = 3 and b = 4 the result is 34 = 81.

Input specification

In the only input line there will be two integer numbers 1 ≤ a ≤ 8 and 0 ≤ b ≤ 10 separated by a single space character.

Output specification

Your program should write a single number which is a value of ab.


3 4


Test cases

Test case Input file Output file Judge Time limit
(example data)
3 4 81 Ignoring extra white spaces 1s
2 7 0 1 Ignoring extra white spaces 1s
3 2 10 1024 Ignoring extra white spaces 1s

Master judge

Generic master judge