This section describes how to create CUDA-based leaderboards, which expect CUDA submissions. To create leaderboards on a Discord server,
the Discord bot expects you to have a Leaderboard Admin or Leaderboard Creator role. These can be
assigned by admins / owners of the server. Nevertheless, this section is also useful for participants
to understand how their submissions are evaluated.
Like we’ve mentioned before, each leaderboard specifies a number of GPUs to evaluate on based on the
creator’s choosing. You can think of each (task, GPU) pair as having essentially its own independent
leaderboard, as for example, a softmax kernel on an NVIDIA T4 may perform very differently on an NVIDIA H100. We
give leaderboard creators the option to select which GPUs they care about for their leaderboards –
for example, they may only care about NVIDIA A100 and NVIDIA H100 performance for their leaderboard.
To create a leaderboard you can run:
After running this, similar to leaderboard submissions, a UI window will pop up asking which GPUs
the leaderboard creator wants to enable submissions on. In the remaining section, we detail how
the unzipped task_zip folder should be structured. Examples of these folders can be found here.
task.yml specification.When a user submits a reference kernel, it is launched inside of a leaderboard-specific evaluation harness, and we provide
several copy-able examples of a leaderboard folder in our GitHub.
The relevant files are defined in a task.yml – for example, in the identity-cuda leaderboard, the YAML looks as follows:
```yaml title=”task.yml”
files:
lang: “cu”
description: Identity kernel in CUDA.
config: # task provided source files to compile sources: [“eval.cu”, “submission.cu”]
# additional include directories include_dirs: []
templates: CUDA: “template.cu”
tests:
benchmarks:
This config file controls all relevant details about how participant will interact with the leaderboard. We will discuss each parameter in detail. Some of the more simple keys are:
lang controls the language of the leaderboard (py or cu)config.sources controls what files are compiled as source files. Most leaderboards will just use eval.cu and submission.cu.config.include_dirs specifies extra include directories for say header libraries. By default, this includes ThunderKittens.templates is an optional way to provide users with an example template for a kernel submission..zipOther than task.yml, the files key controls the list of files that the evaluation harness expects. The leaderboard
creator has to include all of these files, but we provide examples to make it a lot easier. The
name key is how this file is imported locally, and the source key is the name of the actual file in the folder.
submission.cu: This is a special key-value pair that denotes the user submitted kernel (it should not exist in the .zip).task.h ⭐: Specifies constants and the input / output type (e.g. arguments) that the leaderboard kernel should expect.utils.h: Some extra utils that can be used for leaderboard logic. We also include a TestReporter class that is used for reporting kernel correctness.reference.cuh ⭐: Leaderboard-specific logic for generating input data, the reference kernel, and correctness
logic to compare user and reference kernel outputs.eval.cu: Run user and reference kernels to check correctness, then measure
runtime of user kernel if it passes correctness checks. Usually does not need to be edited.In short, most leaderboard
creators will only have to edit task.h and reference.cuh, but we will go over how to edit these more in detail.
task.h and reference.cuh exampleTo keep this simple, a leaderboard creator really only needs to specify:
We recommend following our examples for simplicity, but our task definition allows leaderboard creators to fully modify their evaluation harness. In the remaining sections, we will go over how to use our pre-defined examples.
In all of our examples, the task.h file handles (1) and part of (2), while the reference.cuh file handles (2,3,4). Below, we provide the task.py for the identity-cuda leaderboard.
```cpp title=”task.h” #ifndef POPCORN_TASK_H #define POPCORN_TASK_H
#include
using input_t = std::vector
constexpr std::array<const char*, 2> ArgumentNames = {“seed”, “size”};
#endif
The example above specifies aliases for the input (`input_t`) and output (`output_t`) types of the kernel task. It also specifies
an array of strings called `ArgumentNames`, which specifies **what arguments are passed into the input data generator** at runtime. We distinguish
between `tests` cases and `benchmarks` cases, the former being the actual leaderboard cases and the latter being for users to
debug their code. Using this `TestSpec` specification, we provide test cases to the `task.yml` and fill in the arguments, as shown below:
```yaml title="task.yml"
...
tests:
- {"size": 128, "seed": 5236}
- {"size": 129, "seed": 1001}
- {"size": 256, "seed": 5531}
benchmarks:
- {"size": 1024, "seed": 54352}
- {"size": 4096, "seed": 6256}
- {"size": 16384, "seed": 6252}
- {"size": 65536, "seed": 125432}
Finally, we fill in details for the input data generator, reference kernel, and correctness checker for identity-cuda below:
```cpp title=”reference.cuh #ifndef REFERENCE_CUH #define REFERENCE_CUH
#include
#include “task.h”
// Input data generator. Arguments must match ArgumentNames in task.h
static input_t generate_input(int seed, int size) {
std::mt19937 rng(seed);
input_t data;
std::uniform_real_distribution
data.resize(size); for (int j = 0; j < size; ++j) { data[j] = dist(rng); }
return data; }
// Reference kernel. Must take input_t and produce output_t
static output_t ref_kernel(input_t data) {
return (output_t) data;
}
static void check_implementation(TestReporter& reporter, input_t data, output_t user_out, float epsilon = 1e-5) { output_t ref = ref_kernel(data);
if(user_out.size() != ref.size()) { if(!reporter.check_equal(“size mismatch”, user_out.size(), ref.size())) return; }
for (int j = 0; j < ref.size(); ++j) { if (std::fabs(ref[j] - user_out[j]) > epsilon) { reporter.fail() « “error at “ « j « ”: “ « ref[j] « ” “ « std::to_string(user_out[j]); return; } }
reporter.pass(); }
#endif
As mentioned earlier, based on `task.yml` and `task.h`, each test case will pass a specified set
of arguments to `generate_input(...)` to produce the input data for that task case. We recommend specifying
a seed argument to properly randomizing inputs in a reproducible manner. Furthermore, `check_implementation` uses a
a `TestReport` object defined in `utils.h` to give leaderboard creators the flexibility to provide error messages to participants to help debug
and easily specify whether test cases pass/fail.
**Remark.** Leaderboard creators have the flexibility to edit the logic in `eval.cu`, which uses all of these functions
to evaluate and measure the user specified kernels. The examples above assume the use of our `eval.cu` implementation, but
this can be modified if desired.
## Deleting a Leaderboard
If you have sufficient permissions on the server, you can also delete leaderboards with:
<center>
/leaderboard delete {leaderboard_name: str} ``` </center>
This command will display a UI window with a list of available leaderboards. Select the leaderboard you want to delete from the list. Once confirmed, the leaderboard and all associated submissions will be permanently removed. Please use this command with caution, as it will also delete the leaderboard history as well.
We try to provide examples of leaderboards that can be quickly copied and modified for other references here. Most leaderboards should be able to just modify these files.