-
Notifications
You must be signed in to change notification settings - Fork 129
Migrating v1.5 competition to v2
In this example I will be migrating this Iris competition to v2: https://github.com/madclam/m2aic2019
I first made sure the bundle I was working with worked on v1.5 by uploading the bundle produced by make_bundle
then making a submission included with the Iris example. It worked.
Then I downloaded Pisano period v2 competition here: https://github.com/codalab/competition-examples/tree/master/v2/pisano_period
And made sure that worked in a similar way, using the task + solution provided.
I went through the old YAML and many fields are not implemented yet, for example:
- force_submission_to_leaderboard
- disallow_leaderboard_modifying
- etc.
I commented these out for now, we can actually use this competition to test them later!
HTML section has a new format, old:
html:
data: data.html
evaluation: evaluation.html
overview: overview.html
terms: rules.html
#notebook: README.html
New:
pages:
- title: Data
file: data.html
- title: Evaluation
file: evaluation.html
- title: Overview
file: overview.html
- title: Rules
file: rules.html
In v2 we leverage "tasks and solutions" instead of putting data directly on phases. Phases keep their main properties like start_date
, end_date
although some are named more simply i.e. start
, end
Old:
phases:
1:
phasenumber: 1
label: Development Phase
description: 'Development phase: tune your models and submit prediction results, trained model, or untrained model.'
start_date: 2018-11-15
is_scoring_only: False
execution_time_limit: 500
max_submissions_per_day: 5
force_best_submission_to_leaderboard: True # Participants will see their best submission on the leaderboard
starting_kit: starting_kit.zip # The starting kit you prepared
ingestion_program: ingestion_program.zip # The ingestion program (the same for both phases)
public_data: input_data.zip # Same as input data (available for download by the participants)
input_data: input_data.zip # The data used by the ingestion program (and the code of the participants) in both phases
scoring_program: scoring_program.zip # The scoring program (the same for both phases)
reference_data: reference_data_1.zip # The truth values (solution) for phase 1 used by the scoring program
color: green
New:
tasks:
- index: 0
name: Iris Development Phase Task
input_data: input_data.zip
scoring_program: scoring_program.zip
reference_data: reference_data_1.zip
# No solutions included in this example, but it's possible to do it like so...
# solutions:
# - index: 0
# - tasks:
# - 0
# - path: solution.zip
phases:
- name: Development Phase
description: 'Development phase: tune your models and submit prediction results, trained model, or untrained model.'
start: 2018-11-15
tasks:
- 0
# if we had solutions..
# solutions:
# - 0
Old:
leaderboard:
leaderboards:
Results: &RESULTS
label: RESULTS
rank: 1
columns:
set1_score:
leaderboard: *RESULTS
label: Prediction score
numeric_format: 4
rank: 1
Duration:
leaderboard: *RESULTS
label: Duration
numeric_format: 2
rank: 2
New:
leaderboards:
- title: Results
key: main
columns:
- title: Prediction score
key: set1_score
index: 0
sorting: desc
- title: Duration
key: Duration
index: 1
sorting: desc
Codalab competitions can be ran in 3 styles:
- Code submission (i.e. python, R, C#)
- Result submission (i.e. json)
- Hybrid (submit either variation)
The worker executes the programs in a docker container that the user or organizer can specify, depending on the competition configuration.
v1.5 programs give Codalab some information on how to run the program. It has a metadata
file with a command call, where some strings are replaced with actual file locations, like so:
command: python $program/program.py $input $output
v2 programs tell Codalab how they run slightly differently, they use metadata.yaml
with a command (no strings replaced):
command: python program.py
The largest difference here is files are put in known places. You have /app/input
and /app/output
instead of some randomly named temporary folder.
For your predictions, write to the ./output
folder.
For your scoring programs, write to the ./output
folder a scores.json
mapping to your leaderboard.
Old:
# metadata
command: python3 $program/score.py $input $output
description: Compute scores for the competition
New, with default paths inserted:
(note: new style runs with working dir in the program folder)
# metadata.yaml *** NOTE: added .yaml file ending! ***
command: python3 score.py /app/input/ /app/output/
I tweaked the actual score.py
program to store scores in a dictionary and dump that:
Fix this:
- Metadata for scoring programs in CODE competitions should pass in
elapsedTime
among other missing properties. This should be in theinput
folder along with the submission prediction results - Allow specifying the Docker image to use for prediction/scoring on the competition itself. It is passed along to the worker already