diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 86cc1e671d..35f7a91bb3 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -31,7 +31,6 @@ listed here in alphabetical order: * Nicolas Essayan * Justin Faust (One Stop Systems) * Diane Feddema (Red Hat) -* Grigori Fursin (cTuning.org and cKnowledge.org) * Leonid Fursin (United Silicon Carbide) * Anirban Ghosh (Nvidia) * James Goel (Qualcomm) diff --git a/README.md b/README.md index d77b212b74..59e31cec89 100755 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ across diverse and continuously changing models, data, software and hardware. CK consists of several ongoing sub-projects: -* [Collective Mind framework (CM)](cm) (*~1MB*) - a very light-weight Python-based framework with minimal dependencies +* [Collective Mind framework (CM)](cm) - a very light-weight Python-based framework with minimal dependencies to help users implement, share and reuse cross-platform automation recipes to build, benchmark and optimize applications on any platform with any software and hardware. CM attempts to extends the `cmake` concept @@ -30,26 +30,16 @@ CK consists of several ongoing sub-projects: and the [ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339). - * [CM4MLOPS: CM automation recipes for MLOps, MLPerf and DevOps](https://github.com/mlcommons/cm4mlops) (*~6MB*) - + * [CM4MLOPS](https://github.com/mlcommons/cm4mlops) - a collection of portable, extensible and technology-agnostic automation recipes with a human-friendly interface (aka CM scripts) to unify and automate all the manual steps required to compose, run, benchmark and optimize complex ML/AI applications on diverse platforms with any software and hardware: see [online cKnowledge catalog](https://access.cknowledge.org/playground/?action=scripts), [online MLCommons catalog](https://docs.mlcommons.org/cm4mlops/scripts) and [source code](https://github.com/mlcommons/cm4mlops/blob/master/script). - * [CM automation recipes to reproduce research projects](https://github.com/ctuning/cm4research) (*~1MB*) - a unified CM interface to help researchers - and engineers access, prepare and run diverse research projects and make it easier to validate them in the real world - across rapidly evolving models, data, software and hardware - (see [our reproducibility initatives](https://cTuning.org/ae) - and [motivation](https://www.youtube.com/watch?v=7zpeIVwICa4) behind this project). - - * [CM automation recipes for ABTF](https://github.com/mlcommons/cm4abtf) (*~1MB*) - a unified CM interface and automation recipes + * [CM4ABTF](https://github.com/mlcommons/cm4abtf) - a unified CM interface and automation recipes to run automotive benchmark across different models, data sets, software and hardware from different vendors. - * [Modular C++ harness for MLPerf loadgen](https://github.com/mlcommons/cm4mlops/tree/main/script/app-mlperf-inference-mlcommons-cpp) - - * [Modular Python harness for MLPerf loadgen](https://github.com/mlcommons/cm4mlops/tree/main/script/app-loadgen-generic-python) - * [Collective Knowledge Playground](https://access.cKnowledge.org) - an external platform being developed by [cKnowledge](https://cKnowledge.org) to list CM scripts similar to PYPI, aggregate AI/ML Systems benchmarking results in a reproducible format with CM workflows, and organize [public optimization challenges and reproducibility initiatives](https://access.cknowledge.org/playground/?action=challenges) @@ -72,6 +62,12 @@ We are preparing new projects based on user feedback: [Apache 2.0](LICENSE.md) +### Copyright + +* Copyright (c) 2021-2024 MLCommons +* Copyright (c) 2014-2021 cTuning foundation + + ### Documentation **MLCommons is updating the CM documentation based on user feedback - please check stay tuned for more details**. diff --git a/docs/artifact-evaluation/checklist.md b/docs/artifact-evaluation/checklist.md index 709dcac6af..4b3f257229 100644 --- a/docs/artifact-evaluation/checklist.md +++ b/docs/artifact-evaluation/checklist.md @@ -1,245 +1 @@ -[ [Back to index](https://cTuning.org/ae) ] - -# Artifact Checklist - - -Here we provide a few informal suggestions to help you fill in the -[Unified Artifact Appendix with the Reproducibility Checklist](https://github.com/mlcommons/ck/blob/master/docs/artifact-evaluation/template/ae.tex) -for artifact evaluation while avoiding common pitfalls. -We've introduced this appendix to [unify the description of experimental setups and results across different conferences](https://learning.acm.org/techtalks/reproducibility). - - - - -## Abstract - - Briefly and informally describe your artifacts including minimal hardware, software and other requirements, - how they support your paper and what are they key results to be reproduced. - Note that evaluators will use artifact abstracts to bid on artifacts. - The AE chairs will also use it to finalize artifact assignments. - - -## Checklist - - - Together with the artifact abstract, this check-list will help us make sure that evaluators - have appropriate competency and an access to the technology required to evaluate your artifacts. - It can also be used as meta information to find your artifacts in Digital Libraries. - - ![](https://raw.githubusercontent.com/mlcommons/ck/master/docs/artifact-evaluation/image-general-workflow1.png) - - - Fill in whatever is applicable with some informal keywords and remove unrelated items - (please consider questions below just as informal hints - that reviewers are usually concerned about): - - -* **Algorithm:** Are you presenting a new algorithm? -* **Program:** Which benchmarks do you use - ([PARSEC](http://parsec.cs.princeton.edu "http://parsec.cs.princeton.edu"), - [NAS](http://www.nas.nasa.gov/publications/npb.html "http://www.nas.nasa.gov/publications/npb.html"), - [EEMBC](https://www.eembc.org "https://www.eembc.org"), - [SPLASH](http://www.capsl.udel.edu/splash/index.html "http://www.capsl.udel.edu/splash/index.html"), - [Rodinia](https://www.cs.virginia.edu/~skadron/wiki/rodinia "https://www.cs.virginia.edu/~skadron/wiki/rodinia"), - [LINPACK](http://www.netlib.org/linpack "http://www.netlib.org/linpack"), - [HPCG](http://hpcg-benchmark.org/ "http://hpcg-benchmark.org/"), - [MiBench](http://wwweb.eecs.umich.edu/mibench "http://wwweb.eecs.umich.edu/mibench"), - [SPEC](https://www.spec.org/cpu2006 "https://www.spec.org/cpu2006"), - [cTuning](http://github.com/ctuning/ctuning-programs "http://github.com/ctuning/ctuning-programs"), etc)? - Are they included or should they be downloaded? Which version? - Are they public or private? If they are private, - is there a public analog to evaluate your artifact? - What is the approximate size? -* **Compilation:** Do you require a specific compiler? Public/private? Is it included? Which version? -* **Transformations:** Do you require a program transformation tool (source-to-source, binary-to-binary, compiler pass, etc)? - Public/private? Is it included? Which version? -* **Binary:** Are binaries included? OS-specific? Which version? -* **Model:** Do you use specific models (GPT-J, BERT, MobileNets ...)? - Are they included? If not, how to download and install? - What is their approximate size? -* **Data set:** Do you use specific data sets? - Are they included? If not, how to download and install? - What is their approximate size? -* **Run-time environment:** Is your artifact OS-specific (Linux, Windows, MacOS, Android, etc) ? - Which version? Which are the main software dependencies (JIT, libs, run-time adaptation frameworks, etc); - Do you need root access? -* **Hardware:** Do you need specific hardware (supercomputer, architecture simulator, CPU, GPU, neural network accelerator, FPGA) - or specific features (hardware counters - to measure power consumption, SUDO access to CPU/GPU frequency, etc)? - Are they publicly available? -* **Run-time state:** Is your artifact sensitive to run-time state (cold/hot cache, network/cache contentions, etc.) -* **Execution:** Any specific conditions should be met during experiments (sole user, process pinning, profiling, adaptation, etc)? How long will it approximately run? -* **Metrics:** Which metrics will be evaluated (execution time, inference per second, Top1 accuracy, power consumption, etc). -* **Output:** What is the output of your key experiments (console, file, table, graph) and what are your key results - (exact output, numerical results, empirical characteristics, etc)? - Are expected results included? -* **Experiments:** How to prepare experiments and reproduce results - (README, scripts, [IPython/Jupyter notebook](https://jupyter.org "https://jupyter.org"), - [MLCommons CM automation language](https://doi.org/10.5281/zenodo.8105339), containers etc)? - Do not forget to mention the maximum allowable variation of empirical results! -* **How much disk space required (approximately)?:** This can help evaluators and end-users to find appropriate resources. -* **How much time is needed to prepare workflow (approximately)?:** This can help evaluators and end-users to estimate resources needed to evaluate your artifact. -* **How much time is needed to complete experiments (approximately)?:** This can help evaluators and end-users to estimate resources needed to evaluate your artifact. -* **Publicly available?:** Will your artifact be publicly available? If yes, we may spend an extra effort to help you with the documentation. -* **Code licenses (if publicly available)?:** If you workflows and artifacts will be publicly available, please provide information about licenses. - This will help the community to reuse your components. -* **Code licenses (if publicly available)?:** If you workflows and artifacts will be publicly available, please provide information about licenses. - This will help the community to reuse your components. -* **Workflow frameworks used?** Did authors use any workflow framework which can automate and customize experiments? -* **Archived?:** - Note that the author-created artifacts relevant to this paper - will receive the ACM "artifact available" badge \*only if\* - they have been placed on a publicly - accessible archival repository such as [Zenodo](https://zenodo.org "https://zenodo.org"), - [FigShare](https://figshare.com "https://figshare.com") - or [Dryad](http://datadryad.org "http://datadryad.org"). - A DOI will be then assigned to their artifacts and must be provided here! - Personal web pages, Google Drive, GitHub, GitLab and BitBucket - are not accepted for this badge. - Authors can provide the DOI for their artifacts at the end of the evaluation. - - - - -## Description - - - -### How to access - - - -Describe the way how reviewers will access your artifacts: - -* Clone a repository from GitHub, GitLab or any similar service -* Download a package from a public website -* Download a package from a private website (you will need to send information how to access your artifacts to AE chairs) -* Access artifact via private machine with pre-installed software (only when access to rare or publicly unavailable hardware is required or proprietary - software is used - you will need to send credentials to access your machine to the AE chairs) - - - - Please describe approximate disk space required after unpacking your artifact. - - -### Hardware dependencies - - - - Describe any specific hardware and specific features required to evaluate your artifact - (vendor, CPU/GPU/FPGA, number of processors/cores, interconnect, memory, - hardware counters, etc). - - -### Software dependencies - - - - Describe any specific OS and software packages required to evaluate your - artifact. This is particularly important if you share your source code - and it must be compiled or if you rely on some proprietary software that you - can not include to your package. In such case, we strongly suggest you - to describe how to obtain and to install all third-party software, data sets - and models. - - - - -*Note that we are trying to obtain AE licenses for some commonly used proprietary tools -and benchmarks - you will be informed in case of positive outcome.* - -### Data sets - - - - If third-party data sets are not included in your packages (for example, - they are very large or proprietary), please provide details about how to download - and install them. - - *In case of proprietary data sets, we suggest you provide reviewers - a public alternative subset for evaluation*. - - -### Models - - - - If third-party models are not included in your packages (for example, - they are very large or proprietary), please provide details about how to download - and install them. - - - - -## Installation - - - - Describe the setup procedures for your artifact (even when containers are used). - - - -## Experiment workflow - - - - Describe the experimental workflow and how it is implemented - and executed, i.e. some OS scripts, - [IPython/Jupyter notebook](https://jupyter.org "https://jupyter.org"), - [MLCommons CM automation language](https://github.com/mlcommons/ck/tree/master/docs), etc. - - Check [examples of reproduced papers](https://cknow.io/reproduced-papers "https://cknow.io/reproduced-papers"). - - - - - -## Evaluation and expected result - - - - Describe all the steps necessary to reproduce the key results from your paper. - Describe expected results including maximum allowable variation - of empirical results. - See the [SIGPLAN Empirical Evaluation Guidelines](https://www.sigplan.org/Resources/EmpiricalEvaluation "https://www.sigplan.org/Resources/EmpiricalEvaluation"), - the [NeurIPS reproducibility checklist](https://www.cs.mcgill.ca/~jpineau/ReproducibilityChecklist.pdf "https://www.cs.mcgill.ca/~jpineau/ReproducibilityChecklist.pdf") - and the [AE FAQ](faq.md) for more details. - - - -## Experiment customization - - - - It is optional but can be useful for the community if you describe all the knobs - to customize and tune your experiments and maybe even trying them - with a different data sets, benchmark/applications, - machine learning models, software environment (compilers, libraries, - run-time systems) and hardware. - - -## Reusability - -Please describe your experience if you decided to participate in our pilot project to add -the non-intrusive [MLCommons Collective Mind interface (CM)](https://doi.org/10.5281/zenodo.8105339) -to your artifacts. Note that it will be possible to prepare and run your experiments with -or without this interface! - - - -## Notes - - - - You can add informal notes to draw the attention of evaluators. - - - ----- - -*This document was prepared by [Grigori Fursin](https://cKnowledge.org/gfursin) - with contributions from [Bruce Childers](https://people.cs.pitt.edu/~childers), - [Michael Heroux](https://www.sandia.gov/~maherou), - [Michela Taufer](https://gcl.cis.udel.edu/personal/taufer) and other great colleagues. - It is maintained by the [cTuning foundation](https://cTuning.org/ae) and the - [MLCommons taskforce on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md).* +***Moved to https://github.com/ctuning/artifact-evaluation/blob/master/docs/checklist.md*** diff --git a/docs/artifact-evaluation/faq.md b/docs/artifact-evaluation/faq.md index d1f92a70b3..48f14c90aa 100644 --- a/docs/artifact-evaluation/faq.md +++ b/docs/artifact-evaluation/faq.md @@ -1,256 +1 @@ -[ [Back to index](https://cTuning.org/ae) ] - -# Artifact Evaluation FAQ - -
-Click here to see the table of contents. - -* [Artifact evaluation](#artifact-evaluation) - * [Frequently Asked Questions](#frequently-asked-questions) - * [What is the difference between Repeatability, Reproducibility and Replicability?](#what-is-the-difference-between-repeatability-reproducibility-and-replicability?) - * [Do I have to open source my software artifacts?](#do-i-have-to-open-source-my-software-artifacts?) - * [Is Artifact evaluation blind or double-blind?](#is-artifact-evaluation-blind-or-double-blind?) - * [How to pack artifacts?](#how-to-pack-artifacts?) - * [Is it possible to provide a remote access to a machine with pre-installed artifacts?](#is-it-possible-to-provide-a-remote-access-to-a-machine-with-pre-installed-artifacts?) - * [Can I share commercial benchmarks or software with evaluators?](#can-i-share-commercial-benchmarks-or-software-with-evaluators?) - * [Can I engage with the community to evaluate my artifacts?](#can-i-engage-with-the-community-to-evaluate-my-artifacts?) - * [How to automate, customize and port experiments?](#how-to-automate-customize-and-port-experiments?) - * [Do I have to make my artifacts public if they pass evaluation?](#do-i-have-to-make-my-artifacts-public-if-they-pass-evaluation?) - * [How to report and compare empirical results?](#how-to-report-and-compare-empirical-results?) - * [How to deal with numerical accuracy and instability?](#how-to-deal-with-numerical-accuracy-and-instability?) - * [How to validate models or algorithm scalability?](#how-to-validate-models-or-algorithm-scalability?) - * [Is there any page limit for my Artifact Evaluation Appendix?](#is-there-any-page-limit-for-my-artifact-evaluation-appendix?) - * [Where can I find a sample HotCRP configuration to set up AE?](#where-can-i-find-a-sample-hotcrp-configuration-to-set-up-ae?) - * [Questions and Feedback](#questions-and-feedback) - -
- -## Frequently Asked Questions - - -**If you have questions or suggestions which are not addressed here, please feel free -to contact the [public MLCommons task force on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md) -via this [Discord server](https://discord.gg/JjWNWXKxwT) or post them to the dedicated [AE google group](https://groups.google.com/forum/#!forum/artifact-evaluation).** - - -### What is the difference between Repeatability, Reproducibility and Replicability? - -We use the following definitions [adopted by ACM and NISO](https://www.acm.org/publications/policies/artifact-review-badging): - -* *Repeatability (Same team, same experimental setup)* - - The measurement can be obtained with stated precision by the same team using the same measurement procedure, - the same measuring system, under the same operating conditions, in the same location on multiple trials. - For computational experiments, this means that a researcher can reliably repeat her own computation. - -* *Reproducibility (Different team, different experimental setup)* - - The measurement can be obtained with stated precision by a different team using the same measurement procedure, - the same measuring system, under the same operating conditions, in the same or a different location on multiple trials. - For computational experiments, this means that an independent group can obtain the same result using the author's own artifacts. - -* *Replicability (Different team, same experimental setup)* - - The measurement can be obtained with stated precision by a different team, a different measuring system, - in a different location on multiple trials. For computational experiments, this means that an independent group - can obtain the same result using artifacts which they develop completely independently. - - - -### Do I have to open source my software artifacts? - - - -No, it is not strictly necessary and you can -provide your software artifact as a binary. -However, in case of problems, reviewers may not be -able to fix it and will likely give you a negative score. - - -### Is Artifact evaluation blind or double-blind? - - - -AE is a single-blind process, i.e. authors' names are known to the evaluators -(there is no need to hide them since papers are accepted), -but names of evaluators are not known to authors. -AE chairs are usually used as a proxy between authors and evaluators -in case of questions and problems. - - -### How to pack artifacts? - - - -We do not have strict requirements at this stage. You can pack -your artifacts simply in a tar ball, zip file, Virtual Machine or Docker image. -You can also share artifacts via public services including GitHub, GitLab and BitBucket. - -Please see [our submission guide](submission.md) for more details. - - -### Is it possible to provide a remote access to a machine with pre-installed artifacts? - - - -Only in exceptional cases, i.e. when rare hardware or proprietary software/benchmarks are required, -or VM image is too large or when you are not authorized to move artifacts outside your organization. -In such case, you will need to send the access information -to the AE chairs via private email or SMS. -They will then pass this information to the evaluators. - - -### Can I share commercial benchmarks or software with evaluators? - - - -Please check the license of your benchmarks, data sets and software. -In case of any doubts, try to find a free alternative. In fact, -we strongly suggest you provide a small subset of free benchmarks -and data sets to simplify the evaluation process. - - -### Can I engage with the community to evaluate my artifacts? - - - -Based on the community feedback, we allow open evaluation -to let the community validate artifacts which are publicly available -at GitHub, GitLab, BitBuckets, etc, report issues and help the authors -to fix them. - -Note, that in the end, these artifacts still go through traditional -evaluation process via the AE committee. We successfully validated -at [ADAPT'16](http://adapt-workshop.org/motivation2016.html) -and CGO/PPoPP'17! - - -### How to automate, customize and port experiments? - - - -From our [past experience reproducing research papers](https://www.reddit.com/r/MachineLearning/comments/ioq8do/n_reproducing_150_research_papers_the_problems), -the major difficulty that evaluators face is the lack of a common and portable workflow framework -in ML and systems research. This means that each year they have -to learn some ad-hoc scripts and formats in nearly -all artifacts without even reusing such knowledge the following year. - -Things get even worse if an evaluator would like to validate experiments -using a different compiler, tool, library, data set, operating systems or hardware -rather than just reproducing quickly outdated results using -VM and Docker images - our experience shows that most of the submitted scripts -are not easy to change, customize or adapt to other platform. - -That is why we collaborate with the [open MLCommons taskforce on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/mlperf-education-workgroup.md) -and [ACM](https://acm.org) to develop a [portable automation framework](https://github.com/mlcommons/ck/tree/master/docs) to make it easier to reproduce experiments -across continuously changing software, hardware and data. - -Please join this [taskforce](https://github.com/mlcommons/ck/blob/master/docs/mlperf-education-workgroup.md) -or get in touch with [the AE community](https://groups.google.com/forum/#!forum/artifact-evaluation) -to discuss how to automate your artifacts and make them more portable and reusable. - - -### Do I have to make my artifacts public if they pass evaluation? - -No, you don't have to and it may be impossible in the case of commercial artifacts. -Nevertheless, we encourage you to make your artifacts publicly available upon publication, -for example, by including them in a permanent repository (required to receive the "artifact available" badge) -to support open science as outlined in [our vision](http://dl.acm.org/citation.cfm?id=2618142). - - -Furthermore, if you make your artifacts publicly available at the time -of submission, you may profit from the "public review" option, where you are engaged -with the community to discuss, evaluate and use your software. See such -examples [here](https://cTuning.org/ae/artifacts.html) (search for "public evaluation"). - - -### How to report and compare empirical results? - - -**News:** Please check the [SIGPLAN Empirical Evaluation Guidelines](https://www.sigplan.org/Resources/EmpiricalEvaluation) -and the [NeurIPS reproducibility checklist](https://www.cs.mcgill.ca/~jpineau/ReproducibilityChecklist.pdf). - - - -First of all, you should undoubtedly run empirical experiments more than once -(we still encounter many cases where researchers measure execution time only once). -and perform statistical analysis. - -There is no universal recipe how many times you should repeat your empirical experiment -since it heavily depends on the type of your experiments, platform and environment. -You should then analyze the distribution of execution times as shown in the figure below: - -![](https://raw.githubusercontent.com/mlcommons/ck/master/docs/artifact-evaluation/image-994e7359d7760ab1-cropped.png) -If you have more than one expected value (b), it means that you have several -run-time states in your system (such as adaptive frequency scaling) -and you can not use average and reliably compare empirical results. - -However, if there is only one expected value for a given experiment (a), -then you can use it to compare multiple experiments. This is particularly -useful when running experiments across different platforms from different -users as described in this [article](https://cknow.io/c/report/rpi3-crowd-tuning-2017-interactive). - - -You should also report the variation of empirical results together with all expected values. -Furthermore, we strongly suggest you to pre-record results from your platform -and provide a script to automatically compare new results with the pre-recorded ones. -Otherwise, evaluators can spend considerable amount of time -digging out and validating results from "stdout". - -For example, see how new results are visualized and compared against the pre-recorded ones -using [some dashboard](https://github.com/SamAinsworth/reproduce-cgo2017-paper/files/618737/ck-aarch64-dashboard.pdf) -in the [CGO'17 artifact](https://github.com/SamAinsworth/reproduce-cgo2017-paper). - - - - -### How to deal with numerical accuracy and instability? - - - -If the accuracy of your results depends on a given machine, environment and optimizations -(for example, when optimizing BLAS, DNN, etc), you should provide a script to automatically -report unexpected loss in accuracy above provided threshold as well as any numerical instability. - - -### How to validate models or algorithm scalability? - - - -If you present a novel parallel algorithm or some predictive model which should scale -across a number of cores/processors/nodes, we suggest you -to provide an experimental workflow that could automatically detect the topology -of a user machine, validate your models or algorithm scalability, -and report any unexpected behavior. - - -### Is there any page limit for my Artifact Evaluation Appendix? - - - -There is no limit for the AE Appendix at the time of the submission for Artifact Evaluation. - - -However, there is currently a 2 page limit for the AE Appendix in the camera-ready CGO, PPoPP, ASPLOS and MLSys papers. -There is no page limit for the AE Appendix in the camera-ready SC paper. We also expect -that there will be no page limits for AE Appendices in the journals willing to participate -in the AE initiative. - - -### Where can I find a sample HotCRP configuration to set up AE? - - - -Please, check out our [PPoPP'19 HotCRP configuration for AE](https://www.linkedin.com/pulse/acm-ppopp19-artifact-evaluation-report-hotcrp-grigori-fursin) -in case you need to set up your own HotCRP instance. - - - - -### Questions and Feedback - - - -If you have any questions, do not hesitate to get in touch with the AE community -using this [public discussion group](https://groups.google.com/forum/#!forum/artifact-evaluation)! - +***Moved to https://github.com/ctuning/artifact-evaluation/blob/master/docs/faq.md*** diff --git a/docs/artifact-evaluation/hotcrp-config/README.md b/docs/artifact-evaluation/hotcrp-config/README.md index c79681f9ca..d5d26ef89e 100644 --- a/docs/artifact-evaluation/hotcrp-config/README.md +++ b/docs/artifact-evaluation/hotcrp-config/README.md @@ -1,25 +1 @@ -[ [Back to index](https://cTuning.org/ae) ] - -# HotCRP configuration for Artifact Evaluation - -Here is a typical configuration of HotCRP for Artifact Evaluation -based on our [ACM PPoPP'19 AE report](https://medium.com/@gfursin/acm-ppopp19-artifact-evaluation-report-and-hotcrp-configuration-f529134ab17c): - -* [HotCRP Settings - Basics](HotCRP_Settings__Basics__PPoPP'19_AE.pdf) -* [HotCRP Settings - Messages](HotCRP_Settings__Messages__PPoPP'19_AE.pdf) -* [HotCRP Settings - Submissions](HotCRP_Settings__Submissions__PPoPP'19_AE.pdf) -* [HotCRP Settings - Submission form](HotCRP_Settings__Submission_form__PPoPP'19_AE.pdf) -* [HotCRP Settings - Reviews](HotCRP_Settings__Reviews__PPoPP'19_AE.pdf) -* [HotCRP Settings - Review form](HotCRP_Settings__Review_form__PPoPP'19_AE.pdf) -* [HotCRP Settings - Decisions](HotCRP_Settings__Decisions__PPoPP'19_AE.pdf) - -You can also download this [JSON file](hotcrp-config-acm-ieee-micro-2023-ae.json) -with the latest configuration from the [ACM/IEEE MICRO 2023 Artifact Evaluation](https://cTuning.org/ae/micro2023.html) -and copy/paste it to your own HotCRP configuration in "Advanced" settings. - -If you have further questions, contact us via [Discord server](https://discord.gg/JjWNWXKxwT), -and/or [AE mailing list](https://groups.google.com/g/artifact-evaluation). - -Feel free to join the [MLCommons task force on automation and reproducibility](../../taskforce.md) -to participate in the development of a [common automation language and a public platform](../../README.md) -to facilitate reproducible research and technology transfer. +***Moved to https://github.com/ctuning/artifact-evaluation/blob/master/docs/hotcrp-config/README.md*** \ No newline at end of file diff --git a/docs/artifact-evaluation/reviewing.md b/docs/artifact-evaluation/reviewing.md index ec56a376fc..6c34a0e50b 100644 --- a/docs/artifact-evaluation/reviewing.md +++ b/docs/artifact-evaluation/reviewing.md @@ -1,157 +1 @@ -[ [Back to index](https://cTuning.org/ae) ] - -# Artifact evaluation - -This document provides the guidelines to evaluate artifacts at ACM and IEEE conferences. - -## Overview - -Shortly after the artifact submission deadline, the AE committee members -will bid on artifacts they would like to evaluate based on their competencies -and the information provided in the artifact abstract such as software and hardware dependencies -while avoiding possible conflicts of interest. - -Within a few days, the AE chairs will make the final selection of evaluators -to ensure at least two or more evaluators per artifact. - -Evaluators will then have approximately 1 months to review artifacts via HotCRP, -discuss with the authors about all encountered issues and help them fix all the issues. -Remember that our philosophy of artifact evaluation is not to fail problematic artifacts -but to help the authors improve their public artifacts, pass evaluation -and improve their Artifact Appendix. - -In the end, the AE chairs will decide on a set of the standard ACM reproducibility badges (see below) -to award to a given artifact based on all reviews and the authors' responses. -Such badges will be printed on the 1st page of the paper and will be available -as meta information in the [ACM Digital Library](https://dl.acm.org) - -Authors and reviewers are encouraged to check the [AE FAQ](faq.md) -and contact chairs and the community via our [Discord server for automation and reproducibility](https://discord.gg/JjWNWXKxwT) -or the [dedicated AE google group](https://groups.google.com/forum/#!forum/artifact-evaluation) -in case of questions or suggestions. - - -## ACM reproducibility badges - -Reviewers must read a paper and then thoroughly go through the Artifact Appendix -to evaluate shared artifacts. They should then describe their experience -at each stage (success or failure, encountered problems and how they were possibly solved, -and questions or suggestions to the authors), and give a score on scale -1 .. +1: - -- *+1* if exceeded expectations -- *0* if met expectations (or inapplicable) -- *-1* if fell below expectations - -### Artifacts available - -* Are all artifacts related to this paper publicly available? - -*Note that it is not obligatory to make artifacts publicly available!* - -![](https://www.acm.org/binaries/content/gallery/acm/publications/replication-badges/artifacts_available_dl.jpg) - -The author-created artifacts relevant to this paper will receive an ACM "artifact available" badge -**only if** they have been placed on a publicly accessible archival repository -such as [Zenodo](https://zenodo.org), [FigShare](https://figshare.com), -and [Dryad](http://datadryad.org). - -A DOI will be then assigned to their artifacts and must be provided in the Artifact Appendix! - -*Notes:* - -* ACM does not mandate the use of above repositories. However, publisher repositories, - institutional repositories, or open commercial repositories are acceptable - **only** if they have a declared plan to enable permanent accessibility! - **Personal web pages, GitHub, GitLab and BitBucket are not acceptable for this purpose.** -* Artifacts do not need to have been formally evaluated in order for an article - to receive this badge. In addition, they need not be complete in the sense - described above. They simply need to be relevant to the study and add value - beyond the text in the article. Such artifacts could be something as simple - as the data from which the figures are drawn, or as complex as a complete - software system under study. -* The authors can provide the DOI at the very end of the AE process - and use GitHub or any other convenient way to access their artifacts - during AE. - - -### Artifacts functional - -* Are all components relevant to evaluation included in the package? -* Well documented? Enough to understand, install and evaluate artifact? -* Exercisable? Includes scripts and/or software to perform appropriate experiments and generate results? -* Consistent? Artifacts are relevant to the associated paper and contribute in some inherent way to the generation of its main results? - -![](https://www.acm.org/binaries/content/gallery/acm/publications/replication-badges/artifacts_evaluated_functional_dl.jpg) - -*Note that proprietary artifacts need not be included. If they are required -to exercise the package then this should be documented, along with instructions -on how to obtain them. Proxies for proprietary data should be included so as to -demonstrate the analysis.* - -The artifacts associated with the paper will receive an -"Artifacts Evaluated - Functional" badge *only if* they are found to be documented, consistent, -complete, exercisable, and include appropriate evidence of verification and validation. - -We usually ask the authors to provide a small/sample data set to validate at least -some results from the paper to make sure that their artifact is functional. - -### Results reproduced - -* Was it possible to validate the key results from the paper using provided artifacts? - -![](https://www.acm.org/binaries/content/gallery/acm/publications/replication-badges/results_reproduced_dl.jpg) - -*You should report any unexpected artifact behavior to the authors (depends on the type of artifact such as unexpected output, scalability issues, crashes, performance variation, etc).* - -The artifacts associated with the paper will receive a "Results reproduced" badge *only if* the key results -of the paper have been obtained in a subsequent study by a person or team other than the authors, using -artifacts provided by the author. - -Some variation of empirical and numerical results is tolerated. -In fact it is often unavoidable in computer systems research - see -"how to report and compare empirical results" in the -[AE FAQ](faq.md) page, the [SIGPLAN Empirical Evaluation Guidelines](https://www.sigplan.org/Resources/EmpiricalEvaluation), -and the [NeurIPS reproducibility checklist](https://www.cs.mcgill.ca/~jpineau/ReproducibilityChecklist.pdf). - -*Since it may take weeks and even months to rerun some complex experiments - such as deep learning model training, we are discussing a staged AE where we will first validate that - artifacts are functional before the camera ready paper deadline, and then - use a separate AE with the full validation of all experimental results - with open reviewing and without strict deadlines. We successfully validated - a similar approach at the [MLCommons open reproducibility and optimization challenges)](https://access.cKnowledge.org) - and there is a related initiative at the [NeurIPS conference](https://openreview.net/group?id=NeurIPS.cc/2019/Reproducibility_Challenge).* - -### Artifacts reusable (pilot project with MLCommons) - -Since the criteria for the ACM "Artifacts Evaluated – Reusable" badge are very vague, we have partnered with -[MLCommons](https://mlcommons.org) to add their [unified and technology-agnostic Collective Mind automation interface (MLCommons CM)](https://doi.org/10.5281/zenodo.8105339) -to the shared artifacts. - -This non-intrusive interface was successfully validated to automate and unify the [Student Cluster Competition at SuperComputing'22](https://github.com/mlcommons/ck/blob/master/docs/tutorials/sc22-scc-mlperf.md) -and diverse [MLPerf benchmark community submissions and recent research papers](https://access.cknowledge.org/playground). - -That is why we would like to test it as a possible criteria to obtain the ACM "Artifacts Evaluated – Reusable" badge. - -Our goal is to help the community access diverse research projects, reproduce results and reuse artifacts -in a unified and automated way across continuously evolving software and hardware. - -*Note that it will be possible to prepare and run experiments without this interface too.* - -The authors will get free help from MLCommons and the community via the [public Discord server](https://discord.gg/JjWNWXKxwT) -and/or can try to add the MLCommons CM interface to their artifacts themselves using this [https://github.com/mlcommons/ck/blob/master/docs/tutorials/common-interface-to-reproduce-research-projects.mdtutorial). - - - - -## Distinguished artifact award - -When arranged by the event, an artifact can receive a distinguished artifact award if it is functional, well-documented, portable, easily reproducible and reusable by the community. - ----- - -*This document was prepared by [Grigori Fursin](https://cKnowledge.org/gfursin) - with contributions from [Bruce Childers](https://people.cs.pitt.edu/~childers), - [Michael Heroux](https://www.sandia.gov/~maherou), - [Michela Taufer](https://gcl.cis.udel.edu/personal/taufer) and others. - It is maintained by the [cTuning foundation](https://cTuning.org/ae) and the - [open MLCommons taskforce on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md).* +***Moved to https://github.com/ctuning/artifact-evaluation/blob/master/docs/reviewing.md*** diff --git a/docs/artifact-evaluation/submission.md b/docs/artifact-evaluation/submission.md index 404cb54b84..e7e92f907b 100644 --- a/docs/artifact-evaluation/submission.md +++ b/docs/artifact-evaluation/submission.md @@ -1,180 +1 @@ -[ [Back to index](https://cTuning.org/ae) ] - -# Artifact submission - -This document provides the guidelines to submit your artifacts for evaluation at ACM and IEEE conferences. - - - -## Motivation - - -It's becoming increasingly difficult to [reproduce results from CS papers](https://learning.acm.org/techtalks/reproducibility). -Voluntarily Artifact Evaluation (AE) was successfully introduced -at program languages, systems and machine learning conferences and tournaments -to validate experimental results by the independent AE Committee, share unified Artifact Appendices, -and assign reproducibility badges. - - -AE promotes the reproducibility of experimental results -and encourages artifact sharing to help the community quickly validate and compare alternative approaches. -Authors are invited to formally describe all supporting material (code, data, models, workflows, results) -using the [unified Artifact Appendix and the Reproducibility Checklist template](checklist.md) -and submit it to the [single-blind AE process](reviewing.md). -Reviewers will then collaborate with the authors to evaluate their artifacts and assign the following -[ACM reproducibility badges](https://www.acm.org/publications/policies/artifact-review-and-badging-current): - - -![](https://www.acm.org/binaries/content/gallery/acm/publications/replication-badges/artifacts_available_dl.jpg) -![](https://www.acm.org/binaries/content/gallery/acm/publications/replication-badges/artifacts_evaluated_functional_dl.jpg) -![](https://www.acm.org/binaries/content/gallery/acm/publications/replication-badges/results_reproduced_dl.jpg) - - - -## Preparing your Artifact Appendix and the Reproducibility Checklist - - -You need to prepare the [Artifact Appendix](https://github.com/mlcommons/ck/blob/master/docs/artifact-evaluation/template/ae.tex) -describing all software, hardware and data set dependencies, key results to be reproduced, and how to prepare, run and validated experiments. - -Though it is relatively intuitive and based on our -[past AE experience and your feedback](https://cTuning.org/ae/prior_ae.html), -we strongly encourage you to check the -the [Artifact Appendix guide](checklist.md), -[artifact reviewing guide](reviewing.md), -the [SIGPLAN Empirical Evaluation Guidelines](https://www.sigplan.org/Resources/EmpiricalEvaluation), -the [NeurIPS reproducibility checklist](https://www.cs.mcgill.ca/~jpineau/ReproducibilityChecklist.pdf) -and [AE FAQs](faq.md) before submitting artifacts for evaluation! - -You can find the examples of Artifact Appendices -in the following [reproduced papers](https://cknow.io/reproduced-papers). - - -*Since the AE methodology is slightly different at different conferences, we introduced the unified Artifact Appendix - with the Reproducibility Checklist in 2014 to help readers understand what was evaluated and how! - Furthermore, artifact evaluation often helps to discover some minor mistakes in accepted papers - - in such case you have a chance to add related notes and corrections - in the Artifact Appendix of your camera-ready paper!* - - - -## Preparing your experimental workflow - - -**You can skip this step if you want to share your artifacts without the validation of experimental results - - in such case your paper can still be entitled for the "artifact available" badge!** - -We strongly recommend you to provide at least some automation scripts to build your workflow, -all inputs to run your workflow, and some expected outputs to validate results from your paper. -You can then describe the steps to evaluate your artifact -using README files or [Jupyter Notebooks](https://jupyter.org "https://jupyter.org"). - -Feel free to reuse [portable CM scripts](https://github.com/mlcommons/ck/tree/master/cm-mlops/script) -being developed by the MLCommons to automate common steps to prepare and run various benchmarks -across continously changing software, hardware and data. - - -## Making artifacts available to evaluators - - -Most of the time, the authors make their artifacts available to the evaluators via GitHub, -GitLab, BitBucket or private repositories. Public artifact sharing allows -optional "open evaluation" which we have successfully validated at [ADAPT'16]( https://adapt-workshop.org) -and [ASPLOS-REQUEST'18](https://cknow.io/c/event/request-reproducible-benchmarking-tournament). -It allows the authors to quickly fix encountered issues during evaluation -before submitting the final version to archival repositories. - - -Other acceptable methods include: -* Using zip or tar files with all related code and data, particularly when your artifact - should be rebuilt on reviewers' machines (for example to have a non-virtualized access to a specific hardware). -* Using [Docker](https://www.docker.com "https://www.docker.com"), [Virtual Box](https://www.virtualbox.org "https://www.virtualbox.org") and other containers and VM images. -* Arranging remote access to the authors' machine with the pre-installed software - - this is an exceptional cases when rare or proprietary software and hardware is used. - You will need to privately send the private access information to the AE chairs. - - -Note that your artifacts will receive the ACM "artifact available" badge -**only if** they have been placed on any publicly accessible archival repository -such as [Zenodo](https://zenodo.org "https://zenodo.org"), [FigShare](https://figshare.com "https://figshare.com"), -and [Dryad](http://datadryad.org "http://datadryad.org"). -You will need to provide a DOI automatically assigned to your artifact by these repositories -in your final Artifact Appendix! - - - - - -## Submitting artifacts - - - - -Write a brief abstract describing your artifact, the minimal hardware and software requirements, -how it supports your paper, how it can be validated and what the expected result is. -Do not forget to specify if you use any proprietary software or hardware! -This abstract will be used by evaluators during artifact bidding to make sure that -they have an access to appropriate hardware and software and have required skills. - - -Submit the artifact abstract and the PDF of your paper with the Artifact Appendix attached -using the AE submission website provided by the event. - - - - - - -## Asking questions - - If you have questions or suggestions, - do not hesitate to get in touch with the the AE chairs or the community using - the public [Discord server](https://discord.gg/JjWNWXKxwT), - [Artifact Evaluation google group](https://groups.google.com/forum/#!forum/artifact-evaluation) - and weekly conf-calls of the [open MLCommons taskforce on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md). - -## Preparing your camera-ready paper - -If you have successfully passed AE with at least one reproducibility badge, -you will need to add up to 2 pages of your artifact appendix -to your camera ready paper while removing all unnecessary or confidential information. -This will help readers better understand what was evaluated and how. - - -If your paper is published in the ACM Digital Library, -you do not need to add reproducibility stamps - ACM will add them to your camera-ready paper -and will make this information available for search! - -In other cases, AE chairs will tell you how to add stamps to the first page of your paper. - - - - -## Examples of reproduced papers with shared artifacts and Artifact Appendices: - - - -* [Some papers from the past AE](https://cknow.io/?q=%22reproduced-papers%22) (ASPLOS, MICRO, MLSys, Supercomputing, CGO, PPoPP, PACT, IA3, ReQuEST) -* [Dashboards with reproduced results](https://cknow.io/?q=%22reproduced-results%22) -* Paper "Highly Efficient 8-bit Low Precision Inference of Convolutional Neural Networks with IntelCaffe" from ACM ASPLOS-ReQuEST'18 - * [Paper DOI](https://doi.org/10.1145/3229762.3229763) - * [Artifact DOI](https://doi.org/10.1145/3229769) - * [Original artifact](https://github.com/intel/caffe/wiki/ReQuEST-Artifact-Installation-Guide) - * [Portable automation](https://github.com/ctuning/ck-request-asplos18-caffe-intel) - * [Expected results](https://github.com/ctuning/ck-request-asplos18-results-caffe-intel) - * [Public scoreboard](https://cknow.io/result/pareto-efficient-ai-co-design-tournament-request-acm-asplos-2018) -* Paper "Software Prefetching for Indirect Memory Accesses" from CGO'17 - * [Portable automation at GitHub](https://github.com/SamAinsworth/reproduce-cgo2017-paper) - * [CK dashboard snapshot](https://github.com/SamAinsworth/reproduce-cgo2017-paper/files/618737/ck-aarch64-dashboard.pdf) - - - - ----- - -*This document was prepared by [Grigori Fursin](https://cKnowledge.org/gfursin "https://cKnowledge.org/gfursin") - with contributions from [Bruce Childers](https://people.cs.pitt.edu/~childers "https://people.cs.pitt.edu/~childers"), - [Michael Heroux](https://www.sandia.gov/~maherou "https://www.sandia.gov/~maherou"), - [Michela Taufer](https://gcl.cis.udel.edu/personal/taufer/ "https://gcl.cis.udel.edu/personal/taufer/") and others. - It is maintained by the [cTuning foundation](https://cTuning.org/ae) and the - [open MLCommons taskforce on automation and reproducibility](https://github.com/mlcommons/ck/blob/master/docs/mlperf-education-workgroup.md).* +***Moved to https://github.com/ctuning/artifact-evaluation/blob/master/docs/submission.md*** diff --git a/docs/artifact-evaluation/template/ae.tex b/docs/artifact-evaluation/template/ae.tex index d878bae389..7b7c123330 100644 --- a/docs/artifact-evaluation/template/ae.tex +++ b/docs/artifact-evaluation/template/ae.tex @@ -1,120 +1 @@ -% LaTeX template for Artifact Evaluation V20201122 -% -% Prepared by Grigori Fursin with contributions from Bruce Childers, -% Michael Heroux, Michela Taufer and other colleagues. -% -% See examples of this Artifact Appendix in -% * SC'17 paper: https://dl.acm.org/citation.cfm?id=3126948 -% * CGO'17 paper: https://www.cl.cam.ac.uk/~sa614/papers/Software-Prefetching-CGO2017.pdf -% * ACM ReQuEST-ASPLOS'18 paper: https://dl.acm.org/citation.cfm?doid=3229762.3229763 -% -% (C)opyright 2014-2023 -% -% CC BY 4.0 license -% - -\documentclass{sigplanconf} - -\usepackage{hyperref} - -\begin{document} - -\special{papersize=8.5in,11in} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -% When adding this appendix to your paper, -% please remove above part -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -\appendix -\section{Artifact Appendix} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Abstract} - -{\em Obligatory} - -\subsection{Artifact check-list (meta-information)} - -{\em Obligatory. Use just a few informal keywords in all fields applicable to your artifacts -and remove the rest. This information is needed to find appropriate reviewers and gradually -unify artifact meta information in Digital Libraries.} - -{\small -\begin{itemize} - \item {\bf Algorithm: } - \item {\bf Program: } - \item {\bf Compilation: } - \item {\bf Transformations: } - \item {\bf Binary: } - \item {\bf Model: } - \item {\bf Data set: } - \item {\bf Run-time environment: } - \item {\bf Hardware: } - \item {\bf Run-time state: } - \item {\bf Execution: } - \item {\bf Metrics: } - \item {\bf Output: } - \item {\bf Experiments: } - \item {\bf How much disk space required (approximately)?: } - \item {\bf How much time is needed to prepare workflow (approximately)?: } - \item {\bf How much time is needed to complete experiments (approximately)?: } - \item {\bf Publicly available?: } - \item {\bf Code licenses (if publicly available)?: } - \item {\bf Data licenses (if publicly available)?: } - \item {\bf Workflow framework used?: } - \item {\bf Archived (provide DOI)?: } -\end{itemize} -} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Description} - -\subsubsection{How to access} - -{\em Obligatory} - -\subsubsection{Hardware dependencies} - -\subsubsection{Software dependencies} - -\subsubsection{Data sets} - -\subsubsection{Models} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Installation} - -{\em Obligatory} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Experiment workflow} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Evaluation and expected results} - -{\em Obligatory} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Experiment customization} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Notes} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Methodology} - -Submission, reviewing and badging methodology: - -\begin{itemize} - \item \url{https://www.acm.org/publications/policies/artifact-review-and-badging-current} - \item \url{http://cTuning.org/ae/submission-20201122.html} - \item \url{http://cTuning.org/ae/reviewing-20201122.html} -\end{itemize} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -% When adding this appendix to your paper, -% please remove below part -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -\end{document} +Moved to https://github.com/ctuning/artifact-evaluation/blob/master/docs/template/ae.tex diff --git a/docs/history.md b/docs/history.md index 81754b405f..b92250b128 100644 --- a/docs/history.md +++ b/docs/history.md @@ -27,4 +27,5 @@ co-led with Arjun Suresh. We continue extending CM to support different MLCommons projects to modularize and unify benchmarking of ML/AI systems as a collaborative engineering effort based on [user feedback](../CONTRIBUTING.md). -You can learn more about the CM concept and motivation from the [keynote at ACM REP'23](https://doi.org/10.5281/zenodo.8105339). +You can learn more about the CM concept and motivation from the [keynote at ACM REP'23](https://doi.org/10.5281/zenodo.8105339) +and this [white paper](https://arxiv.org/abs/2406.16791). diff --git a/docs/mlperf/inference/README.md b/docs/mlperf/inference/README.md index f3cacaa2ec..fd6f84e296 100644 --- a/docs/mlperf/inference/README.md +++ b/docs/mlperf/inference/README.md @@ -258,7 +258,6 @@ You can see example of this visualization GUI [online](https://access.cknowledge ## Acknowledgments [Collective Mind](https://doi.org/10.5281/zenodo.8105339) is an open community project -led by [Grigori Fursin](https://cKnowledge.org/gfursin) and [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) to modularize AI benchmarks and provide a common interface to run them across diverse models, data sets, software and hardware - we would like to thank all our [great contributors](../../../CONTRIBUTING.md) for their feedback, support and extensions! diff --git a/docs/news-mlperf-v3.1.md b/docs/news-mlperf-v3.1.md index 37dd6c6793..06682b4ca8 100644 --- a/docs/news-mlperf-v3.1.md +++ b/docs/news-mlperf-v3.1.md @@ -32,13 +32,6 @@ We thank all [our contributors](https://access.cknowledge.org/playground/?action for interesting discussions and feedback that helped to improve the open-source MLCommons CM automation workflows for MLPerf benchmarks and [make them available to everyone](https://github.com/mlcommons/ck/tree/master/docs/mlperf)! -Join our public [Discord server](https://discord.gg/JjWNWXKxwT) to learn how to -use our open-source MLPerf automation and submit your results to MLPerf inference v4.0! - -Don't hesitate to contact [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) -and [Grigori Fursin](https://cKnowledge.org/gfursin) for more details about this community project from MLCommons and the cTuning foundation. - - ## New CM capabilities to automate experiments, optimizations and design space exploration diff --git a/docs/news.md b/docs/news.md index caa3e7d1e7..c49dcdd16e 100644 --- a/docs/news.md +++ b/docs/news.md @@ -39,7 +39,7 @@ * [Grigori Fursin](https://cKnowledge.org/gfursin) gave an invited talk at [AVCC'23](https://avcc.org/avcc2023) about our MLCommons CM automation language and how it can help to develop modular, portable and technology-agnostic benchmarks. -* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) and [Grigori Fursin](https://cKnowledge.org/gfursin) +* [Grigori Fursin](https://cKnowledge.org/gfursin) gave an [IISWC'23 tutorial](https://iiswc.org/iiswc2023/#/program/) about our CM workflow automation language and how it can make it easier for researchers to reproduce their projects and validate in the real world across rapidly evolving software and hardware. diff --git a/docs/tutorials/scc23-mlperf-inference-bert.md b/docs/tutorials/scc23-mlperf-inference-bert.md index dc40e47a0c..9f397cece1 100644 --- a/docs/tutorials/scc23-mlperf-inference-bert.md +++ b/docs/tutorials/scc23-mlperf-inference-bert.md @@ -1557,13 +1557,12 @@ We welcome other MLPerf and CM extensions including support for multi-node execu Please join our [Discord server](https://discord.gg/JjWNWXKxwT) to provide your feedback and participate in these community developments! +# Authors -## Acknowledgments +* [Grigori Fursin](https://cKnowledge.org/gfursin) (cTuning foundation and cKnowledge.org) +* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) (cTuning foundation and cKnowledge.org) -This tutorial, the MLCommons CM automation language, CM scripts and CM automation workflows -for MLPerf were developed by [Grigori Fursin](https://cKnowledge.org/gfursin) -and [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) ([cTuning foundation](https://cTuning.org) -and [cKnowledge.org](https://cKnowledge.org)) in collaboration with the community and MLCommons. +## Acknowledgments We thank Miro Hodak, Mitchelle Rasquinha, Amiya K. Maji, Ryan T DeRue, Michael Goin, Kasper Mecklenburg, Lior Khermosh, James Goel, Jinho Suh, Thomas Zhu, Peter Mattson, David Kanter, Vijay Janappa Reddi diff --git a/docs/tutorials/scc24-mlperf-inference.md b/docs/tutorials/scc24-mlperf-inference.md new file mode 100644 index 0000000000..951310a8f2 --- /dev/null +++ b/docs/tutorials/scc24-mlperf-inference.md @@ -0,0 +1,11 @@ +[ [Back to index](../README.md) ] + +# Tutorial to run and optimize MLPerf inference benchmark at SCC'24 + +TBD + +# Authors + +* [Arjun Suresh](https://www.linkedin.com/in/arjunsuresh) +* [Miro Hodak](https://www.linkedin.com/in/miroslav-hodak) +* [Grigori Fursin](https://cKnowledge.org/gfursin)