Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timed preprocessing #184

Open
DilipSequeira opened this issue Oct 23, 2020 · 10 comments
Open

Timed preprocessing #184

DilipSequeira opened this issue Oct 23, 2020 · 10 comments
Labels
WG v1.1 / backlog WG will track this item for v0.7 for resolution

Comments

@DilipSequeira
Copy link
Contributor

For the March '21 round (1.0?) we would like to see consideration of more timed preprocessing in the datacenter scenarios. specifically for the image models and for 3D-UNet. For edge, it makes sense that the submitter gets to choose the format because it's often coming in from a camera pipeline, but for datacenter it will typically be some form of compressed data (e.g. jpeg).

Let's discuss in the WG.

@tjablin
Copy link
Collaborator

tjablin commented Oct 27, 2020

I am broadly supportive of this change. In a perfect world, we would do this at the same time we switch to loadgen-over-network, but I don't think we have time this round for loadgen-over-network. For consistency, should we time all pre-processing?

@tjablin tjablin linked a pull request Oct 27, 2020 that will close this issue
@christ1ne
Copy link
Contributor

christ1ne commented Oct 27, 2020

Proposal:

  • what? For datacenter only & for R50, SSD-R34 and 3D UNET: input is now jpeg images, output is the same as before.
  • when? TBD

@TheKanter
Copy link
Contributor

WG comments:

This is for data center only, under the theory that data center inputs are usually compressed. OTOH, edge inputs are often raw.

Also, pre-processing should be added to all benchmarks (already present in RNN-T, none needed in many other benchmarks).

This is a good topic for more discussion.

@aaronzhongii
Copy link

As Scott pointed it, this is unfair for inference only chip vendors. It's really hard to interpret a result with third party image decompressor playing huge role if preprocessing is timed. Only GPU have the compatibility to handle both at the same time, so if MLPerf promote this, does this mean MLPerf WG prefer GPU over inference only chips?

@christ1ne
Copy link
Contributor

  • need to consider if that will further narrow the pool of potential submitters
  • better to get use case from the vision advisory board
  • will hear from David's survey on the v0.7 submitters and non-submitters

@tjablin
Copy link
Collaborator

tjablin commented Nov 3, 2020

Only GPU have the compatibility to handle both at the same time, so if MLPerf promote this, does this mean MLPerf WG prefer GPU over inference only chips?

MLPerf ought to reward good designs. Image decompression is an important part of inference for many workloads. It is appropriate that better architectures with more capabilities have higher performance as measured by MLPerf.

It's really hard to interpret a result with third party image decompressor playing huge role if preprocessing is timed.

The only performance that benefits customers is end-to-end performance. If a chip is decompression limited, it is misleading to publish numbers that ignore this limitation. MLPerf ought to publish performance numbers that most clearly reflect real world performance. There are already high performance open source image decompression libraries. It is unlikely anyone will get an advantage by optimizing them. Submitters with dedicated hardware for image decoding ought to be rewarded for their ingenuity.

Hopefully, measuring preprocessing time will guide submitters toward measuring systems with realistic balances of decompression and inference capacity.

@christ1ne
Copy link
Contributor

christ1ne commented Nov 10, 2020

WG: will update with survey results next week. If no consensus from submitters, we will rely on the future vision advisory board.
Scott suggested another end-to-end benchmark to track the system including networking card, graphics, and include their cost somehow.

@christ1ne christ1ne reopened this Nov 10, 2020
@christ1ne
Copy link
Contributor

@TheKanter will follow up on data center specific submitters.

@christ1ne christ1ne added WG v1.1 / backlog WG will track this item for v0.7 for resolution and removed WG v1.0 labels Nov 17, 2020
@tjablin
Copy link
Collaborator

tjablin commented Nov 30, 2020

I think we are out of time to land this for 1.0. I propose merging with Loadgen over network and aim for Inference 1.1. Dilip, what do you think?

@DilipSequeira
Copy link
Contributor Author

Agreed on all counts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WG v1.1 / backlog WG will track this item for v0.7 for resolution
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants