-
Notifications
You must be signed in to change notification settings - Fork 127
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
5b93079
commit 22842e1
Showing
3 changed files
with
40 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things:
As far as I understand, --optimize is currently a command line option in a released version of Marian. Removal of that command line option is a change of the API (well, the CLI, but the idea is the same) and would thus mandate a change in the major version of Marian (due to semantic versioning that Marian has adopted). I would therefore advise to leave that option in (e.g., as an alias for --gemm-precision int16); simply for backwards compatibility.
There's no gemm with float16?
I assume that all the '... shift ...' options relate to the bias. Would it make sense to have --gem-precision {float32 | [float16 |] int16 | int8} and --gemm-int8-shift { none | shift | alpha | all | alpha-all }? If gemm-shift has no effect in certain situations, that's fine; it's options that contradict each other that I don't like.
The whole int-gemm thing should be documented somewhere. Most users will have no idea what it's all about and what all these options mean. Ultimately, this should be in the Marian documentation, but that doesn't seem to have been updated in quite some time. For the time being, the Wiki should do.
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ugermann
--optimize
. In my pull request for intgemm, I have already proposed a change similar to this interface. There's no alpha stuff in that pull request so it makes things a bit simpler. Marcin is supposed to review it at some point.--fp16
mode is pervasive, not just for the GEMM operations. Also, it's only available on GPU. Theoptimize
stuff is quantization before performing gemm and requantization after.22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re 1.: Well then, my biggest concern a code reviewer insisting that --optimize stay in there. If --optimize is not being loved, my point regarding that one is sort-of mute (notwithstanding the semantic versioning issue).
re 3.: I have no strong opinion on that; fine with having the full slew of options as possible values to a single string-valued parameter.
re 4.: I was thinking about the marian-dev wiki https://github.com/marian-nmt/marian-dev/wiki. This is Marian-related, not Bergamot-related.
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re 4: This won't be accepted in the marian wiki until intgemm makes into master, and even then, that would be without the alpha stuff. I propose a wiki entry in the bergamot wiki, which we will then copy + the single string format with the asterisk it might have to change based on how Marcin decides to treat the command line switches.
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oi the joys of being a Marian contributor! Remind me: why again are we doing this?
I don't really care much about where it is as long as it's somewhere reasonable, but bergamot/coordination doesn't strike me as the right place. I don't mind if it's on on the bergamot/mts wiki, or on the wiki of the fork that you created on your user account. It could also just be a markdown file in ./docs in marian-dev in whatever branch we are on these days.
What I do care about is some form of explanation of what all these things mean, because to the uninitiated, the intgemm stuff will require a considerable amount of explanation.
You know what: you could also create a blog post (wherever, as long as it's public and somewhat permanent) where you explain things. That way you will get at least the recognition for implementing and documenting this stuff. So basically I want something where the help message would say: see .... for details. I kind-of get the intgemm stuff (do integer computations instead of floats), but "precomputed alpha" is something that I'd need to research in more detail.
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work for you? If it's not clear, i can expand: https://github.com/browsermt/coordination/wiki/Low-precision-GEMM-and-Intgemm
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for, why are we doing this, because the in master implementation of 8bit CPU GEMM (fbgemm) doesn't allow for architecture agnostic binary format and also requires prior binarzation AND is slower.
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said earlier, I'm not sure if https://github.com/browsermt/coordination/wiki is the right place, but "anywhere" is much, much better than "nowhere", so I'm more than happy to go along with this for the time being. It should be reasonably easy to move documentation to the right place eventually.
So thanks for the documentation! It greatly helps me to understand what's going on. I may comment in more detail later, but for the time being, this is a very good start!
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, let me know if more things are necessary, I can add points and clarifications. It doesn't cost that much more effort after the initial effort.
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Doing this" referred to contribution to Marian in general and was a sigh of despair with respect to getting pull requests into master. ;-)
Let me know when this is ready to be pulled into the current rest server branch.
22842e1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to the reader: back to this thread: browsermt/mts#1