Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Has the demo gif for text generation been sped up? #12

Open
farazk86 opened this issue Dec 30, 2020 · 1 comment
Open

Has the demo gif for text generation been sped up? #12

farazk86 opened this issue Dec 30, 2020 · 1 comment

Comments

@farazk86
Copy link

Hi,

I cannot achieve the speed demonstrated in the gif: https://github.com/huggingface/tflite-android-transformers/tree/master/gpt2

It takes about 7 seconds to generate a single word on my build. I am even using gpuDelegate to run interpreter on GPU rather than CPU and its still slower.

Has the gif been sped up? am I the only one having this poor performance?

Thanks

@pierreduf
Copy link

pierreduf commented Dec 12, 2022

Hi @farazk86,

I know that this message (and repo) is rather old but I'm testing this demo and struggle to find a way to make it working with GPU delegate. Would you mind sharing what you did ?

My understanding is that the model is not adapted to run on GPU but I can't even start the app without crash, so I'm curious to know how you did it. Without that modification below, the app runs perfectly and outputs about 1 word/sec.

If anyone else has insights about that, I would be really grateful as well (@Pierrci ? @sayakpaul ?). Sorry if that's a very noob question !

I had some difficulties related to gradle / TF version but now I can build a valid APK supporting GPU with the following modifs :

GPT2Client.kt

            import org.tensorflow.lite.gpu.CompatibilityList
            import org.tensorflow.lite.gpu.GpuDelegate
            .......
            //val opts = Interpreter.Options()
            //opts.setNumThreads(NUM_LITE_THREADS)

            val compatList = CompatibilityList()

            val opts = Interpreter.Options().apply{
                if(compatList.isDelegateSupportedOnThisDevice){
                    // if the device has a supported GPU, add the GPU delegate
                    val delegateOptions = compatList.bestOptionsForThisDevice
                    this.addDelegate(GpuDelegate(delegateOptions))
                } else {
                    // if the GPU is not supported, run on 4 threads
                    this.setNumThreads(NUM_LITE_THREADS)
                }
            }

and of course adding in build.gradle

    implementation 'org.tensorflow:tensorflow-lite:2.5.0'
    implementation 'org.tensorflow:tensorflow-lite-gpu:2.5.0'

But when I run the app it crashes on startup with the following error (tflite 2.3)

12-12 10:55:56.204  3214  3214 D Launcher: onStop
12-12 10:55:56.241 15950 15956 I zygote64: Do partial code cache collection, code=59KB, data=38KB
12-12 10:55:56.241 15950 15956 I zygote64: After code cache collection, code=57KB, data=37KB
12-12 10:55:56.241 15950 15956 I zygote64: Increasing code cache capacity to 256KB
12-12 10:55:56.461 15950 15969 D libGLESv3: Successfully load libGLESv2_oneplus.so, this=0x7581a5c008
12-12 10:55:56.463 15950 15969 I tflite  : Created TensorFlow Lite delegate for GPU.
12-12 10:55:56.466 15950 15969 I tflite  : Initialized TensorFlow Lite runtime.
12-12 10:55:56.477 15950 15969 I tflite  : Created 0 GPU delegate kernels.
12-12 10:16:41.414  8335  8335 E AndroidRuntime: FATAL EXCEPTION: main
12-12 10:16:41.414  8335  8335 E AndroidRuntime: Process: co.huggingface.android_transformers.gpt2, PID: 8335
12-12 10:16:41.414  8335  8335 E AndroidRuntime: java.lang.IllegalArgumentException: ByteBuffer is not a valid flatbuffer model
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.createModelWithBuffer(Native Method)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.<init>(NativeInterpreterWrapper.java:60)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at org.tensorflow.lite.Interpreter.<init>(Interpreter.java:224)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at co.huggingface.android_transformers.gpt2.ml.GPT2Client$loadModel$2.invokeSuspend(GPT2Client.kt:137)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:241)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:594)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.access$runSafely(CoroutineScheduler.kt:60)
12-12 10:16:41.414  8335  8335 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:740)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: FATAL EXCEPTION: main
12-12 10:55:56.488 15950 15950 E AndroidRuntime: Process: co.huggingface.android_transformers.gpt2, PID: 15950
12-12 10:55:56.488 15950 15950 E AndroidRuntime: java.lang.IllegalArgumentException: Internal error: Failed to apply delegate: Following operations are not supported by GPU delegate:
12-12 10:55:56.488 15950 15950 E AndroidRuntime: DEQUANTIZE: 
12-12 10:55:56.488 15950 15950 E AndroidRuntime: DIV: Op can only handle 1 or 2 operand(s).
12-12 10:55:56.488 15950 15950 E AndroidRuntime: GATHER: Operation is not supported.
12-12 10:55:56.488 15950 15950 E AndroidRuntime: PACK: Operation is not supported.
12-12 10:55:56.488 15950 15950 E AndroidRuntime: POW: Op can only handle 1 or 2 operand(s).
12-12 10:55:56.488 15950 15950 E AndroidRuntime: SPLIT: Operation is not supported.
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 106 operations will run on the GPU, and the remaining 2317 operations will run on the CPU.
12-12 10:55:56.488 15950 15950 E AndroidRuntime: TfLiteGpuDelegate Init: SLICE: Output batch don't match
12-12 10:55:56.488 15950 15950 E AndroidRuntime: TfLiteGpuDelegate Prepare: delegate is not initialized
12-12 10:55:56.488 15950 15950 E AndroidRuntime: Node nu
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.applyDelegate(Native Method)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.applyDelegates(NativeInterpreterWrapper.java:351)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.init(NativeInterpreterWrapper.java:82)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.<init>(NativeInterpreterWrapper.java:63)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at org.tensorflow.lite.Interpreter.<init>(Interpreter.java:266)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at co.huggingface.android_transformers.gpt2.ml.GPT2Client$loadModel$2.invokeSuspend(GPT2Client.kt:155)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:241)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:594)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.access$runSafely(CoroutineScheduler.kt:60)
12-12 10:55:56.488 15950 15950 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:740)
12-12 10:55:56.494 15950 15979 D OSTracker: OS Event: crash
12-12 10:55:56.496  1222  2239 W ActivityManager:   Force finishing activity co.huggingface.android_transformers.gpt2/.MainActivity
12-12 10:55:56.498  1222  1748 I ActivityManager: Showing crash dialog for package co.huggingface.android_transformers.gpt2 u0
12-12 10:55:56.502  1222  1747 D RestartProcessManager: Duration is too short, ignore : 696 in co.huggingface.android_transformers.gpt2

With tflite 2.4 it's a bit different :

12-12 11:08:18.914 17407 17426 I tflite  : Created TensorFlow Lite delegate for GPU.
12-12 11:08:18.917 17407 17426 I tflite  : Initialized TensorFlow Lite runtime.
12-12 11:08:18.928 17407 17426 I tflite  : Created 0 GPU delegate kernels.
12-12 11:08:18.959 17407 17407 E AndroidRuntime: FATAL EXCEPTION: main
12-12 11:08:18.959 17407 17407 E AndroidRuntime: Process: co.huggingface.android_transformers.gpt2, PID: 17407
12-12 11:08:18.959 17407 17407 E AndroidRuntime: java.lang.IllegalArgumentException: Internal error: Failed to apply delegate: Following operations are not supported by GPU delegate:
12-12 11:08:18.959 17407 17407 E AndroidRuntime: DEQUANTIZE: 
12-12 11:08:18.959 17407 17407 E AndroidRuntime: GATHER: Operation is not supported.
12-12 11:08:18.959 17407 17407 E AndroidRuntime: MEAN: Mean operation supports only HW plane
12-12 11:08:18.959 17407 17407 E AndroidRuntime: SPLIT: Operation is not supported.
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 147 operations will run on the GPU, and the remaining 2276 operations will run on the CPU.
12-12 11:08:18.959 17407 17407 E AndroidRuntime: TfLiteGpuDelegate Init: Tensor "Identity_8" has bad input dims size: 5.
12-12 11:08:18.959 17407 17407 E AndroidRuntime: TfLiteGpuDelegate Prepare: delegate is not initialized
12-12 11:08:18.959 17407 17407 E AndroidRuntime: Node number 2423 (TfLiteGpuDelegateV2) failed to prepare.
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 
12-12 11:08:18.959 17407 17407 E AndroidRuntime: Restored
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.applyDelegate(Native Method)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.applyDelegates(NativeInterpreterWrapper.java:367)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.init(NativeInterpreterWrapper.java:85)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at org.tensorflow.lite.NativeInterpreterWrapper.<init>(NativeInterpreterWrapper.java:63)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at org.tensorflow.lite.Interpreter.<init>(Interpreter.java:277)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at co.huggingface.android_transformers.gpt2.ml.GPT2Client$loadModel$2.invokeSuspend(GPT2Client.kt:155)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:241)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:594)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler.access$runSafely(CoroutineScheduler.kt:60)
12-12 11:08:18.959 17407 17407 E AndroidRuntime: 	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:740)
12-12 11:08:18.966 17407 17436 D OSTracker: OS Event: crash
12-12 11:08:18.967  1222  3147 W ActivityManager:   Force finishing activity co.huggingface.android_transformers.gpt2/.MainActivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants