You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, to finetune on the KoLLAVA Dataset, you can convert its data into a parquet file, which can be read by pandas through 'pandas.read_parquet("parquet path")'. Each row of this loaded DataFrame should consist of 3 elements:
prompt: prompt received by the large language model
ground_truth: your desired model output
image: image stored in byte string format, which can be encoded by cv2.imencode
An example of the prepared dataset is shown below:
Then replace the 'data_path' in scripts/pretrain.sh and scripts/finetune.sh with the path containing the above parquet file and pass '--is_parquet True'. Finally, as long as you properly configure the vision encoder and LLM according to your needs, you can train your own VLM.
I want to finetune KoLLAVA Dataset on this VLM.
How to do it?
I will really appreciate your help.
The text was updated successfully, but these errors were encountered: