Authors | Daniil Dorin, Nikita Kiselev, Ernest Nasyrov, Kirill Semkin |
Advisor | Vadim Strijov, DSc |
Consultant | Andrey Grabovoy, PhD |
- Reconstruct images, which participants viewed during the simultaneous fMRI-EEG procedure
- There is no method that uses a simultaneous fMRI-EEG signal (only fMRI / EEG separately)
- LinkReview
How to decode human vision through neural signals has attracted a long-standing interest in neuroscience and machine learning. Modern contrastive learning and generative models improved the performance of visual decoding and reconstruction based on functional Magnetic Resonance Imaging (fMRI) and electroencephalography (EEG). However, combining these two types of information is difficult to decode visual stimuli, including due to a lack of training data. In this study, we present an end-to-end fMRI-EEG based visual reconstruction zero-shot framework, consisting of multiple tailored brain encoders and fuse module, which projects neural signals from different sources into the shared subspace as the CLIP embedding, and a two-stage multi-pipe fMRI-EEG-to-image generation strategy. In stage one, fMRI and EEG are embedded to align the high-level CLIP embedding, and then the prior diffusion model refines combined embedding into image priors. In stage two, we input this combined embedding to a pre-trained diffusion model. The experimental results indicate that our fMRI-EEG-based visual zero-shot framework achieves SOTA performance in reconstruction, highlighting the portability, low cost, and hight temporal and spatial resolution of combined fMRI-EEG, enabling a wide range of BCI applications.