papers.json

{"source-free": {"An Uncertainty-guided Tiered Self-training Framework for Active Source-free Domain Adaptation in Prostate Segmentation": "|**2024-7-3**|**An Uncertainty-guided Tiered Self-training Framework for Active Source-free Domain Adaptation in Prostate Segmentation**|Zihao Luo et.al|[paper](https://arxiv.org/abs/2407.02893)|[code](https://github.com/HiLab-git/UGTST)|-|\n", "Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation": "|**2024-6-26**|**Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation**|Shivang Chopra et.al|[paper](https://arxiv.org/abs/2402.04929)|-|<details><summary>detail</summary>arXiv admin note: substantial text overlap with arXiv:2310</details>|\n", "Advancing UWF-SLO Vessel Segmentation with Source-Free Active Domain Adaptation and a Novel Multi-Center Dataset": "|**2024-6-19**|**Advancing UWF-SLO Vessel Segmentation with Source-Free Active Domain Adaptation and a Novel Multi-Center Dataset**|Hongqiu Wang et.al|[paper](https://arxiv.org/abs/2406.13645)|[code](https://github.com/whq-xxh/SFADA-UWF-SLO.)|<details><summary>detail</summary>MICCAI 2024 Early Accept</details>|\n", "Evidentially Calibrated Source-Free Time-Series Domain Adaptation with Temporal Imputation": "|**2024-6-12**|**Evidentially Calibrated Source-Free Time-Series Domain Adaptation with Temporal Imputation**|Mohamed Ragab et.al|[paper](https://arxiv.org/abs/2406.02635)|-|-|\n", "Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation": "|**2024-6-10**|**Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation**|Dong Zhao et.al|[paper](https://arxiv.org/abs/2406.06813)|-|<details><summary>detail</summary>2024 Conference on Computer Vision and Pattern Recognition</details>|\n", "Source -Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels": "|**2024-6-9**|**Source -Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels**|Shlomo Salo Elia et.al|[paper](https://arxiv.org/abs/2406.05863)|-|-|\n", "Proxy Denoising for Source-Free Domain Adaptation": "|**2024-6-3**|**Proxy Denoising for Source-Free Domain Adaptation**|Song Tang et.al|[paper](https://arxiv.org/abs/2406.01658)|-|-|\n", "CLIP-Guided Source-Free Object Detection in Aerial Images": "|**2024-5-30**|**CLIP-Guided Source-Free Object Detection in Aerial Images**|Nanqing Liu et.al|[paper](https://arxiv.org/abs/2401.05168)|[code](https://github.com/Lans1ng/SFOD-RS.)|<details><summary>detail</summary>Accepted by IGARSS2024</details>|\n", "Empowering Source-Free Domain Adaptation with MLLM-driven Curriculum Learning": "|**2024-5-28**|**Empowering Source-Free Domain Adaptation with MLLM-driven Curriculum Learning**|Dongjie Chen et.al|[paper](https://arxiv.org/abs/2405.18376)|[code](https://github.com/Dong-Jie-Chen/RCL.)|-|\n", "Reliable Source Approximation: Source-Free Unsupervised Domain Adaptation for Vestibular Schwannoma MRI Segmentation": "|**2024-5-25**|**Reliable Source Approximation: Source-Free Unsupervised Domain Adaptation for Vestibular Schwannoma MRI Segmentation**|Hongye Zeng et.al|[paper](https://arxiv.org/abs/2405.16102)|[code](https://github.com/zenghy96/Reliable-Source-Approximation.)|<details><summary>detail</summary>Early accepted by MICCAI 2024</details>|\n", "Selection, Ensemble, and Adaptation: Advancing Multi-Source-Free Domain Adaptation via Architecture Zoo": "|**2024-5-23**|**Selection, Ensemble, and Adaptation: Advancing Multi-Source-Free Domain Adaptation via Architecture Zoo**|Jiangbo Pei et.al|[paper](https://arxiv.org/abs/2403.01582)|-|-|\n", "SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network": "|**2024-5-23**|**SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network**|Weiyu Guo et.al|[paper](https://arxiv.org/abs/2405.14398)|[code](https://anonymous.4open.science/r/SpGesture.)|-|\n", "SepRep-Net: Multi-source Free Domain Adaptation via Model Separation And Reparameterization": "|**2024-5-17**|**SepRep-Net: Multi-source Free Domain Adaptation via Model Separation And Reparameterization**|Ying Jin et.al|[paper](https://arxiv.org/abs/2402.08249)|-|-|\n", "Source-Free Domain Adaptation of Weakly-Supervised Object Localization Models for Histology": "|**2024-5-12**|**Source-Free Domain Adaptation of Weakly-Supervised Object Localization Models for Histology**|Alexis Guichemerre et.al|[paper](https://arxiv.org/abs/2404.19113)|-|-|\n", "High-order Neighborhoods Know More: HyperGraph Learning Meets Source-free Unsupervised Domain Adaptation": "|**2024-5-11**|**High-order Neighborhoods Know More: HyperGraph Learning Meets Source-free Unsupervised Domain Adaptation**|Jinkun Jiang et.al|[paper](https://arxiv.org/abs/2405.06916)|-|-|\n", "Source-Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels": "|**2024-7-2**|**Source-Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels**|S Salo Elia et.al|[paper](https://ui.adsabs.harvard.edu/abs/2024arXiv240605863S/abstract)|[code](https://paperswithcode.com/paper/source-free-domain-adaptation-for-speaker)|-|\n", "Global self-sustaining and local inheritance for source-free unsupervised domain adaptation": "|**2024-7-1**|**Global self-sustaining and local inheritance for source-free unsupervised domain adaptation**|L Peng et.al|[paper](https://www.sciencedirect.com/science/article/pii/S0031320324004308)|-|<details><summary>detail</summary>Pattern Recognition, 2024 Elsevier</details>|\n", "Unveiling the Unknown: Unleashing the Power of Unknown to Known in Open-Set Source-Free Domain Adaptation": "|**2024-6-30**|**Unveiling the Unknown: Unleashing the Power of Unknown to Known in Open-Set Source-Free Domain Adaptation**|F Wan et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/html/Wan_Unveiling_the_Unknown_Unleashing_the_Power_of_Unknown_to_Known_CVPR_2024_paper.html)|[code](https://paperswithcode.com/paper/unveiling-the-unknown-unleashing-the-power-of)|<details><summary>detail</summary>Proceedings of the IEEE\u00a0\u2026, 2024 openaccess.thecvf.com</details>|\n", "Discriminative Pattern Calibration Mechanism for Source-Free Domain Adaptation": "|**2024-6-30**|**Discriminative Pattern Calibration Mechanism for Source-Free Domain Adaptation**|H Xia et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/html/Xia_Discriminative_Pattern_Calibration_Mechanism_for_Source-Free_Domain_Adaptation_CVPR_2024_paper.html)|[code](https://paperswithcode.com/paper/discriminative-pattern-calibration-mechanism)|<details><summary>detail</summary>\u2026\u00a0of the IEEE/CVF Conference on\u00a0\u2026, 2024 openaccess.thecvf.com</details>|\n", "Understanding and Improving Source-free Domain Adaptation from a Theoretical Perspective": "|**2024-6-30**|**Understanding and Improving Source-free Domain Adaptation from a Theoretical Perspective**|Y Mitsuzumi et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/html/Mitsuzumi_Understanding_and_Improving_Source-free_Domain_Adaptation_from_a_Theoretical_Perspective_CVPR_2024_paper.html)|[code](https://paperswithcode.com/paper/understanding-and-improving-source-free)|<details><summary>detail</summary>Proceedings of the IEEE\u00a0\u2026, 2024 openaccess.thecvf.com</details>|\n", "Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation\u2013Supplementary Material\u2013": "|**2024-6-30**|**Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation\u2013Supplementary Material\u2013**|X Zheng et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Zheng_Semantics_Distortion_and_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>openaccess.thecvf.com</details>|\n", "LEAD: Learning Decomposition for Source-free Universal Domain Adaptation\u2014Supplementary Material": "|**2024-6-30**|**LEAD: Learning Decomposition for Source-free Universal Domain Adaptation\u2014Supplementary Material**|S Qu et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Qu_LEAD_Learning_Decomposition_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>Integration openaccess.thecvf.com</details>|\n", "Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation (Supplementary Material)": "|**2024-6-30**|**Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation (Supplementary Material)**|D Zhao et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Zhao_Stable_Neighbor_Denoising_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>openaccess.thecvf.com</details>|\n", "EventDance: Unsupervised Source-free Cross-modal Adaptation for Event-based Object Recognition\u2013Supplementray Material\u2013": "|**2024-6-30**|**EventDance: Unsupervised Source-free Cross-modal Adaptation for Event-based Object Recognition\u2013Supplementray Material\u2013**|X Zheng et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Zheng_EventDance_Unsupervised_Source-free_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>openaccess.thecvf.com</details>|\n", "MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection\u2014Supplementary Material": "|**2024-6-30**|**MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection\u2014Supplementary Material**|B Peng et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Peng_MAP_MAsk-Pruning_for_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>openaccess.thecvf.com</details>|\n"}, "object detection": {"Cyclic Refiner: Object-Aware Temporal Representation Learning for Multi-View 3D Detection and Tracking": "|**2024-7-3**|**Cyclic Refiner: Object-Aware Temporal Representation Learning for Multi-View 3D Detection and Tracking**|Mingzhe Guo et.al|[paper](https://arxiv.org/abs/2407.03240)|-|<details><summary>detail</summary>Accepted by IJCV</details>|\n", "A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection": "|**2024-7-3**|**A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection**|Jie Shao et.al|[paper](https://arxiv.org/abs/2407.02835)|-|<details><summary>detail</summary>has published on IEEE Signal Processing Letters</details>|\n", "Self-supervised co-salient object detection via feature correspondence at multiple scales": "|**2024-7-3**|**Self-supervised co-salient object detection via feature correspondence at multiple scales**|Souradeep Chakraborty et.al|[paper](https://arxiv.org/abs/2403.11107)|-|<details><summary>detail</summary>ECCV 2024</details>|\n", "Similarity Distance-Based Label Assignment for Tiny Object Detection": "|**2024-7-3**|**Similarity Distance-Based Label Assignment for Tiny Object Detection**|Shuohao Shi et.al|[paper](https://arxiv.org/abs/2407.02394)|[code](https://github.com/cszzshi/SimD)|-|\n", "SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection": "|**2024-7-2**|**SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection**|Anay Majee et.al|[paper](https://arxiv.org/abs/2407.02665)|-|<details><summary>detail</summary>ECCV 2024</details>|\n", "GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection": "|**2024-7-2**|**GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection**|Ziying Song et.al|[paper](https://arxiv.org/abs/2403.11848)|-|-|\n", "DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection": "|**2024-7-2**|**DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection**|Kaixin Xu et.al|[paper](https://arxiv.org/abs/2407.02098)|-|-|\n", "Tracking Object Positions in Reinforcement Learning: A Metric for Keypoint Detection (extended version)": "|**2024-7-2**|**Tracking Object Positions in Reinforcement Learning: A Metric for Keypoint Detection (extended version)**|Emma Cramer et.al|[paper](https://arxiv.org/abs/2312.00592)|-|-|\n", "Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection": "|**2024-7-1**|**Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection**|Zixing Li et.al|[paper](https://arxiv.org/abs/2407.01894)|-|-|\n", "Formal Verification of Object Detection": "|**2024-7-1**|**Formal Verification of Object Detection**|Avraham Raviv et.al|[paper](https://arxiv.org/abs/2407.01295)|-|-|\n", "Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection": "|**2024-7-1**|**Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection**|Francesco Barbato et.al|[paper](https://arxiv.org/abs/2407.01193)|-|<details><summary>detail</summary>IROS 2024</details>|\n", "No More Potentially Dynamic Objects: Static Point Cloud Map Generation based on 3D Object Detection and Ground Projection": "|**2024-7-1**|**No More Potentially Dynamic Objects: Static Point Cloud Map Generation based on 3D Object Detection and Ground Projection**|Soojin Woo et.al|[paper](https://arxiv.org/abs/2407.01073)|-|-|\n", "SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection": "|**2024-7-1**|**SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection**|Dingkang Liang et.al|[paper](https://arxiv.org/abs/2407.01016)|-|-|\n", "Multi-Species Object Detection in Drone Imagery for Population Monitoring of Endangered Animals": "|**2024-6-27**|**Multi-Species Object Detection in Drone Imagery for Population Monitoring of Endangered Animals**|Sowmya Sankaran et.al|[paper](https://arxiv.org/abs/2407.00127)|-|-|\n", "Weighted Circle Fusion: Ensembling Circle Representation from Different Object Detection Results": "|**2024-6-27**|**Weighted Circle Fusion: Ensembling Circle Representation from Different Object Detection Results**|Jialin Yue et.al|[paper](https://arxiv.org/abs/2406.19540)|-|-|\n", "Masked Feature Compression for Object Detection": "|**2024-7-3**|**Masked Feature Compression for Object Detection**|C Dai et.al|[paper](https://www.mdpi.com/2227-7390/12/12/1848)|[code](https://github.com/bosszhe/emiff)|<details><summary>detail</summary>Mathematics, 2024 mdpi.com</details>|\n", "Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024": "|**2024-7-2**|**Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024**|P Wu et.al|[paper](https://arxiv.org/abs/2406.09201)|[code](https://paperswithcode.com/paper/enhanced-object-detection-a-study-on-vast)|-|\n", "\u041f\u0420\u0418\u041c\u0415\u041d\u0415\u041d\u0418\u0415 MULTI-LABEL \u041a\u041b\u0410\u0421\u0421\u0418\u0424\u0418\u041a\u0410\u0426\u0418\u0418 \u0418 OBJECT DETECTION \u0414\u041b\u042f \u041a\u0422-\u0421\u041d\u0418\u041c\u041a\u041e\u0412": "|**2024-7-2**|**\u041f\u0420\u0418\u041c\u0415\u041d\u0415\u041d\u0418\u0415 MULTI-LABEL \u041a\u041b\u0410\u0421\u0421\u0418\u0424\u0418\u041a\u0410\u0426\u0418\u0418 \u0418 OBJECT DETECTION \u0414\u041b\u042f \u041a\u0422-\u0421\u041d\u0418\u041c\u041a\u041e\u0412**|\u041f\u0410 \u0421\u0443\u0445\u043e\u0432 et.al|[paper](https://cyberleninka.ru/article/n/primenenie-multi-label-klassifikatsii-i-object-detection-dlya-kt-snimkov)|-|<details><summary>detail</summary>\u0412\u0435\u0441\u0442\u043d\u0438\u043a \u043d\u0430\u0443\u043a\u0438, 2024 cyberleninka.ru</details>|\n", "Environmentally adaptive fast object detection in UAV images": "|**2024-7-2**|**Environmentally adaptive fast object detection in UAV images**|M Sang et.al|[paper](https://www.sciencedirect.com/science/article/pii/S0262885624002075)|-|<details><summary>detail</summary>Image and Vision Computing, 2024 Elsevier</details>|\n", "Highway Abandoned Objects Recognition Based on Open Vocabulary Object Detection Approach": "|**2024-7-2**|**Highway Abandoned Objects Recognition Based on Open Vocabulary Object Detection Approach**|S Liu et.al|[paper](https://ascelibrary.org/doi/abs/10.1061/9780784485514.045)|-|<details><summary>detail</summary>\u2026\u00a0on Transportation and\u00a0\u2026, 2024 ascelibrary.org</details>|\n", "SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention": "|**2024-7-2**|**SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention**|M Nawfal Meeran et.al|[paper](https://ui.adsabs.harvard.edu/abs/2024arXiv240605802N/abstract)|[code](https://github.com/spidernitt/sam-pm)|-|\n", "OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition": "|**2024-7-2**|**OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition**|K Chen et.al|[paper](https://link.springer.com/article/10.1007/s11263-024-02144-1)|-|<details><summary>detail</summary>International Journal of\u00a0\u2026, 2024 Springer</details>|\n", "Advanced Object Detection and Decision Making in Autonomous Medical Response Systems": "|**2024-7-2**|**Advanced Object Detection and Decision Making in Autonomous Medical Response Systems**|J Needhi - 2024 - preprints.org et.al|[paper](https://www.preprints.org/manuscript/202406.0811)|-|<details><summary>detail</summary>2024 preprints.org</details>|\n", "A Three-Stage Model for Camouflaged Object Detection": "|**2024-7-1**|**A Three-Stage Model for Camouflaged Object Detection**|T Chen et.al|[paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4862341)|[code](https://github.com/rmcong/fpnet_acmmm23)|<details><summary>detail</summary>Available at SSRN 4862341 papers.ssrn.com</details>|\n", "Multi-Scale Features Extraction and Cross-Stage Features Fusion Network for Small Object Detection (Mcfn)": "|**2024-7-1**|**Multi-Scale Features Extraction and Cross-Stage Features Fusion Network for Small Object Detection (Mcfn)**|D Bian et.al|[paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4862981)|-|<details><summary>detail</summary>Available at SSRN\u00a0\u2026 papers.ssrn.com</details>|\n"}, "domain adaptation": {"Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach": "|**2024-7-3**|**Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach**|Zhenbang Du et.al|[paper](https://arxiv.org/abs/2311.00285)|-|<details><summary>detail</summary>This work has been submitted to the IEEE for possible publication</details>|\n", "An Uncertainty-guided Tiered Self-training Framework for Active Source-free Domain Adaptation in Prostate Segmentation": "|**2024-7-3**|**An Uncertainty-guided Tiered Self-training Framework for Active Source-free Domain Adaptation in Prostate Segmentation**|Zihao Luo et.al|[paper](https://arxiv.org/abs/2407.02893)|[code](https://github.com/HiLab-git/UGTST)|-|\n", "Multi-Task Domain Adaptation for Language Grounding with 3D Objects": "|**2024-7-3**|**Multi-Task Domain Adaptation for Language Grounding with 3D Objects**|Penglei Sun et.al|[paper](https://arxiv.org/abs/2407.02846)|[code](https://sites.google.com/view/da4lg.)|-|\n", "A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection": "|**2024-7-3**|**A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection**|Jie Shao et.al|[paper](https://arxiv.org/abs/2407.02835)|-|<details><summary>detail</summary>has published on IEEE Signal Processing Letters</details>|\n", "ECAT: A Entire space Continual and Adaptive Transfer Learning Framework for Cross-Domain Recommendation": "|**2024-7-2**|**ECAT: A Entire space Continual and Adaptive Transfer Learning Framework for Cross-Domain Recommendation**|Chaoqun Hou et.al|[paper](https://arxiv.org/abs/2407.02542)|-|-|\n", "CLIP the Divergence: Language-guided Unsupervised Domain Adaptation": "|**2024-7-1**|**CLIP the Divergence: Language-guided Unsupervised Domain Adaptation**|Jinjing Zhu et.al|[paper](https://arxiv.org/abs/2407.01842)|-|-|\n", "Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision": "|**2024-7-1**|**Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision**|Hao Dong et.al|[paper](https://arxiv.org/abs/2407.01518)|[code](https://github.com/donghao51/MOOSA.)|<details><summary>detail</summary>Accepted by ECCV 2024</details>|\n", "TransferAttn: Transferable-guided Attention Is All You Need for Video Domain Adaptation": "|**2024-7-1**|**TransferAttn: Transferable-guided Attention Is All You Need for Video Domain Adaptation**|Andr\u00e9 Sacilotti et.al|[paper](https://arxiv.org/abs/2407.01375)|-|-|\n", "Gradient-based Class Weighting for Unsupervised Domain Adaptation in Dense Prediction Visual Tasks": "|**2024-7-1**|**Gradient-based Class Weighting for Unsupervised Domain Adaptation in Dense Prediction Visual Tasks**|Roberto Alcover-Couso et.al|[paper](https://arxiv.org/abs/2407.01327)|-|-|\n", "Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo Labeling": "|**2024-6-30**|**Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo Labeling**|Haoran Li et.al|[paper](https://arxiv.org/abs/2406.18610)|-|-|\n", "AdaTreeFormer: Few Shot Domain Adaptation for Tree Counting from a Single High-Resolution Image": "|**2024-6-30**|**AdaTreeFormer: Few Shot Domain Adaptation for Tree Counting from a Single High-Resolution Image**|Hamed Amini Amirkolaee et.al|[paper](https://arxiv.org/abs/2402.02956)|[code](https://github.com/HAAClassic/AdaTreeFormer.)|<details><summary>detail</summary>Accepted in ISPRS Journal of Photogrammetry and Remote Sensing</details>|\n", "STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning": "|**2024-6-27**|**STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning**|Yanan Zhang et.al|[paper](https://arxiv.org/abs/2406.19362)|-|<details><summary>detail</summary>Accepted by IEEE-TIV</details>|\n", "Learning Visual Conditioning Tokens to Correct Domain Shift for Fully Test-time Adaptation": "|**2024-6-27**|**Learning Visual Conditioning Tokens to Correct Domain Shift for Fully Test-time Adaptation**|Yushun Tang et.al|[paper](https://arxiv.org/abs/2406.19341)|-|<details><summary>detail</summary>accepted by TMM</details>|\n", "ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation": "|**2024-6-27**|**ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation**|Nazanin Moradinasab et.al|[paper](https://arxiv.org/abs/2406.19225)|-|-|\n", "Physics-informed and Unsupervised Riemannian Domain Adaptation for Machine Learning on Heterogeneous EEG Datasets": "|**2024-6-27**|**Physics-informed and Unsupervised Riemannian Domain Adaptation for Machine Learning on Heterogeneous EEG Datasets**|Apolline Mellot et.al|[paper](https://arxiv.org/abs/2403.15415)|-|-|\n", "Confidence sharing adaptation for out-of-domain human pose and shape estimation": "|**2024-7-3**|**Confidence sharing adaptation for out-of-domain human pose and shape estimation**|T Yue et.al|[paper](https://www.sciencedirect.com/science/article/pii/S1077314224001322)|-|<details><summary>detail</summary>Computer Vision and Image\u00a0\u2026, 2024 Elsevier</details>|\n", "\u2026\u00a0Carbon Content and Temperature in Bof Steelmaking Based on Adaptive Balanced Joint Distribution Alignment Domain Adaptation with Variational Autoencoder": "|**2024-7-2**|**\u2026\u00a0Carbon Content and Temperature in Bof Steelmaking Based on Adaptive Balanced Joint Distribution Alignment Domain Adaptation with Variational Autoencoder**|Z Liu et.al|[paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4863841)|-|<details><summary>detail</summary>Available at SSRN 4863841 papers.ssrn.com</details>|\n", "POND: Multi-Source Time Series Domain Adaptation with Information-Aware Prompt Tuning": "|**2024-7-2**|**POND: Multi-Source Time Series Domain Adaptation with Information-Aware Prompt Tuning**|J Wang et.al|[paper](https://www.researchgate.net/profile/Junxiang-Wang-3/publication/381225385_POND_Multi-Source_Time_Series_Domain_Adaptation_with_Information-Aware_Prompt_Tuning/links/6663974e85a4ee7261ae011e/POND-Multi-Source-Time-Series-Domain-Adaptation-with-Information-Aware-Prompt-Tuning.pdf)|[code](https://paperswithcode.com/paper/prompt-based-domain-discrimination-for-multi)|<details><summary>detail</summary>2024 researchgate.net</details>|\n", "Continuous Test-time Domain Adaptation for Efficient Fault Detection under Evolving Operating Conditions": "|**2024-7-2**|**Continuous Test-time Domain Adaptation for Efficient Fault Detection under Evolving Operating Conditions**|H Sun et.al|[paper](https://ui.adsabs.harvard.edu/abs/2024arXiv240606607S/abstract)|[code](https://paperswithcode.com/paper/continuous-test-time-domain-adaptation-for)|-|\n", "Source-Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels": "|**2024-7-2**|**Source-Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels**|S Salo Elia et.al|[paper](https://ui.adsabs.harvard.edu/abs/2024arXiv240605863S/abstract)|[code](https://paperswithcode.com/paper/source-free-domain-adaptation-for-speaker)|-|\n", "Cross-Domain Classification Based on Frequency Component Adaptation for Remote Sensing Images": "|**2024-7-2**|**Cross-Domain Classification Based on Frequency Component Adaptation for Remote Sensing Images**|P Zhu et.al|[paper](https://www.mdpi.com/2072-4292/16/12/2134)|-|<details><summary>detail</summary>Remote Sensing, 2024 mdpi.com</details>|\n", "TSFAN: Tensorized spatial-frequency attention network with domain adaptation for cross-session EEG-based biometric recognition": "|**2024-7-2**|**TSFAN: Tensorized spatial-frequency attention network with domain adaptation for cross-session EEG-based biometric recognition**|X Jin et.al|[paper](https://automatedtest.iopscience.iop.org/article/10.1088/1741-2552/ad5761)|-|<details><summary>detail</summary>Journal of\u00a0\u2026, 2024 automatedtest.iopscience.iop.org</details>|\n", "SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition": "|**2024-7-2**|**SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition**|T Wang et.al|[paper](https://arxiv.org/abs/2406.07832)|-|-|\n", "Novel Deep Learning Domain Adaptation Approach for Object Detection Using Semi-Self Building Dataset and Modified YOLOv4": "|**2024-7-1**|**Novel Deep Learning Domain Adaptation Approach for Object Detection Using Semi-Self Building Dataset and Modified YOLOv4**|A Gomaa et.al|[paper](https://www.mdpi.com/2032-6653/15/6/255)|-|<details><summary>detail</summary>World Electric Vehicle Journal, 2024 mdpi.com</details>|\n", "Global self-sustaining and local inheritance for source-free unsupervised domain adaptation": "|**2024-7-1**|**Global self-sustaining and local inheritance for source-free unsupervised domain adaptation**|L Peng et.al|[paper](https://www.sciencedirect.com/science/article/pii/S0031320324004308)|-|<details><summary>detail</summary>Pattern Recognition, 2024 Elsevier</details>|\n"}, "domain generalization": {"Self-supervised Vision Transformer are Scalable Generative Models for Domain Generalization": "|**2024-7-3**|**Self-supervised Vision Transformer are Scalable Generative Models for Domain Generalization**|Sebastian Doerrich et.al|[paper](https://arxiv.org/abs/2407.02900)|[code](https://github.com/sdoerrich97/vits-are-generative-models)|<details><summary>detail</summary>MICCAI 2024</details>|\n", "Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision": "|**2024-7-1**|**Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision**|Hao Dong et.al|[paper](https://arxiv.org/abs/2407.01518)|[code](https://github.com/donghao51/MOOSA.)|<details><summary>detail</summary>Accepted by ECCV 2024</details>|\n", "Nonlinear Craig Interpolant Generation over Unbounded Domains by Separating Semialgebraic Sets": "|**2024-6-30**|**Nonlinear Craig Interpolant Generation over Unbounded Domains by Separating Semialgebraic Sets**|Hao Wu et.al|[paper](https://arxiv.org/abs/2407.00625)|-|-|\n", "Verifying the Generalization of Deep Learning to Out-of-Distribution Domains": "|**2024-6-30**|**Verifying the Generalization of Deep Learning to Out-of-Distribution Domains**|Guy Amir et.al|[paper](https://arxiv.org/abs/2406.02024)|-|<details><summary>detail</summary>To appear in the Journal of Automated Reasoning (JAR)</details>|\n", "Diverse Intra- and Inter-Domain Activity Style Fusion for Cross-Person Generalization in Activity Recognition": "|**2024-6-28**|**Diverse Intra- and Inter-Domain Activity Style Fusion for Cross-Person Generalization in Activity Recognition**|Junru Zhang et.al|[paper](https://arxiv.org/abs/2406.04609)|-|<details><summary>detail</summary>The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024)</details>|\n", "A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation": "|**2024-6-28**|**A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation**|Yongkang Liu et.al|[paper](https://arxiv.org/abs/2406.09881)|-|-|\n", "A synthetic data approach for domain generalization of NLI models": "|**2024-6-28**|**A synthetic data approach for domain generalization of NLI models**|Mohammad Javad Hosseini et.al|[paper](https://arxiv.org/abs/2402.12368)|-|-|\n", "Scalable and Domain-General Abstractive Proposition Segmentation": "|**2024-6-28**|**Scalable and Domain-General Abstractive Proposition Segmentation**|Mohammad Javad Hosseini et.al|[paper](https://arxiv.org/abs/2406.19803)|-|-|\n", "Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation": "|**2024-6-26**|**Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation**|Shivang Chopra et.al|[paper](https://arxiv.org/abs/2402.04929)|-|<details><summary>detail</summary>arXiv admin note: substantial text overlap with arXiv:2310</details>|\n", "Unmasking the Imposters: In-Domain Detection of Human vs. Machine-Generated Tweets": "|**2024-6-25**|**Unmasking the Imposters: In-Domain Detection of Human vs. Machine-Generated Tweets**|Bryan E. Tuck et.al|[paper](https://arxiv.org/abs/2406.17967)|-|-|\n", "Utilizing Graph Generation for Enhanced Domain Adaptive Object Detection": "|**2024-6-23**|**Utilizing Graph Generation for Enhanced Domain Adaptive Object Detection**|Mu Wang et.al|[paper](https://arxiv.org/abs/2406.06535)|-|-|\n", "PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images": "|**2024-6-21**|**PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images**|Parastoo Sotoudeh Sharifi et.al|[paper](https://arxiv.org/abs/2406.15685)|[code](https://github.com/ParastooSotoudeh/PathoWAve)|-|\n", "Augmenting Query and Passage for Retrieval-Augmented Generation using LLMs for Open-Domain Question Answering": "|**2024-6-20**|**Augmenting Query and Passage for Retrieval-Augmented Generation using LLMs for Open-Domain Question Answering**|Minsang Kim et.al|[paper](https://arxiv.org/abs/2406.14277)|-|-|\n", "Towards Trustworthy Unsupervised Domain Adaptation: A Representation Learning Perspective for Enhancing Robustness, Discrimination, and Generalization": "|**2024-6-18**|**Towards Trustworthy Unsupervised Domain Adaptation: A Representation Learning Perspective for Enhancing Robustness, Discrimination, and Generalization**|Jia-Li Yin et.al|[paper](https://arxiv.org/abs/2406.13180)|-|-|\n", "Augmenting Biomedical Named Entity Recognition with General-domain Resources": "|**2024-6-18**|**Augmenting Biomedical Named Entity Recognition with General-domain Resources**|Yu Yin et.al|[paper](https://arxiv.org/abs/2406.10671)|[code](https://github.com/qingyu-qc/bioner_gerbera)|<details><summary>detail</summary>We make data</details>|\n", "Entity-centric multi-domain transformer for improving generalization in fake news detection": "|**2024-7-3**|**Entity-centric multi-domain transformer for improving generalization in fake news detection**|P Bazmi et.al|[paper](https://www.sciencedirect.com/science/article/pii/S0306457324001663)|-|<details><summary>detail</summary>Information Processing &\u00a0\u2026, 2024 Elsevier</details>|\n", "Fine-Grained Domain Generalization with Feature Structuralization": "|**2024-7-2**|**Fine-Grained Domain Generalization with Feature Structuralization**|W Yu et.al|[paper](https://arxiv.org/abs/2406.09166)|[code](https://github.com/ValeevGroup/tiledarray)|-|\n", "Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification": "|**2024-6-30**|**Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification**|S Addepalli et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/html/Addepalli_Leveraging_Vision-Language_Models_for_Improving_Domain_Generalization_in_Image_Classification_CVPR_2024_paper.html)|[code](https://github.com/val-iisc/VL2V-ADiP)|<details><summary>detail</summary>Proceedings of the\u00a0\u2026, 2024 openaccess.thecvf.com</details>|\n", "Disentangled Prompt Representation for Domain Generalization": "|**2024-6-30**|**Disentangled Prompt Representation for Domain Generalization**|D Cheng et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/html/Cheng_Disentangled_Prompt_Representation_for_Domain_Generalization_CVPR_2024_paper.html)|[code](https://github.com/lpiccinelli-eth/unidepth)|<details><summary>detail</summary>Proceedings of the\u00a0\u2026, 2024 openaccess.thecvf.com</details>|\n", "Supplementary Materials: Unknown Prompt, the only Lacuna: Unveiling CLIP's Potential in Open Domain Generalization": "|**2024-6-30**|**Supplementary Materials: Unknown Prompt, the only Lacuna: Unveiling CLIP's Potential in Open Domain Generalization**|M Singha et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Singha_Unknown_Prompt_the_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>openaccess.thecvf.com</details>|\n", "Supplementary Material for DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning": "|**2024-6-30**|**Supplementary Material for DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning**|B Dataset - openaccess.thecvf.com et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Bai_DiPrompT_Disentangled_Prompt_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>openaccess.thecvf.com</details>|\n", "Supplementary Material for Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization": "|**2024-6-30**|**Supplementary Material for Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization**|K Le et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024/supplemental/Le_Efficiently_Assemble_Normalization_CVPR_2024_supplemental.pdf)|-|<details><summary>detail</summary>Phuoc, KS Wong openaccess.thecvf.com</details>|\n", "Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation": "|**2024-6-30**|**Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation**|S Angarano et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024W/Vision4Ag/html/Angarano_Domain_Generalization_for_Crop_Segmentation_with_Standardized_Ensemble_Knowledge_Distillation_CVPRW_2024_paper.html)|[code](https://paperswithcode.com/paper/domain-generalization-for-crop-segmentation)|<details><summary>detail</summary>Proceedings of the\u00a0\u2026, 2024 openaccess.thecvf.com</details>|\n", "MixStyle-Based Contrastive Test-Time Adaptation: Pathway to Domain Generalization": "|**2024-6-30**|**MixStyle-Based Contrastive Test-Time Adaptation: Pathway to Domain Generalization**|K Yamashita et.al|[paper](https://openaccess.thecvf.com/content/CVPR2024W/MAT/html/Yamashita_MixStyle-Based_Contrastive_Test-Time_Adaptation_Pathway_to_Domain_Generalization_CVPRW_2024_paper.html)|-|<details><summary>detail</summary>\u2026\u00a0of the IEEE/CVF Conference on\u00a0\u2026, 2024 openaccess.thecvf.com</details>|\n", "Fault vibration model driven fault-aware domain generalization framework for bearing fault diagnosis": "|**2024-6-30**|**Fault vibration model driven fault-aware domain generalization framework for bearing fault diagnosis**|B Pang et.al|[paper](https://www.sciencedirect.com/science/article/pii/S1474034624002684)|-|<details><summary>detail</summary>Advanced Engineering\u00a0\u2026, 2024 Elsevier</details>|\n"}, "vision language": {"InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output": "|**2024-7-3**|**InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output**|Pan Zhang et.al|[paper](https://arxiv.org/abs/2407.03320)|[code](https://github.com/InternLM/InternLM-XComposer.)|<details><summary>detail</summary>Technical Report</details>|\n", "Multi-modal Attribute Prompting for Vision-Language Models": "|**2024-7-3**|**Multi-modal Attribute Prompting for Vision-Language Models**|Xin Liu et.al|[paper](https://arxiv.org/abs/2403.00219)|-|-|\n", "Vision-driven Automated Mobile GUI Testing via Multimodal Large Language Model": "|**2024-7-3**|**Vision-driven Automated Mobile GUI Testing via Multimodal Large Language Model**|Zhe Liu et.al|[paper](https://arxiv.org/abs/2407.03037)|-|-|\n", "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model": "|**2024-7-3**|**EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model**|Yuxuan Zhang et.al|[paper](https://arxiv.org/abs/2406.20076)|[code](https://github.com/hustvl/EVF-SAM)|<details><summary>detail</summary>Preprint</details>|\n", "Can 3D Vision-Language Models Truly Understand Natural Language?": "|**2024-7-3**|**Can 3D Vision-Language Models Truly Understand Natural Language?**|Weipeng Deng et.al|[paper](https://arxiv.org/abs/2403.14760)|[code](https://github.com/VincentDENGP/3D-LR)|<details><summary>detail</summary>https://github</details>|\n", "Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective": "|**2024-7-3**|**Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective**|Zhaotian Weng et.al|[paper](https://arxiv.org/abs/2407.02814)|-|<details><summary>detail</summary>ACM Class:I</details>|\n", "Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts": "|**2024-7-2**|**Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts**|Aditya Sharma et.al|[paper](https://arxiv.org/abs/2406.16851)|-|<details><summary>detail</summary>Under review</details>|\n", "MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context": "|**2024-7-2**|**MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context**|Zishan Gu et.al|[paper](https://arxiv.org/abs/2407.02730)|-|-|\n", "Light-weight Fine-tuning Method for Defending Adversarial Noise in Pre-trained Medical Vision-Language Models": "|**2024-7-2**|**Light-weight Fine-tuning Method for Defending Adversarial Noise in Pre-trained Medical Vision-Language Models**|Xu Han et.al|[paper](https://arxiv.org/abs/2407.02716)|-|-|\n", "Commonsense Reasoning for Legged Robot Adaptation with Vision-Language Models": "|**2024-7-2**|**Commonsense Reasoning for Legged Robot Adaptation with Vision-Language Models**|Annie S. Chen et.al|[paper](https://arxiv.org/abs/2407.02666)|-|-|\n", "Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Vision-Language Models": "|**2024-7-2**|**Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Vision-Language Models**|Joan Nwatu et.al|[paper](https://arxiv.org/abs/2407.02623)|[code](https://github.com/Anniejoan/Uplifting-Lower-income-data)|<details><summary>detail</summary>ACM Class:K</details>|\n", "Conceptual Codebook Learning for Vision-Language Models": "|**2024-7-2**|**Conceptual Codebook Learning for Vision-Language Models**|Yi Zhang et.al|[paper](https://arxiv.org/abs/2407.02350)|-|-|\n", "Why do LLaVA Vision-Language Models Reply to Images in English?": "|**2024-7-2**|**Why do LLaVA Vision-Language Models Reply to Images in English?**|Musashi Hinck et.al|[paper](https://arxiv.org/abs/2407.02333)|-|<details><summary>detail</summary>Pre-print</details>|\n", "ColPali: Efficient Document Retrieval with Vision Language Models": "|**2024-7-2**|**ColPali: Efficient Document Retrieval with Vision Language Models**|Manuel Faysse et.al|[paper](https://arxiv.org/abs/2407.01449)|-|<details><summary>detail</summary>Under Review</details>|\n", "BiasDora: Exploring Hidden Biased Associations in Vision-Language Models": "|**2024-7-2**|**BiasDora: Exploring Hidden Biased Associations in Vision-Language Models**|Chahat Raj et.al|[paper](https://arxiv.org/abs/2407.02066)|[code](https://github.com/chahatraj/BiasDora.)|<details><summary>detail</summary>Under Review</details>|\n", "Towards Vision-Language Geo-Foundation Model: A Survey": "|**2024-7-3**|**Towards Vision-Language Geo-Foundation Model: A Survey**|Y Zhou et.al|[paper](https://www.researchgate.net/profile/Yue-Zhou-139/publication/381403816_Towards_Vision-Language_Geo-Foundation_Model_A_Survey/links/666ba71ea54c5f0b9464c544/Towards-Vision-Language-Geo-Foundation-Model-A-Survey.pdf)|[code](https://github.com/zytx121/awesome-vlgfm)|<details><summary>detail</summary>researchgate.net</details>|\n", "VLind-Bench: Measuring Language Priors in Large Vision-Language Models": "|**2024-7-2**|**VLind-Bench: Measuring Language Priors in Large Vision-Language Models**|K Lee et.al|[paper](https://arxiv.org/abs/2406.08702)|[code](https://paperswithcode.com/paper/vlind-bench-measuring-language-priors-in)|-|\n", "How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models": "|**2024-7-2**|**How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models**|T Khajuria et.al|[paper](https://arxiv.org/abs/2406.09067)|[code](https://paperswithcode.com/paper/how-structured-are-the-representations-in)|-|\n", "AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models": "|**2024-7-2**|**AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models**|Y Wu et.al|[paper](https://arxiv.org/abs/2406.09295)|[code](https://paperswithcode.com/paper/alignmmbench-evaluating-chinese-multimodal)|-|\n", "MirrorCheck: Efficient Adversarial Defense for Vision-Language Models": "|**2024-7-2**|**MirrorCheck: Efficient Adversarial Defense for Vision-Language Models**|S Fares et.al|[paper](https://arxiv.org/abs/2406.09250)|[code](https://paperswithcode.com/paper/mirrorcheck-efficient-adversarial-defense-for)|-|\n", "LLAVIDAL: Benchmarking Large Language Vision Models for Daily Activities of Living": "|**2024-7-2**|**LLAVIDAL: Benchmarking Large Language Vision Models for Daily Activities of Living**|R Chakraborty et.al|[paper](https://arxiv.org/abs/2406.09390)|[code](https://paperswithcode.com/paper/llavidal-benchmarking-large-language-vision)|-|\n", "OpenVLA: An Open-Source Vision-Language-Action Model": "|**2024-7-2**|**OpenVLA: An Open-Source Vision-Language-Action Model**|MJ Kim et.al|[paper](https://arxiv.org/abs/2406.09246)|[code](https://paperswithcode.com/paper/openvla-an-open-source-vision-language-action)|-|\n", "Generative AI-based Prompt Evolution Engineering Design Optimization With Vision-Language Model": "|**2024-7-2**|**Generative AI-based Prompt Evolution Engineering Design Optimization With Vision-Language Model**|M Wong et.al|[paper](https://arxiv.org/abs/2406.09143)|[code](https://paperswithcode.com/paper/generative-ai-based-prompt-evolution)|-|\n", "VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks": "|**2024-7-1**|**VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks**|J Wu et.al|[paper](https://arxiv.org/abs/2406.08394)|[code](https://github.com/opengvlab/visionllm)|-|\n", "RWKV-CLIP: A Robust Vision-Language Representation Learner": "|**2024-7-1**|**RWKV-CLIP: A Robust Vision-Language Representation Learner**|T Gu et.al|[paper](https://arxiv.org/abs/2406.06973)|[code](https://github.com/deepglint/rwkv-clip)|-|\n"}}