☄️ EgoAlpha releases the TrustGPT focuses on reasoning. Trust the GPT with the strongest reasoning abilities for authentic and reliable answers. You can click here or visit the Playgrounds directly to experience it。
[2024.9.12]
[2024.9.11]
[2024.9.10]
- Paper: OPAL: Outlier-Preserved Microscaling Quantization A ccelerator for Generative Large Language Models
[2024.9.9]
[2024.9.8]
[2024.9.7]
[2024.9.6]
[2024.9.5]
[2024.9.4]
[2024.9.3]
[2024.9.2]
[2024.9.1]
[2024.8.31]
[2024.8.30]
[2024.8.29]
[2024.8.28]
[2024.8.27]
[2024.8.26]
[2024.8.25]
[2024.8.24]
[2024.8.23]
[2024.8.22]
[2024.8.21]
- Paper: AgentMove: Predicting Human Mobility Anywhere Using Large Language Model based Agentic Framework
[2024.8.20]
- Paper: Geo-Llama: Leveraging LLMs for Human Mobility Trajectory Generation with Spatiotemporal Constraints
[2024.8.19]
[2024.8.18]
[2024.8.17]
[2024.8.16]
[2024.8.15]
- Technical Report: Can Large Language Models Understand Symbolic Graphics Programs?
[2024.8.14]
[2024.8.13]
- Paper: When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
[2024.8.12]
- Survey Paper: Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions
[2024.8.11]
[2024.8.10]
[2024.8.9]
[2024.8.8]
[2024.8.7]
[2024.8.6]
[2024.8.5]
- Paper: Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting
- Paper: Toward Automatic Relevance Judgment using Vision--Language Models for Image--Text Retrieval Evaluation
[2024.8.4]
- Paper: Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs
[2024.8.3]
[2024.8.2]
[2024.8.1]
[2024.7.31]
[2024.7.30]
[2024.7.29] -🔥🔥🔥Paper: Wolf: Captioning Everything with a World Summarization Framework
[2024.7.28]
[2024.7.27]
[2024.7.26]
[2024.7.25]
[2024.7.24]
[2024.7.23]
[2024.7.22]
[2024.7.21]
[2024.7.20]
[2024.7.19]
[2024.7.18]
- Mistral AI has just launched its first open source model based on the Mamba2 architecture, Codestral Mamba (7B), specialising in code generation.
- Paper: Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models
[2024.7.17]
- Paper: The Two Sides of the Coin: Hallucination Generation and Detection with LLMs as Evaluators for LLMs
[2024.7.16]
[2024.7.15]
- 🔥🔥🔥The world's first big open-source model for chip design is born! 5 years to reshape the $500 billion semiconductor industry!
- Paper: On LLM Wizards: Identifying Large Language Models' Behaviors for Wizard of Oz Experiments
[2024.7.14]
- 🔥🔥🔥The Big Model Authority Test was exposed as a flop! Favouring closed source models such as GPT-4 even more, even the cue words are treated differently
- Paper: RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
[2024.7.13]
[2024.7.12]
[2024.7.11]
- 🔥🔥🔥NVIDIA release: PaperData, Data Everywhere:A Guide for Pretraining Dataset Construction
[2024.7.10]
[2024.7.9]
[2024.7.8]
[2024.7.7]
[2024.7.6]
- 🔥🔥🔥 Gemma 2 is the strongest open-source model, surpassing Llama 3!
- Paper: IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization
[2024.7.5]
[2024.7.4]
[2024.7.3]
[2024.7.2]
- Technical Paper: Motion meets Attention: Video Motion Prompts
[2024.7.1]
[2024.6.30]
- Paper: LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
- Paper: Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs
[2024.6.29]
[2024.6.28]
[2024.6.27]
[2024.6.26]
- Paper: Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?
- Paper: PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation
[2024.6.25]
[2024.6.24]
[2024.6.23]
[2024.6.22]
- Paper: Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models
[2024.6.21]
[2024.6.20]
- Paper: LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation
- Paper: From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries
[2024.6.19]
- Paper: Adversarial Attacks on Multimodal Agents
- Paper: AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation
[2024.6.18]
[2024.6.17]
- Paper: Unveiling Encoder-Free Vision-Language Models
- Paper: VideoLLM-online: Online Video Large Language Model for Streaming Video
[2024.6.16]
[2024.6.15]
- Paper: VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
- Paper: EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models
[2024.6.14]
[2024.6.13]
- Paper: Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
- Paper: Enhancing End-to-End Autonomous Driving with Latent World Model
[2024.6.12]
- Paper: Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
- Paper: The Impact of Initialization on LoRA Finetuning Dynamics
[2024.6.11]
- Paper: Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
- Paper: Can Language Models Serve as Text-Based World Simulators?【ACL2024】
[2024.6.10]
- Survey Paper: Towards a Personal Health Large Language Model
- Technical Report: Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
[2024.6.9]
- Paper: 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
- Technical Report: Towards Semantic Equivalence of Tokenization in Multimodal LLM
[2024.6.8]
[2024.6.7]
- Paper: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences【ACL2024】
[2024.6.6]
- Paper: Verbalized Machine Learning: Revisiting Machine Learning with Language Models
- Paper: PaCE: Parsimonious Concept Engineering for Large Language Models
[2024.6.5]
[2024.6.4]
- Paper: Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks
- Paper: Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
[2024.6.3]
- Paper: Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
- Paper: Graph External Attention Enhanced Transformer【ICML2024】
[2024.6.2]
- Paper: Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality【ICML2024】
- Paper: LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models
[2024.6.1]
- Paper: StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
- Paper: DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
[2024.5.31]
- Paper: Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training
- Paper: OR-Bench: An Over-Refusal Benchmark for Large Language Models
[2024.5.30]
- Paper: Visualizing the loss landscape of Self-supervised Vision Transformer【NIPS2024 workshop】
- Paper: TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models【ACL2024】
[2024.5.29]
- Paper: Cross-Context Backdoor Attacks against Graph Prompt Learning【KDD2024】
- Paper: Yuan 2.0-M32: Mixture of Experts with Attention Router
[2024.5.28]
- Survey Paper: Tool Learning with Large Language Models: A Survey
- Paper: Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models
[2024.5.27]
- Paper: Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models【ACL2024】
- Paper: Exploring Alignment in Shared Cross-lingual Spaces【ACL2024】
[2024.5.26]
- Paper: ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models
- Paper: Disease-informed Adaptation of Vision-Language Models【MICCAI 2024】
[2024.5.25]
- Paper: Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models【ICME2024】
- Paper: Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models
[2024.5.24]
- Paper: Neuromorphic dreaming: A pathway to efficient learning in artificial agents
- Paper: DAGER: Exact Gradient Inversion for Large Language Models
[2024.5.23]
- Paper: Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
- Paper: When Generative AI Meets Workplace Learning: Creating A Realistic & Motivating Learning Experience With A Generative PCA【ECIS2024】
[2024.5.22]
- Paper: MAML-en-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning【KDD2024】
- Paper: Measuring Impacts of Poisoning on Model Parameters and Embeddings for Large Language Models of Code【1st ACM International Conference on AI-powered Software (AIware), co-located with the ACM International Conference on the Foundations of Software Engineering (FSE) 2024, Porto de Galinhas, Brazil. 】
- Paper: Effective In-Context Example Selection through Data Compression【ACL2024】
[2024.5.21]
- Paper: Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion
- Paper: CPS-LLM: Large Language Model based Safe Usage Plan Generator for Human-in-the-Loop Human-in-the-Plant Cyber-Physical System【AAAI2024】
- Paper:DocReLM: Mastering Document Retrieval with Language Model
[2024.5.20]
- Paper: Libra: Building Decoupled Vision System on Large Language Models
- Paper: Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement【CVPR 2024 Workshop NTIRE: New Trends in Image Restoration and Enhancement workshop and Challenges】
- Paper: MarkLLM: An Open-Source Toolkit for LLM Watermarking
[2024.5.19]
- Paper: Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers
- Survey Paper: When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
[2024.5.18]
- Paper: HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models
- Paper: Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
[2024.5.17]
- Paper: UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models
- Paper: Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
[2024.5.16]
- Paper: Text-to-Vector Generation with Neural Path Representation
- Paper: Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model
[2024.5.15]
- 🔥🔥🔥Google strikes back: Project Astra goes head-to-head with GPT-4o, Veo fights Sora, and a new version of Gemini revolutionises search
- Paper: Improving Transformers with Dynamically Composable Multi-Head Attention
- Paper: Learning Multi-Agent Communication from Graph Modeling Perspective
[2024.5.14]
- 🔥🔥🔥OpenAI Turns the World Upside Down: GPT-4o is Completely Free, Real-Time Voice-Video Interaction Rocks the Room
- Paper: Efficient Vision-Language Pre-training by Cluster Masking
- Paper: Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research
[2024.5.13]
- Paper: Linearizing Large Language Models
- Paper: Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark
[2024.5.12]
- Paper: UniDM: A Unified Framework for Data Manipulation with Large Language Models
- Paper: Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling
[2024.5.11]
[2024.5.10]
- Paper: FlockGPT: Guiding UAV Flocking with Linguistic Orchestration
- Paper: An Automatic Prompt Generation System for Tabular Data Tasks
- Paper: Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
[2024.5.9]
- Paper: Probing Multimodal LLMs as World Models for Driving
- Paper: Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning
- Paper: Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
[2024.5.8]
- Paper: ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning
- Paper: QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
- Paper: Unveiling Disparities in Web Task Handling Between Human and Web Agent
[2024.5.7]
- Paper: vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention
- Paper: DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
- Survey Paper: Vision Mamba: A Comprehensive Survey and Taxonomy
[2024.5.6]
- Paper: Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
- Paper: On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?
- Paper: Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows
[2024.5.5]
- Paper: What matters when building vision-language models?
- Paper: REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs
- Paper: FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems
[2024.5.4]
- Paper: Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo
- Paper: Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT
- Paper: Auto-Encoding Morph-Tokens for Multimodal LLM
[2024.5.3]
- Paper: Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models
- Paper: CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models
- Paper: AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts
[2024.5.2]
- Paper: When Quantization Affects Confidence of Large Language Models?
- Paper: Causal Evaluation of Language Models
- Paper: Investigating Automatic Scoring and Feedback using Large Language Models
[2024.5.1]
- Paper: Self-Play Preference Optimization for Language Model Alignment
- Paper: Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3
- Paper: RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization
[2024.4.30]
- Paper: EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model
- Paper: NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance
- Paper: Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning
[2024.4.29]
- The "Chinese NVIDIA" thousand-card cluster is now in place.
- XVERSE-V: Unconditionally commercial free, outperforms Claude 3 Sonnet
[2024.4.28]
- Paper: Learning to Beat ByteRL: Exploitability of Collectible Card Game Agents
- Paper: Unifying Asynchronous Logics for Hyperproperties
- Paper: Lost in Recursion: Mining Rich Event Semantics in Knowledge Graphs
[2024.4.27]
- Paper: AAPL: Adding Attributes to Prompt Learning for Vision-Language Models
- Paper: SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
- Paper: Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents
[2024.4.26]
- Paper: IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages
- Paper: Make Your LLM Fully Utilize the Context
- Paper: Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning
[2024.4.25]
- Paper: Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
- Paper: The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
- Paper: MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
[2024.4.24]
- Survey Paper: A Survey on Visual Mamba
- Paper: Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach
- Paper: Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation
[2024.4.23]
[2024.4.22]
- Paper: Unified Scene Representation and Reconstruction for 3D Large Language Models
- Paper: Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs
- Paper: Stronger Random Baselines for In-Context Learning
[2024.4.21]
- Paper: When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes
- Paper: Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs
- Paper: Private Agent-Based Modeling
[2024.4.20]
- 🔥🔥🔥Introducing Meta Llama 3: The most capable openly available LLM to date
- Paper: Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction
[2024.4.19]
- Paper: V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning
- Paper: Point-In-Context: Understanding Point Cloud via In-Context Learning
- Paper: Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models
[2024.4.18]
- Paper: Quantifying Multilingual Performance of Large Language Models Across Languages
- Paper: Moving Object Segmentation: All You Need Is SAM (and Flow)
- Paper: When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes
[2024.4.17]
- Paper: Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding
- Survey Paper: The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey
- Paper: LLMTune: Accelerate Database Knob Tuning with Large Language Models
[2024.4.16]
- Paper: MMInA: Benchmarking Multihop Multimodal Internet Agents
- Paper: Memory Sharing for Large Language Model based Agents
- Paper: LLMorpheus: Mutation Testing using Large Language Models
[2024.4.15]
- Paper: Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation
- Paper: BRAVE: Broadening the visual encoding of vision-language models
- Paper: From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications
[2024.4.14]
- Paper: ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference
- Paper: ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling
- Paper: MetaCheckGPT -- A Multi-task Hallucination Detector Using LLM Uncertainty and Meta-models
[2024.4.13]
- Paper: OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
- Paper: Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
- Paper: Manipulating Large Language Models to Increase Product Visibility
[2024.4.12]
- Paper: RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
- Paper: Generating consistent PDDL domains with Large Language Models
- Paper: ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
[2024.4.11]
- Paper: ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference
- Paper: InfiCoder-Eval: Systematically Evaluating the Question-Answering Capabilities of Code Large Language Models
- Paper: High-Dimension Human Value Representation in Large Language Models
[2024.4.10]
- Paper: OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
- Paper: EduAgent: Generative Student Agents in Learning
- Paper: Content Knowledge Identification with Multi-Agent Large Language Models (LLMs)
[2024.4.9]
- Paper: Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
- Paper: MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
- Paper: MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise
[2024.4.8]
- Paper: Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
- Paper: LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding
- Paper: DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model
[2024.4.7]
- Paper: LongVLM: Efficient Long Video Understanding via Large Language Models
- Paper: nicolay-r at SemEval-2024 Task 3: Using Flan-T5 for Reasoning Emotion Cause in Conversations with Chain-of-Thought on Emotion States
- Paper: Do Large Language Models Rank Fairly? An Empirical Study on the Fairness of LLMs as Rankers
[2024.4.6]
- Paper: Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought
- Paper: MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens
- Paper: Scaling Up Video Summarization Pretraining with Large Language Models
[2024.4.5]
- Paper:Evaluating LLMs at Detecting Errors in LLM Responses
- Paper:Laser Learning Environment: A new environment for coordination-critical multi-agent tasks
- Paper:Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models
[2024.4.4]
- Paper:AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
- Paper:Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph
- Paper:Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
[2024.4.3]
- Paper: Topic-based Watermarks for LLM-Generated Text
- Paper: ViTamin: Designing Scalable Vision Models in the Vision-Language Era
- Paper: Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners
[2024.4.2]
- Paper: Segment Any 3D Object with Language
- Paper: Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
- Paper: Iterated Learning Improves Compositionality in Large Vision-Language Models
[2024.4.1]
- Paper: LUQ: Long-text Uncertainty Quantification for LLMs
- Paper: ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
- Paper: ChatGPT v.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models
[2024.3.31]
- Paper: MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning
- Paper: Convolutional Prompting meets Language Models for Continual Learning
- Paper: Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference
[2024.3.30]
- Paper: Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models
- Paper: ReALM: Reference Resolution As Language Modeling
- Paper: Gecko: Versatile Text Embeddings Distilled from Large Language Models
[2024.3.29]
- Paper: RSMamba: Remote Sensing Image Classification with State Space Model
- Paper: Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning
- Paper: WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models
[2024.3.28]
- Paper: Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
- Paper: 3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation
- Paper: MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model
[2024.3.27]
- 🔥🔥🔥Stability AI open-sources 3B code generation model: it's patchable, and it can Debug
- 🔥🔥🔥Suno, the "AI Songwriter", is a hit in the music industry.
- Paper: TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
[2024.3.26]
- Paper: AIOS: LLM Agent Operating System
- Paper: Bayesian Methods for Trust in Collaborative Multi-Agent Autonomy
- Paper: Multi-Agent Optimization for Safety Analysis of Cyber-Physical Systems: Position Paper
[2024.3.25]
- Paper: DreamLIP: Language-Image Pre-training with Long Captions
- Paper: Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models
- Paper: Comp4D: LLM-Guided Compositional 4D Scene Generation
[2024.3.24]
- 🔥🔥🔥Paper: SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
- Paper: Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs
- Paper: AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models
[2024.3.23]
- Paper: SAMCT: Segment Any CT Allowing Labor-Free Task-Indicator Prompts
- Paper: Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model
- Paper: Towards Robots That Know When They Need Help: Affordance-Based Uncertainty for Large Language Model Planners
[2024.3.22]
- Paper: MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
- Paper: Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning
- Survey Paper: Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
[2024.3.21]
- Paper: MyVLM: Personalizing VLMs for User-Specific Queries
- Paper: PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
- Paper: ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training
[2024.3.20]
- Paper: A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science
- Survey Paper: The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs)
- Survey Paper: ChatGPT Alternative Solutions: Large Language Models Survey
[2024.3.19]
- 🔥🔥🔥Nvidia GTC
- 🔥🔥🔥Open Release of Grok-1
- Paper: VideoAgent: Long-form Video Understanding with Large Language Model as Agent
- Paper: Few-Shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt
[2024.3.18]
- Paper: The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
- Paper: ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning
- Paper: LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
[2024.3.17]
- Paper: Unveiling the Generalization Power of Fine-Tuned Large Language Models
- Paper: Towards Proactive Interactions for In-Vehicle Conversational Assistants Utilizing Large Language Models
- Paper: UniCode: Learning a Unified Codebook for Multimodal Large Language Models
[2024.3.16]
- Paper: Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
- Paper: Simple and Scalable Strategies to Continually Pre-train Large Language Models
- Paper: SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
[2024.3.15]
- Paper: ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models
- Paper: Big City Bias: Evaluating the Impact of Metropolitan Size on Computational Job Market Abilities of Language Models
- Paper: LG-Traj: LLM Guided Pedestrian Trajectory Prediction
[2024.3.14]
- 🔥🔥🔥Speak, See and Act, OpenAI Robotics
- Paper: 3D-VLA: A 3D Vision-Language-Action Generative World Model
- Paper: MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
- Survey Paper: Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey
[2024.3.13]
- 🔥🔥🔥The world first AGI agents! Introducing Devin, the first AI software engineer
- 🔥🔥🔥Paper: Towards General Computer Control: A Multimodal Agent for Red Dead Redemption II as a Case Study
- Paper: DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning
- 🔥🔥🔥Paper: VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
- Paper: NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
[2024.3.12]
- Paper: Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling
- Paper: VideoMamba: State Space Model for Efficient Video Understanding
- Paper: Naming, Describing, and Quantifying Visual Objects in Humans and LLMs
- Paper: ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis
[2024.3.11]
- 🔥🔥🔥Inflection-2.5 Release: The Ultimate Big Model
- Paper:LLM4Decompile: Decompiling Binary Code with Large Language Models
- Paper:ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
[2024.3.10]
- Paper:Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
- Paper:Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs
- Paper:VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model
[2024.3.9]
- Paper:Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
- Paper:Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
- Paper:DeepSeek-VL: Towards Real-World Vision-Language Understanding
[2024.3.8]
- Survey Paper: Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation
- Survey Paper: A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
- Paper: Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
- Paper: Localized Zeroth-Order Prompt Optimization
[2024.3.7]
- Paper: Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models
- Paper: SaulLM-7B: A pioneering Large Language Model for Law
- Paper: On the Origins of Linear Representations in Large Language Models
- Paper: Model Parallelism on Distributed Infrastructure: A Literature Review from Theory to LLM Case-Studies
- Paper: Automatic Bi-modal Question Title Generation for Stack Overflow with Prompt Learning
[2024.3.6]
- Paper: KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection
- Paper: KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents
- Paper: A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives
- Paper: Learning to Use Tools via Cooperative and Interactive Agents
- Paper: OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following
[2024.3.5]
- Paper: RegionGPT: Towards Region Understanding Vision Language Model
- Paper: Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation
- Paper: RIFF: Learning to Rephrase Inputs for Few-shot Fine-tuning of Language Models
- Paper: Non-autoregressive Sequence-to-Sequence Vision-Language Models
- Paper: Using LLMs for the Extraction and Normalization of Product Attribute Values
[2024.3.4]
- 🔥🔥🔥 Paper: Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
- Paper: Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs
- Paper: Chain-of-Thought Unfaithfulness as Disguised Accuracy
- Paper: LLMBind: A Unified Modality-Task Integration Framework
- Paper: Double-I Watermark: Protecting Model Copyright for LLM Fine-tuning
- Paper: Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs
[2024.3.3]
- Paper: Meta-Task Prompting Elicits Embedding from Large Language Models
- Paper: Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization
- Paper: LeMo-NADe: Multi-Parameter Neural Architecture Discovery with LLMs
- Paper: A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models
- Paper: VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models
[2024.3.2]
- Paper: GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning
- Paper: Language Agents as Optimizable Graphs
- Paper: Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
- Paper: OncoGPT: A Medical Conversational Model Tailored with Oncology Domain Expertise on a Large Language Model Meta-AI (LLaMA)
- Paper: Set the Clock: Temporal Alignment of Pretrained Language Models
[2024.3.1]
- Paper: Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
- Paper: The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
- Survey Paper: Retrieval-Augmented Generation for AI-Generated Content: A Survey
- Paper: Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models
- Paper: ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
[2024.2.29]
- Paper: ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
- Paper: The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
- Paper: Evaluating Very Long-Term Conversational Memory of LLM Agents
- Paper: Tower: An Open Multilingual Large Language Model for Translation-Related Tasks
- Paper: VRP-SAM: SAM with Visual Reference Prompt
- Paper: Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models
[2024.2.28]
- Survey Paper: Large Language Models for Data Annotation: A Survey
- Survey Paper: Investigating Cultural Alignment of Large Language Models
- Paper: AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
- Paper: LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments
- Paper: Generative Pretrained Hierarchical Transformer for Time Series Forecasting
- Paper: Long-Context Language Modeling with Parallel Context Encoding
- Paper: GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
- Paper: API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs
[2024.2.27]
- Survey Paper: A Survey on Knowledge Distillation of Large Language Models
- Mistral has released Mistral Large, with an MMLU rating second only to GPT-4, 32K contexts, no Chinese support, and API calls via La Plateforme and Azure.
[2024.2.26]
[2024.2.25]
[2024.2.27]
- Mistral has released Mistral Large, with an MMLU rating second only to GPT-4, 32K contexts, no Chinese support, and API calls via La Plateforme and Azure.
- Paper: Instruct-Imagen: Image Generation with Multi-modal Instruction
[2024.2.26]
[2024.2.25]
[2024.2.24]
[2024.2.23]
[2024.2.22]
[2024.2.21]
[2024.2.20]
[2024.2.19]
[2024.2.18]
[2024.2.17]
[2024.2.16]
[2024.2.15]
- Paper: PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
- Paper:VisualWebArena: EVALUATING MULTIMODAL AGENTS ON REALISTIC VISUAL WEB TASKS
[2024.2.14]
[2024.2.13]
[2024.2.12]
[2024.2.11]
[2024.2.10]
[2024.2.9]
[2024.2.8]
[2024.2.7]
[2024.2.6]
[2024.2.5]
[2024.2.4]
[2024.2.3]
[2024.2.2]
[2024.2.1]
[2024.1.31]
[2024.1.30]
[2024.1.29]
[2024.1.28]
[2024.1.27]
[2024.1.26]
[2024.1.25]
[2024.1.24]
[2024.1.23]
[2024.1.22]
[2024.1.21]
[2024.1.20]
[2024.1.19]
[2024.1.18]
[2024.1.17]
[2024.1.16]
[2024.1.15]
[2024.1.14]
- Technical Report: DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
[2024.1.13]
[2024.1.12]
- Survey Paper: Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives
[2024.1.11]
[2024.1.10]
- Paper: Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models
[2024.1.9]
[2024.1.8]
[2024.1.7]
[2024.1.6]
[2024.1.5]
[2024.1.4]
[2024.1.3]
- Paper:CogAgent: A Visual Language Model for GUI Agents
- Paper: TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
[2024.1.2]
[2024.1.1]
[2023.12.31]
[2023.12.30]
[2023.12.29]
- KwaiAgents is a series of Agent-related works open-sourced by the KwaiKEG from Kuaishou Technology 【Paper/Github】
[2023.12.28]
- Paper: EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
- Paper: Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
[2023.12.27]
- Paper: The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
[2023.12.26]
[2023.12.25]
[2023.12.24]
[2023.12.23]
[2023.12.22]
[2023.12.21]
[2023.12.20]
[2023.12.19]
[2023.12.18]
[2023.12.17]
[2023.12.16]
- 🔥🔥🔥Paper: Agent as Cerebrum, Controller as Cerebellum: Implementing an Embodied LMM-based Agent on Drones
[2023.12.15]
[2023.12.14]
- Paper: LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
- Paper: From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3"
- Paper: WonderJourney: Going from Anywhere to Everywhere
[2023.12.13]
[2023.12.12]
[2023.12.11]
- Thesis from CMU (Carnegie Mellon University), Juncheng Billy Li: Towards Robust Large-scale Audio/Visual Learning
[2023.12.10]
- Survey Paper: Is Prompt All You Need? No. A Comprehensive and Broader View of Instruction Learning
- 🔥🔥🔥The first open source MoE Structure LLM is released!
[2023.12.9]
[2023.12.8]
[2023.12.7]
- 🔥🔥🔥Gemini:The largest and most capable Google LLM is coming! Gemini Ultra, Gemini Pro, and Gemini Nano!
[2023.12.6]
[2023.12.5]
[2023.12.4]
[2023.12.3]
[2023.12.2]
[2023.12.1]
- Peking University's newest multimodal LLM open source: trained on mixed datasets and directly used for image-video tasks without modification: 【arXiv/Demo/GitHub/HuggingFace】
[2023.11.30]
[2023.11.29]
- 🔥🔥🔥Text to Video, PIKA1.0 officially released!
- Paper: MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers
[2023.11.28]
[2023.11.27]
[2023.11.26]
[2023.11.25]
[2023.11.24]
[2023.11.23]
[2023.11.22]
[2023.11.21]
[2023.11.20]
[2023.11.19]
- OpenAI Late Night Change, Sam Altman Kicked Out, Former CTO Temporary Interim CEO
- Paper: IN-CONTEXT LEARNING WITH ITERATIVE DEMON- STRATION SELECTION
[2023.11.18]
[2023.11.17]
- DeepMind's big model on Science: 1-minute prediction of 10 days of weather data, 90% of the indicators beyond the strongest human model
- Paper: Towards Verifiable Text Generation with Symbolic References
[2023.11.16]
- Microsoft's late-night amplification: GPT-4, DALL-E 3, GPTs for free, self-research big model dedicated AI chip
- DevOps finally has an exclusive big model, Ant and BYU jointly released
[2023.11.15]
- 🔥🔥🔥Paper: Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
- 🔥🔥🔥Paper: SpectralGPT: Spectral Foundation Model
- Paper: CHATMAP : LARGE LANGUAGE MODEL INTERACTION WITH CARTOGRAPHIC DATA
- Paper: EviPrompt: A Training-Free Evidential Prompt Generation Method for Segment Anything Model in Medical Images
[2023.11.14]
[2023.11.13]
[2023.11.12]
[2023.11.11]
[2023.11.10]
- Paper: Levels of AGI: Operationalizing Progress on the Path to AGI
- Paper: mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
[2023.11.9]
[2023.11.8]
[2023.11.7]
[2023.11.6]
- 🔥🔥🔥 01.Ai first open source large models, the Yi series of large models: Yi-34B and Yi-6B.
- 🔥🔥🔥 Elon Musk's xAI products in two consecutive releases: PromptIDE & Grok
[2023.11.5]
[2023.11.4]
- Paper: RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
[2023.11.3]
[2023.11.2]
[2023.11.1]
- NVIDIA releases ChipNeMo, a large model to design semiconductors and accelerate AI design chips
- Paper: LILO: Learning Interpretable Libraries by Compressing and Documenting Code
[2023.10.31]
[2023.10.30]
[2023.10.29]
[2023.10.28]
[2023.10.27]
[2023.10.26]
[2023.10.25]
[2023.10.24]
[2023.10.23]
- Paper: MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
- Paper: Large Language Models Cannot Self-Correct Reasoning Yet
[2023.10.22]
[2023.10.21]
[2023.10.20]
- Paper: The Foundation Model Transparency Index
- Paper: XVAL: A CONTINUOUS NUMBER ENCODING FOR LARGE LANGUAGE MODELS
[2023.10.19]
[2023.10.18]
[2023.10.17]
[2023.10.16]
[2023.10.15]
[2023.10.14]
[2023.10.13]
[2023.10.12]
- Paper: Understanding the Effects of RLHF on LLM Generalisation and Diversity
- Paper: Learning Interactive Real-World Simulators
[2023.10.11]
[2023.10.10]
[2023.10.9]
[2023.10.8]
[2023.10.7]
[2023.10.6]
[2023.10.5]
[2023.10.4]
[2023.10.3]
[2023.10.2]
[2023.10.1]
[2023.9.30]
[2023.9.29]
[2023.9.28]
- Survey Paper: Instruction Tuning for Large Language Models: A Survey
[2023.9.27]
- Chinese LLaMA-2 tops the list, open source and commercially available! With a budget of one thousand yuan, training for half a day, the effect is comparable to mainstream large models.
- Lingxin Intelligence releases CharacterGLM: Play AI role-playing, 6B model is now open source.
- Paper: Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement
[2023.9.26]
- The biggest bug of the large model! The accuracy of the answers is almost zero, from GPT to Llama, none are spared: Paper:The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”
- Paper: Chain-of-Verification Reduces Hallucination in Large Language Models
- Paper: Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models
[2023.9.25]
[2023.9.24]
- Writer model is open source, commercially available, and there are a total of 8 models.
- Paper: LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
[2023.9.23]
- Defeat GPT-4? 70 billion parameter Xwin-LM climbs up to the top of Stanford AlpacaEval
- Paper: End-to-End Speech Recognition Contextualization with Large Language Models
[2023.9.22]
- AgentVerse: A Framework for Multi-LLM Environment Simulation
- The performance of the 20 billion scale large model is comparable to Llama2-70B! It is completely open source, and everything from the base to the tools is well arranged.
- Paper: Kosmos-2.5: A Multimodal Literate Model
[2023.9.21]
- 34B parameter exceeds GPT-4! "Mathematical Universal Large Model" MAmmoTH open source: average accuracy rate increased by 29% (Paper/Project Page)
- Paper: Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
[2023.9.20]
- Optimizing LLMs from a Dataset Perspective
- Google DeepMind predicts 71 million genetic mutations, decrypts the human genetic code, and is now published in Science. It has been open-sourced.(Paper/Science/Dataset)
- Paper: Replacing softmax with ReLU in Vision Transformers
[2023.9.19]
[2023.9.18]
[2023.9.17]
[2023.9.16]
[2023.9.15]
- Microsoft Open Sources EvoDiff: A New Generation of Protein Generative AI : [Paper]
- Paper: DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
[2023.9.14]
[2023.9.13]
- Chinese multimodal large model VisCPM open API interface! The upgraded version is far more capable than similar models(Paper/Github)
- Survey Paper: A Survey on Large Language Model based Autonomous Agents
[2023.9.12]
[2023.9.11]
- Paper:Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning
[2023.9.10]
[2023.9.9]
- Open-source version of code interpreter tops GitHub hotlist, runs locally, accesses Internet: [Github]
- Peking University propose the Structured Chain of Thought SCoT: [Paper]
- Paper:LARGE LANGUAGE MODELS AS OPTIMIZERS
- Survey Paper:RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
[2023.9.8]
[2023.9.7]
- Baichuan Intelligence Releases Baichuan2 Big Model: Comprehensively Ahead of Llama2, Training Slices Also Open Source: Github/Technical Report
[2023.9.6]
[2023.9.5]
[2023.9.4]
[2023.9.3]
[2023.9.2]
[2023.9.1]
- 8 LLM products are fully open to the whole community, including IOS and Android APP【Baidu, 百川智能, SenseChat, 智谱清言, ByteDance, INTERN, CAS, MiniMax】
- Paper:FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
[2023.8.31]
[2023.8.30]
[2023.8.29]
[2023.8.28]
[2023.8.27]
[2023.8.26]
- WizardLM: Open-source! [demo / HuggingFace / github]
[2023.8.25]
[2023.8.24]
-
Paper:Giraffe: Adventures in Expanding Context Lengths in LLMs
-
Paper:Graph of Thoughts: Solving Elaborate Problems with Large Language Models
[2023.8.23]
- HuggingFace Introducing IDEFICS: An Open Reproduction of State-of-the-Art Visual Language Model
- Paper:SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
[2023.8.22]
[2023.8.21]
- Paper:VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
- Paper:Exploring the Intersection of Large Language Models and Agent-Based Modeling via Prompt Engineering
[2023.8.20]
[2023.8.19]
- WizardMath: model checkpoints / project page / Paper
[2023.8.18]
[2023.8.17]
[2023.8.16]
[2023.8.15]
[2023.8.14]
[2023.8.13]
[2023.8.12]
[2023.8.11]
[2023.8.10]
[2023.8.9]
- Stability AI has just announced the release of StableCode, its very first LLM generative AI product for coding
- Paper:Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals
[2023.8.8]
[2023.8.7]
[2023.8.6]
[2023.8.5]
[2023.8.4]
- Chinese LLaMA2 model is open source and commercially usable
- Paper:Scientific discovery in the age of artificial intelligence
[2023.8.3]
[2023.8.2]
[2023.8.1]
[2023.7.31]
[2023.7.30]
[2023.7.29]
[2023.7.28]
[2023.7.27]
[2023.7.26]
[2023.7.25]
[2023.7.24]
[2023.7.23]
[2023.7.22]
[2023.7.21]
[2023.7.20]
- New Architecture: RetNetwork, beyond Transformer 👉Paper👈
[2023.7.19]
[2023.7.18]
[2023.7.17]
[2023.7.16]
[2023.7.15]
[2023.7.14]
[2023.7.13]
[2023.7.12]
- Claude2👉 [Paper]Model Card and Evaluations for Claude Models / [Website](https://claude.ai/)
[2023.7.11]
- Paper: VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
- Paper: Focused Transformer: Contrastive Training for Context Scaling
[2023.7.10]
[2023.7.9]
[2023.7.8]
[2023.7.7]
[2023.7.6]
[2023.7.5]
- MetaGPT: Multi-Role Meta-Programming Framework
- Paper: Conformer LLMs -- Convolution Augmented Large Language Models
[2023.7.4]
[2023.7.3]
[2023.7.2]
[2023.7.1]
[2023.6.30]
[2023.6.29]
[2023.6.28]
[2023.6.27]
[2023.6.26]
[2023.6.25]
[2023.6.24]
[2023.6.23]
[2023.6.22]
- Stanford has released an automatic evaluation system for LLM called AlpacaEval
- Ocean-1: the world's first contact center foundation model.
[2023.6.21]
[2023.6.20]
[2023.6.19]
- Technical Report: AIGC industry overview article from SEALAND SECURITIES
[2023.6.18]
[2023.6.17]
[2023.6.16]
- Financial FinGPT model open source, benchmarked against BloombergGPT, training parameters can be reduced from 6.17 billion to 3.67 million, can predict stock prices. (Paper/Code)
[2023.6.15]
- Paper:MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models
[2023.6.14]
[2023.6.13]
[2023.6.12]
[2023.6.11]
[2023.6.10]
[2023.6.9]
[2023.6.8]
[2023.6.7]
[2023.6.6]
- Paper: XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters
- Paper: UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
[2023.6.5]
[2023.6.4]
[2023.6.3]
[2023.6.2]
[2023.6.1]
[2023.5.31]
-
Intel announces the Aurora genAI, which is generative AI model with 1 trillion parameters
-
HuotuoGPT, towards Taming Language Model to Be a Doctor(Github/Demo/Paper)
[2023.5.30]
[2023.5.29]
[2023.5.28]
- Paper: ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
- Paper:RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text
[2023.5.27]
-
Paper: Gorilla: Large Language Model Connected with Massive APIs
-
Paper: ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs
[2023.5.26]
[2023.5.25]
[2023.5.24]
[2023.5.23]
- Paper:Symbol tuning improves in-context learning in language models
- Paper: PointGPT: Auto-regressively GenerativePre-training from Point Clouds
[2023.5.22]
[2023.5.21]
[2023.5.20]
[2023.5.19]
- OpenAI introducing the ChatGPT APP for IOS
- Paper: VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
[2023.5.18]
[2023.5.17]
[2023.5.16]
[2023.5.15]
[2023.5.14]
[2023.5.13]
[2023.5.12]
[2023.5.11]
-
Google has released PaLM 2, which improves multiple abilities and offers four versions for selection. (Paper/Page)
-
The open-source healthcare large language model NHS-LLM and OpenGPT.
-
Paper: Language models can explain neurons in language models
[2023.5.10]
-
Meta releases a large-scale model called ImageBind that can traverse six senses, and it is now open-source. (Paper/Code)
-
HuoTuo: Open Source Chinese Medical Large Model of Harbin Institute of Technology
[2023.5.9]
[2023.5.8]
-
PandaLM: the first large model for automated evaluation.(Code)
[2023.5.7]
- Paper: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
[2023.5.6]
-
OpenAI release the language-to-3D model: Shape.E (Paper/Project Page)
[2023.5.5]
[2023.5.4]
[2023.5.3]
[2023.5.2]
-
Customize your own LLMs, stop prompt-tunning: Lamini
-
Paper: Unlimiformer: Long-Range Transformers with Unlimited Length Input
[2023.5.1]
[2023.4.30]
[2023.4.29]
[2023.4.28]
[2023.4.27]
-
AudioGPT: [Project Page/Paper]
-
Paper:Answering Questions by Meta-Reasoning over Multiple Chains of Thought
[2023.4.26]
[2023.4.25]
[2023.4.24]
[2023.4.23]
[2023.4.22] Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models [Paper/Project]
[2023.4.21]
-
Paper:
Progressive-Hint Prompting Improves Reasoning in Large Language Models
Pretrained Language Models as Visual Planners for Human Assistance
[2023.4.20]
[2023.4.19]
[2023.4.18]
-
HuaTuo:Tunning LLaMA Model with chinese medical instructions
-
MiniGPT-4 [Project Page/Paper]
-
Paper: Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
[2023.4.17]
-
The open source democratizes large language models,OpenAssistant, supports 35 languages, and can use RLHF data for free[Project Page/Code/Paper]
-
Paper: Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
[2023.4.16]
[2023.4.15]
-
Paper: CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society
-
Visual Med-Alpaca: Bridging Modalities in Biomedical Language Models
[2023.4.14]
[2023.4.13] Three Amazing Works:
-
AutoGPT: An Autonomous GPT-4 Experiment 👉Code👈
-
Databricks releases Dolly 2.0, the first open, instruction-following LLM for commercial use
[2023.4.12] OpenAGI: When LLM Meets Domain Experts
[2023.4.11] Why think step-by-step? Reasoning emerges from the locality of experience
[2023.4.10] TagGPT: Large Language Models are Zero-shot Multimodal Taggers
[2023.4.9] A new AI model from Meta AI: Segment Anything Model (SAM) (Paper/Code)
[2023.4.8] EleutherAI&Yale et al. proposed a large-scale language model analysis suite that spans training and extension: Pythia (Paper/Code)
[2023.4.6] Effective Theory of Transformers at Initialization
[2023.4.5] REFINER: Reasoning Feedback on Intermediate Representations
[2023.4.4] Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?
[2023.4.3] Self-Refine: Iterative Refinement with Self-Feedback
[2023.4.1] A survey of Large Language Models
[2023.3.31] BloombergGPT: A Large Language Model for Finance
[2023.3.30] GPTEval: NLG Evaluation using GPT-4 with Better Human Alignment
[2023.3.29] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
[2023.3.28] ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
[2023.3.27] Scaling Expert Language Models with Unsupervised Domain Discovery
[2023.3.26] CoLT5: Faster Long-Range Transformers with Conditional Computation
[2023.3.23] OpenAI announces 'Plug-ins' for ChatGPT that enable it to perform actions beyond text.
[2023.3.22] GitHub launches Copilot X, aiming at the future of AI-powered software development.
[2023.3.21] Google Bard is now available in the US and UK, w/ more countries to come.
[2023.3.20] OpenAI’s new paper looks at the economical impact of LLMs+Labor Market.GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
[2023.3.17] Microsoft 365 Copilot released. Word, Excel, PowerPoint, Outlook powered by LLMs.
[2023.3.16] Baidu announcing the LLM named "文心一言"(ERNIE3.0 + PLATO)
[2023.3.15] Two Breaking News: - Announcing GPT4 by OpenAI from Microsoft. Paper🔗 - Announcing PaLM API by Google.
[2023.3.13] LLaMA has been fine-tuned by Stanford
[2023.3.10] Announcing OpenChatKit by Together
[2023.3.9] GPT-4 is coming next week and it will be multimodal,announced by OpenAI.
[2023.3.8] Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
[2023.3.7] Larger language models do in-context learning differently
[2023.3.6] Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning