Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
results		results
scripts		scripts
visual		visual
.gitignore		.gitignore
README.md		README.md

Repository files navigation

On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability

Kevin Wang * · Junbo Li * · Neel P. Bhatt * · Yihan Xi
. Qiang Liu . Ufuk Topcu . Atlas Wang .

*Equal contribution and co-first authors

arXiv | Project Page(coming soon) | more examples + evaluations(coming soon)

We evaluated the GPT4 and o1 on planning tasks, highlighting their strength in problem understanding and identifying challenges in spatial reasoning and generalization.

News

[2025/4] The code and dataset for SPINBench are now available! Check it out here: SPINbench
[2025/3] We introduce SPINBench, a benchmark designed to evaluate the strategic planning and social reasoning capabilities of large language models
[2025/1] We are currently working on developing the benchmark and plan to release the code and data within a month.

TODO List

We will update the detailed information and share access to more files soon.

Release detailed experiments evaluation
Release automoation evaluation script (This would take a while) Check the SPINbench repo.

The File Hierarchy:

OpenAI's o1 Models
  └─results
     └─barman (the domains)
     ...
     └─tyreworld
        └─p_.pddl.prompt (the prompt we used for experiments, including the domain and problem in natural language)
        └─p_.pddl.gpt4 (GPT4 results to the prompt)
        └─p_.pddl.o1-mini (O1-mini results to the prompt)
        └─p_.pddl.o1-preivew(o1-preview results to the prompt)
        └─random.py(only in randomized example, this encode the problem with random symbol)
  └─visual (this would include more visual examples and graphic)
  └─scripts (scripts used to generate files, and update in the future)

The detailed experiment results

Citation

If you find our paper useful or interesting, please consider giving a star ⭐ and citing the following paper 📝.

@misc{wang2024planningabilitiesopenaiso1,
      title={On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability}, 
      author={Kevin Wang and Junbo Li and Neel P. Bhatt and Yihan Xi and Qiang Liu and Ufuk Topcu and Zhangyang Wang},
      year={2024},
      eprint={2409.19924},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2409.19924}, 
}

Acknowledgements

The basic prompts are from llm+p available at this GitHub repository. We thank all the authors for their great work and repos.

There are also some concurrent works that were released recently or will be released soon:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability

arXiv | Project Page(coming soon) | more examples + evaluations(coming soon)

News

TODO List

The File Hierarchy:

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

VITA-Group/o1-planning

Folders and files

Latest commit

History

Repository files navigation

On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability

arXiv | Project Page(coming soon) | more examples + evaluations(coming soon)

News

TODO List

The File Hierarchy:

Citation

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages