Skip to content
forked from Suchenl/LAMIC

[AAAI 2026] We present LAMIC, a Layout-Aware Multi-Image Composition framework, that extends single-reference diffusion models to multi-reference scenarios in a training-free manner, with superior abilities in human identity keeping, background preservation and layout control. Now providing Plug-and-play on Flux.1 Kontext. Let's try it !!!

License

Notifications You must be signed in to change notification settings

strato-space/LAMIC

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LAMIC

LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer

License: Apache-2.0 arXiv Hugging Face

This repo contains minimal inference code to run layout-aware multi-image composition & editing with our LAMIC based on Flux.1 Kontext-dev open-source models.

🚀 Updates

  • [2025.07] This work is now open source, and you are free to use it!
  • [2025.08.01] We released a technical report! See our detailed method here.
  • [2025.08.08] LAMIC's Comfyui implementation is coming soon!

An example of LAMIC

Generate using the same input "example_inputs.json" with four different random number seeds.

example_forest example_man example_sea_turtle example_sea_turtle
example example_bboxed

Local install

Install LAMIC

git clone https://github.com/Suchenl/LAMIC.git
cd LAMIC
conda create --name myenv python=3.10.18
pip install -r requirements.txt

Install underlying foundation model

Name HuggingFace repo ModelScope repo
FLUX.1 Kontext [dev] https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev https://www.modelscope.cn/models/black-forest-labs/FLUX.1-Kontext-dev

For example, using modelscope

modelscope download --model black-forest-labs/FLUX.1-Kontext-dev --local_dir ./your_dir

Usage

Run example

python batch_run.py --input_path "./example_inputs.json" --num_img_per_sample 4 --concat_per_sample True --flux_kontext_transformer_path "your_local_kontext_transformer_path" --flux_path "your_local_flux_path (flux or flux kontext, both are acceptable)"

Of course, you can also refer to example_input to quickly build your own input. The number of reference images is flexible—just try it!

More refined control (modulate your custom inputs)

1. Use your predefined masks for layout control instead of margin boxes

Replace the key "bbox" with "mask_path".

For example, "mask_path": "your_mask_path (image format)"

2. Enhance the impact of one or more of your reference images

If you find that in some cases, the reference performance of some reference images is insufficient, you can try increasing the value of "to_area", which means the area to which the original reference image is scaled in proportion to its length and width. Since LAMIC is based on the attention between tokens, increasing the area of some reference images can increase the number of tokens, thereby enhancing the reference performance of the generated image. Vice versa.

3. Reduce the impact of one or more of your reference images on other reference images

In addition to the second method, you can also consider adding the key of "de-overlap" to other reference images, to eliminate the influence of some reference images on other reference images (when the target areas of the two reference images overlap)

For example, "ref_img_4": {"de-overlap": [1, 2]}. This means eliminating the influence of reference image 4 on the overlapping area with reference image 1, and eliminating the influence of reference image 4 on the overlapping area with reference image 2.

Framework of LAMIC

framework

You are free to:

  • ✅ Modify, adapt, or build upon the work
  • ✅ Distribute your modified versions
  • ✅ Use for research, education, or production systems

📄 Citation

If you find this work useful to you, please consider citing our paper and giving the repo a ⭐:

@article{chen2025lamic,
  title={LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer},
  author={Chen, Yuzhuo and Ma, Zehua and Wang, Jianhua and Kang, Kai and Yao, Shunyu and Zhang, Weiming},
  journal={arXiv preprint arXiv:2508.00477},
  year={2025}
}

About

[AAAI 2026] We present LAMIC, a Layout-Aware Multi-Image Composition framework, that extends single-reference diffusion models to multi-reference scenarios in a training-free manner, with superior abilities in human identity keeping, background preservation and layout control. Now providing Plug-and-play on Flux.1 Kontext. Let's try it !!!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%