Consishoi — Consistent Human-Object Interaction Generation (SDXL extension)

This is the official code repository for ConsisHOI, a framework that enables the generation of consistent Human-Object Interaction (HOI) images adhering to precise layout constraints, built upon Stable Diffusion XL.

Requirements

Python 3.10+
See root requirements.txt for full dependency list.

Quick setup:

conda create -n consishoi python=3.10 -y
conda activate consishoi
pip install -r requirements.txt

🛠️Usage

Inference

Generate one image with the unified pipeline:

python inference.py \
	--consishoi_model_path /path/to/consishoi_model_dir \
	--id_image /path/to/id_image.png \
	--prompt "a person is riding a motorcycle" \
	--out_image ./OUTPUTS/output.png \
	--steps 50 \
    --scale 5.0 \
    --device cuda

Example interaction_cfg :

interaction_cfg = {
	'subject_phrases': ['person'],
	'object_phrases': ['motorcycle'],
	'action_phrases': ['a person is riding a motorcycle'],
	'subject_boxes': [[x0, y0, x1, y1]],
	'object_boxes': [[x0, y0, x1, y1]],
	'scheduled_sampling_beta': 1.0,
}

Training

Train with a single model path for base weights:

python train.py \
	--consishoi_model_path /path/to/consishoi_model_dir \
	--output_dir ./OUTPUTS/ \
	--train_batch_size 4 \
	--device cuda

If GPU memory is tight, consider enabling CPU offload in the pipeline or using lower dtype where supported.

License & Contact

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
diffusers_consishoi		diffusers_consishoi
id_encoder		id_encoder
LICENSE		LICENSE
README.md		README.md
consishoi_unet_2d_condition.py		consishoi_unet_2d_condition.py
inference.py		inference.py
inference.sh		inference.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Consishoi — Consistent Human-Object Interaction Generation (SDXL extension)

About

Uh oh!

Releases

Packages

Languages

License

HeisXing/ConsisHOI

Folders and files

Latest commit

History

Repository files navigation

Consishoi — Consistent Human-Object Interaction Generation (SDXL extension)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages