From Noise to Art

Style and Character Adaptation for Stable Diffusion LoRA Model

Context

Technical Prototype

Role

AI Developer / Researcher

Year

2025

Industry

Generative AI, Digital Art

The Idea

‍This project focused on training a LoRA (Low-Rank Adaptation) model for Stable Diffusion 1.5 to capture and replicate the unique artistic style and character designs associated with the "Neon Drive" narrative concept. The objective was to create a lightweight, adaptable model that allows users to generate new images consistent with the specific Neon Drive aesthetic (e.g., character 'Kaela Eryndor') when used in conjunction with the base SD 1.5 model.

Training Dataset - Generated using Midjourney 6.1

Development

‍The LoRA development involved several stages outlined in the "From Noise to Art" research:

Dataset Creation: A small dataset consisting of 7 synthetic images (512x512 resolution) was generated using Midjourney v6.1 (for characters/locations) and NijiJourney (for stylized character illustrations) based on the "Neon Drive" theme.
Dataset Preparation: Image captions were automatically generated for the dataset images using the Florence 2 caption model to provide textual context during training.
Training Environment: Training was conducted using both local (NVIDIA RTX 3080) and cloud (NVIDIA A10G) hardware, utilizing environments like Flux Gym and SD Scripts with Python 3.10.16. Tools like Invoke, ForgeWebUI, and ComfyUI were used for generation and testing.
LoRA Training: The LoRA was trained specifically for the Stable Diffusion 1.5 (pruned) base model. Key hyperparameters included a learning rate of 1e-4, a batch size of 1, and 10 training epochs, using the AdamW8bit optimizer. Training was performed at a 512x512 resolution.
Evaluation: Sample outputs were generated after training iterations to assess the model's ability to capture the desired style and character features.

Reflection

‍This project successfully demonstrated the process of creating a custom LoRA for SD 1.5 from a small, curated, synthetic dataset. It showcased a workflow integrating various AI tools for dataset generation (Midjourney, NijiJourney), captioning (Florence 2), and model training (SD Scripts). The resulting LoRA allows users to infuse the specific "Neon Drive" aesthetic into their SD 1.5 generations. The broader research compared SD 1.5 with other models like SD XL and Flux, noting trade-offs in training time, quality, and prompt consistency.

‍

Evaluation of Different Model Performance

What Worked

Successful training of a functional SD 1.5 LoRA from a small (7-image) synthetic dataset.
Demonstrated a viable workflow combining multiple AI tools for dataset creation and preparation.
Generated sample images showing the LoRA's ability to influence style and character appearance.

What Did Not Work / Limitations

The LoRA's performance and fidelity are likely limited by the very small training dataset size.
Requires the user to have the correct base model (SD 1.5 pruned specified in training).
Effective use likely requires specific trigger words and prompting techniques, which could not be verified from the Civitai page.
Based on model comparisons in the document, SD 1.5 LoRAs might offer lower quality or prompt consistency compared to those trained for newer base models.

Civitai

Neon X Drive. LoRA - v1.0 | Stable Diffusion LoRA | Civitai