水下写真 – ARJANE's Portfolio

Underwater portraits

项目时间 Date

2024.9

项目类型 Project type

AI 设计 AI Design

完成人 Author

倪威 ARJANE

关键词 Keyword

Lora 训练 Lora training, 训练日志 Training log

这是一个真实感模型，旨在打造水下摄影的漂浮感和斑驳的光影。

This is a photorealistic model designed to create the floating feeling and dappled light and shadows of underwater photography.

数据集处理 Dataset Preparation

用到的 80% 数据是 Midjourney 的合成图像，处理方法和上一个模型《纸秋》类似。这次还混合了约 20% 的真实世界数据在其中。

80% of the data is from Midjourney, processed in a similar way to the previous model “Paper Autumn”. About 20% of real-world data is also mixed in.

模型训练 LoRA Training

一共进行了三次训练尝试，分别是 30、50 和 80 Epoch。主要原因是前两次训练水下的特征不强烈，例如光影效果缺失、没有漂浮感。第三次的结果我其实仍不是特别满意，但我意识到继续在训练上做文章意义也不大了，可能是基座模型和数据集限制了最终的效果。

A total of three trainings were conducted, at 30, 50, and 80 epochs. The main reason was that the underwater features in the first two trainings were not strong, such as the lack of light and shadow effects and the lack of floating feeling. I was still not particularly satisfied with the results of the third training, but I realized that there was no point in continuing to work on the training, probably because the base model and the dataset limited the final results.

第一次和第三次训练选择了 Prodigy 优化器，第二次选择了常规的 Adam 优化器，是因为我希望学得慢一点，但结果也非常一般。

I chose the Prodigy optimizer for the first and third training runs, and the Adam optimizer for the second run because I wanted to learn more slowly, but the results were also very average.

选择合适的模型做推理 Choose a base model for inference

针对 LoRA 结果和推理模型的筛选，我做了多轮对比后，选择第三次的 Epoch=80、LEOSAM Helloworld SDXL 作为底模。

After multiple rounds of comparisons of LoRA results and base models for inferencing, I chose the Epoch=80 model from third round training, and LEOSAM Helloworld SDXL as the base model.

效果展示（文生图） Text-to-image Results

效果展示（图生图） Image-to-image Results

图生图的保 ID 是通过 InsatantID 模块实现的。InstantID 模型感觉不是很优质，对颜色影响比较大，结果有时甚至很模糊，以至于我需要用超分提一下画质。下次换其他的保 ID 方案试试（所以这个模型还会有 2.0 ovo）

The ID preservation of portraits is achieved through the InsatantID module. The InstantID model doesn’t seem to be of very high quality, and has a significant negative impact on color. The results are sometimes even blurry, so I need to use SUPIR to improve the image quality. I will try other ID preservation solutions next time (so this model will have a ver2 ovo)