n00mkrad/text2image-gui

Fork: 97 Star: 948 (更新于 2024-12-18 15:24:09)

license: GPL-3.0

Language: C# .

Somewhat modular text2image GUI, initially just for Stable Diffusion

最后发布版本： 1.7.1 ( 2023-01-31 16:49:19)

GitHub网址

介绍
版本
相关

NMKD Stable Diffusion GUI

Somewhat modular text2image GUI, initially just for Stable Diffusion.

Relies on a slightly customized fork of the InvokeAI Stable Diffusion code: Code Repo

Main Guide:
System Requirements
Features and How to Use Them
Hotkeys (Main Window)

Additional Guides:
AMD GPU Support
Inpainting

System Requirements

OS: Windows 10/11 64-bit

Minimum:

GPU: Nvidia GPU with 4 GB VRAM, Maxwell Architecture (2014) or newer
- Alternatively, with limited feature support: Any DirectML-capable GPU with 8 GB of VRAM
RAM: 8 GB RAM (Note: Pagefile must be enabled as swapping will occur with only 8 GB!)
Disk: 10 GB (another free 5 GB for temporary files recommended)

Features and How to Use Them

Prompt Input

Multiple prompts at once: Enter each prompt on a new line (newline-separated). Word wrapping does not count towards this.
Negative Prompt: Put words or phrases into this box to tell the AI to exclude those things when generating images.
- Alternatively, you can also put the negative prompt into the regular prompt box by wrapping it in [brackets].
Emphasis: Use + after a word/phrase to make it more impactful, or - to do the opposite. You can also use to increase the effect. Wrap your phrase in parentheses if you want to apply it to more than one word.
- Each plus/minus applies a multiplier of 1.1. So two +++ would be 1.1^3 = 1.331, and so on.
- You can also type the strength manually after parentheses, e.g. a (huge)1.33 dog instead of a huge+++ dog
- Syntax Examples: a green++ tree, a (big green)+ tree with orange- leaves (in the woods)++
Wildcards: Fill in words or phrases from a list into the prompt.
- Inline: photo of a ~car,tree,dog~.
- From File: photo of a ~objects for loading texts from objects.txt in your Wildcards folder in the SD GUI root folder.
- Order: Use ~ for random/shuffled, ~~ for unchanged order, or ~~~ for sorted (A-Z) mode.

Additional Inputs

Textual Inversion Embeddings: Select a prompt embedding and add it to your prompt (Path can be set in Settings).
LoRA Files: (Hidden if no files are in the folder) Select LoRA models and set the weight.
Base Image: Load an initialization image that will be used together with your text prompt ("img2img"), or for inpainting
- Loading multiple images means that each image will be processed separately.

Stable Diffusion Settings

Steps: More steps can increase detail, but only to a certain extent. Depending on the sampler, 15-50 is a good range.
- Has a linear performance impact: Doubling the step count means each image takes twice as long to generate.
Prompt Guidance (CFG Scale): Lower values are closer to the raw output of the AI, higher values try to respect your prompt more accurately.
- Use low values if you are happy with the AI's representation of your prompt. Use higher values if not - but going too high will degrade quality.
- No performance impact, no matter the value.
Seed: Starting value for the image generation. Allows you to create the exact same image again by using the same seed.
- When using the same seed, the image will only be identical if you also use the same sampler and resolution (and other settings).
- Lock Seed Option: Disable incrementing the seed by 1 for each image. Only useful in combination with wildcards.
Resolution: Adjust image size. Only values that are divisible by 64 are possible. Sizes above 512x512 can lead to repeated patterns.
- Higher resolution images require more VRAM and are slower to generate.
- High-Resolution Fix: Enable this to avoid getting repeated patterns at high resolutions (~768px+).
Sampler: Changes the way images are sampled. DPM++ 2M Karras is the default because it's fast and tends to look good even with 10-20 steps.
Generate Seamless Images: Generates seamless/tileable images, very useful for making game textures or repeating backgrounds.
Generate Symmetric Images: Generates images that are mirrored on one axis.

Image Viewer

Review current images: Use the scroll wheel while hovering over the image to go to the previous/next image.
Slideshow: The image viewer always shows the newest generated image if you haven't manually changed it in the last 3 seconds.
Context Menu: Right-click into the image area to show more options.
Pop-Up Viewer: Click into the image area to open the current image in a floating window.
- Use the mouse wheel to change the window's size (zoom), right-click for more options, double-click to toggle fullscreen.

Settings Button (Top Bar)

Note: Some options might be hidden depending on the selected implementation.

Image Generation Implementation: Choose the AI implementation that's used for image generation.
- Stable Diffusion - InvokeAI: Supports the most features, but struggles with 4 GB or less VRAM, requires an Nvidia GPU
- Stable Diffusion - ONNX: Lacks some features and is relatively slow, but can utilize AMD GPUs (any DirectML capable card)
- InstructPix2Pix - For instruction-based image editing. Requires an Nvidia GPU
Use Full Precision: Use FP32 instead of FP16 math, which requires more VRAM but can fix certain compatibility issues.*
Unload Model After Each Generation: Completely unload Stable Diffusion after images are generated.*
Stable Diffusion Model File: Select the model file to use for image generation.
- Included models are located in Models/Checkpoints. You can add external folder paths by clicking on "Folders...".
Stable Diffusion VAE: Select external VAE (Variational Autoencoder) model. VAEs can improve image quality.*
- Default path is Models/VAEs. You can add external folder paths by clicking on "Folders...".
Textual Inversion Embeddings Folder: Select folder where embeddings (usually .pt files) are loaded from.
LoRA Models Folder: Select folder where LoRA models (.safetensors files) are loaded from.
Cache Models in RAM: When enabled, models are offloaded into RAM when switching to a new one. This makes it very fast to switch back, but takes up 2GB+ per cached model.
Skip Final CLIP Layers (CLIP Skip): Can improve quality on certain models.
CUDA Device: Allows your to specify the GPU to run the AI on, or set it to run on the CPU (very slow).*
Image Output Folder: Set the folder where your generated images will be saved.
Output Subfolder Options:
- Subfolder Per Prompt: Save images in a subfolder for each prompt. Negative prompt is excluded from the folder name.
- Ignore Wildcards: Use wildcard name (as in prompt input) instead of the replaced text in file/folder names.
- Subfolder Per Session: Save images in a subfolder for each session (every time the program is started).
Information to Include in Filename: Specify which information should be included in the filename.
Favorites Folder: Specify your favorites folder, where your favorite images will be copied to (right-click image viewer or use Ctrl+D)
Image Save Mode: Choose whether you want to delete or keep generated images by default.
When Running Multiple Prompts, Use Same Starting Seed for All of Them: If enabled, the seed resets to the starting value for every new prompt. If disabled, the seed will be incremented by 1 after each iteration, being sequential until all prompts/iterations have been generated.
When Post-Processing Is Enabled, Also Save Un-Processed Image: When enabled, both the "raw" and the post-processed image will be saved.
Automatically Set Generation Resolution After Loading an Initialization Image: Automatically sets the image generation to match your image.
Retain Aspect Ratio of Initialization Image (If It Needs Resizing): Use padding (black borders) instead of stretching, in case the init image resolution does not match the image generation resolution.
Advanced Mode: Increases the limits of the sliders in the main window. Not very useful most of the time unless you really need those high values.
Notify When Image Generation Has Finished: Play a sound, show a notification, or do both if image generation finishes in background.

Logs Button (Top Bar)

Open Logs Folder: Opens the log folder of the current session. The application deletes older logs on every startup.
<logname>.txt: Open the log file or copy the text.

Installation Manager & Updater Button (Top Bar)

Manage Installation: Allows you to check if your installation is valid and can repair/reset it.
- Installation Status: Shows which modules are installed (checkboxes are not interactive and only indicate if a module is installed correctly!).
- Re-Install Python Dependencies: Re-installs the Stable Diffusion code from its repository and re-installs all required python packages.
- Re-Install Upscalers: (Re-)Installs upscaling files (RealESRGAN, GFPGAN, CodeFormer, including model files).
- (Re-)Install: Installs everything. Skips already installed components.
- Uninstall: Removes everything except for Conda which is included and needed for a re-installation.

Install Updates: Allows you to update to a new version or re-install the current one.

Developer Tools Button (Top Bar)

Open Stable Diffusion CLI: Use Stable Diffusion in command-line interface.
Open CMD in Python Environment: Opens a CMD window with the built-in python environment activated.
Merge Models: Allows you to merge/blend two models. The percentage numbers represent their respective weight.
Prune Models: Allows you to reduce the size of models by removing data that's not needed for image generation.
Convert Models: Allows you to convert model weights between Pytorch (ckpt/pt), Diffusers, Diffusers ONNX, and SafeTensors formats.
View Log In Realtime: Opens a separate window that shows all log output, including messages that are not shown in the normal log box.

Post-Processing Button (Top Bar)

Upscaling: Set RealESRGAN upscaling factor.
Face Restoration: Enable GFPGAN or CodeFormer for face restoration.

Training Button (Top Bar)

Opens LoRA training window (Guide here)

Bottom Bar Buttons

Generate: Start AI image generation (or cancel if it's already running).
Prompt Queue Button: Right-click to add the current settings to the queue, or left-click to manage the queued entries.
Prompt History Button: View recent prompts, load them into the main window, search or clear history, or disable it.
Image Deletion Button: Delete either the image that is being viewed currently, or all images from the current batch.
Open Folder Button: Opens the (root) image output folder.
Left/Right Buttons: Show the previous or next image from the current batch.

Hotkeys (Main Window)

CTRL+G: Run Image Generation (or Cancel if already running)
CTRL+M: Show Model Quick Switcher (Once it's open, use ESC to Cancel or Enter to confirm)
CTRL+Shift+M: Show VAE Quick Switcher
CTRL+PLUS: Toggle Prompt Textbox Size
CTRL+Shift+PLUS: Toggle Negative Prompt Textbox Size
CTLR+DEL: Delete currently viewed image
CTRL+SHIFT+DEL: Delete all generated images (of the current batch)
CTRL+O: Open currently viewed image
CTRL+SHIFT+O: Show current image in its folder
CTRL+C: Copy currently viewed image to clipboard
CTRL+D: Copy currently viewed image to favorites
CTRL+V: Paste image (If clipboard contains a bitmap)
CTRL+Q: Quit
CTRL+Scroll: Change textbox font size (only works while the textbox is being used)
F1: Open Help (Currently links to GitHub Readme)
F12: Open Settings
ESC: Remove focus from currently focused GUI element (e.g. get out of the prompt textbox)

最近版本更新:(数据更新于 2024-10-05 01:26:10)

2023-01-31 16:49:19 1.7.1

n00mkrad/text2image-gui同语言 C#最近更新仓库

2024-12-20 06:41:40 microsoft/PowerToys

2024-12-07 22:15:43 Pik-4/HsMod

2024-12-01 20:18:03 BililiveRecorder/BililiveRecorder

2024-11-26 14:48:43 Azure/azure-sdk-for-net

2024-11-19 11:38:46 jellyfin/jellyfin

2024-11-15 10:33:01 DigitalRuby/IPBan