I have recently become very intersted in running AI Large language Models (LLM’s).
With a view to furthering my research in this area, I have been planning a build for a machine dedicated to AI inference.
My goal is to be able to run 70b models at Q6 or even Q8 quants with a tk/s of 3-6 tk/s at least, and, hopefully, a 120b model with at least a Q5 quant at at least 1 tk/s.
The spec that I arrived at is:
These have 24 gbs VRAM each, for a total of 48 GBs, and they have just two PCIE power connectors, not three, making it easier to power them.
2×3090 Asus TUF Gaming GPUs
These have 24 gbs VRAM each, for a total of 48 GBs, and they have just two PCIE power connectors, not three, making it easier to power them.
Threadripper PRO 3955WX
I went with the PRO threadripper because of it’s support for more than 256 GBS RAM, and it’s 128 PCE lanes. I could have probably went with the 3945 model, since the clock speeds are similar, and the extra 4 cores (16 vs 12) of the 3955 probably won’t make that much difference for inference.
256 GB 3200 MHZ DDR4 RAM
3200 Mhz DDR4 is not the fastest, but it’s the fastest speed that the 3955wx supports, and I don’t think that over clocking 8x32GB sticks is going to work. I need 8 sticks because I want to use 8 channel memory. Memory bandwidth is very important for LLM’s, and 8 channel memory has about a 200 GB/s bandwidth, vs 100 GB/s for quad channel.
WRX80-E SAGE Motherboard
This actually cost more than the CPU, but it has 7 PCI-x16 ports, which I will need in the future if I intend to add more GPU’s, and because it has 8 channel memory support.
Corsair HX1500
A 1500 watt PSU should be ok for two 3090’s, maybe even 3 if I underclock the card. If I get any more in the future I will have to get another PSU and connect them together.
2 TB M.2 SSD
Noctua Cooler
Mining Case
I went with an open Air mining rig because it is the only setup that would allow me to add more than 2 GPUS.
It will be some time before I get all of the parts, because most of them are used, and shipping will take time.
I also posted this guide to the reddit sub “/r/StableDiffusion” under the username “spiritusastrum”.
Generating the same character in multiple poses is one of the most common questions for beginners in AI generated art.
Most solutions involve LORA’s, Dreamboth, etc, etc, and these are preferred, however they are complex, and require good quality training sets.
It is possible to generate enough training images using just SD, but this is difficult.
I have, after some research and trial and error, discovered a very simple way to create a unique character using entirely Stable Diffusion and then change that characters pose, while keeping most of their likeness (Clothing, hair, face, etc, intact).
I am using “DivineAnimeMix” to generate the images used in this guide.
The guide is aimed mainly at simple artwork, such as the kind you would see in Visual Novels, etc. With complex art or close ups of a characters face, etc, this technique may not work as well.
First, (using txt2img) create an extremely specific prompt, and use it to produce a single image of a character.
This image should be the “main” character image, used for the characters default image in the visual novel, etc.
It is important to make sure that this image matches the prompt as much as possible.
For example, if the colour of the characters jacket is different from the prompt, it should be fixed now, otherwise the jacket will be the wrong colour when the pose changes. This can be fixed, but it is easier to fix it now.
Also, it can be very difficult (Or almost impossible) to get things like tattoos, makeup, etc, to match properly when the pose changes, so it can be desirable to avoid creating characters with tattoos, etc. If a character does have tattoos, makeup, etc, it is important to specify the location of the tattoo and what it is.
So, instead of just “With tattoos” say “With a tattoo on their left arm”, etc.
The final point with prompt generation is to add a pose or stance with a “weight” modifier, such as:
(Standing: 5.5).
This is used to change the pose, without changing the prompt.
This is the image that I generated at this stage of the process:
With the default image and prompt generated, it is now possible to change the pose.
This is done by just changing the weighed prompt, so:
(Standing: 5.5) could become (Riding a Motorcycle: 5.5), or anything else.
Of course, cfg scale, restore faces, etc, can all be used to improve the quality of the images, however the prompt should not be changed, apart from the pose.
What will now happen is that many similar, not identical, characters will be generated.
This is the image that I generated for this step:
Notice that the character is similar, but not identical, to the first image.
Once an image is produced that firstly, matches the desired pose, and secondly, has relatively few differences to the “main” image, it is time for the next step.
Download the second image and open it in photoshop, or any other basic image editor (Even MS paint would work fine for this).
Now, with the main image as a guide, roughly mark any areas which the AI has gotten “wrong”.
Is the jacket the wrong colour? Use the colour dropper tool to paint over the jacket on the second image with the colour from the first image.
Is the AI wearing long pants in the second image, and shorts in the first?
Again, use the eye dropper tool to paint a skin-colour texture over the long pants texture.
Is the hair too long, or too short? Are there tattoos where there shouldn’t be?
Do the same thing.
If something is missing from the second image, simply select and copy it from the first image.
Tattoos, a belt, a style of glove, even an entire face, can be very crudely copied, pasted, and scaled, onto the second image.
This will result in something that looks awful, but this is perfectly fine. The goal is simply to add visual cues to tell the AI which parts of the image to regenerate, the AI will do the rest.
This is the image that I created here:
Notice that I have painted over her right arm (Her sleeves should be short) and her right hand (She should be wearing gloves). I have also copied and pasted the face, right sleeve, and the fur collar from the first image.
When this is done, upload the modified image to stable diffusion, this time to img2img.
Use the same prompt that was used to generate the second image (not the main image!), and set the “denoising strength” as low as possible. The idea is to JUST regenerate the parts that you painted over in photoshop, not the rest of the image.
You can use inpainting for this (painting over only the parts of the image that you want to regenerate, leaving the rest), but I found that img2img works as good or better (I seemed to end up with bad-quality faces more often with inpainting).
You may need to generate several images, but you should end up with a character that looks MUCH more like the main character that you initially generated, but with the pose of the second character.
If there are any minor issues remaining, simply take the best result, download it to photoshop, and go through the process again, you can repeat this as many times as necessary.
This is one of my final images:
Notice that the right arm is still wrong (She has veins instead of tattoos, and she has a bracelet instead of gloves) but these issues could be fixed in photoshop. The major details match the original character, while the pose is different.
Here is another image:
Again, the likeness is not perfect, but it is close enough that I think most people would regard this character as the “same” as the first one.
Here is a side by side comparision between the initial character and the final image:
I think that, for a single pass, this is a very good result.
If I did another pass in photoshop, I could fix the issues with the right glove, and the red stitching on her pants, as well as modify the tattoos to help them match up more.
I have tried this with “realistic” and anime style checkpoints and it works very well with both. I suspect it would work better with illustrated or manga specific checkpoints, because there would generally be less detail involved, and so it would be harder to detect differences between the images.
This solution, of course, does NOT solve the problem of creating the “same character in a different pose”. You are generating a new character, that just happens to look similar, however, this process seems to work well enough for my purposes at least, and it may work for others as well.
In addition to my main project, I have been working on a simple concept for a system (Using the Unity game engine) that I can use to create Visual novels, and interactive stories.
Initially, this will use simple 2D art, sprites, etc, which means I can generate almost all of the artwork using AI-Generated art.
However, if it works out, I can add more complex 3D geometry in future, and create point-and-click games, and other projects that are a hybrid of 2D and conventional 3D art.
Using AI generated art would allow me to develop smaller projects more quickly, while still being able to work on my main game, which I would not be able to do if I was working on multiple complex 3D projects.
Most of the basic logic for displaying, selecting, and interacting with sprites is already done for the interactive story system, as well as basic text display, etc.
The main problem now is figuring out how to generate consistent characters with stable diffusion.
I have been working with stable diffusion to generate AI Art for some time, but the main problem that I am having is with generating the same character in multiple poses.
This is, apparently, a common problem with AI generated art in general.
It is possible to generate similar characters using a strong enough prompt, but not identical characters.
I have been working on a simple workflow to address this problem, and the initial results are looking promising.