I'd like to fine tune a model that does img2img with a text prompt to guide the output. I think [img2img-turbo](https://github.com/GaParmar/img2img-turbo) might be the closest to what I'm after, though by default it uses a fixed prompt which can be made variable with [some tweaking of the training code](https://github.com/GaParmar/img2img-turbo/issues/41). At the moment I only have access to 24GB VRAM which limits my options. What I'm after is training a model to make specific text-based modifications to images, and I have plenty of before to after images plus the modification text prompts to train on. Worst case, I can try to see if reducing the image size during training makes it possible with my setup. Are there any other options available today?

8
1
Is there anything that makes training a translation task easy?
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    hok
    1y ago 100%

    Thanks for the tips. After doing a bunch of searching, I found that what I needed was BPE, or byte-pair encoding. This allows the token set to contain sub-word sequences, which lets the tokenizer represent a unique constant like 0x0373 as ['__sow', '0x', '03', '73', '__eow'].

    4
  • Is there anything that makes training a translation task easy?
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    hok
    1y ago 100%

    Thanks, the quickstart guide was straightforward to follow. Do you have any suggestions on how to do word splitting with code, if any? For example, on a test run, I found that the model was not able to synthesize unique constants correctly even though this test run consisted only of obvious "a to b" relationships.

    2
  • I have thousands of side-by-side translations for two computer languages (lower level to higher level), and I would like to train a model that is able to do translations on new data with higher accuracy. Got any suggestions on what to do? I don't think I want to fine tune a ChatGPT-style model since I think the task is more structured than that. Also, I consider myself technically competent but probably would fail at designing my own model and pipeline.

    2
    7