#Text to Mask (Clipseg)

11 messages · Page 1 of 1 (latest)

buoyant aspen
#

Input a prompt and an image to generate a mask representing areas of the image matched by the prompt. This mask can be used in the Create Denoise Mask node or many other applications.

The raw output of Clipseg (in this case, clipseg-rd64-refined) is a low resolution grayscale image, so this node includes options to apply smoothing and thresholding to yield a pure black/white image at full size. There are additional options to expand or contract the mask, apply a blur at the end, and invert the mask between black on white and white on black.

I have included two files: clipseg.py, which contains the Text to Mask (Clipseg) node, and a second file, clipseg_adv.py, containing an advanced version of the node and some other nodes that may be used with it to get the same functionality as found in the standard node, or in other creative combinations.

The extra nodes in clipseg_adv are:

  • Text to Mask Advanced (Clipseg) - Enter up to four prompts, and choose a mask that combines them with logical "and", logical "or", or outputs all four masks as separate channels of an RGBA image.
  • Clipseg Mask Hierarchy - Select objects from foreground-to-background to create a segmentation map of separate distinct areas and use each region mask separately.

You can download the files here: https://github.com/dwringer/composition-nodes/

Note:
Currently, the version of the Transformers library [4.46.3] that's pinned for the InvokeAI package has a regression which results in the Clipseg nodes failing to work properly, giving the error ("ValueError: Input image size (352x352) doesn't match model (224x224)."). This is fixed at least as early as Transformers 4.48.3, which can be installed by activating your InvokeAI .venv and typing uv pip install transformers==4.48.3. This will fix the Clipseg nodes, although it's always possible there might be unintended consequences from upgrading.

GitHub

Image and Mask Composition node bundle for InvokeAI - dwringer/composition-nodes

normal tiger
#

What an amazing node!

sly moon
#

finally getting around to playing with this, this is super cool! Wonder how results compare to SAM

buoyant aspen
sly moon
#

based on my understanding, it's possible to do text -> segment with SAM, but the main intended usages are selected parts of the image

#

in any case, I think we want something like this or @normal tiger 's remove background node as part of core (leaning towards this as it's more general and doesn't rely on rembg) and then continue thinking about a SAM node that could function similar to FaceIdentifier

multiple ways to achieve this 🤔

normal tiger
sly moon
sage yew
#

Hi and thanks for the great work, i was using this nodes ( especially clipseg text_mask )
But after the latest upgrade of InvokeAI, it seems that they no longer work as before.
Can you please upgrade them
Thank you

steel orchid
#

What is the appropriate content for the init.py file?

rose vapor
#

Mine says ```### composition-nodes/__init__.py

from .clipseg import *
from .clipseg_adv import *
from .cmyk import *
from .image_blend import *
from .image_composite import *
from .image_enhance import *
from .image_offset import *
from .image_rotate import *
from .latent_masked_blend import *
from .latents_offset import *
from .noise_s import *
from .shmmask import *
from .text_mask import *