#How to get worker to save multiple images to S3?

46 messages · Page 1 of 1 (latest)

little fractal
#

Hey all - my comfyui workflow is saving multiple images from throughout the workflow......however in the S3 upload, the worker is only saving one image - do you know how I can have it to save the multiple images into the same directory in S3?

untold summit
#

I think you need to change the code for uploading files to s3 in the handler

#

it only uploads 1 result if im not wrong

little fractal
#

oh I see

#

I'll dig into it - thanks!

untold summit
#

Sure, your welcome!

narrow sky
#

zip it then upload

little fractal
#

Oh that’s a great suggestion - I wasn’t sure if comfyui had a zip node?

untold summit
#

Oh wait what doesn't runpod supports multi files upload?

untold summit
little fractal
#

I'm using the runpod-worker-comfy repo by blib-la - this is the lines of code that handles the upload:

            print(
                "runpod-worker-comfy - the image was generated and uploaded to AWS S3"
            )```
untold summit
#

Ah le me check

little fractal
#

FWIW, I mentioned the following in the blib-la support discord too:

I suspect the issue is in line 240 in rp_handler.py where the dict output_images is overwritten in the for statement? I unfortunately have no way to test as I can't set up the docker locally on my M1 Mac (needs a Nvidia I'm guessing?)

untold summit
#

Whats local_image_path

little fractal
#

the path where the output images are stored in the worker's instance of ComfyUI I believe

untold summit
#

so a directory, and it only uploads one file?

little fractal
#

correct

untold summit
#

ill try to check on this maybe, runpod only uploads one of the files

untold summit
narrow sky
#

nah you can upload multi

untold summit
compact skiff
#

It's not magic. It depends on how you code your serverless handler to run. If you need to upload multiple files then call rp_upload.upload_image (included in the runpod Python module) multiple times. If you don't like how rp_upload.upload_image works you can write your own upload routine.

Here is an example of uploading 2 files uses rp_upload.upload_image:

from runpod.serverless.utils import rp_upload
import runpod

def handler(job):
input = job['input']
JOB_ID = job['id']

imageOne_url = rp_upload.upload_image(JOB_ID, "./imageOne.png")
imageTwo_url = rp_upload.upload_image(JOB_ID, "./imageTwo.png")

return {'imageOne_url': imageOne_url, 'imageTwo_url': imageTwo_url}

You will have to modify this for your specific needs.

untold summit
#

Yeah thats why im confused why madiator said this

#

Yeah it receives a file path in args

#

orr... you can use another utility function in the same python file as that upload_image() to upload multiple files

compact skiff
# untold summit Did you mean the upload functions automatically retrieves multiple images or fil...

I do not believe that rp_upload.upload_image handles multiple files with its default code. Here is code for a Multi file upload routine you could add and use to upload multiple files, with wilcard:

import glob

def uploadMulti(job_id, path_with_wildcard):
    # Get a list of all files matching the wildcard path
    file_list = glob.glob(path_with_wildcard)
    
    # Initialize an empty list to store the URLs of uploaded files
    uploaded_urls = []
    
    # Iterate through each file in the list
    for file_path in file_list:
        # Upload the file and get the URL
        file_url = rp_upload.upload_image(job_id, file_path)
        # Append the URL to the list
        uploaded_urls.append(file_url)
    
    # Return the list of URLs
    return uploaded_urls

Here is how you call such a function:

# Example usage
def handler(job):
    input = job['input']
    JOB_ID = job['id']

    path_with_wildcard = "./images/*.png"
    uploaded_urls = uploadMulti(JOB_ID, path_with_wildcard)

    return {'uploaded_urls': uploaded_urls}
little fractal
#

Thank you @compact skiff - this is super helpful. I’m actually looking to download - rather than upload - multiple images from the output folder. Basically my workflow is producing multiple images and I want to present them all to the user to choose the one they want?

untold summit
#

if you want to download inside serverless, there is boto3, or you can use the rp_download from the runpod's pip package

compact skiff
# little fractal Thank you <@392788612153868290> - this is super helpful. I’m actually looking to...

I think what you are describing is uploading. Your workflow is likely producing image files PNG, JPG, or similar. They are stored on the worker's disk as files. For the user to get access to them you will have to upload those files somewhere that will host them on the Internet. My code above uploads them to an S3 bucket. With proper configuration of the S3 bucket it's URL will be available for anyone on the Internet to access it. Your handler needs to return those URLs to user in JSON. If you are presenting a web interface to the user then you will need to include the images in <img> tags so they can see them or use JavaScript to make it so the user downloads it.

untold summit
#

yep correct you're, encryption

#

upload -> serverless worker to s3
download is s3 to serverless worker

compact skiff
#

If you do actually need to dowload something into your worker you can add this function:

"""Downloads a file from a URL to a local path."""
def download_file(url, local_filename):
    try:
        print(f'[Enhancer]: Downloading {url}')
        if os.path.exists(local_filename):
            return local_filename, None
        with requests.get(url, stream=True) as r:
            r.raise_for_status()

            with open(local_filename, 'wb') as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)

        return local_filename, None

    except Exception as e:
        return None, e

You can call it like this:

result, error = download_file(url, local_path)
little fractal
#

Ahh thank you both!

#

Btw from a performance point of view is it better to go via S3 or bring directly to local device? Ie export as a list of base64 strings

untold summit
#

Try both, but if you do many images at one time I guess s3 is better

#

S3 more cleaner, depending on your provider too it can provide speed if your provider allows you to connect cdn to s3 buckets

compact skiff
# little fractal Btw from a performance point of view is it better to go via S3 or bring directly...

When you run a serverless worker the only thing that will be returned is JSON (text). Below is an example of a run of one of my ToonCrafter worker:

{
  "delayTime": 491,
  "executionTime": 135484,
  "id": "your-unique-id-will-be-here",
  "output": {
    "output_video_url": "https://mybucket.nyc3.digitaloceanspaces.com/ToonCrafter/2024_06_16_16.20.48.mp4"
  },
  "status": "COMPLETED"
}

You can see how I returned a s3 URL to a video file. None of the serverless disk will persist so any files not uploaded somewhere are lost.

One other option would be to convert your image to BASE64. Since BAS64 is just text you can return that in the actual JSON response. Although, there is a size limit on how much data you can include in a response. You could likely return an image encoded as BASE64 but I wouldn't suggest that route for multiple images.

But again, nothing persists after an API call to a serverless worker. You have to move the results somewhere that will persist and give a link to it in the JSON that is returned.

little fractal
#

Thank you thank you both!!

untold summit
#

Yep might be slower too if you return base64.. Depending on your network connection

compact skiff
#

I've just finished building a web socket proxy... that creates a WebSocket connection between the serverless worker and the user's browser. With that you could send the results directly to the users browser. I'm not giving that code out yet though... I am considering if I want to run it as a service.

untold summit
#

Wew how's the speed

compact skiff
#

Hard to say.... it depends on a lot of factors. Upload/Download speed at given runpod region. Upload/Download speed of the users browser. I have done tests with streaming logs from the worker to the browser. I am currently working on code that will stream webcam video to the worker from the browser and video in the reverse. Transferring over media (images, videos) should be no problem in most scenarios.... but if a user was on a slow link it could.