#Excel file storage size

1 messages · Page 1 of 1 (latest)

minor rapids
#

I keep getting this error randomly: "Your request couldn't be completed. Try again later" when manipulating a 5MB excel file within an internal action. Is there a way to understand the underlying issue on action error for such error message so I could fix it? Also, should convex be able tom handle such file?

vivid kindleBOT
#

Hi @minor rapids. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the knowledge sources, I can help explain the issue you're experiencing with your Excel file in Convex.

The error message "Your request couldn't be completed. Try again later" is a generic error that can occur for several reasons. From the information provided, it appears you might be hitting one of Convex's limits when handling your 5MB Excel file in an internal action.

There are a few potential causes for this issue:

  1. Function argument size limits: Convex has an 8MB limit for function arguments, as mentioned in the documentation. While your 5MB file is technically under this limit, there might be additional overhead in how Convex encodes byte arrays that pushes you over an internal limit.

  2. AWS Lambda size limits: As explained in one of the support threads, when using "use node" for internal actions, you might hit AWS Lambda size limits that are actually smaller than your file size due to encoding overhead:

    "This is hitting an AWS Lambda size limit, larger than your actual bytearray size due to the way we encode bytearray arguments." Discord support thread

  3. Memory usage: When processing large files, you might be hitting memory constraints in the Convex environment.

Recommended solutions:

  1. Use file storage instead of passing the file directly: Instead of passing the file as an argument to your function, upload it to Convex's file storage first and then pass the storage ID to your action:

    "You can also read larger data into actions by writing it to file storage, then passing the storage id to the function to read / stream" Discord support thread

#
  1. Avoid "use node" for large files: If possible, try using the standard Convex JS runtime instead of "use node" for your internal action.

  2. Stream the file if possible: If you need to process the file in chunks rather than loading it all at once, consider streaming approaches.

Regarding whether Convex should be able to handle a 5MB file - yes, Convex should be able to handle files of this size, especially through its file storage system which has much higher limits (1 GiB on Starter plan, 100 GiB on Professional plan) Production > State > Limits.

For better error diagnostics, you might want to set up log streaming as mentioned in the documentation to get more detailed information about what's happening when these errors occur.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@minor rapids Kapa.ai is still learning and improving, please let me know how I did by reacting below.

minor rapids
#

I am already using the file storage. I don't pass the file as an argument of the action

#

@vivid kindle I am already using the file storage. I don't pass the file as an argument of the action

vivid kindleBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

Thank you for clarifying. If you're already using file storage and not passing the file as an argument to your action, then the issue is likely different from what I initially suggested.

Since you're getting the "Your request couldn't be completed. Try again later" error while manipulating a 5MB Excel file within an internal action (but not passing it as an argument), here are some other potential causes:

  1. Memory usage during processing: Even when reading from file storage, if your action is loading the entire 5MB file into memory at once for processing, you might still hit memory limits in the Convex environment.

  2. Action timeout: If your processing takes too long, the action might be timing out.

  3. Streaming issues: If you're streaming the file from storage but not handling the stream properly, this could cause issues.

From one of the Discord support threads, a similar issue was reported when handling large API responses:

"I'm having an issue with handling large API responses in Convex. The endpoint returns responses up to 150MB in size. The request works fine via curl, but fails when i run the action from the Convex console with: 'Connection lost while action was in flight'" Discord support thread

For better error diagnostics, you could try:

  1. Add more detailed logging: Add try/catch blocks with detailed error logging to pinpoint where exactly the failure occurs.

  2. Process in smaller chunks: Instead of loading the entire Excel file at once, try processing it in smaller chunks if possible.

  3. Use "use node": If you're not already, you might try using "use node" for your action as it might provide a different runtime environment with different limits.

#
  1. Check for concurrent request limits: If you're making multiple concurrent requests, you might be hitting limits:

    "Your backend is limited to 64 concurrent actions. To get more resources, upgrade to Convex Pro." Discord support thread

Without seeing your specific code and the exact error details, it's difficult to pinpoint the exact cause. If the issue persists, you might want to share your code with Convex support for more targeted assistance.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@minor rapids Kapa.ai is still learning and improving, please let me know how I did by reacting below.

minor rapids
#

@vivid kindle since the error message is generic, is there a way to investigate further details on it?

vivid kindleBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

Based on the knowledge sources, there are a few ways to investigate the generic "Your request couldn't be completed. Try again later" error message in Convex:

  1. Error handling with try/catch blocks: Implement proper error handling in your code to catch specific errors. Convex errors are typically thrown as ConvexError or ServerError objects that you can catch and examine:

    try {
      // Your code that's causing the error
    } catch (error) {
      if (error instanceof ConvexError) {
        console.error("ConvexError:", error.data);
      } else {
        console.error("Other error:", error);
      }
    }
    

    As mentioned in the documentation: "If an error occurs during a call to mutation, it will throw an exception. Typically you may want to catch ConvexError and ServerError and handle them however is appropriate in your application." Client > Android Kotlin

  2. Set up observability tools: Implement dedicated observability tools to get more actionable data about errors:

    "By setting up dedicated tools, you can get actionable data to help understanding errors, performance, user behavior and allow you respond quickly as data changes." Stack > Observing your app in production

  3. Check for memory usage: The error might be related to memory limits. In one Discord thread, a support staff mentioned: "Sounds like you're hitting the memory limits in the 'use node' environment." Discord support thread

#
  1. Look for import cycles: In some cases, import cycles can cause cryptic errors. As mentioned in a Discord thread: "it was an import cycle, thanks for that" Discord support thread

  2. Check for streaming issues: If you're streaming data, there might be issues with how the response is being handled: "this isn't something i realized would happen... btw we only log the [WARN] line if you've already returned a Response and it's in the middle of streaming the response body" Discord support thread

  3. Implement retry mechanisms: For actions that might fail transiently, consider implementing retry mechanisms with exponential backoff as described in Stack > Automatically Retry Actions

Without seeing your specific implementation, it's difficult to pinpoint the exact cause, but these approaches should help you gather more information about what's causing the generic error message.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@minor rapids Kapa.ai is still learning and improving, please let me know how I did by reacting below.

minor rapids
#

I am using the workpool component to process the excel file in scheduled batches. I am able to use xlsx package to "open" the workbook (no other action) for each batch without issue. Then after I am validating the data with reduce functions to format them the way I need them. Then querying for existing record and upsert them later with mutations. All with for loop with batch of 500 rows to avoid hitting the convex limits.

#

@vivid kindle I am using the workpool component to process the excel file in scheduled batches. I am able to use xlsx package to "open" the workbook (no other action) for each batch without issue. Then after I am validating the data with reduce functions to format them the way I need them. Then querying for existing record and upsert them later with mutations. All with for loop with batch of 500 rows to avoid hitting the convex limits.

vivid kindleBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

Based on your approach of using the workpool component to process Excel files in scheduled batches, you're on the right track with several best practices for handling large data processing in Convex.

Your current implementation includes:

  1. Using the workpool component for batch processing
  2. Opening the workbook with xlsx package
  3. Validating and formatting data with reduce functions
  4. Querying for existing records and upserting with mutations
  5. Processing in batches of 500 rows to avoid hitting limits

This approach aligns well with recommended patterns for high-throughput operations in Convex. According to the knowledge sources, there are a few considerations that might help optimize your workflow:

Transaction Throughput Optimization

When dealing with high-throughput mutations, it's important to ensure your mutations are only reading necessary documents:

"If you're having problems with mutation throughput, it may be because the mutations are reading unnecessary documents. You can sculpt ctx.db.querys to only look at the essential documents..." Stack > Optimize Transaction Throughput

Batch Size Considerations

Your batch size of 500 rows is a good starting point, but you might need to adjust based on the complexity of your data:

"The component will fetch your data in batches of 100, and call your function on each document in a batch. If you want to change the batch size, you can specify it. This can be useful if your documents are large, to avoid running over the transaction limit, or if your documents are updating frequently and you are seeing OCC conflicts while migrating." Components > Migrations

Handling Concurrent Operations

#

If you're experiencing "Too many concurrent requests" errors despite batching:

"Your backend is limited to 64 concurrent actions. To get more resources, upgrade to Convex Pro." Discord support thread

On Pro plans, this limit is raised based on your workload needs.

Error Handling for Scheduled Functions

Since you're using scheduled batches, be aware of how errors are handled:

"Once scheduled, mutations are guaranteed to be executed exactly once. Convex will automatically retry any internal Convex errors, and only fail on developer errors." Scheduling > Scheduled Functions

Avoiding OCC Conflicts

If you're experiencing "Another call to this mutation changed the document" errors:

"When you look up the group by groupId... that .filter is reading the whole table and filtering in memory, even though it doesn't actually care about docs not related to this group... that will make it slow as well. switch to an index there and you're good" Discord support thread

Make sure you're using indexes properly when querying for existing records before upserting.

Your approach of processing in batches of 500 rows is a good strategy to avoid hitting the 8MB read limit in Convex functions:

"Uncaught Error: Too many bytes read in a single function execution (limit: 8388608 bytes). Consider using smaller limits in your queries, paginating your queries, or using indexed queries with a selective index range expressions." Discord support thread

#

Overall, your approach seems well-structured for handling large Excel files in Convex. If you continue to experience generic errors, implementing detailed logging within your batch processing functions could help identify exactly where issues are occurring.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: