#pdf2html library not working on convex

14 messages · Page 1 of 1 (latest)

vernal kindle
#

✖ Error fetching POST https://befitting-gazelle-225.convex.cloud/api/deploy2/start_push 400 Bad Request: InvalidModules: Hit an error while pushing:
Loading the pushed modules encountered the following
error:
Uncaught Failed to analyze functions/fileManagement/morphing.js: Could not load the "sharp" module using the linux-arm64 runtime
Possible solutions:

sharp

High performance Node.js image processing. The fastest module to resize JPEG, PNG, WebP and TIFF images.

limber pierBOT
#

Hi @vernal kindle. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

The error message you’re seeing is not directly about the pdf2html library, but rather about the sharp module failing to load in the Convex environment. This is a common issue when using native Node.js modules like sharp, especially when they are not installed with the correct platform-specific binaries for the Convex runtime (which uses linux-arm64).

Here’s what you can do to resolve this:

  1. Mark sharp as an external package in your convex.json:

    {
      "node": {
        "externalPackages": ["sharp"]
      }
    }
    

    This tells Convex to install sharp on the server rather than bundling it, which is required for native modules to work properly in Convex’s Node.js runtime. Make sure your function is using the "use node" directive at the top.

  2. Install sharp with the correct platform binaries. Since Convex runs on linux-arm64, you need to ensure that sharp is installed with the appropriate binaries. You can do this by running:

    npm install --os=linux --cpu=arm64 sharp
    

    Or, if you use optional dependencies:

    npm install --include=optional sharp
    

    If you’re using pnpm or another package manager, make sure it supports multi-platform installations. See the sharp installation docs for more details.

#
  1. Check for GLIBCXX errors. If you see errors about missing GLIBCXX versions, it may be due to incompatible binaries. Make sure you’re using a recent version of sharp (e.g., 0.33.5 or 0.34.1 have been reported to work) and that your local environment matches the Convex runtime as closely as possible. Some users have found that downgrading sharp or switching package managers (e.g., from pnpm to npm) can help resolve these issues. Also, ensure you’re not running into issues specific to your OS or architecture, as bundling happens on Convex servers, not locally.

  2. If you’re using bun as your package manager, it has been reported to work with sharp as long as you follow the steps above and use the node runtime with externalized modules.

For more details and troubleshooting steps, see these relevant discussions:

If you continue to have issues, double-check that all dependencies required by pdf2html (and any of its transitive dependencies like sharp) are installed and externalized as needed. If a library fundamentally relies on native binaries that are not compatible with the Convex runtime, it may not be possible to use it directly in Convex. In that case, consider alternative approaches or libraries.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@vernal kindle Kapa.ai is still learning and improving, please let me know how I did by reacting below.

vernal kindle
#

@limber pier adding sharp to convex.json seems to have worked, but i have a different error now:
✖ Error fetching POST https://befitting-gazelle-225.convex.cloud/api/deploy2/start_push 400 Bad Request: InvalidModules: Hit an error while pushing:
Loading the pushed modules encountered the following
error:
Uncaught Failed to analyze functions/fileManagement/morphing.js: Cannot find module './xhr-sync-worker.js'
Require stack:

  • /tmp/source/f6ca4598-58b3-473e-9e50-2d012927f483/modules/functions/fileManagement/morphing.js
    at <anonymous> (../../../node_modules/jsdom/lib/jsdom/living/xhr/XMLHttpRequest-impl.js:38:2)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../node_modules/jsdom/lib/jsdom/living/generated/XMLHttpRequest.js:654:13)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../node_modules/jsdom/lib/jsdom/living/interfaces.js:181:13)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../node_modules/jsdom/lib/jsdom/browser/Window.js:32:8)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../node_modules/jsdom/lib/api.js:18:46)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../node_modules/pdf2html/lib/HTMLParser.js:11:17)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../node_modules/pdf2html/lib/PDFProcessor.js:11:19)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../node_modules/pdf2html/index.js:9:7)
    at <anonymous> (convex:/user/functions/fileManagement/morphing.js:14:31)
    at <anonymous> (../../../convex/functions/fileManagement/morphing.ts:5:20)
limber pierBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

The error you are seeing:

Cannot find module './xhr-sync-worker.js'
Require stack:
- .../jsdom/lib/jsdom/living/xhr/XMLHttpRequest-impl.js
...

indicates that the Convex bundler is unable to resolve a file required by the jsdom library, which is a dependency of pdf2html. This is a known limitation with Convex's bundling process: some libraries, especially those like jsdom that rely on dynamic imports or expect access to local files, may not work because esbuild (the bundler used by Convex) cannot always detect and include these files. This is mentioned in the Convex documentation:

Additionally, some libraries rely on local files, which cannot be bundled by esbuild. If bundling is used, irrespective of the choice of runtime, these imports will always fail in Convex.

Examples given include libraries like sharp and others that rely on dynamic dependencies or local files, which matches the issue you're experiencing with jsdom and pdf2htmlBundling limitations.

In summary:
This error is due to a limitation in Convex's bundling system, and there is no supported workaround for libraries that require dynamic imports of local files (like jsdom and, by extension, pdf2html). If you need this functionality, you may need to run this code outside of Convex or look for alternative libraries that do not have these bundling requirements.

If you have further questions or need help finding alternatives, let me know!

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@vernal kindle Kapa.ai is still learning and improving, please let me know how I did by reacting below.

vernal kindle
#

i've been lookign for a better pdf2html alternative but cant find it anywhere, you know any @limber pier

limber pierBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

I'm sorry, but the knowledge sources do not contain information about alternative libraries to pdf2html that are compatible with Convex. The available documentation and support threads mainly discuss issues with bundling and compatibility for libraries like sharp and jsdom, but do not recommend specific alternatives for PDF-to-HTML conversion within Convex. If you need further assistance, you might consider asking in the Convex Discord Community for up-to-date recommendations from other developers and the Convex team.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@vernal kindle Kapa.ai is still learning and improving, please let me know how I did by reacting below.

cerulean veldt