#Server Times out since update to 1.8

310 messages · Page 1 of 1 (latest)

raw aspen
#

Our server times out since the update to 1.8. When I rebuild it works fine for a couple of mins and then it starts to become unresponsive, e.g. a simple /store/products?limit=7 runs for some time and then times out. I can see the request on the server side coming in though...

[2023-04-08 13:04:24] 162.158.102.252 - - [08/Apr/2023:13:04:24 +0000] "GET /store/products?limit=7 HTTP/1.1" - - "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36"

I do not see anything suspicious in the server logs on Digital Ocean - no error messages or weird logs.

The postgres connection seems to be working just fine as well...

I know this is not super helpful but I do not even know where to start here. It worked just fine before the upgrade to 1.8.

#

When I redeploy the server it works again for a couple of minutes until it starts to become unresponsive and time out

raw aspen
#

I followed the precise update docs for 1.8… is anyone seeing a similar behaviour?

pulsar crag
#

Is it specific actions which are causing the time out?

raw aspen
#

I am still trying to determine that.

#

The only thing I’m really doing is a call to the products endpoint

#

And it works at first

#

It works locally as well

#

It’s hard to tell since there are no logs 🙈

pulsar crag
#

Hmm. And this leads to a timeout? Also, can you share the logs of the actual timeout?

raw aspen
#

504 Gateway timeout

#

Could it even be a DO issue? Their status page says all clear though

#

I don’t see logs of the timeout on the server. That’s the thing.

pulsar crag
#

Hmm

#

My thinking is that it's related to DO. Do you have access to any monitoring on DO? Just so we have something to start from. Otherwise, its gonna be super hard to debug

raw aspen
#

So it seems to some routes like products endpoint only as it seems. Some custom endpoints of mine work just fine. So it is not DO related. Very weird indeed.

#

It just times out... I have not touched anything there afaik. Is there a way I could inject some logging into the product routes to see what is going on? There is literally nothing on the server logs that seems weird. No error, no issues. The request is even shown on the specific route

raw aspen
#

When I redeploy the server it works for like 5 minutes - then it starts to time out again...

raw aspen
#

I downgraded back to 1.7.8 and everything works again… I am somehow surprised I seem to be the only one with that issue

grizzled thorn
#

Im currently facing a very similiar issue when visiting some product routes in the admin, however I've got something in my console:

<--- Last few GCs --->

[5528:000002342FC729B0]    38860 ms: Mark-sweep 4044.3 (4137.4) -> 4034.3 (4140.1) MB, 1100.0 / 0.0 ms  (average mu = 0.412, current mu = 0.020) allocation failure; scavenge might not succeed
[5528:000002342FC729B0]    39828 ms: Mark-sweep 4050.2 (4140.1) -> 4040.0 (4146.1) MB, 947.0 / 0.0 ms  (average mu = 0.242, current mu = 0.021) allocation failure; scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
 1: 00007FF6072E9E7F node_api_throw_syntax_error+175967
 2: 00007FF607270C06 SSL_get_quiet_shutdown+65750
 3: 00007FF607271FC2 SSL_get_quiet_shutdown+70802
 4: 00007FF607D0A214 v8::Isolate::ReportExternalAllocationLimitReached+116
 5: 00007FF607CF5572 v8::Isolate::Exit+674
 6: 00007FF607B773CC v8::internal::EmbedderStackStateScope::ExplicitScopeForTesting+124
 7: 00007FF607B745EB v8::internal::Heap::CollectGarbage+3963
 8: 00007FF607B8A823 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath+2099
 9: 00007FF607B8B0CD v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath+93
10: 00007FF607B9A903 v8::internal::Factory::NewFillerObject+851
11: 00007FF60788BEB5 v8::internal::DateCache::Weekday+1349
12: 00007FF607DA78B1 v8::internal::SetupIsolateDelegate::SetupHeap+558193
13: 00007FF607DFFC0E v8::internal::SetupIsolateDelegate::SetupHeap+919502

This is on a local dev environment

raw aspen
#

Very interesting.

#

Does this happen occasionally or constantly? Can you specify on which routes?

#

How many products do you have locally?

grizzled thorn
#

Just on some admin routes

#

Like 7 or 8?

#

Maybe it has something to do with my local redis version

raw aspen
#

Ok interesting.

#

I also somehow have redis in mind. But I have no evidence

grizzled thorn
#

Heres my indication from the log

info:    Connection to Redis in module 'event-bus-redis' established
√ Modules initialized – 52ms
√ Database initialized – 41ms
√ Repositories initialized – 34ms
√ Services initialized – 9ms
√ Express intialized – 7ms
\ Initializing plugins
It is highly recommended to use a minimum Redis version of 6.2.0
           Current: 6.0.16
It is highly recommended to use a minimum Redis version of 6.2.0
           Current: 6.0.16
It is highly recommended to use a minimum Redis version of 6.2.0
           Current: 6.0.16
√ Plugins intialized – 785ms
√ Subscribers initialized – 10ms
√ API initialized – 55ms
grizzled thorn
#

Still occurs with an updated redis server

#

But only on 1 specific product route I found

pulsar crag
raw aspen
#

Ok I reran the upgrade and it seems to be fine now but this time I added the admin plugin and added admin to /app. I cannot imagine there is a coupling there but could this be related in any way? The server is now running without any issues since multiple hours… really interesting issue

raw aspen
#

And it times out again 😂

#

Agh this is a real bummer. It happens after the product route is hit. At some point it just times out... Really stuck here since there are no error logs, nothing suspicious or anything else...

raw aspen
#

Does anyone have any recommendations what I should be looking into? I reverted back to 1.7.8

pulsar crag
#

Hmm, do you have a cache module configured?

bronze cloak
#

That seems to be a pretty strange one, do you have any memory, cpu etc metrics to share, log files, v8 monitoring or such? We are using the 1.8 on our staging and it doesn’t produce this behavior, everything is loading and the server does not timeout

raw aspen
#

Oliver, I do Not know to be honest. Can you elaborate?

#

All metrics are super normal. CPU usage is at 15%, all metrics super low / normal

#

I will actually maybe just try to install the server from scratch

bronze cloak
#

What Olivier means is are you using any specific module for the cache mechanism or the default cache module?

raw aspen
#

Default

bronze cloak
#

And for the event?

#

Default or another one?

raw aspen
#

I am using all default

bronze cloak
#

Would be nice if you could try with the redis event module and the redis cache module

raw aspen
#

Sorry I am using redis. That’s what I believe at least

bronze cloak
#

Could you share your medusa config as well so we can get a better idea of your env

raw aspen
#

let ENV_FILE_NAME = "";
switch (process.env.NODE_ENV) {
  case "production":
    ENV_FILE_NAME = ".env.production";
    break;
  case "staging":
    ENV_FILE_NAME = ".env.staging";
    break;
  case "test":
    ENV_FILE_NAME = ".env.test";
    break;
  case "development":
  default:
    ENV_FILE_NAME = ".env";
    break;
}

try {
  dotenv.config({ path: process.cwd() + "/" + ENV_FILE_NAME });
} catch (e) {}

// CORS when consuming Medusa from admin
const ADMIN_CORS =
  process.env.ADMIN_CORS || "http://localhost:7000,http://localhost:7001";

// CORS to avoid issues when consuming Medusa from a client
const STORE_CORS = process.env.STORE_CORS || "http://localhost:8000";

let DATABASE_EXTRA = {};
if (process.env.NODE_ENV === "production") {
  DATABASE_EXTRA = { ssl: { rejectUnauthorized: false } };
}

const DATABASE_URL =
  process.env.DATABASE_URL

// Medusa uses Redis, so this needs configuration as well
const REDIS_URL = process.env.REDIS_URL || "redis://localhost:6379";

// Algolia
const ALOGLIA_APP_ID = process.env.ALGOLIA_APP_ID;
const ALGOLIA_ADMIN_API_KEY = process.env.ALGOLIA_ADMIN_API_KEY;````
#
const STRIPE_API_KEY = process.env.STRIPE_API_KEY || "";
const STRIPE_WEBHOOK_SECRET = process.env.STRIPE_WEBHOOK_SECRET || "";

// This is the place to include plugins. See API documentation for a thorough guide on plugins.
const plugins = [
  `medusa-fulfillment-manual`,
  `medusa-payment-manual`,
  {
    resolve: `medusa-payment-stripe`,
    options: {
      api_key: STRIPE_API_KEY,
      webhook_secret: STRIPE_WEBHOOK_SECRET,
      automatic_payment_methods: true,
      capture: true,
    },
  },
  {
    resolve: `medusa-payment-paypal`,
    options: {
      sandbox: process.env.PAYPAL_SANDBOX,
      client_id: process.env.PAYPAL_CLIENT_ID,
      client_secret: process.env.PAYPAL_CLIENT_SECRET,
      auth_webhook_id: process.env.PAYPAL_AUTH_WEBHOOK_ID,
    },
  },
];

module.exports = {
  projectConfig: {
    database_type: "postgres",
    database_url: DATABASE_URL,
    database_extra: DATABASE_EXTRA,
    store_cors: STORE_CORS,
    admin_cors: ADMIN_CORS,
    redis_url: REDIS_URL,
  },
  plugins,
  featureFlags: {
    tax_inclusive_pricing: true, // Prob need to remove some time
  },
  eventBus: {
    resolve: "@medusajs/event-bus-redis",
    options: {
      redisUrl: REDIS_URL,
    },
  },
  cacheService: {
    resolve: "@medusajs/cache-redis",
    options: {
      redisUrl: REDIS_URL,
    },
  },
};```
#

(I had added the admin plugin but removed it just now, it shouldnt have any impact though)

bronze cloak
#

The event bus and cache service should be under modules, so they are not used here, could you do the change please

raw aspen
#

Omg… let me double check that

#

I’m so sorry. What a crazily stupid mistake

bronze cloak
#

Ahah no worries, before jumping let see id that fix your issue

raw aspen
#

I will get back here tomorrow

#

Thanks so much Adrien 🙏

solemn cedar
#

It looks like I have the same problem, or a very similar one to yours. I am trying to deploy Medusa Backend which has the Admin plugin as well.

I am getting a FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory error

#

I'm on Digital Ocean, and I tried upgrading to Basic GB RAM to see if that had any effect. It did not.

solemn cedar
#

This my medusa-config.js file:


let ENV_FILE_NAME = "";
switch (process.env.NODE_ENV) {
  case "production":
    ENV_FILE_NAME = ".env.production";
    break;
  case "staging":
    ENV_FILE_NAME = ".env.staging";
    break;
  case "test":
    ENV_FILE_NAME = ".env.test";
    break;
  case "development":
  default:
    ENV_FILE_NAME = ".env";
    break;
}

try {
  dotenv.config({ path: process.cwd() + "/" + ENV_FILE_NAME });
} catch (e) { }

// CORS when consuming Medusa from admin
const ADMIN_CORS =
  process.env.ADMIN_CORS || "http://localhost:7000,http://localhost:7001";

// CORS to avoid issues when consuming Medusa from a client
const STORE_CORS = process.env.STORE_CORS || "http://localhost:8000";

const DATABASE_TYPE = process.env.DATABASE_TYPE || "sqlite";

const DB_USERNAME = process.env.DB_USERNAME
const DB_PASSWORD = process.env.DB_PASSWORD
const DB_HOST = process.env.DB_HOST
const DB_PORT = process.env.DB_PORT
const DB_DATABASE = process.env.DB_DATABASE

const DATABASE_URL = 
  `postgres://${DB_USERNAME}:${DB_PASSWORD}` + 
  `@${DB_HOST}:${DB_PORT}/${DB_DATABASE}`

// const DATABASE_URL = process.env.DATABASE_URL || "postgres://localhost/medusa-store";
const REDIS_URL = process.env.REDIS_URL || "redis://localhost:6379";```
#
  `medusa-fulfillment-manual`,
  `medusa-payment-manual`,
  // To enable the admin plugin, uncomment the following lines and run `yarn add @medusajs/admin`
  {
    resolve: "@medusajs/admin",
    /** @type {import('@medusajs/admin').PluginOptions} */
    options: {
      autoRebuild: true,
    },
  },
  // Spaces
  {
    resolve: `medusa-file-spaces`,
    options: {
      spaces_url: process.env.SPACE_URL,
      bucket: process.env.SPACE_BUCKET,
      endpoint: process.env.SPACE_ENDPOINT,
      access_key_id: process.env.SPACE_ACCESS_KEY_ID,
      secret_access_key: process.env.SPACE_SECRET_ACCESS_KEY,
    },
  },
  // Sendgrid
  {
    resolve: `medusa-plugin-sendgrid`,
    options: {
      api_key: process.env.SENDGRID_API_KEY,
      from: process.env.SENDGRID_FROM,
      customer_password_reset_template: process.env.SENDGRID_CUSTOMER_PASSWORD_RESET_ID,
      user_password_reset_template: process.env.SENDGRID_USER_PASSWORD_RESET_ID,
    },
  },
  // Algolia
  {
    resolve: `medusa-plugin-algolia`,
    options: {
      applicationId: process.env.ALGOLIA_APP_ID,
      adminApiKey: process.env.ALGOLIA_ADMIN_API_KEY,
      settings: {
        products: {
          indexSettings: {
            searchableAttributes: ["title", "description"],
            attributesToRetrieve: [
              "id",
              "title",
              "description",
              "handle",
              "thumbnail",
              "variants",
              "variant_sku",
              "options",
              "collection_title",
              "collection_handle",
              "images",
            ],
          },
          transform: (product) => ({
            id: product.id,
            // other attributes...
          }),
        },
      },
    },
  },
];```
#
  eventBus: {
    resolve: "@medusajs/event-bus-redis",
    options: {
      redisUrl: REDIS_URL
    }
  },
  cacheService: {
    resolve: "@medusajs/cache-redis",
    options: {
      redisUrl: REDIS_URL
    }
  },
}

/** @type {import('@medusajs/medusa').ConfigModule["projectConfig"]} */
const projectConfig = {
  jwtSecret: process.env.JWT_SECRET,
  cookieSecret: process.env.COOKIE_SECRET,
  database_type: DATABASE_TYPE,
  database_url: DATABASE_URL,
  store_cors: STORE_CORS,
  admin_cors: ADMIN_CORS,
  // Uncomment the following lines to enable REDIS
  redis_url: REDIS_URL,
  // Needed when hosting on Digital Ocean
  // database_extra: { ssl: { rejectUnauthorized: false } },
}

if (DATABASE_URL && DATABASE_TYPE === "postgres") {
  projectConfig.database_url = DATABASE_URL;
  delete projectConfig["database_database"];
}


/** @type {import('@medusajs/medusa').ConfigModule} */
module.exports = {
  projectConfig,
  plugins,
  modules,
};```
pulsar crag
#

Can you try to disable the admin plugin and run a new deployment? When served on the server, the admin requires quite a lot of memory to build, so it might be the culprit

raw aspen
#

So I adjusted my medusa-config according to the docs:

  featureFlags: {
    product_categories: true,
  },
  projectConfig: {
    database_type: "postgres",
    database_url: DATABASE_URL,
    database_extra: DATABASE_EXTRA,
    store_cors: STORE_CORS,
    admin_cors: ADMIN_CORS,
    redis_url: REDIS_URL,
  },
  plugins,
  modules: {
    eventBus: {
      resolve: "@medusajs/event-bus-redis",
      options: {
        redisUrl: REDIS_URL,
      },
    },
    cacheService: {
      resolve: "@medusajs/cache-redis",
      options: {
        redisUrl: REDIS_URL,
      },
    },
  },
};```

Again, the problem of timeout on the server persists unfortunately... Here is my updated part of the config file.
pulsar crag
#

And just to be clear, it only happens if you hit the products route? So if you deploy and don't hit that route everything works as expected?

#

If this is the case, then I believe it is related to caching.

raw aspen
#

Yes that is the case from what I can see

#

How could I support to debug this generally? I am a bit lost

pulsar crag
#

What Redis app do you have installed on DO?

#

And how powerful is it?

solemn cedar
#

And if it is only needed for builds? or is I good to always have the memory needed for performance etc. I am asking because the Digital Ocean droplets get pretty expensive pretty fast.

pulsar crag
#

Running the admin build during deployment is not recommended, as this requires a minimum of 2 GB RAM.

You should investigate if there's a way for you to build it before deploying it to DO e.g. we have configured a GitHub action for our staging and test environments.

raw aspen
#

I think these two problems are not related are they? Talking about @solemn cedars and my issue.

pulsar crag
#

I am dreading, that your Redis cache is too weak to handle the caching we do in the product pricing selection

raw aspen
#

Running on 1 CPU + 1 GB + 10 GB Disk Droplets

Haha this is clearly not cutting it...

#

Which option do you recommend here?

#

Version: 7

#

however - none of our metrics indicate anything crazy going on? Or am I reading this wrong?

raw aspen
pulsar crag
#

As you mentioned yourself, your issue is likely not related to admin, as it happens when you hit the product route

raw aspen
#

Is it possible that the redis cluster size is the issue though? Its only 1GB RAM... thats not a lot

solemn cedar
#

@raw aspen instead of running everything of a droplet, try deploying the backend as a DO app, and create and connect a postgres and redis db to the app

#

This is my setup

raw aspen
#

That’s what I did…

solemn cedar
#

A droplet is not the same as as an app

raw aspen
#

I am aware. I am quite sure I deployed everything as an app

#

Let me double check that

#

Ahh wait you mean deploy redis as an app?

solemn cedar
#

Yup

karmic spruce
raw aspen
#

Would it make sense to upgrade the Redis cluster to e.g. 2GB RAM or even 4GB RAM? Or is this unrelated?

solemn cedar
#

I may have misunderstood the question, I do not mean that you should create a new app that has a redis db in it but rather that a redis db should be created and linked.

If that is what you have done, then I am out of ideas

raw aspen
#

Yes, that is what I have done already I think

#

@solemn cedar what are your Redis Specs if I may ask? Are you running everything on 1GB RAM without issues?

karmic spruce
#

I'm running Redis in a container and with meudsa 1.7.15 it's using 8 MB RAM

raw aspen
#

I actually just upgraded it to 2GB RAM... Issue persists once I hit the /store/product route

raw aspen
karmic spruce
#

I'm just saying that I believe 1GB RAM 1 CPU just for Redis will be enough unless you have some huge usage

#

I'm having couple thousands users daily and it never exceeds 8-9MB RAM

raw aspen
#

ok well we dont have much usage

pulsar crag
raw aspen
#

I am using it on the /store page on nextJS starter to test. Once I scroll down it fetches (12) products in each go and then at some point (after 3 or 4 fetches) it just stops and times out and the skeletons remain.

#
export const fetchCollectionProducts = async ({
  pageParam = 0,
  id,
  cartId,
}: {
  pageParam?: number
  id: string
  cartId?: string
}) => {
  const { products, count, offset } = await medusaClient.products.list({
    limit: 12,
    offset: pageParam,
    collection_id: [id],
    cart_id: cartId,
  })

  return {
    response: { products, count },
    nextPage: count > offset + 12 ? offset + 12 : null,
  }
}```
raw aspen
#

Could this be related to the collections endpoint somehow?

pulsar crag
#

Are you able to reproduce this locally? I can't provoke the issue.

raw aspen
#

I am sorry I cannot seem to reproduce this error. I notice that when I directly hit the products backend via curl it seems to be fine. The buggy page is /store in nextjs... I am trying to understand if I am making any crazy calls to the backend there, but actually it is just copied from the nextjs Starter so this should be alright

raw aspen
#

Is there a way I could inject custom logging into the products endpoint? I need to understand what is going on. Otherwise we have to stay on 1.7.8 I guess

#

My custom endpoints all work though. These custom endpoints don’t communicate with the db. The db itself can be queried though without any issues (eg via datagrip). I am clueless but maybe it’s something with typeorm accessing the db given my 2 observations from above

pulsar crag
#

I will investigate this issue further and get back asap 👍

raw aspen
#

Thanks Oliver. I will try to set up the server from scratch in DO on 1.8

bronze cloak
#

Guys, just to check something, could you downgrade to node 16? The v18 compat has not been tested and i just want to remove that from the board

crystal flume
#

I'll add that I use v18, with the feature flag for nested categories to false, and I'm not having any timeout issues with 1.8.

#

From a comment above, the issue may be with the Nextjs app, and not with the endpoint itself.

bronze cloak
#

Nice to know that node 18 does not bring any issue 💪 thanks for the feedback

bronze cloak
#

Or hitting the end point directly

karmic spruce
#

Maybe limit and offset issue?

#

Shouldn't matter right?

bronze cloak
#

It shouldn’t, this has mot changed in typeorm

#

I ll investigate that tomorrow 👌

crystal flume
#

It might be relevant how many products and how many variants they have. There must be something different about those getting the timeouts versus those who are not, and that might be a clue to the root issue.

raw aspen
#

We don’t have the issue with curl

pulsar crag
#

Yeah, but in the example above the limit is set to 12, and that should not cause any issues.

raw aspen
#

I checked limits as well. Didn’t have an impact

bronze cloak
#

I expect to run my load testing scenarios which i used for perf improvements in 1.7 which creates products with 50 variants/prices

raw aspen
#

Yes

pulsar crag
raw aspen
#

We only have 1 variant per product

#

It does not

#

Correct

pulsar crag
#

Interesting - thanks!

bronze cloak
#

So it would come from the client then, thanks a lot

#

As mentioned by pevey

raw aspen
#

Mhm I checked the Nextjs logs. I think it might be due to the fact we build products with ISR.

#

We have a very long build interval though to not tax the server much

pulsar crag
crystal flume
#

Yes, that report is what I was thinking of when I thought about number of products. But I also thought I had read (but then couldn't find) in this thread that the issue only occured with the app, which points to the app or the medusa client library.

karmic spruce
#

I mean as soon as you see the page, the ISR has already happened

karmic spruce
#

Though I don't think that would be an issue here.

raw aspen
#

I don’t feel like this is the issue. Why would Nextjs on vercel work perfectly fine before on 1.7.8 and then fail on 1.8?

bronze cloak
#

We will investigate both issues and come back to you asap guys

karmic spruce
#

SO what did the nextjs logs say?

raw aspen
#

its just a bunch of 504 for the individual get calls on /store

#

no further specifications or error messages

#

However as I said, if I call a custom route it works just fine

#

Btw we have around 3.5k products atm.

pulsar crag
#

@raw aspen We will jump on this first thing tomorrow morning and get back with a resolution 👍

raw aspen
#

Thanks. Can we be sure though that this is something on Medusas side and not a bug I am producing? I will revert to 1.7.8 for now. Let me know if and how i can support. Thanks 🙏

pulsar crag
#

Thanks - we'll keep you posted as we begin debugging 👍

raw aspen
#

I set up the server from scratch - meaning I cleared out my server repo, cloned 1.8 according to the medusa docs and followed the set up instructions for DO. The server ran again for a couple of minutes. As soon as I start scrolling on /store it times out again...
Maybe this is helpful.

bronze cloak
#

I am stargint to investigate on my side 💪

raw aspen
#

Let me know if I should share further info on our backend. We have pretty much 0 customisation yet (as stated above, I deployed straight from the docs this morning), we have 3.5k Products that have a large metadata body (in case it matters). Other than that, nothing crazy: no discounts, no price lists yet, only 1 variant per product, not a huge load of traffic yet.

bronze cloak
#

I am doing my test with 150 products, 7500 variants and 7500 prices

#

50 variants per product

#

so far with: node 18, 1GB ram, medusa 1.8.0, cache redis ^2.0.0-next-20230323083446, event bus redis ^1.8.0
on postman and the admin I don't have any issues, I ll try the store front

raw aspen
#

We use the nextjs storefront

#

try the /store route

#

and scroll down

#
    "@medusajs/cache-redis": "^1.8.0",
    "@medusajs/event-bus-local": "^1.8.0",
    "@medusajs/event-bus-redis": "^1.8.0",
    "@medusajs/medusa": "^1.8.0",
    "@medusajs/medusa-cli": "^1.3.9",```
#

This is my versions. Seems like your cache-redis is on a different version?

raw aspen
#

Plus I run on node v16.20.0 on prod

#

(DO only supports until node 17 as it seems from the docs)

#

App Platform supports Node versions up to 17.x.

bronze cloak
#

node should not have an impact, just wanted to test with 18

#

I ve changed the packages versions, no issue with postman or the admin, though, I am struggeling building the store front at the moment

bronze cloak
#

I can't reproduce so far

raw aspen
#

Thanks for further looking into this.
I loaded all my products into the local db and I can also not reproduce this timeout locally.

#

Is there a way to "swipe" the database elegantly and repopulate it? (We are currently in alpha anyways so it would not really matter)

#

Actually! I CAN reproduce it. I think it might have to do with how I am building the pages on NextJS and my ISR Strategy. I will investigate

bronze cloak
#

no problem, keep us in touch 💪

#

If I find anything new I ll let you know, in the meantime we are also waiting your finding

sharp sierra
#

I've created a thread

karmic spruce
sharp sierra
#

ok

raw aspen
#

So to recap what I think the issue is:

  1. on /[handle].tsx we try to build our product pages using ISR. Hence, we use fallback: 'blocking'. The actual product handles are not returned from getStaticPaths but instead just a small subset:
  2. Whenever we hit the handle page and the page is NOT prebuilt (which is the case for every product page) at first, we build and return it.
  3. It seems that my fallback strategy is / was foolish as it caused the server to stop. Probably because too many requests were incoming at once.
  4. I changed my fallback to true now. This seems to solve the issue: I can scroll down "as far as I want" and the pdps build anyways which is great.

My general question now - but I guess this is more of a NextJS / Vercel qn, so feel free to ignore:

  1. What is the ideal setup to handle that many product pages while maintaining speed (looking at SEO) and scalabity.

Right now our getStaticProps looks like this:

  const handle = params?.handle as string
  let props
  const product = await fetchProduct(handle)

  // Prepare Product SEO Page
  if (product !== undefined) {
    const queryClient = new QueryClient()

    await queryClient.prefetchQuery([`get_product`, handle], () =>
      fetchProduct(handle)
    )

    const queryData = await queryClient.getQueryData([`get_product`, handle])

    if (!queryData) {
      return {
        props: {
          notFound: true,
        },
        revalidate: 86400, // In seconds
      }
    }

    props = {
      dehydratedState: dehydrate(queryClient),
      notFound: false,
      source: SOURCE_TYPE.PRODUCT,
    }
  }

  if (!props) {
    return {
      props: {
        notFound: true,
      },
      revalidate: 86400, // In seconds
    }
  }

  return {
    props: props,
    revalidate: 86400, // In seconds
  }
}```

We try to generate many pages under the root for SEO reasons. So not only product but also category pages
#
  1. I check if the incoming handle belongs to a product (it could be a category or something else, I discarded this part of the code as it is not relevant)
  2. If product actually exists I create the queryclient and prepare the props. The source parameter indicates that this is a product page, so the page renders a product component
  3. I have set the revalidate to a very long period (24 hours) but I feel like my whole ISR strategy is not working out
#

At the end of the day I do not want to build products at build time but I want them to be built once if they are called. They do not even have to rebuild. Our products are super static...

karmic spruce
#

This is weird because simply loading this products list on page (with limit, offset) doesn't bulild the product pages itself

#

Unless it builds them when using Link prefetch 🤔 Nah, I don't think so

raw aspen
#

Ah could it be that because I have /[handle].tsx and /store.tsx under the same directory level it always calls handle when I access store? And passes "store" as a param to my server to "fetch the page"?

karmic spruce
#

hmm

#

However next.js link prefetching does have some impact on ISR. From their docs – Prefetch the page in the background. Defaults to true. Any <Link /> that is in the viewport (initially or through scroll) will be preloaded. Prefetch can be disabled by passing prefetch = false. When prefetch is set to false, prefetching will still occur on hover. Pages using Static Generation will preload JSON files with the data for faster page transitions. Prefetching is only enabled in production.

#

I'm quite confused now whether it actually builds the non ISR'ed yet pages on link prefetch

raw aspen
#

haha wow that is something

karmic spruce
#

I'm not sure how to interpret this

#

Maybe you can try to disable prefetch on Links where you load this product list on scroll

#
<Link prefetch={false} href=/product ....>
#

And see if this helps

raw aspen
#

Yes I will try this! Right now I am deploying with fallback: true to see if this solves the issue. Then I will optimise

raw aspen
#

Kosek - it turns out that the prefetch={false} in combination with fallbacj: true indeed solved the problem. So first of all: HUGE Thank you to you and the Medusa team for all your time you guys spent on this bug.

However, for future use cases: Isnt it super dangerous that the server times out if <Link/> prefetch is called on it too many times? Could it be that my code in getStaticProps is too inefficient? I dont really see a reason why it would be though?!

tiny sequoia
#

Just following on, im getting this exact same issue but i'm running this in an EC2 as apposed to on DO, about to test without prefetch={false} and see if that has any effect

raw aspen
#

This solved it for us actually

tiny sequoia
#

Just tested on my install and with prefetch false i'm still getting timeouts

raw aspen
#

Wow ok

#

Are you revalidating ?

tiny sequoia
#

set it to 24hr

raw aspen
#

OK

tiny sequoia
#

it got my first set of products when querying the category, then loaded a product

raw aspen
#

Are you sure you have added this to all your <Link/> instances and set fallback=true?

tiny sequoia
#

not all instances, just on the productPreview

raw aspen
#

Ok

tiny sequoia
#

Gonna try that next

raw aspen
#

Did you also just have this issue now since 1.8?

tiny sequoia
#

yeah purely since 1.8 upgrade

#

but fundamentally

raw aspen
#

ok same here

#

lets stay connected about this

tiny sequoia
#

medusa should be able to handle a small number of requests like this

raw aspen
#

I would agree

#

I think it is nextjs related though

tiny sequoia
#

i dont think it is

#

the prefetch performs the same api request that visiting a link would

#

it may be the frequency of the requests

#

thats causing the timeout

#

i've got a bit of a unique product setup

#

my main product is linked to multiple accessories,

#

and when the page loads, it pulls in said related products, its prices and allows them to be added to basket as a bundle

#

my issue might be on that page load, its sending a query for multiple products for the details needed on those, that said they're relatively small requests

karmic spruce
#

prefetch={false} will only disable prefetching visible links. By default Nextjs uses Intersection Observer to prefetch visible links. It will prefetch all visible links.

prefetch={false} will not disable prefetching on hover. If you move your cursor through multiple links, it will prefetch them.

tiny sequoia
#

For context i've just cloned the nextjs starter and tried to do a fresh build against our medusa running on an ec2 (1.8.0) and it got 15 pages in of the static page generation and then timed out

#

~ 60 products / ~600 variants

#

so barely anything there

pulsar crag
#

Are you all using the new inventory module or is this just from upgrading to 1.8?

tiny sequoia
#

Mine was a complete fresh install of 1.8

#

cause we had issues with migrations

pulsar crag
#

Got it thanks

tiny sequoia
#

luckily we weren't far in to populating our products at the time on this install

#

module.exports = {
projectConfig: {
redis_url: REDIS_URL,
database_url: DATABASE_URL,
database_type: "postgres",
store_cors: STORE_CORS,
admin_cors: ADMIN_CORS,
},
plugins,
featureFlags: {
product_categories: true
},
modules: {
eventBus: {
resolve: "@subtle grottojs/event-bus-redis",
options: {
redisUrl: REDIS_URL
}
},
cacheService: {
resolve: "@subtle grottojs/cache-redis",
options: {
redisUrl: REDIS_URL
}
}
}
}

#

"dependencies": {
"@subtle grottojs/admin": "^2.0.0",
"@subtle grottojs/cache-redis": "^1.8.0-rc.3",
"@subtle grottojs/event-bus-redis": "^1.8.0-rc.4",
"@subtle grottojs/inventory": "^1.8.0",
"@subtle grottojs/medusa": "^1.8.0",
"@subtle grottojs/medusa-cli": "^1.3.5",
"add": "^2.0.6",
"medusa-extender": "^1.7.6",
"medusa-file-s3": "^1.1.5",
"medusa-fulfillment-manual": "^1.1.31",
"medusa-interfaces": "^1.3.3",
"medusa-payment-manual": "^1.0.16",
"medusa-payment-stripe": "^1.1.42",
"medusa-plugin-sendgrid": "^1.3.3",
"medusa-plugin-strapi": "^1.0.7-dev",
"typeorm": "0.3.11",
"yarn": "^1.22.19"
},

#

oooh

#

didnt realise them 2 were on rc

#

lemme update those and retest

#

@pulsar crag worthupdating to 1.8.1?

pulsar crag
#

I don't think it will resolve it, but yes, I recommend upgrading away from the rc 👍

#

to the latest

tiny sequoia
#

Yeah, upgraded to 1.8.1 and updated cache-redis and event-bus-redis to 1.8.0

#

no different

pulsar crag
#

Quick update, we seem to have identified the culprit. The maximum number of connections to the database is reached during the build step, leading to timing out requests. We are investigating the issue and will get back as soon as we dig out something worth sharing.

tiny sequoia
#

@pulsar crag 👌 - just been doing some testing and it seem to be when theres multiple concurrent requests for price variants that throws the issue, which probably co-incides with running out of database connections

pulsar crag
#

@bronze cloak – can you add your findings here? 💪

bronze cloak
#

I ve fixed it

#

I ll let you know when it is ready

bronze cloak
tiny sequoia
#

👌 will give it a try once the little ones are asleep 😴 😂

bronze cloak
#

Ahah just put mine to sleep 😴

karmic spruce
#

@sharp sierra It was related after all 🙂

sharp sierra
#

Adrien you are genius man

sharp sierra
sharp sierra
pulsar crag
#

Works on my end - great job @bronze cloak

sharp sierra
#

trying to install it

#

I was actually worried for demo. I was planning to demo using npm run dev lol

pulsar crag
#

When's your demo? 😄

sharp sierra
#

if this works it will be great help

sharp sierra
#

but I had to move my code to demo environment and all

pulsar crag
pulsar crag
sharp sierra
pulsar crag
#

Great

sharp sierra
#

Next dev is so slow as compared to production app.

#

💀

#

Next production has so many performance tweaks like prefetching links json and much more

#

Life saved

tiny sequoia
#

It works, thank you @bronze cloak - saves me writing a "your project might be delayed" email to one of our clients :p

#

genuinely, the support you guys provide is excellent

raw aspen
#

@bronze cloak this is insane! Will try out tomorrow. Thanks so much - this literally saved my week 😊

raw aspen
karmic spruce
# sharp sierra Next production has so many performance tweaks like prefetching links json and m...

With medusa be careful tho with all these json prefetching. It can get pretty heavy.
Imagine you have a collection/category page where you have 20 products and you want to have that page statically generated, because why not.
Each of those products can have a lot of variants and that json will get huge. Like couple of hundreds kb. You might even get warnings during nextjs build for that matter
Now you have all those links to categories on main page and it will prefetch all of them. Nextjs does it smart because it does it in idle time, but it might still add up. Your mileage may very.
For that matter I am stripping the fetched collection products json in getStaticProps from all properties that I don't need. Jut before sending it with props.
Or better to use expand with select when fetching, tho I don't know if you can select fields on expanded relations.
I just strip it like that in getStaticProps (it's just an example, it's async because I'm doing there more, like generating blur placeholders for thumbnails)

const productsTraversed = await Promise.all(
    productsFiltered.map(async (product) => {
      delete product.description;
      delete product.options;
      delete product.images;
      delete product.created_at;
      delete product.updated_at;
      delete product.deleted_at;
      delete product.profile_id;
      // delete product.is_giftcard;
      delete product.weight;
      ...
      ...
      // careful - below line only for this page, do not copy
      delete product.collection;
      for (const variant of product.variants) {
        delete variant.created_at;
        delete variant.updated_at;
        delete variant.deleted_at;
        delete variant.sku;
        delete variant.allow_backorder;
        delete variant.manage_inventory;
       ...

Because for a product link on category page you'll just need the title, handle, thumbnail and maybe the variant prices.
I saved a lot of kb's with that.
Just something to think about.

#

Or create an endpoint for building where you select all you need.

dull kindle
#

Hello guys, we did the 1.8.2 update and we still have the memory issue...

Scavenge (reduce) 965.1 (984.0) -> 964.6 (984.2) MB, 6.8 / 0.0 ms (average mu = 0.295, current mu = 0.207) allocation failure Mark-sweep (reduce) 965.3 (984.2) -> 963.8 (985.2) MB, 411.6 / 0.0 ms (+ 151.5 ms in 42 steps since start of marking, biggest step 12.7 ms, walltime since start of marking 603 ms) (average mu = 0.220, current mu = 0.119) a FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

We have the issue when we call :
GET /store/products/[id]

after a rollback to 1.7.x it's ok.

We are using a dedicated redis and node 16

dull kindle
#

even with a bigger server :
246294 ms: Scavenge (reduce) 1962.6 (1998.8) -> 1962.1 (1999.1) MB, 8.2 / 0.0 ms (average mu = 0.237, current mu = 0.081) allocation failure

247134 ms: Mark-sweep (reduce) 1962.8 (1999.1) -> 1962.4 (1999.8) MB, 782.9 / 0.0 ms (+ 327.0 ms in 75 steps since start of marking, biggest step 22.6 ms, walltime since start of marking 1219 ms) (average mu = 0.241, current mu = 0.24

bronze cloak
#

could you share your medusa-config and package.json please 🙂

dull kindle
#

{
"name": "xxxxx",
"version": "0.0.0",
"license": "MIT",
"scripts": {
"seed": "medusa seed -f ./data/seed.json",
"build": "rm -rf dist && ./node_modules/.bin/tsc -p tsconfig.json",
"start": "npm run build && NODE_ENV=development node ./dist/main.js",
"start:watch": "npx nodemon --watch './src//' --ext 'ts,json' --ignore 'src//*.spec.ts' --exec 'npm run build && NODE_ENV=development node ./dist/main.js'",
"start:prod": "npm run build && NODE_ENV=production node dist/main"
},
"private": true,
"resolutions": {},
"devDependencies": {
"@babel/cli": "^7.14.3",
"@babel/core": "^7.14.3",
"@babel/preset-typescript": "^7.13.0",
"@types/express": "^4.17.17",
"@types/jest": "^27.5.2",
"@types/jsonwebtoken": "^8.5.9",
"@types/multer": "^1.4.7",
"babel-preset-medusa-package": "^1.1.19",
"cross-env": "^5.2.1",
"jest": "^29.0.0",
"nodemon": "^2.0.20",
"supertest": "^4.0.2",
"ts-jest": "^29.0.0"
},
"dependencies": {
"@subtle grottojs/cache-redis": "1.8.2",
"@subtle grottojs/event-bus-redis": "1.8.2",
"@subtle grottojs/medusa": "1.8.2",
"@subtle grottojs/medusa-cli": "1.3.8",
"algoliasearch": "^4.14.3",
"async-mutex": "^0.4.0",
"clsx": "^1.1.1",
"core-js": "^3.6.5",
"crypto-browserify": "^3.12.0",
"date-fns": "^2.29.3",
"eslint": "~8.15.0",
"immutability-helper": "^3.1.1",
"ioredis": "^4.17.3",
"iso8601-duration": "^1.3.0",
"lodash": "^4.17.21",
"lodash-es": "^4.17.21",
"medusa-extender": "1.8.8",
"medusa-file-s3": "^1.1.12",
"medusa-fulfillment-manual": "1.1.37",
"medusa-interfaces": "1.3.7",
"medusa-payment-manual": "1.0.23",
"medusa-plugin-auth": "1.4.3",
"medusa-plugin-sendgrid": "1.3.9",
"randomatic": "^3.1.1",
"tslib": "^2.3.0",
"type-fest": "^2.16.0",
"typescript": "4.6.2",
"uuid": "^8.3.2"
},
"engines": {
"node": ">=16.0.0"
}
}

#

it seems we are trying to download 25mb/s from our DB.... it was not the case in 1.7.x just to get 1 product

dull kindle
#

it seems that the issue is coming from the SQL part : SELECT products... the request is too big. Maybe link to the new typeORM?

bronze cloak
#

I am sorry i am not on the computer anymore, this seems weird i have 300 products and 15k variants and prices and it works very well in few seconds. Each product contains 50 variants

#

I dont see your typeorm version in your packages, which one is it?

dull kindle
#

0.3.14 imported from medusajs/utils

#

if i call store/products/[id] without any expands or fields -> timeout & out of memory

#

store/products/[id]?expand=variants,images,options,variants.options -> 2sec for 1 product with 9 variants, 5 options and 14 images

bronze cloak
#

It seams very odd, for my product it takes around 300ms with 50 variants one few options and 1 prices per variant

#

Do you have any customization? Can you try without any customization if you have any?

#

My daughter is sick i have to go so sorry..

dull kindle
#

There is a breaking change in the pricing strategy ?

bronze cloak
#

Oh yeah indeed, you might not be able to use it straight away if you have a custom strategy

#

Anyway it should have worked with the actual 1.8.2 as it was fixed, the only other OOM i ve fixed earlier was for a user creating some product with 2.5k metadata

#

Could you first try by deactivating all your customisation just to take that off the table

#

I am on the waiting line on the phone at the moment 😂seems infinite

sharp sierra
#

Ayo
You guys are giving rare diamond for free. The product is made highly customisable and easy to code with.

#

Our internal demo people haven't seen in deep but when I see the api it's cool

#

Medusa has ton of potential