Bottlenecks and a couple of questions/issues

Hello,

I prepared a simple model for daylight study. Box geometry, multiple stories, rectangular glazings, extruded border window shades and some context (shades). I used standard (Generic_Exterior_Solar_Modifier_Set) modifiers.

When running the analysis with grid size of 2 (around 5k points) everything ran as expected - fast and results seemed reliable (as far as i could tell from the sparse grid).
When i changed the grid size to 1 (around 16k points) and ran the analysis it takes very long to calculate (running over 20h now) and seems to be stuck (from what i can tell from pollination debug tab).

Also, for some reason, there seem to be two copied jobs for the same run - started exactly at the same time and running same amount of time. I am sure that i did not set up two jobs by accident. I double checked that and also - previous runs on sparser grids did not create that issue.

You could see the model viewed from pollination below:

What could be the problem? Could you guys take a look at it?
I post a screenshot with project name and job ids (you can also see the time running there)

Thank you for your time and help,
Wujo

Hi @Wujo :wave:

Thanks for trying out Pollination and sending some feedback our way. We noticed some issues over the weekend where the status of certain jobs was not being updated properly. I have manually triggered an update for your jobs (1 and 2) and you can see that they are now marked as completed and took about 6 minutes to run.

This is an issue on our end which we will look into this week to understand why our job reporting system stopped over the weekend and avoid issues like this arising in the future. Thanks again for giving the platform a try and reporting back any issues you notice!! :raised_hands:

2 Likes

Thank you @antoine!

I made a couple of other test runs and problem still seems to remain.
I would like to also point your attention to duplicated jobs on pollination platform - it seems everytime (no matter the recipe) i start a new job - 2 are created instead (with different IDs). I am sure I create only one job, and only one is read from “Check Job Status” component.

Have a good day,
Wujo

I have the same issue, with two jobs running much longer than they should:
CLOUD:max/demo/e13e02bd-8b8b-446d-9676-5ccecaf7026a
CLOUD:max/demo/3758798d-ac89-4d3a-b62f-9201ebb52cef

1 Like

Hi @Max thanks for notifying us of this issue on your end. We’re actively looking into what is causing Jobs not to be updated and will keep you in the loop once we’ve resolved it. That being said I manually updated your two jobs just now so you should be able to access their results :raised_hands:

1 Like

@antoine has this been resolved yet? I appear to still have the issue, e.g. my last job (7df06d54-6768-4862-898a-ffad84f019ab) has been running for over an hour. Are others also experiencing this?

Hi @Max, for some reason I missed this one. Sorry!

The run was finished successfully actually in 3 minutes.

@antoine can manually update them to show up. Meanwhile feel free to re-run the job again if such cases happened again. This issue is on top of the list for us to fully debug. Thank you for reporting!

2 posts were split to a new topic: Running large energy models

Hi @Max (cc @wujo) :wave:

I’m happy to let you know that we are 90% sure that we have resolved the issue that you were encountering where jobs weren’t properly updating their state back to the platform. You should now be able to run your simulations without the time bottlenecks experienced over the last couple of weeks.

Thanks again for taking the time to test out the platform and reporting any issues you encountered. Please do let us know if you encounter a similar issue again. I am officially on “fix the scaling bugs” duty for the next month :raised_hands:

1 Like

Thanks @antoine! I sent a test off yesterday and it went fine so fingers crossed! Thanks for staying on top of things

2 Likes

Hi @antoine, I’m getting this error again (e.g. CLOUD:max/demo/ec322263-6490-43ed-aedc-f1754c255c9a).
Is it just me?

Hi again @antoine, FYI I sent off another batch of runs about 30min ago: 2 annual energy, 2 pmv comfort maps, and 2 annual daylight. There are about 700 points in the grid (sensorcount=200) so it should be done by now, however only the energy ones are completed and the other ones not.

The daylighting and comfort simulations take about 3min and 6min respectively when running locally.

I’m wondering if I’m the only only experiencing this? Haven’t seen others post about it.

Hi @Max :wave:

Thanks again for reporting back to us and sorry for the delayed response. I had a look at the jobs scheduled over the last couple of days and it seems like you are unfortunately the only person experiencing this issue right now :sweat_smile:

I ran a quick query against your demo project and found that the only failing runs are for a specific set of recipes and versions:

  • ladybug-tools/pmv-comfort-map:0.3.0
  • ladybug-tools/annual-daylight:0.6.3

I suspect the issue is that the update payloads sent by those recipes are too big and therefore not being processed by our run state management backend. I will look into it today and get back to you with a better timeline for a fix.

Looking at the logs from our backend I can see that your runs have finished (be it failed or succeeded), I will manually trigger an update on my end just now so you can keep working with these while we fix the issue on our end :raised_hands:

Hi again @max :wave: ,

Just getting back in touch to let you know that I have now updated the status of your Jobs that were stuck in a running state. I have also found the source of the bug and implemented a fix which means future Jobs should work as expected again.

The issue was due to a buggy upgrade in one of our dependencies and I have raised an issue on their end to fix it before we can upgrade to their latest release.

Thanks again for being patient with us during early access and helping us iron out bugs as they appear!

1 Like

Fantastic - I checked and it works now thanks!

1 Like