EnergyPlus simulation runs successfuly but the run stays unfinished for a long time

Hi folks,

My simulations have been extremely slow lately or have been failing again. Is this the same issue (I’ve read in another thread that this has been addressed)?

Hey @patryk_wozniczka! It should be the same issue. I don’t see any jobs running on production.

@antoinedao or @tyler can help with updating the UI so you can access the results.

We thought it was resolved but it was deeper than what we thought originally and there were several combinations that could cause the issue of job status not being updated on the UI. The good news is that the jobs run successfully and we need to fix the update status. The other good news is that @tyler has made tremendous progress on this issue and should be able to push an update to production soon.

Sorry that this is taking us longer than ideal and thank you for being patient with us! We will get it fix for you!

2 Likes

FYI this job for example (from 2 days ago) finished 79 of 144 runs within 1.5 hours (very quickly) but then slowed down and runs started failing. So I cancelled.

1 Like

Sounds good, @mostapha. Do you have any sense how long it might take you folks to fix this? The more reliable pollination is the easier it would be to integrate it into our workflows.

Thanks! I’m sure @tyler will try to recreate the issue and update you on this. We really appreciate reporting cases like this. The Job ID is 3ec3d6e4-9185-463e-93a3-bbb6e5ed08d8.

Let me give you an estimate after @tyler has a closer look. If this is the same issue then it should be really fast but that’s to be seen.

2 Likes

433a5b8e-67a8-4592-a8ce-1bca20f0de20

Runs keep failing:

But it says Energy Plus Completed Successfully

Is there any errors in the .err file for this run? You can find it as one of the outputs.

Hi @patryk_wozniczka ! My apologies for the bad experience here.

I just merged the updates I’ve been working on to improve the issue we’ve been having with lost run status updates. Would you be able to try to run the job again and let us know if the issue persists?

In the meantime, I will recreate this job on our test cluster and see if I can find anything else.

2 Likes

that’s good news!

I will run the job again.

@mostapha I did inspect the .err file (only some minor errors) and ran that model on a local machine to double check and it ran successfully.

1 Like

I ran the job again but increased the number of runs to 144 (which is what a complete study would be) and it is still very slow and failing:

Job ID: 6268445e-d1a9-4ebb-b536-3ed7c5e3070d

.err file says energy plus completed successfully with some errors

when I run these exact same models locally I’m able to read results from .sqls

1 Like

Thanks for letting us know. This turned out to be a different issue from the missing updates. It seems the output SQLite database files for these runs are very large (~2GB) and are causing issues in the compression step before they are moved to cloud storage.

We’re looking into some options and will let you know when we have a solution.

3 Likes

Hi @patryk_wozniczka :wave:

I wanted to give you an update on the fix we are rolling out to resolve the issue you encountered with your large simulations. As @tyler mentioned the run is failing because the process executing the run zips output files before staging them to another process which then unzips them and pushes them to our persistent storage layer. I know… It’s pretty convoluted :sweat_smile:

We have found a way around this that removes the need to zip and unzip output files and are testing it in our staging environment this week. I expect we will have a fix available for you by mid next week at the latest. This should result in the following outcomes:

  • Runs should no longer fail because of the size of your output files
  • Runs should be slightly faster as a whole zip + unzip step is removed from the process

Thanks again for pushing the platform to where it needs to be and helping us find bugs and optimise on top of them :raised_hands:

3 Likes

Hi @patryk_wozniczka :wave:,

I’m happy to let you know that we have fixed, tested and pushed an update to the app which allows your larger energy simulations to run to completion as expected. I would be super grateful if you could try to re-run the parametric experiment you had issues with 2 weeks ago and confirm that it all runs as expected.

Thanks again for being patient with us during our early access/beta period and for helping us make Pollination a more reliable product!

3 Likes