Unable to retry failed studies

Thank you for all the inputs, Mikkel! The model is running now.

So, I am running a parametric model with several inputs, and pollination has to run 162 parametric runs. However, a few (9/162) runs failed (it does not seem like any particular lead to model failing input). As for the same inputs, the model was able to run for the vertical sensors (smaller model)

I am also unable to retry the failed model using the retry button. Please let me know how to run the failed models again.

Hi @vanageso - Thank you for reporting this.

Can you tell me what happens when you try to re-run them? Does it give you an error?

We have also recently added an option to re-run all the failed runs in the whole study.

Let me know how it goes. If this doesn’t fix the problem, then my guess is that some of the input artifacts haven’t been uploaded successfully.

Tried batch the batch option. It still fails. What should be the next step then.

I could try to load the results and try to skip over the failed runs (unsure how to recognize if is failed run as I loop through them using the fly component) during the post processing as there are not as many that failed (and I could rerun them individually again later)

Please let me know if there is a better approach! Thank you for your inputs and help!

So when I tried re-running them it says running for a while and it doesn’t seem like it’s going through any number of steps its always (0/0). After a while it just says failed again. When It ran for the first time with all the runs there were a certain number of steps completed even for the failed runs. That dosen’t repeat when with the retry button.

Thank you for testing, @vanageso! Let me check on our end and report back.

Cc: @antoinedao

I had a look and this is a bug on our end! Sorry about that, @vanageso! :expressionless: I thought we had it fixed. I will document this for @antoinedao to help you with this issue.

Hi @vanageso :wave: I had a look at why the studies were not retrying correctly and have implemented a fix that should unblock you for the time being. The change is being deployed to our user facing environment as I type this so you should be able to retry in 5-10 minutes.

Let us know if you still encounter the same issue.

PS: I tested this by triggering a retry for the run with id c9eb0a30-1174-523d-9dd2-ae4dd8877d7b and it seems to have retried correctly. You can look for it on you end to see if this is the case. The run failed again so I suspect retrying won’t change much to the outcome and it might be an issues with the inputs or the recipe you are using.

1 Like

Thank you! I was able to retry the failed runs.

1 Like

Unable to retry again using both the batch retry button and individual cases

Hi @vanageso, Sorry about that! I was worried that it will affect you, but we had to revert some of the recent changes temporarily, and I assume that is why re-running these runs are blocked. @antoinedao, is running few tests on the server to ensure the changes won’t cause any permission issues before we merge them back in the next couple of days. We will keep you posted as soon as the changes are pushed back to production.

Thank you for your response! I will check back in a few days.

It’s working now. Thank you!

1 Like