How to submit an energy simulation to Pollination using the API

serpoag · June 6, 2023, 7:33pm

Hi,
I am pretty new to using pollination cloud. I want to automatize a feedback loop workflow so models are sent for simulation as they are generated, and their results are downloaded as soon as they are available. My first attempt to do this using streamlit was unsuccessful, and I do not know what the problem is. I have used this topic as guide. My questions are the following:

What am I doing wrong?
Is it possible to send a honeybee model directly to the cloud without having to previously save it as a hbjson on a local drive? It is not a big deal for my project, but maybe it is possible.
What is the process to automatically receive the results from the simulation on Python as soon as they are available?
Thanks for your help.

api_key = 'my_key'
assert api_key is not None, 'You must provide valid Pollination API key.'

# project owner and project name - change them to your account and project names
owner = 'my_username'
project = 'demo'

api_client = ApiClient(api_token=api_key)

# We assume that the recipe has been added to the project manually
recipe = Recipe('ladybug-tools', 'annual-energy-use', '0.5.3', client=api_client)

# Import EPW and DDY files
# set default values
_percentile_ = 0.4
epw = EPW(f"my/epw/file/address")
out_path = os.path.join(f"my/out/file", 'ddy')
ddy_file = epw.to_ddy(out_path, _percentile_)

# for files and folder we have to provide the relative path to
recipe_inputs = {
    'ddy': ddy_file,
    'epw': epw,
    'model': None, # This changes with iterations
    'add_idf': None,
    'measures': None,
    'sim_par': None,
    'units': None,
    'viz_variables': None
}

# Save models as hbjsons
hbfolder = f"my/hbjson/folder"
for model_nr in range(len(my_models)):
    my_models[model_nr].to_hbjson(name = "model" + str(model_nr), folder=hbfolder)

# create a new study
new_study = NewJob(owner, project, recipe, client=api_client)

new_study.name = 'Parametric study submitted from Python'

study_inputs = []
for model in pathlib.Path(hbfolder).glob('*.hbjson'):
    inputs = dict(recipe_inputs)  # create a copy of the recipe
    # upload this model to the project
    # It is better to upload the files to a subfolder not to overwrite other files in
    # the project. In this case I call it dataset_1.
    # you can find them here: https://app.pollination.cloud/ladybug-tools/projects/demo?tab=files&path=dataset_1
    uploaded_path = new_study.upload_artifact(model, target_folder='dataset_1')
    inputs['model'] = uploaded_path
    inputs['model_id'] = model.stem  # I'm using the file name as the id.
    study_inputs.append(inputs)

# add the inputs to the study
# each set of inputs create a new run
new_study.arguments = study_inputs

# # create the study
running_study = new_study.create()

job_url = f'https://app.pollination.cloud/{running_study.owner}/projects/{running_study.project}/jobs/{running_study.id}'
print(job_url)
time.sleep(5)

status = running_study.status.status

while True:
    status_info = running_study.status
    print(f'\t# ------------------ #')
    print(f'\t# pending runs: {status_info.runs_pending}')
    print(f'\t# running runs: {status_info.runs_running}')
    print(f'\t# failed runs: {status_info.runs_failed}')
    print(f'\t# completed runs: {status_info.runs_completed}')
    if status in [
        JobStatusEnum.pre_processing, JobStatusEnum.running, JobStatusEnum.created,
        JobStatusEnum.unknown
        ]:
        time.sleep(30)
        running_study.refresh()
        status = status_info.status
    else:
        # study is finished
        time.sleep(2)
        break

I receive the following message

WARNING streamlit.runtime.caching.cache_data_api: No runtime found, using MemoryCacheStorageManager
Traceback (most recent call last):

  File ~\AppData\Local\anaconda3\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File \pollination_all_external.py:2715
    running_study = new_study.create()

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\interactors.py:165 in create
    qb_job = self.generate_qb_job()

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\interactors.py:174 in generate_qb_job
    arguments = self._generate_qb_job_arguments()

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\interactors.py:200 in _generate_qb_job_arguments
    run_args.append(JobPathArgument.parse_obj({

  File pydantic\main.py:526 in pydantic.main.BaseModel.parse_obj

  File pydantic\main.py:341 in pydantic.main.BaseModel.__init__

ValidationError: 7 validation errors for JobPathArgument
source -> type
  string does not match regex "^HTTP$" (type=value_error.str.regex; pattern=^HTTP$)
source -> url
  field required (type=value_error.missing)
source -> type
  string does not match regex "^S3$" (type=value_error.str.regex; pattern=^S3$)
source -> key
  field required (type=value_error.missing)
source -> endpoint
  field required (type=value_error.missing)
source -> bucket
  field required (type=value_error.missing)
source -> path
  str type expected (type=type_error.str)

mostapha · June 6, 2023, 7:44pm

Hi, @serpoag! Welcome to the forum.

Can you give us a bit more context on what you are trying to achieve? Are you trying this from inside Grasshopper? How are you creating the HBJSON models? Are you using apps because you couldn’t do it from the web interface?

I’m not sure why would you need to convert the models to HBJSON here if they have been already created.

# Save models as hbjsons
hbfolder = f"my/hbjson/folder"
for model_nr in range(len(my_models)):
    my_models[model_nr].to_hbjson(name = "model" + str(model_nr), folder=hbfolder)

The error that you get is a validation error of the object that is being submitted. I should be able to help you more once you give me a bit more context about your project.

serpoag · June 6, 2023, 9:34pm

Thanks for your reply!
I am working with multi-agent reinforcement learning as optimization method, thus I need permanent feedback from a perfomance evaluator. In the current stage of development, performance = annual energy use, but previously I have tested succesfully my workflow in grasshopper using solar irradiation as performance. Now that I have progressed to Honeybee and multiplied the number of agents, the best choice seems to use pollination cloud to expedite the evaluation of the geometries. I am still deciding on one of three options I have to implement a comunication platform with the cloud. I have tried using directly the pollination components available for grasshopper, but it seems to me that the way that gh works dimishes the flexibility of my workflow (eseentially because solutions have to be expired, complicating loops). Thus I reckon my best option is to execute everything directly from python. For this I have a version using CPython and Rhino.Inside. Here I generate the geometries with the Rhino library and then use the lbt libraries to get a hbjson. At these point I only need to upload, wait for evaluation results and download these results. This repeated several times for every agent. My third option (I think the less promissing) implies that all the grasshopper components get “eaten” by one python component within gh that does all the loops inside and just gives one output when it is done. This would work similarly to the CPython option, but would have more limitations.

mostapha · June 6, 2023, 9:49pm

Hi, @serpoag - thank you for providing more information. If you can use Python outside Grasshopper that is definitely preferred.

Now my next question is why would you need to use Streamlit at all? From what I understand your workflow is dependent on Rhino for geometry. So what you need to do is to create those geometries and then you can use the Pollination API directly to submit your studies and get the results. Am I understanding this correctly?

Can you share the whole script with me? Including the modules that you use. The first item that I would clean up is all the additional None inputs here.

recipe_inputs = {
    'ddy': ddy_file,
    'epw': epw,
    'model': None  # This changes with iterations
}

serpoag · June 7, 2023, 12:36pm

Hi again. There is no particular need to use Streamlit. I just saw the post and implemented it. I am not really used to REST APIs, so I am quite lost there. Do you have any example of Python code where geometries are locally generated, sent to the cloud with the API and then results downloaded automatically? That would be very helpful.

Oh, and I just read your previous post again. I missed telling you that I generate “Model” objects in the style of the “HB Model” component in Grasshopper. Thus .to_hbjson works to have a valid input for the recipe. I don’t think is possible to input “Models” directly to the recipe, is it? I also forgot to tell you that when I got the wrong input for epw and ddy, the hbjsons did upload, but the simulation did not run, and there was no printed hint for debugging.

Unfortunately, I cannot yet share my entire code, long and part of my thesis yet to be made public. I can, however, tell you how I manage to arrive to “Model” objects. As I was working in Grasshopper and got used to the components, I took chunks of code of Honeybee’s geometry handlers and freed them from their Grasshopper dependencies. That way, they can do the same thing they used to within the canvas but now directly from CPython. I am privately sharing the modified functions in GitHub with you as a collaborator if you want to check them out. The code is still messy and needs adding credits and such before I feel confident to make it public (even more, considering most of it is your team’s work with minor modifications).

So yes, the pathway is Rhino geometries → Honeybee Rooms (with shades, apertures by ratio, programs, etc.) → HB Model → hbjsons. I am open to suggestions! Thanks again.

mostapha · June 7, 2023, 1:05pm

Hi @serpoag, This is very helpful and makes it easier to help. I have two questions, and then I can share a sample file that takes one or several HBJSON files and download the results.

Do you want to submit several simulations together as part of one study? Or do you want to submit one simulation at a time?
What are the outputs that you are looking for? I’m thinking that you might only need to download the JSON output from the simulation instead of the SQLite file. Also, it might make sense to run the simulation annually or monthly instead of hourly if you are only looking for annual results.

UPDATE: I saw your code on GitHub. It will be helpful if you can also share a few HBJSON file that are generated from your script so the sample code will represent your case better.

serpoag · June 7, 2023, 2:41pm

Thanks for the prompt reply.

I submit several simulations together, as shown in the example code of my original post. Each simulation is focused on one agent (or, plainly, one building), while the rest of the agents become shades. As such, there is one model for each agent. Nevertheless, these simulation batches come in generations, so there are as many models as there are agents in each generation, and there are as many generations as the agents need to reinforce a desired behavior. The workflow profits from Pollination’s capability to simulate several jobs (individual models for each agent) in parallel at each generation.

Regarding the outputs, I guess JSON would be enough for intermediate results. I would want to have graphical outputs for the final result, though, not sure yet if on Grasshopper or somewhere else.

I have uploaded the hbjsons to my account using the code of my original post with the mistaken inputs for epw and ddy. On the file, there are 9 models corresponding to one generation. I do not mind keeping the intermediate steps, so each generation of hbjsons could be replaced by the newest one if needed. Hope this helps.

mostapha · June 7, 2023, 9:09pm

Hi @serpoag,

Here is a sample code that submits all the options, waits for them to be finished, and downloads the eui.json files. I tried to make it a bit more modular than the previous example. I also ended up using pollination-streamlit to keep the code shorter.

"""
Sample code for using the API to submit several HBJSON files from a folder to
Pollination, and download the results when ready.
"""
import pathlib
import time
import requests
from requests.exceptions import HTTPError
import zipfile
import tempfile
import shutil

from typing import List

from pollination_streamlit.api.client import ApiClient
from pollination_streamlit.interactors import NewJob, Recipe, Job
from queenbee.job.job import JobStatusEnum


def submit_study(
    study_name: str, api_client: ApiClient, owner: str, project: str, epw: pathlib.Path,
        ddy: pathlib.Path, models_folder: pathlib.Path) -> Job:

    print(f'Creating a new study: {study_name}')
    # Assumption: the recipe has been already added to the project
    recipe = Recipe('ladybug-tools', 'annual-energy-use', '0.5.3', client=api_client)

    input_folder = pathlib.Path(models_folder)

    # create a new study
    new_study = NewJob(owner, project, recipe, client=api_client)
    new_study.name = study_name
    new_study.description = f'Annual Energy Simulation {input_folder.name}'

    # upload the weather files - you only need to upload them once, and you can use
    # the path to them directly
    assert epw.is_file(), f'{epw} is not a valid file path.'
    assert ddy.is_file(), f'{ddy} is not a valid file path.'

    epw_path = new_study.upload_artifact(epw, target_folder='weather-data')
    ddy_path = new_study.upload_artifact(ddy, target_folder='weather-data')

    recipe_inputs = {
        'epw': epw_path,
        'ddy': ddy_path
    }

    study_inputs = []
    for model in input_folder.glob('*.hbjson'):
        inputs = dict(recipe_inputs)  # create a copy of the recipe
        # upload this model to the project
        print(f'Uploading model: {model.name}')
        uploaded_path = new_study.upload_artifact(model, target_folder=input_folder.name)
        inputs['model'] = uploaded_path
        inputs['model_id'] = model.stem  # use model name as the ID.
        study_inputs.append(inputs)

    # add the inputs to the study
    # each set of inputs create a new run
    new_study.arguments = study_inputs

    # # create the study
    running_study = new_study.create()

    job_url = f'https://app.pollination.cloud/{running_study.owner}/projects/{running_study.project}/jobs/{running_study.id}'
    print(job_url)
    time.sleep(5)
    return running_study


def check_study_status(study: Job):
    """"""
    status = study.status.status
    http_errors = 0
    while True:
        status_info = study.status
        print('\t# ------------------ #')
        print(f'\t# pending runs: {status_info.runs_pending}')
        print(f'\t# running runs: {status_info.runs_running}')
        print(f'\t# failed runs: {status_info.runs_failed}')
        print(f'\t# completed runs: {status_info.runs_completed}')
        if status in [
            JobStatusEnum.pre_processing, JobStatusEnum.running, JobStatusEnum.created,
            JobStatusEnum.unknown
        ]:
            time.sleep(15)
            try:
                study.refresh()
            except HTTPError as e:
                status_code = e.response.status_code
                print(str(e))
                if status_code == 500:
                    http_errors += 1
                    if http_errors > 3:
                        # failed for than 3 times with no success
                        raise HTTPError(e)
                    # wait for additional 15 seconds
                    time.sleep(10)
            else:
                http_errors = 0
                status = status_info.status
        else:
            # study is finished
            time.sleep(2)
            break


def _download_results(
    owner: str, project: str, study_id: int, download_folder: pathlib.Path,
    api_client: ApiClient, page: int = 1
        ):
    print(f'Downloading page {page}')
    per_page = 25
    url = f'https://api.pollination.cloud/projects/{owner}/{project}/runs'
    params = {
        'job_id': study_id,
        'status': 'Succeeded',
        'page': page,
        'per-page': per_page
    }
    response = requests.get(url, params=params, headers=api_client.headers)
    response_dict = response.json()
    runs = response_dict['resources']
    temp_dir = tempfile.TemporaryDirectory()
    # with tempfile.TemporaryDirectory() as temp_dir:
    if temp_dir:
        temp_folder = pathlib.Path(temp_dir.name)
        for run in runs:
            run_id = run['id']
            # the model-id is hardcoded in submit_study. This is not necessarily good
            # practice and makes the code to only be useful for this example.
            input_id = [
                inp['value']
                for inp in run['status']['inputs'] if inp['name'] == 'model_id'
            ][0]
            run_folder = temp_folder.joinpath(input_id)
            eui_file = run_folder.joinpath('eui.json')
            out_file = download_folder.joinpath(f'{input_id}.json')
            print(f'downloading {input_id}.json to {out_file.as_posix()}')
            run_folder.mkdir(parents=True, exist_ok=True)
            download_folder.mkdir(parents=True, exist_ok=True)
            url = f'https://api.pollination.cloud/projects/{owner}/{project}/runs/{run_id}/outputs/eui'
            signed_url = requests.get(url, headers=api_client.headers)
            output = api_client.download_artifact(signed_url=signed_url.json())
            with zipfile.ZipFile(output) as zip_folder:
                zip_folder.extractall(run_folder.as_posix())
            # move the json file to study folder
            shutil.copy(eui_file.as_posix(), out_file.as_posix())

    next_page = response_dict.get('next_page')
    if next_page is not None:
        time.sleep(1)
        _download_results(
            owner, project, study_id, download_folder, api_client, page=next_page
        )


def download_study_results(
        api_client: ApiClient, study: Job, output_folder: pathlib.Path):
    owner = study.owner
    project = study.project
    study_id = study.id

    _download_results(
        owner=owner, project=project, study_id=study_id, download_folder=output_folder,
        api_client=api_client
    )


if __name__ == '__main__':
    api_key = 'YOUR-API-KEY'
    assert api_key is not None, 'You must provide valid Pollination API key.'

    # project owner and project name - Change these!
    owner = 'mostapha'
    project = 'agent-based-energy-simulation'

    # change this to where the study folder is
    study_folder = pathlib.Path(__file__).parent
    input_folder = study_folder.joinpath('dataset_1')
    epw = study_folder.joinpath('PER_Arequipa.847520_IWEC.epw')
    ddy = study_folder.joinpath('PER_Arequipa.847520_IWEC.ddy')
    results_folder = study_folder.joinpath('results/dataset_1')
    name = 'YOUR-STUDY-NAME'
    api_client = ApiClient(api_token=api_key)

    study = submit_study(name, api_client, owner, project, epw, ddy, input_folder)
    # wait until the study is finished
    check_study_status(study=study)
    download_study_results(
        api_client=api_client, study=study, output_folder=results_folder
    )

Here is a zipped folder with the sample code, and all the input files.

submit_simulations.zip (302.5 KB)

Here is the project on Pollination.

Let me know if you have any other questions.

serpoag · June 8, 2023, 10:41am

Thanks @mostapha, it works great!
Just one last question. I noticed that now the simulations are taking longer than when I submitted them via the grasshopper component (on this study). I am assuming this is due to the simulation parameters. And I can only assume this because there is no file for the simulation parameters on the trial I get using directly python. Is there anything I am missing to be able to set and get this info?

EDIT: Nevermind, is an input to the recipe in JSON format, isn’t it? I assume that when set, it uploads the JSON file as well. Thanks again!

mostapha · June 8, 2023, 1:36pm

Hi @serpoag, No problem! Glad that it works.

The simulation parameters input expects a JSON file. You can save the parameters to a file and then upload them as an input just like how you upload the weather files and assign the path to the input.

serpoag · June 27, 2023, 2:08pm

Hi again,
I found an issue whilst working with your workflow.
I get an HTTP 500 server error. Any idea how to solve it or at least override without halting the code execution? I understand that this is maybe a temporary issue from the server side (because I have sent several runs without problems), so maybe we can retry for a limited number of times executing check_study_status (or any of the other functions) when such an error is raised? Thanks for the help!

https://app.pollination.cloud/centipede-llc/projects/second_tst/jobs/412def2c-8401-4334-bb0b-2919d03e4127
	# ------------------ #
	# pending runs: 0
	# running runs: 4
	# failed runs: 0
	# completed runs: 0
Traceback (most recent call last):

  File ~\AppData\Local\anaconda3\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File c:\users\arr18sep\desktop\pollination\marl_cloud.py:565
    check_study_status(study=study)

  File ~\Desktop\pollination\herp\pollination_interact.py:136 in check_study_status
    study.refresh()

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\interactors.py:95 in refresh
    self._fetch_runs()

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\interactors.py:88 in _fetch_runs
    self._runs = self.run_api.get_runs(self.owner, self.project, self.id)

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\api\runs.py:22 in get_runs
    return self._run_results_request(owner, project, job_id)

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\api\runs.py:10 in _run_results_request
    res = self.client.get(

  File ~\AppData\Local\anaconda3\lib\site-packages\pollination_streamlit\api\client.py:82 in get
    res.raise_for_status()

  File ~\AppData\Local\anaconda3\lib\site-packages\requests\models.py:1021 in raise_for_status
    raise HTTPError(http_error_msg, response=self)

HTTPError: 500 Server Error: Internal Server Error for url: https://api.pollination.cloud/projects/centipede-llc/second_tst/results?job_id=412def2c-8401-4334-bb0b-2919d03e4127&page=1

mostapha · June 27, 2023, 11:42pm

I’ll provide a sample code tomorrow.

mostapha · July 12, 2023, 9:21pm

Hi @serpoag,

I improved the code and replaced the zipped file with a new one. This is the part of the code that I changed inside the check_study_status function.

            try:
                study.refresh()
            except HTTPError as e:
                status_code = e.response.status_code
                print(str(e))
                if status_code == 500:
                    http_errors += 1
                    if http_errors > 3:
                        # failed for than 3 times with no success
                        raise HTTPError(e)
                    # wait for additional 15 seconds
                    time.sleep(15)
            else:
                http_errors = 0
                status = status_info.status

If any internal server error happens it waits for an additional 15 seconds before continuing. If it fails for more than 3 consecutive times it throws an error.

I ran the study a few times to make sure there are no errors. You can see the last one here: Pollination Cloud App