Another critical element of the project is to take various data feeds from sensors across the building (air quality, motion, energy, HVAC parameters) and use them to manage the building In the most optimal way. This will ensure efficient performance and seamless integration with the building management system for better outcomes (such as improved air quality, reduced energy usage etc.) for occupants and building managers.
To achieve safe and effective function across the project, its crucial that the AI model is developed with correct guard rails and relevant redundancies. For this, we are using a Deep Reinforcement Model (DRL) for training and development.
The Deep Reinforcement Model
The DRL is a model is currently being developed and tested using P-Blocks Level 8. To support this, a simulated environment that closely replicates the conditions of P-Blocks Level 8 has been created, enabling the model to learn appropriate responses. The model has been tested in a range of trial scenarios and the results shows that, compared to the conventional rule-based control (RBC) schedule strategy, the DRL agents dynamically increased ventilation, improving indoor air quality during occupied periods.
By learning to heat and cool on demand, the system reduced overall energy consumption by 6.88% – 7.32%, even with the higher ventilation rates. It is now undergoing further testing and optimisation under diverse operational conditions, including varying occupancy levels, seasonal changes, and different times of the day, to demonstrate its safe and effective deployment. Within the DRL, there are a number of Agents which are further explained below.

Figure 1: HVAC simulation for validation of our environment model, AI strategy and energy performance.
The Agents
The AI Model has various agents which have been designated zones within the building along with chiller and air handling unit agents. An agent is a virtual person that acts as an information processing broker for individual functions across the model. There is a strict policy for which these agents can operate allowing them to implement zone policies which both collaborate and/or compete with each other’s needs.
The agents are ingesting observation data from across various inputs (e.g. ambient temperature, total occupancy, zone temperature etc.) and the learned policies decide how the Deep Reinforcement Model (DRL) actions this data through the HVAC system.

Figure 2: Example of data inputs for the multi-agent system.