AI DRIVEN REAL TIME MONITORING AND PREDICTIVE ANALYTICS FOR DISTRIBUTED ENERGY RESOURCES IN MICROGRID

Esteves Bengui | MEL Candidate | Nov 29, 2025

Mentor: Airton Dudzevich

ABSTRACT

This project demonstrates a promising framework for renewable energy forecasting in environments constrained by limited historical data. By fusing physics-based modeling with GAN-powered corrections, we successfully created a workflow that circumvents the ‘cold start’ problem.

While the forecasting models proved sensitive to the severe measurement noise and outages observed in the 2023 dataset, their strong performance on cleaner 2022 and 2023 data validates the underlying methodology. This confirms that blending synthetic data with intelligent correction models is a viable strategy for closing the gap between idealized simulations and operational reality. Future work must focus on enhanced data filtering to handle the unpredictability of real-world microgrid operations.

INTRODUCTION

While Solar PV is a dominant force in global electrification, its inherent intermittency creates significant forecasting challenges. This unpredictability poses a major risk to utilities and microgrid owners, particularly in remote communities lacking central grid backup.

Traditional forecasting methods often suffer from computational bottlenecks or require pristine, multi-year datasets—requirements that typically prevent the deployment of AI models during the initial phase of a project. As the industry shifts toward modular microgrids to relieve strained central grids, the need for agile, autonomous Energy Management Systems (EMS) becomes critical. Consequently, developing AI-driven models capable of immediate, effective operation is key to enabling this transition.

This project addresses the core forecasting component of that challenge. By developing a workflow that does not depend on extensive historical datasets, we aim to bridge the gap between research and live application, making high-quality forecasting scalable for microgrids from their very first day of operation.

METHODOLOGY

This project creates a long-term, realistic PV profile by bridging the gap between theoretical models and real-world unpredictability. The workflow (Figure 1) proceeds in six stages:

  1. Data & Physics: We combine PVDAQ and NSRDB data to build a deterministic System Advisor Model (SAM) spanning 1998–2023.
  2. Generative AI: A conditional GAN learns the “residuals” dynamics like clouds and soiling that SAM misses to generate 25 years of synthetic, realistic data.
  3. Forecasting: This synthetic data trains XGBoost and LSTM models, which are validated against real test data.
  4. Deployment: The finalized models power a dashboard that replicates real-world forecasting operations.
Fig 3: Overview of project workflow

RESULTS

The two figures present a zoom-in analysis of the XGBoost Fine-Tuned (FT) model applied to real PV data from 2022 and 2023. Each plot shows the model’s performance on a representative low-noise day. The XGBoost FT model, trained on synthetic data and fine-tuned with real measurements, achieves a 1.77% MAE of the 4.7 MW capacity on the 2023 day, and 2.78% MAE on the 2022 day, demonstrating strong accuracy under stable operating conditions.

Fig 4: XGBOOST FT Model performance on 2022 Real PV data
Fig 5: XGBOOST Model performance on 2023 Real PV data

Zoom-in analysis of two representative low-noise days (2022-10-18 and 2023-09-08) shows that the LSTM Fine-Tuned model achieves high intrinsic accuracy when operated under stable irradiance conditions. The model yields an MAE of 0.78% of AC capacity on 2022 real data and 1.79% on 2023 real data, demonstrating strong generalization and confirming that most observed full-year errors originate from measurement noise and plant disturbances rather than model limitations.

Fig 5: LSTM FT Model performance on 2022 Real PV data
Fig. 5: LSTM FT Model Performance on 2023 Real PV data

DISCUSION

The zoom-in evaluations of both the LSTM and XGBoost Fine-Tuned (FT) models on representative low-noise days provide strong evidence that the forecasting framework achieves high intrinsic accuracy when external disturbances are minimal. Under stable irradiance conditions, the fine-tuned models track the real PV output closely.

These results demonstrate two key insights:

  1. Model quality is high when evaluated on clean operational periods.
    Both architectures, deep learning (LSTM) and gradient boosting (XGBoost) exhibit excellent agreement with measured PV output when noise, curtailment, and sensor irregularities are absent.
  2. Most year-round forecasting error comes from real-world data noise, not from model limitations.
    The models generalize well across years and remain robust even when trained primarily on synthetic data and fine-tuned with limited real measurements. Deviations in full-year evaluations are largely attributable to measurement noise, shading events, tracking anomalies, or operational disturbances inherent to PVDAQ datasets.

CONCLUSION

The combination of physics-based modeling, GAN residual synthesis, and machine-learning fine-tuning produces highly accurate and generalizable PV forecasting models. Both the LSTM and XGBoost Fine-Tuned approaches exhibit similar behavioral patterns: low error during stable operating periods (1-3%), higher error when real data is noisy or disturbed, and strong transferability from synthetic to real measurements. This validates the overall hybrid methodology as a robust solution for forecasting in environments where high-fidelity PV data are limited, noisy, or inconsistent.

This project demonstrates a powerful framework for renewable-energy forecasting under data limitations. By integrating physics-based simulations with GAN-enhanced synthetic data and modern machine-learning architectures, it is possible to build forecasting systems that maintain high accuracy even when real-world conditions introduce measurement noise, curtailment events, shading anomalies, or sensor irregularities.

As the energy sector increasingly depends on accurate solar forecasts for grid stability, storage operation, demand planning, and microgrid optimization, hybrid synthetic-to-real approaches like this may represent a meaningful step forward. By bridging the gap between idealized simulations and actual PV plant behavior, this methodology offers a scalable, resilient, and future-ready foundation for forecasting in the next generation of low-carbon energy systems.

Further research should explore:

  • Minimum amount of real data required for model generalization
  • Testing the workflow with higher quality dataset
  • Developing preprocessing methods to mask/remove non weather er related disturbances

CONTACT

Esteves Bengui

Email: esteves.bengui.biz@outlook.com

Phone: 281 416 3369

Website: www.linkedin.com/in/esteves-bengui

Leave a Reply