What Is Energy Forecasting? Definition & Guide

Key Takeaways

Solar energy forecasting predicts how much electricity a PV system will produce over a specific period, from the next hour to the next 25 years
Forecasts rely on three inputs: weather and irradiance data, system design parameters (tilt, azimuth, module specs), and loss models (shading, soiling, wiring, inverter clipping)
P50 forecasts represent the median expected production — there is a 50% chance actual output exceeds this value. P90 forecasts are used by lenders because there is a 90% chance production meets or exceeds the P90 figure
Short-term forecasts (hours to days) support grid operations and trading, while long-term forecasts (annual/lifetime) drive financial models and investment decisions
Forecast accuracy depends on the quality of the weather dataset — TMY (Typical Meteorological Year) data from sources like NREL NSRDB or Meteonorm is standard for long-term production estimates
Modern solar design software integrates forecasting directly into the design workflow, so production estimates update automatically as designers adjust panel layouts and system configurations

What Is Solar Energy Forecasting?

Solar energy forecasting is the process of predicting how much electricity a photovoltaic system will generate over a defined period. Every solar project — residential, commercial, or utility-scale — requires a production forecast before financing, permitting, or construction can proceed. Without a reliable forecast, installers cannot provide accurate savings estimates, lenders cannot underwrite loans, and grid operators cannot plan dispatch schedules.

A solar production forecast answers the most basic question every homeowner and investor asks: “How much electricity will this system actually produce?” The answer determines payback periods, ROI calculations, and whether the project gets built at all.

The forecasting process combines three categories of input data: the solar resource at the site (irradiance, temperature, wind speed), the system design (module type, inverter specifications, tilt, azimuth, string configuration), and loss factors (shading, soiling, wiring resistance, module mismatch, inverter clipping, and degradation over time). A generation and financial tool processes these inputs to produce hourly, monthly, and annual energy estimates.

Types of Solar Energy Forecasts

Short-Term Forecasting

Hours to days ahead — Uses numerical weather prediction (NWP) models, satellite imagery, and sky cameras to predict output in near-real-time. Grid operators and energy traders use short-term forecasts to balance supply and demand, schedule reserves, and optimize day-ahead market bids. Accuracy ranges from 5–15% RMSE depending on cloud variability.

Medium-Term Forecasting

Weeks to months ahead — Combines historical irradiance patterns with seasonal climate models to predict output over the next 1–6 months. Used for maintenance scheduling, energy contract planning, and cash flow projections. Seasonal decomposition and analog year methods are common approaches.

Long-Term Forecasting

Annual and lifetime (25+ years) — Uses TMY (Typical Meteorological Year) datasets or multi-year satellite-derived irradiance records to estimate lifetime production. This is the forecast that appears in proposals, loan applications, and investor reports. Annual degradation (typically 0.4–0.7% per year) is factored into each year’s estimate.

Probabilistic Forecasting

P50, P75, P90 confidence levels — Instead of a single production number, probabilistic forecasts express output as a distribution. P50 is the median estimate (50% chance of exceeding it). P90 is the conservative estimate used by banks — there is a 90% probability the system produces at least this amount. The spread between P50 and P90 reflects the uncertainty in the weather data and model.

Forecast Comparison by Timeframe

Forecast Type	Timeframe	Primary Data Sources	Typical Accuracy	Primary Use Case
Short-Term	1–72 hours	NWP models, satellite, sky cameras	5–15% RMSE	Grid balancing, energy trading
Medium-Term	1 week – 6 months	Historical irradiance, seasonal models	8–20% RMSE	Maintenance planning, contract bidding
Long-Term (Annual)	1–30 years	TMY data, satellite records (10–20 yr)	3–8% interannual variability	Financial models, loan underwriting
Probabilistic (P50)	Annual/lifetime	Multi-source irradiance, uncertainty models	Median estimate	Investor base case, expected returns
Probabilistic (P90)	Annual/lifetime	Same + uncertainty quantification	90% exceedance probability	Debt sizing, bankability reports

The Energy Forecasting Formula

Forecasted Energy Output

Forecasted Energy (kWh) = POA Irradiance (kWh/m²) × Array Area (m²) × Module Efficiency (%) × (1 − System Losses) × Availability Factor

Breaking down each component:

POA Irradiance — Plane of Array irradiance is the solar energy reaching the module surface after accounting for tilt, azimuth, and ground reflectance (albedo). Transposition models like Perez or Hay-Davies convert GHI/DNI/DHI data to POA values.
Array Area — Total active module area in square meters. For a 10 kW system using 400 W panels (each ~1.92 m²), this is 25 panels × 1.92 = 48 m².
Module Efficiency — The STC-rated conversion efficiency of the module, typically 19–23% for current monocrystalline panels.
System Losses — Combined losses from shading (2–15%), soiling (1–5%), wiring (1–3%), module mismatch (1–2%), inverter conversion (2–4%), and temperature derating (2–8%). Total system losses typically range from 10–25%.
Availability Factor — The fraction of time the system is operational, accounting for downtime due to maintenance, grid outages, or equipment failure. Residential systems typically assume 99%, commercial 97–99%.

Example: A 10 kW residential system in Phoenix, AZ.

POA Irradiance = 2,150 kWh/m² per year. Array area = 48 m². Module efficiency = 21%. System losses = 18%. Availability = 99%.

Forecasted Energy = 2,150 × 48 × 0.21 × (1 − 0.18) × 0.99 = 2,150 × 48 × 0.21 × 0.82 × 0.99 = 17,595 kWh/year

This matches the commonly cited rule of thumb for Phoenix: roughly 1,700–1,800 kWh per kWp per year.

P50 vs. P90: What They Mean for Financial Models

P50 and P90 are not different forecasting methods — they are different confidence levels applied to the same underlying forecast model. A P50 estimate of 15,000 kWh/year means there is a 50% chance the system produces more than 15,000 kWh and a 50% chance it produces less. A P90 estimate of 13,200 kWh/year means there is a 90% chance the system produces at least 13,200 kWh. The P90 value is always lower than P50.

Banks and investors use P90 (sometimes P75) to size debt because they need confidence that loan payments can be covered even in below-average solar years. Equity investors and homeowner proposals typically use P50 because it represents the expected long-run average. The gap between P50 and P90 is usually 8–15%, driven by interannual weather variability and model uncertainty. A narrow P50/P90 gap indicates a site with consistent solar resource and a well-validated model.

Data Sources for PV Energy Prediction

Forecast accuracy depends directly on the quality of the input irradiance data. The primary sources used in the solar industry:

Data Source	Provider	Coverage	Resolution	Typical Use
NSRDB (National Solar Radiation Database)	NREL	Americas, parts of Asia	4 km, 30-min	U.S. residential and commercial projects
PVGIS	European Commission JRC	Europe, Africa, Asia	0.05°, hourly	European project feasibility
Meteonorm	Meteotest	Global	Interpolated, hourly	Bankable reports, international projects
SolarAnywhere	Clean Power Research	Americas	1 km, 30-min	Utility-scale, high-accuracy studies
Solcast	Solcast (DNV)	Global	1–2 km, 5-min	Real-time operations, short-term forecasting
ERA5	ECMWF (Copernicus)	Global	31 km, hourly	Research, climate trend analysis

TMY datasets synthesize 15–30 years of historical records into a single “typical” year. They are standard for long-term production estimates but cannot capture extreme weather years. For bankable forecasts, developers often supplement TMY data with site-specific measurements or multiple satellite-derived datasets.

Forecasting Accuracy and Uncertainty

No solar production forecast is exact. Understanding the sources of uncertainty is as important as the forecast itself.

Key sources of uncertainty:

Interannual weather variability — Solar irradiance at any location varies by 3–8% year to year. A TMY-based forecast represents the long-term average, but any single year can deviate significantly. This is the largest source of uncertainty for long-term forecasts.
Model uncertainty — Different simulation engines (PVsyst, SAM, SurgePV) use different transposition models, loss algorithms, and temperature coefficients. Model-to-model differences of 2–5% in annual production are common even with identical inputs.
Data source bias — Satellite-derived irradiance data can have systematic biases of 2–5% compared to ground measurements, especially in regions with complex terrain or frequent low clouds.
Shading estimation — Shading losses are one of the hardest parameters to forecast accurately. A solar design software with integrated 3D modeling and hour-by-hour shade simulation reduces this uncertainty compared to manual shade estimates.
Degradation assumptions — Long-term forecasts are sensitive to the assumed annual degradation rate. A difference of 0.1% per year compounds to a 2.5% difference over 25 years.

Pro Tip

When comparing production forecasts from different tools or providers, always check three things: which weather dataset was used, what losses were included (some tools exclude soiling or snow losses by default), and whether the number is a P50 or P90 estimate. Most discrepancies between competing proposals come from differences in these assumptions, not from the simulation engine itself.

How Forecasting Fits the Solar Design Workflow

In modern solar design platforms, energy forecasting is not a separate step — it runs continuously as part of the design process. When a designer adjusts panel placement, changes the tilt angle, or swaps an inverter model, the production forecast updates automatically.

This integration matters because it creates a feedback loop. A designer can see how moving a panel row to avoid a chimney shadow affects annual production, compare inverter options based on projected energy yield, and generate a customer proposal with production and savings estimates — all without leaving the design environment.

The generation and financial tool in SurgePV connects the production forecast directly to financial outputs: monthly savings, payback period, ROI, and cash flow projections over the system lifetime. The forecast feeds the financial model, and both appear in the customer proposal.

Forecast Validation and Monitoring

A forecast is only useful if it is validated against actual performance. Post-installation monitoring compares measured production to the original forecast, expressed as the performance ratio (PR) or capacity factor.

Common validation metrics:

Performance Ratio (PR) — Actual output divided by the expected output based on measured irradiance and system capacity. A PR above 80% is typical for well-designed systems.
Mean Bias Error (MBE) — The average difference between forecasted and actual production, showing systematic over- or under-prediction.
Root Mean Square Error (RMSE) — The standard deviation of forecast errors, capturing both bias and random scatter.

Industry data from IEA PVPS Task 13 shows that well-calibrated long-term forecasts typically achieve annual accuracy within 3–5% of actual production for sites with ground-measured irradiance data, and within 5–10% for sites relying solely on satellite data.

Generate Accurate Energy Forecasts for Every Project

SurgePV combines satellite irradiance data, 3D shade simulation, and component-level modeling to produce bankable production forecasts directly in the design workflow.

Book a Demo

No commitment required · 20 minutes · Live project walkthrough

Sources

Frequently Asked Questions

How accurate are solar energy forecasts?

Long-term annual forecasts (P50) are typically accurate within 3–8% of actual production when using validated satellite irradiance data and calibrated simulation models. Short-term forecasts (day-ahead) achieve 5–15% RMSE depending on cloud conditions. Accuracy improves with higher-quality irradiance data, site-specific measurements, and validated shading models. Using solar design software with integrated 3D shade analysis reduces one of the largest sources of forecast error.

What is the difference between P50 and P90 in solar forecasting?

P50 is the median production estimate — there is a 50% probability the system produces more than this value in any given year. P90 is the conservative estimate with a 90% probability of being met or exceeded. P90 is typically 8–15% lower than P50. Lenders use P90 to size loans because they need high confidence that debt service will be covered even in below-average solar years. Homeowner proposals and equity models typically use P50 because it represents the expected long-run average.

What data do I need to create a solar production forecast?

At minimum, you need site-specific irradiance data (GHI, DNI, and DHI from a source like NREL NSRDB or Meteonorm), ambient temperature data, the system design specifications (module wattage, quantity, tilt, azimuth, inverter model), and a shade analysis of the site. Additional inputs that improve accuracy include soiling rates for the region, snow loss estimates, wind speed data for temperature modeling, and the specific wiring and conduit layout for electrical loss calculations. A generation and financial tool automates most of these calculations once the system design is complete.