CapTest Workflow
CapTest is a convenient way to keep the measured
data, modeled data, test settings, and comparison plots together for one
capacity test.
The general workflow is still the same workflow described in CapData Workflow: load the
data, review the column groups, filter the measured and modeled data, calculate
reporting conditions, fit regressions, and compare the results. CapTest
helps with the pieces that are repeated from project to project:
Keeping the measured and modeled datasets together as
ct.measandct.sim.Applying a named regression setup, such as the standard ASTM E2848 equation or one of the bifacial options.
Storing common test values, such as nameplate capacity, test tolerance, irradiance filter limits, shade-filter settings, and bifaciality.
Creating comparison plots and pass/fail summaries from the same test object.
Reading and writing the test setup from a yaml file, which can be helpful when you want a repeatable project record.
A CapTest object keeps the test level requirements (e.g., minimum irradiance
and test tolerance) for the test in a single place. The raw measured data and PVsyst
data remain in the associated CapData objects, while the test-level assumptions
are stored on CapTest.
When to use CapTest
Almost always! The goal of the CapTest class is to streamline conducting tests whose regression equation includes regressors that are values calculated from the data available, like \(E_{Total}\) for a bifacial test.
If there is not a test setup available that fits the regression equation you are using, then you can create your own test setup and, if necessary, functions to create the calculated parameters. See the Custom Test Setups section for details on this process. Also, please open an issue on Github to request adding a new test setup!
The examples in CapData Workflow use CapData
directly. That workflow is mostly unchanged and may be helpful while learning pvcaptest or while working through a non-standard analysis.
CapTest becomes more helpful when:
You want to use one of the standard regression setups without manually assigning a regression columns dictionary and then calling
process_regression_columnsto recursively process it.You want one place to store project-level assumptions such as AC nameplate, tolerance, bifaciality, and filter settings.
You plan to save the test setup in a yaml file and re-run it later.
You want comparison methods such as
captest_results(),overlay_scatters(), andresidual_plot()to use the same measured and modeled data automatically.
Choosing a test setup
The test_setup value tells pvcaptest which regression equation and default
measured/model column mappings to use. These setups are intended to cover common
capacity-test cases without requiring users to create one.
A test setup is a named preset that bundles everything needed to configure the regression for a capacity test. Each setup defines:
Regression formula — the model equation, such as the standard ASTM E2848 four-term formula or the bifacial temperature-corrected power formula.
Measured column mappings (
reg_cols_meas) — which measured data columns map to each regression variable, how multiple sensors are aggregated (sum or mean), and any calculated columns required by the setup (e.g.e_total,power_temp_correct).Modeled column mappings (
reg_cols_sim) — the corresponding PVsyst output columns and any calculated columns for the modeled side.Default reporting conditions — how each regression variable is aggregated to compute reporting conditions (e.g. 60th-percentile POA, mean ambient temperature and wind speed).
Scatter plot function — the plotting callable matched to the regression formula, used by
scatter_plots().
See Custom Test Setups for additional details and example.
The built-in options are:
e2848_defaultStandard ASTM E2848 regression:
\[P = E_{POA}\left(a_{1} + a_{2} E_{POA} + a_{3} T_{a} + a_{4} v\right)\]This is the default setup for monofacial capacity tests.
bifi_e2848_etotalUses the standard ASTM E2848 regression form, but replaces front-side POA with total irradiance:
\[E_{Total} = E_{POA} + E_{Rear} \varphi\]where \(\varphi\) is the bifaciality factor. This is useful for the NREL modified bifacial approach described in Bifacial Tests.
bifi_power_tcUses temperature-corrected power as the dependent variable and regresses it against front and rear irradiance. This setup creates a two-panel scatter plot so the front- and rear-side relationships can be reviewed separately.
e2848_spec_corrected_poaUses the standard ASTM E2848 regression form, but applies a First Solar spectral correction to POA irradiance before fitting the regression. This setup requires humidity and pressure data on the measured side and precipitable water from the PVsyst output. See Spectrally corrected POA (e2848_spec_corrected_poa) for the additional inputs.
Note
The built-in setup names are strings. For example,
test_setup='e2848_default' uses the standard ASTM E2848 setup, and
test_setup='bifi_e2848_etotal' uses the total-irradiance bifacial setup.
Note
Each built-in test setup maps its regression variables to specific
column group IDs — the string keys of the column_groups attribute.
The IDs hardcoded into the built-in setups are:
irr_poa— front-side plane-of-array irradiance (all setups)irr_rpoa— rear-side plane-of-array irradiance (bifi_e2848_etotal,bifi_power_tc)real_pwr_mtr— AC power meter (all setups)temp_amb— ambient temperature (all setups)wind_speed— wind speed (all setups)humidity— relative humidity (e2848_spec_corrected_poaonly)pressure— station pressure (e2848_spec_corrected_poaonly)
Warning
If your data uses different column group IDs (for example because your
column-group YAML template assigns a different name to your irradiance
sensor), the built-in setup will not find the expected groups and the
regression will fail or use incorrect data. In that case you must either
rename the column groups in your data to match the IDs above, or supply
a fully custom reg_cols_meas that references your actual column
group IDs. See Custom Test Setups for details.
Creating a CapTest
A CapTest can be created from from file paths, data
that has already been loaded, or from a yaml file. Using from_params will create
a CapTest object given file paths and is the option recommended for typical usage of
pvcaptest to interactivley run a test in a Jupyter notebook.
From data paths
If you provide paths to your data, CapTest will load the data for you.
ct = CapTest.from_params(
test_setup='bifi_e2848_etotal',
meas_path='./data/measured/',
sim_path='./data/pvsyst_results.csv',
bifaciality=0.15,
ac_nameplate=6_000_000,
test_tolerance='- 4',
meas_load_kwargs={
'group_columns': './path-to/column_groups.xlsx',
},
)
Measured data is loaded with load_data(), and modeled data
is loaded with load_pvsyst(). Extra loading options can be
passed with meas_load_kwargs and sim_load_kwargs.
Note
You will likely need / want to include meas_load_kwargs to load your column grouping from a file. See the examples.
From loaded data
If you have already loaded the measured and modeled data, pass the two
CapData objects to from_params().
from captest import CapTest, load_data, load_pvsyst
meas = load_data(path='./data/measured/')
sim = load_pvsyst(path='./data/pvsyst_results.csv')
ct = CapTest.from_params(
test_setup='e2848_default',
meas=meas,
sim=sim,
ac_nameplate=6_000_000,
test_tolerance='- 4',
)
ct.setup()
Note
Note, the last line that calls setup(). When manually constructing a
CapTest object as shown here this is a necessary step. See What setup does.
The measured data is then available as ct.meas and the modeled data is
available as ct.sim. Both are regular CapData objects, so the filtering,
plotting, reporting-condition, and regression methods used elsewhere in the
User Guide still apply.
From yaml
For repeatable project work, the capacity-test setup can be stored in a yaml
file and loaded with from_yaml().
captest:
test_setup: bifi_e2848_etotal
meas_path: ./data/measured/
sim_path: ./data/pvsyst.csv
ac_nameplate: 6_000_000
test_tolerance: "- 4"
min_irr: 400
max_irr: 1400
fshdbm: 1.0
bifaciality: 0.15
ct = CapTest.from_yaml('./project.yaml')
Relative meas_path and sim_path values are interpreted relative to the
yaml file location. This makes the yaml file portable with the project folder.
One yaml file can also contain more than one capacity-test setup. For example, the same bifacial project may be reviewed with both the total-irradiance E2848 setup and the temperature-corrected power setup.
captest_bifi_etotal:
test_setup: bifi_e2848_etotal
meas_path: ./data/measured/
sim_path: ./data/pvsyst.csv
ac_nameplate: 6_000_000
bifaciality: 0.15
captest_bifi_power_tc:
test_setup: bifi_power_tc
meas_path: ./data/measured/
sim_path: ./data/pvsyst.csv
ac_nameplate: 6_000_000
bifaciality: 0.15
ct_etotal = CapTest.from_yaml('./project.yaml', key='captest_bifi_etotal')
ct_power_tc = CapTest.from_yaml('./project.yaml', key='captest_bifi_power_tc')
What setup does
When CapTest has both measured and modeled data, it prepares each
CapData object for the selected test setup. This happens automatically when
using from_params() or
from_yaml() with both datasets present.
The setup step:
Assigns the regression equation for the selected
test_setup.Assigns the measured and modeled columns used by that equation.
Aggregates sensors where the setup calls for a sum or average.
Creates calculated columns required by the setup, such as
e_totalfor the total-irradiance bifacial setup orpower_temp_correctfor the temperature-corrected bifacial setup.Copies scalar values such as
bifaciality,power_temp_coeff,base_temp, andspectral_module_typeonto the measured and modeledCapDataobjects so calculated columns use the intended assumptions.
If you change a setup value after creating ct, call
setup() again before continuing.
Note
Calling setup() resets ct.meas.data_filtered and
ct.sim.data_filtered back to the unfiltered data. This is usually what
you want after changing the setup, but it also means filters should be
re-applied after calling setup() again.
Running a capacity test
After ct has been created, use ct.meas and ct.sim in the same way
you would use separate CapData objects.
The example below shows the general pattern. Actual filters should be selected to match the contract and test procedure.
# measured filters
ct.meas.filter_irr(ct.min_irr, ct.max_irr)
ct.meas.filter_outliers()
ct.rep_cond()
ct.meas.fit_regression()
# simulated data filters
ct.sim.filter_time(start='2026-03-26', end='2026-04-12')
ct.sim.filter_irr(ct.min_irr, ct.max_irr)
ct.sim.filter_pvsyst()
ct.sim.fit_regression()
cap_ratio = ct.captest_results_check_pvalues()
rep_cond() calculates reporting conditions
using the selected setup’s defaults. For the standard E2848 setup, POA is
calculated using the 60th percentile of filtered POA, while ambient temperature
and wind speed use the mean.
If reporting conditions should be calculated from modeled data instead of measured data, use:
ct.rep_cond(which='sim')
After reporting conditions are calculated, it is common to apply a second,
narrower irradiance filter around the reporting irradiance.
ct.rep_irr_filter_low and ct.rep_irr_filter_high provide the lower and
upper fractional bounds. With the default rep_irr_filter=0.2, these values
are 0.8 and 1.2. Using these attributes of the CapTest instance helps
to apply these consistently in the filtering of the measured and simulated data.
ct.meas.filter_irr(
ct.rep_irr_filter_low,
ct.rep_irr_filter_high,
ref_val='rep_irr',
)
ct.sim.filter_irr(
ct.rep_irr_filter_low,
ct.rep_irr_filter_high,
ref_val='rep_irr',
)
The ref_val='rep_irr' argument uses the CapData.rc attribute if it set.
Reviewing results
The main comparison method is
captest_results(). It predicts the measured
and modeled capacities at the reporting conditions, calculates the capacity
ratio, and can print a pass/fail summary using the AC nameplate and test
tolerance stored on your instance of Captest, e.g. ct.
cap_ratio = ct.captest_results()
Additional review methods are available from the same CapTest object:
captest_results_check_pvalues()compares results with and without high-p-value coefficients and highlights coefficients with p-values above 0.05.overlay_scatters()overlays the measured and modeled regression scatter plots.residual_plot()compares measured and modeled residuals against the regression variables.get_summary()combines the filter summaries forct.measandct.siminto one table.determine_pass_or_fail()applies the stored nameplate and tolerance to a capacity ratio.
Scatter plots
scatter_plots() creates the scatter plot that
matches the selected test_setup. By default it plots the measured data.
ct.scatter_plots()
ct.scatter_plots(which='sim')
The built-in scatter plots support several options that are useful during data review.
AM/PM split
Use split_day=True to show morning and afternoon points separately.
ct.scatter_plots(split_day=True)
By default pvcaptest tries to determine the split time from modeled clear-sky
GHI, when that information is available. Otherwise it uses "12:30". To set
the split time manually, pass split_time.
ct.scatter_plots(split_day=True, split_time='12:45')
The marker and color styles can be adjusted with am_color, pm_color,
am_marker, and pm_marker.
Temperature-corrected power view
Use tc_power=True to view temperature-corrected power on the y-axis for
setups where raw power is used in the regression. This plotting option can be used
to review temperature-corrected power vs POA irradiance regardless of the presence
of a temperature-corrected power term in the regression. This an independent calculation
of temperature-corrected power for the plot.
ct.scatter_plots(tc_power=True)
The layout can be controlled with tc_mode:
'replace'shows one plot with temperature-corrected power on the y-axis.'add_panel'shows raw power and temperature-corrected power in separate panels.'overlay'overlays raw power and temperature-corrected power in one plot.
ct.scatter_plots(tc_power=True, tc_mode='add_panel')
Note
The default temperature-corrected power calculation expects measured power,
POA irradiance, and back-of-module temperature column groups. If a project
uses different data, pass an explicit tc_power_calc dictionary that
points to the correct columns or column groups.
Passing a custom tc_power_calc dictionary can be used to calculate
cell temperature from POA irradiance, ambient temperature, wind speed,
module type, and mounting type. The dictionary must include a top-level
power calculation tuple that produces the temperature-corrected power
column, such as {'power': (power_temp_correct, {...})}.
Linked timeseries
Use timeseries=True to add a timeseries panel below the scatter plot.
Selections in the scatter plot and timeseries plot are linked, which can help
identify when unusual scatter points occurred.
ct.scatter_plots(timeseries=True)
ct.scatter_plots(split_day=True, tc_power=True, tc_mode='overlay',
timeseries=True)
Adjusting reporting conditions
The selected test_setup provides default reporting-condition calculations,
but they can be adjusted for a specific project. For example, to use the 55th
percentile POA while leaving the other reporting-condition variables at their
default calculations:
from captest.captest import perc_wrap
ct.rep_cond(func={'poa': perc_wrap(55)})
The same adjustment can be saved in yaml:
captest:
test_setup: e2848_default
overrides:
rep_conditions:
func:
poa: perc_55
The perc_55 shorthand is converted to the corresponding percentile
function when the yaml file is loaded.
Custom setups
Most users should start with one of the built-in setups. If a project requires a different regression equation, the setup can be customized by overriding the regression formula and the measured/model column mappings.
See Custom Test Setups for additional detail on the options.
For small changes to a built-in setup, use overrides. For example, a yaml
file can change the reporting-condition calculation without redefining the
whole setup.
For a fully custom regression, use test_setup: custom and provide:
reg_cols_meas: how the measured data columns map to the regression terms.reg_cols_sim: how the modeled data columns map to the regression terms.reg_fml: the regression formula.
For example:
captest:
test_setup: custom
meas_path: ./data/measured/
sim_path: ./data/pvsyst.csv
ac_nameplate: 6_000_000
overrides:
reg_fml: power ~ poa + t_amb
reg_cols_meas:
power: real_pwr_mtr
poa: irr_poa
t_amb: temp_amb
reg_cols_sim:
power: E_Grid
poa: GlobInc
t_amb: T_Amb
Note
Custom formulas and column mappings require more care than the built-in setups. Confirm that the measured and modeled columns use consistent units and represent the same physical quantities before comparing results.
Saving a test setup
to_yaml() writes the main test settings back
to a yaml file. This can be useful after adjusting a setup in a notebook and
wanting to save the settings for a future run.
ct.to_yaml('./project.yaml')
By default, to_yaml updates the selected section of an existing yaml file
and preserves other top-level sections, such as project metadata. It writes
test settings and paths, but it does not write the measured data, modeled data,
fitted regression results, or plots.
Spectrally corrected POA (e2848_spec_corrected_poa)
The e2848_spec_corrected_poa setup applies a First Solar spectral
correction to POA irradiance before running the standard ASTM E2848 regression.
This can be useful when spectral effects are part of the agreed test method.
The calculation uses pvlib’s First Solar spectral-correction method, based on the McCarthy 2024 PVPMC poster and the pvlib.spectrum.spectral_factor_firstsolar reference.
Measured data requirements:
A
humiditycolumn group with relative humidity in percent.A
pressurecolumn group with station pressure in hPa / mbar.A
sitedictionary on the measuredCapDataobject. This is populated whenload_data()is called with thesiteargument.
Modeled data requirements:
A
PrecWatcolumn in the PVsyst output. Configure the PVsyst export to include precipitable water.
Example:
from captest import CapTest, load_data, load_pvsyst
site = {
'loc': {'latitude': 33.01, 'longitude': -99.56,
'altitude': 500, 'tz': 'America/Chicago'},
'sys': {'surface_tilt': 0, 'surface_azimuth': 180, 'albedo': 0.2},
}
meas = load_data(path='./data/measured/', site=site)
sim = load_pvsyst(path='./data/pvsyst_results.csv')
ct = CapTest.from_params(
test_setup='e2848_spec_corrected_poa',
meas=meas,
sim=sim,
ac_nameplate=6_000_000,
test_tolerance='- 4',
spectral_module_type='cdte',
)
The corrected irradiance column is named poa_spec_corrected and is added
to both ct.meas.data and ct.sim.data during setup. The regression then
uses that corrected POA value in place of raw POA irradiance.