CapTest Workflow

CapTest is a convenient way to keep the measured data, modeled data, test settings, and comparison plots together for one capacity test.

The general workflow is still the same workflow described in CapData Workflow: load the data, review the column groups, filter the measured and modeled data, calculate reporting conditions, fit regressions, and compare the results. CapTest helps with the pieces that are repeated from project to project:

  • Keeping the measured and modeled datasets together as ct.meas and ct.sim.

  • Applying a named regression setup, such as the standard ASTM E2848 equation or one of the bifacial options.

  • Storing common test values, such as nameplate capacity, test tolerance, irradiance filter limits, shade-filter settings, and bifaciality.

  • Creating comparison plots and pass/fail summaries from the same test object.

  • Reading and writing the test setup from a yaml file, which can be helpful when you want a repeatable project record.

A CapTest object keeps the test level requirements (e.g., minimum irradiance and test tolerance) for the test in a single place. The raw measured data and PVsyst data remain in the associated CapData objects, while the test-level assumptions are stored on CapTest.

When to use CapTest

Almost always! The goal of the CapTest class is to streamline conducting tests whose regression equation includes regressors that are values calculated from the data available, like \(E_{Total}\) for a bifacial test.

If there is not a test setup available that fits the regression equation you are using, then you can create your own test setup and, if necessary, functions to create the calculated parameters. See the Custom Test Setups section for details on this process. Also, please open an issue on Github to request adding a new test setup!

The examples in CapData Workflow use CapData directly. That workflow is mostly unchanged and may be helpful while learning pvcaptest or while working through a non-standard analysis.

CapTest becomes more helpful when:

  • You want to use one of the standard regression setups without manually assigning a regression columns dictionary and then calling process_regression_columns to recursively process it.

  • You want one place to store project-level assumptions such as AC nameplate, tolerance, bifaciality, and filter settings.

  • You plan to save the test setup in a yaml file and re-run it later.

  • You want comparison methods such as captest_results(), overlay_scatters(), and residual_plot() to use the same measured and modeled data automatically.

Choosing a test setup

The test_setup value tells pvcaptest which regression equation and default measured/model column mappings to use. These setups are intended to cover common capacity-test cases without requiring users to create one.

A test setup is a named preset that bundles everything needed to configure the regression for a capacity test. Each setup defines:

  • Regression formula — the model equation, such as the standard ASTM E2848 four-term formula or the bifacial temperature-corrected power formula.

  • Measured column mappings (reg_cols_meas) — which measured data columns map to each regression variable, how multiple sensors are aggregated (sum or mean), and any calculated columns required by the setup (e.g. e_total, power_temp_correct).

  • Modeled column mappings (reg_cols_sim) — the corresponding PVsyst output columns and any calculated columns for the modeled side.

  • Default reporting conditions — how each regression variable is aggregated to compute reporting conditions (e.g. 60th-percentile POA, mean ambient temperature and wind speed).

  • Scatter plot function — the plotting callable matched to the regression formula, used by scatter_plots().

See Custom Test Setups for additional details and example.

The built-in options are:

e2848_default

Standard ASTM E2848 regression:

\[P = E_{POA}\left(a_{1} + a_{2} E_{POA} + a_{3} T_{a} + a_{4} v\right)\]

This is the default setup for monofacial capacity tests.

bifi_e2848_etotal

Uses the standard ASTM E2848 regression form, but replaces front-side POA with total irradiance:

\[E_{Total} = E_{POA} + E_{Rear} \varphi\]

where \(\varphi\) is the bifaciality factor. This is useful for the NREL modified bifacial approach described in Bifacial Tests.

bifi_power_tc

Uses temperature-corrected power as the dependent variable and regresses it against front and rear irradiance. This setup creates a two-panel scatter plot so the front- and rear-side relationships can be reviewed separately.

e2848_spec_corrected_poa

Uses the standard ASTM E2848 regression form, but applies a First Solar spectral correction to POA irradiance before fitting the regression. This setup requires humidity and pressure data on the measured side and precipitable water from the PVsyst output. See Spectrally corrected POA (e2848_spec_corrected_poa) for the additional inputs.

Note

The built-in setup names are strings. For example, test_setup='e2848_default' uses the standard ASTM E2848 setup, and test_setup='bifi_e2848_etotal' uses the total-irradiance bifacial setup.

Note

Each built-in test setup maps its regression variables to specific column group IDs — the string keys of the column_groups attribute. The IDs hardcoded into the built-in setups are:

  • irr_poa — front-side plane-of-array irradiance (all setups)

  • irr_rpoa — rear-side plane-of-array irradiance (bifi_e2848_etotal, bifi_power_tc)

  • real_pwr_mtr — AC power meter (all setups)

  • temp_amb — ambient temperature (all setups)

  • wind_speed — wind speed (all setups)

  • humidity — relative humidity (e2848_spec_corrected_poa only)

  • pressure — station pressure (e2848_spec_corrected_poa only)

Warning

If your data uses different column group IDs (for example because your column-group YAML template assigns a different name to your irradiance sensor), the built-in setup will not find the expected groups and the regression will fail or use incorrect data. In that case you must either rename the column groups in your data to match the IDs above, or supply a fully custom reg_cols_meas that references your actual column group IDs. See Custom Test Setups for details.

Creating a CapTest

A CapTest can be created from from file paths, data that has already been loaded, or from a yaml file. Using from_params will create a CapTest object given file paths and is the option recommended for typical usage of pvcaptest to interactivley run a test in a Jupyter notebook.

From data paths

If you provide paths to your data, CapTest will load the data for you.

ct = CapTest.from_params(
    test_setup='bifi_e2848_etotal',
    meas_path='./data/measured/',
    sim_path='./data/pvsyst_results.csv',
    bifaciality=0.15,
    ac_nameplate=6_000_000,
    test_tolerance='- 4',
    meas_load_kwargs={
        'group_columns': './path-to/column_groups.xlsx',
    },
)

Measured data is loaded with load_data(), and modeled data is loaded with load_pvsyst(). Extra loading options can be passed with meas_load_kwargs and sim_load_kwargs.

Note

You will likely need / want to include meas_load_kwargs to load your column grouping from a file. See the examples.

From loaded data

If you have already loaded the measured and modeled data, pass the two CapData objects to from_params().

from captest import CapTest, load_data, load_pvsyst

meas = load_data(path='./data/measured/')
sim = load_pvsyst(path='./data/pvsyst_results.csv')

ct = CapTest.from_params(
    test_setup='e2848_default',
    meas=meas,
    sim=sim,
    ac_nameplate=6_000_000,
    test_tolerance='- 4',
)
ct.setup()

Note

Note, the last line that calls setup(). When manually constructing a CapTest object as shown here this is a necessary step. See What setup does.

The measured data is then available as ct.meas and the modeled data is available as ct.sim. Both are regular CapData objects, so the filtering, plotting, reporting-condition, and regression methods used elsewhere in the User Guide still apply.

From yaml

For repeatable project work, the capacity-test setup can be stored in a yaml file and loaded with from_yaml().

captest:
  test_setup: bifi_e2848_etotal
  meas_path: ./data/measured/
  sim_path: ./data/pvsyst.csv
  ac_nameplate: 6_000_000
  test_tolerance: "- 4"
  min_irr: 400
  max_irr: 1400
  fshdbm: 1.0
  bifaciality: 0.15
ct = CapTest.from_yaml('./project.yaml')

Relative meas_path and sim_path values are interpreted relative to the yaml file location. This makes the yaml file portable with the project folder.

One yaml file can also contain more than one capacity-test setup. For example, the same bifacial project may be reviewed with both the total-irradiance E2848 setup and the temperature-corrected power setup.

captest_bifi_etotal:
  test_setup: bifi_e2848_etotal
  meas_path: ./data/measured/
  sim_path: ./data/pvsyst.csv
  ac_nameplate: 6_000_000
  bifaciality: 0.15

captest_bifi_power_tc:
  test_setup: bifi_power_tc
  meas_path: ./data/measured/
  sim_path: ./data/pvsyst.csv
  ac_nameplate: 6_000_000
  bifaciality: 0.15
ct_etotal = CapTest.from_yaml('./project.yaml', key='captest_bifi_etotal')
ct_power_tc = CapTest.from_yaml('./project.yaml', key='captest_bifi_power_tc')

What setup does

When CapTest has both measured and modeled data, it prepares each CapData object for the selected test setup. This happens automatically when using from_params() or from_yaml() with both datasets present.

The setup step:

  • Assigns the regression equation for the selected test_setup.

  • Assigns the measured and modeled columns used by that equation.

  • Aggregates sensors where the setup calls for a sum or average.

  • Creates calculated columns required by the setup, such as e_total for the total-irradiance bifacial setup or power_temp_correct for the temperature-corrected bifacial setup.

  • Copies scalar values such as bifaciality, power_temp_coeff, base_temp, and spectral_module_type onto the measured and modeled CapData objects so calculated columns use the intended assumptions.

If you change a setup value after creating ct, call setup() again before continuing.

Note

Calling setup() resets ct.meas.data_filtered and ct.sim.data_filtered back to the unfiltered data. This is usually what you want after changing the setup, but it also means filters should be re-applied after calling setup() again.

Running a capacity test

After ct has been created, use ct.meas and ct.sim in the same way you would use separate CapData objects.

The example below shows the general pattern. Actual filters should be selected to match the contract and test procedure.

# measured filters
ct.meas.filter_irr(ct.min_irr, ct.max_irr)
ct.meas.filter_outliers()
ct.rep_cond()
ct.meas.fit_regression()

# simulated data filters
ct.sim.filter_time(start='2026-03-26', end='2026-04-12')
ct.sim.filter_irr(ct.min_irr, ct.max_irr)
ct.sim.filter_pvsyst()
ct.sim.fit_regression()

cap_ratio = ct.captest_results_check_pvalues()

rep_cond() calculates reporting conditions using the selected setup’s defaults. For the standard E2848 setup, POA is calculated using the 60th percentile of filtered POA, while ambient temperature and wind speed use the mean.

If reporting conditions should be calculated from modeled data instead of measured data, use:

ct.rep_cond(which='sim')

After reporting conditions are calculated, it is common to apply a second, narrower irradiance filter around the reporting irradiance. ct.rep_irr_filter_low and ct.rep_irr_filter_high provide the lower and upper fractional bounds. With the default rep_irr_filter=0.2, these values are 0.8 and 1.2. Using these attributes of the CapTest instance helps to apply these consistently in the filtering of the measured and simulated data.

ct.meas.filter_irr(
    ct.rep_irr_filter_low,
    ct.rep_irr_filter_high,
    ref_val='rep_irr',
)

ct.sim.filter_irr(
    ct.rep_irr_filter_low,
    ct.rep_irr_filter_high,
    ref_val='rep_irr',
)

The ref_val='rep_irr' argument uses the CapData.rc attribute if it set.

Reviewing results

The main comparison method is captest_results(). It predicts the measured and modeled capacities at the reporting conditions, calculates the capacity ratio, and can print a pass/fail summary using the AC nameplate and test tolerance stored on your instance of Captest, e.g. ct.

cap_ratio = ct.captest_results()

Additional review methods are available from the same CapTest object:

Scatter plots

scatter_plots() creates the scatter plot that matches the selected test_setup. By default it plots the measured data.

ct.scatter_plots()
ct.scatter_plots(which='sim')

The built-in scatter plots support several options that are useful during data review.

AM/PM split

Use split_day=True to show morning and afternoon points separately.

ct.scatter_plots(split_day=True)

By default pvcaptest tries to determine the split time from modeled clear-sky GHI, when that information is available. Otherwise it uses "12:30". To set the split time manually, pass split_time.

ct.scatter_plots(split_day=True, split_time='12:45')

The marker and color styles can be adjusted with am_color, pm_color, am_marker, and pm_marker.

Temperature-corrected power view

Use tc_power=True to view temperature-corrected power on the y-axis for setups where raw power is used in the regression. This plotting option can be used to review temperature-corrected power vs POA irradiance regardless of the presence of a temperature-corrected power term in the regression. This an independent calculation of temperature-corrected power for the plot.

ct.scatter_plots(tc_power=True)

The layout can be controlled with tc_mode:

  • 'replace' shows one plot with temperature-corrected power on the y-axis.

  • 'add_panel' shows raw power and temperature-corrected power in separate panels.

  • 'overlay' overlays raw power and temperature-corrected power in one plot.

ct.scatter_plots(tc_power=True, tc_mode='add_panel')

Note

The default temperature-corrected power calculation expects measured power, POA irradiance, and back-of-module temperature column groups. If a project uses different data, pass an explicit tc_power_calc dictionary that points to the correct columns or column groups.

Passing a custom tc_power_calc dictionary can be used to calculate cell temperature from POA irradiance, ambient temperature, wind speed, module type, and mounting type. The dictionary must include a top-level power calculation tuple that produces the temperature-corrected power column, such as {'power': (power_temp_correct, {...})}.

Linked timeseries

Use timeseries=True to add a timeseries panel below the scatter plot. Selections in the scatter plot and timeseries plot are linked, which can help identify when unusual scatter points occurred.

ct.scatter_plots(timeseries=True)
ct.scatter_plots(split_day=True, tc_power=True, tc_mode='overlay',
                 timeseries=True)

Adjusting reporting conditions

The selected test_setup provides default reporting-condition calculations, but they can be adjusted for a specific project. For example, to use the 55th percentile POA while leaving the other reporting-condition variables at their default calculations:

from captest.captest import perc_wrap

ct.rep_cond(func={'poa': perc_wrap(55)})

The same adjustment can be saved in yaml:

captest:
  test_setup: e2848_default
  overrides:
    rep_conditions:
      func:
        poa: perc_55

The perc_55 shorthand is converted to the corresponding percentile function when the yaml file is loaded.

Custom setups

Most users should start with one of the built-in setups. If a project requires a different regression equation, the setup can be customized by overriding the regression formula and the measured/model column mappings.

See Custom Test Setups for additional detail on the options.

For small changes to a built-in setup, use overrides. For example, a yaml file can change the reporting-condition calculation without redefining the whole setup.

For a fully custom regression, use test_setup: custom and provide:

  • reg_cols_meas: how the measured data columns map to the regression terms.

  • reg_cols_sim: how the modeled data columns map to the regression terms.

  • reg_fml: the regression formula.

For example:

captest:
  test_setup: custom
  meas_path: ./data/measured/
  sim_path: ./data/pvsyst.csv
  ac_nameplate: 6_000_000
  overrides:
    reg_fml: power ~ poa + t_amb
    reg_cols_meas:
      power: real_pwr_mtr
      poa: irr_poa
      t_amb: temp_amb
    reg_cols_sim:
      power: E_Grid
      poa: GlobInc
      t_amb: T_Amb

Note

Custom formulas and column mappings require more care than the built-in setups. Confirm that the measured and modeled columns use consistent units and represent the same physical quantities before comparing results.

Saving a test setup

to_yaml() writes the main test settings back to a yaml file. This can be useful after adjusting a setup in a notebook and wanting to save the settings for a future run.

ct.to_yaml('./project.yaml')

By default, to_yaml updates the selected section of an existing yaml file and preserves other top-level sections, such as project metadata. It writes test settings and paths, but it does not write the measured data, modeled data, fitted regression results, or plots.

Spectrally corrected POA (e2848_spec_corrected_poa)

The e2848_spec_corrected_poa setup applies a First Solar spectral correction to POA irradiance before running the standard ASTM E2848 regression. This can be useful when spectral effects are part of the agreed test method.

The calculation uses pvlib’s First Solar spectral-correction method, based on the McCarthy 2024 PVPMC poster and the pvlib.spectrum.spectral_factor_firstsolar reference.

Measured data requirements:

  • A humidity column group with relative humidity in percent.

  • A pressure column group with station pressure in hPa / mbar.

  • A site dictionary on the measured CapData object. This is populated when load_data() is called with the site argument.

Modeled data requirements:

  • A PrecWat column in the PVsyst output. Configure the PVsyst export to include precipitable water.

Example:

from captest import CapTest, load_data, load_pvsyst

site = {
    'loc': {'latitude': 33.01, 'longitude': -99.56,
            'altitude': 500, 'tz': 'America/Chicago'},
    'sys': {'surface_tilt': 0, 'surface_azimuth': 180, 'albedo': 0.2},
}
meas = load_data(path='./data/measured/', site=site)
sim = load_pvsyst(path='./data/pvsyst_results.csv')

ct = CapTest.from_params(
    test_setup='e2848_spec_corrected_poa',
    meas=meas,
    sim=sim,
    ac_nameplate=6_000_000,
    test_tolerance='- 4',
    spectral_module_type='cdte',
)

The corrected irradiance column is named poa_spec_corrected and is added to both ct.meas.data and ct.sim.data during setup. The regression then uses that corrected POA value in place of raw POA irradiance.