captest package

Submodules

captest.capdata module

Provides the CapData class and supporting functions.

The CapData class provides methods for loading, filtering, and regressing solar data. A capacity test following the ASTM standard can be performed using a CapData object for the measured data and a seperate CapData object for the modeled data. The get_summary and captest_results functions accept two CapData objects as arguments and provide a summary of the data filtering steps and the results of the capacity test, respectively.

class captest.capdata.CapData(name)

Bases: object

Class to store capacity test data and column grouping.

CapData objects store a pandas dataframe of measured or simulated data and a dictionary grouping columns by type of measurement.

The column_groups dictionary allows maintaining the original column names while also grouping measurements of the same type from different sensors. Many of the methods for plotting and filtering data rely on the column groupings.

Parameters:
  • name (str) – Name for the CapData object.

  • data (pandas dataframe) – Used to store measured or simulated data imported from csv.

  • data_filtered (pandas dataframe) – Holds filtered data. Filtering methods act on and write to this attribute.

  • column_groups (dictionary) – Assigned by the group_columns method, which attempts to infer the type of measurement recorded in each column of the dataframe stored in the data attribute. For each inferred measurement type, group_columns creates an abbreviated name and a list of columns that contain measurements of that type. The abbreviated names are the keys and the corresponding values are the lists of columns.

  • regression_cols (dictionary) – Dictionary identifying which columns in data or groups of columns as identified by the keys of column_groups are the independent variables of the ASTM Capacity test regression equation. Set using set_regression_cols or by directly assigning a dictionary.

  • summary_ix (list of tuples) – Holds the row index data modified by the update_summary decorator function.

  • summary (list of dicts) – Holds the data modified by the update_summary decorator function.

  • rc (DataFrame) – Dataframe for the reporting conditions (poa, t_amb, and w_vel).

  • regression_results (statsmodels linear regression model) – Holds the linear regression model object.

  • regression_formula (str) – Regression formula to be fit to measured and simulated data. Must follow the requirements of statsmodels use of patsy.

  • tolerance (str) – String representing error band. Ex. ‘+ 3’, ‘+/- 3’, ‘- 5’ There must be space between the sign and number. Number is interpreted as a percent. For example, 5 percent is 5 not 0.05.

agg_sensors(agg_map=None)

Aggregate measurments of the same variable from different sensors.

Parameters:

agg_map (dict, default None) – Dictionary specifying aggregations to be performed on the specified groups from the column_groups attribute. The dictionary keys should be keys from the column_gruops attribute. The dictionary values should be aggregation functions. See pandas API documentation of Computations / descriptive statistics for a list of all options. By default the groups of columns assigned to the ‘power’, ‘poa’, ‘t_amb’, and ‘w_vel’ keys in the regression_cols attribute are aggregated: - sum power - mean of poa, t_amb, w_vel

Returns:

Acts in place on the data, data_filtered, and regression_cols attributes.

Return type:

None

Notes

This method is intended to be used before any filtering methods are applied. Filtering steps applied when this method is used will be lost.

column_groups_to_excel(save_to='./column_groups.xlsx')

Export the column groups attribute to an excel file.

Parameters:

save_to (str) – File path to save column groups to. Should include .xlsx.

copy()

Create and returns a copy of self.

data_columns_to_excel(sort_by_reversed_names=True)

Write the columns of data to an excel file as a template for a column grouping.

Parameters:

sort_by_inverted_names (bool, default False) – If true sort column names after reversing them.

Returns:

Writes to excel file at self.data_loader.path / ‘column_groups.xlsx’.

Return type:

None

drop_cols(columns)

Drop columns from CapData data and column_groups.

Parameters:

Columns (list) – List of columns to drop.

empty()

Return a boolean indicating if the CapData object contains data.

expanded_uncert(grp_to_term, k=1.96)

Calculate expanded uncertainty of the predicted power.

Adds instrument uncertainty and spatial uncertainty in quadrature and passes the result through the regression to calculate the Systematic Standard Uncertainty, which is then added in quadrature with the Random Standard Uncertainty of the regression and multiplied by the k factor, k.

1. Combine by adding in quadrature the spatial and instrument uncertainties for each measurand. 2. Add the absolute uncertainties from step 1 to each of the respective reporting conditions to determine a value for the reporting condition plus the uncertainty. 3. Calculate the predicted power using the RCs plus uncertainty three times i.e. calculate for each RC plus uncertainty. For example, to estimate the impact of the uncertainty of the reporting irradiance one would calculate expected power using the irradiance RC plus irradiance uncertainty at the reporting irradiance and the original temperature and wind reporting conditions that have not had any uncertainty added to them. 6. Calculate the percent difference between the three new expected power values that include uncertainty of the RCs and the expected power with the unmodified RC. 7. Take the square root of the sum of the squares of those three percent differences to obtain the Systematic Standard Uncertainty (bY).

Expects CapData to have a instrument_uncert and spatial_uncerts attributes with matching keys.

Parameters:
  • grp_to_term (dict) – Map the groups of measurement types to the term in the regression formula that was regressed against an aggregated value (typically mean) from that group.

  • k (numeric) – Coverage factor.

Return type:

Expanded uncertainty as a decimal value.

filter_clearsky(window_length=20, ghi_col=None, inplace=True, keep_clear=True, **kwargs)

Use pvlib detect_clearsky to remove periods with unstable irradiance.

The pvlib detect_clearsky function compares modeled clear sky ghi against measured clear sky ghi to detect periods of clear sky. Refer to the pvlib documentation for additional information.

By default uses data identified by the column_groups dictionary as ghi and modeled ghi. Issues warning if there is no modeled ghi data, or the measured ghi data has not been aggregated.

Parameters: window_length : int, default 20

Length of sliding time window in minutes. Must be greater than 2 periods. Default of 20 works well for 5 minute data intervals. pvlib default of 10 minutes works well for 1min data.

ghi_colstr, default None

The name of a column name of measured GHI data. Overrides default attempt to automatically identify a column of GHI data.

inplacebool, default True

When true removes periods with unstable irradiance. When false returns pvlib detect_clearsky results, which by default is a series of booleans.

keep_clearbool, default True

Set to False to keep cloudy periods.

**kwargs

kwargs are passed to pvlib detect_clearsky. See pvlib documentation for details.

filter_custom(func, *args, **kwargs)

Apply update_summary decorator to passed function.

Parameters:
  • func (function) – Any function that takes a dataframe as the first argument and returns a dataframe. Many pandas dataframe methods meet this requirement, like pd.DataFrame.between_time.

  • *args – Additional positional arguments passed to func.

  • **kwds – Additional keyword arguments passed to func.

Examples

Example use of the pandas dropna method to remove rows with missing data.

>>> das.custom_filter(pd.DataFrame.dropna, axis=0, how='any')
>>> summary = das.get_summary()
>>> summary['pts_before_filter'][0]
1424
>>> summary['pts_removed'][0]
16

Example use of the pandas between_time method to remove time periods.

>>> das.reset_filter()
>>> das.custom_filter(pd.DataFrame.between_time, '9:00', '13:00')
>>> summary = das.get_summary()
>>> summary['pts_before_filter'][0]
245
>>> summary['pts_removed'][0]
1195
>>> das.data_filtered.index[0].hour
9
>>> das.data_filtered.index[-1].hour
13
filter_days(days, drop=False, inplace=True)

Select or drop timestamps for days passed.

Parameters:
  • days (list) – List of days to select or drop.

  • drop (bool, default False) – Set to true to drop the timestamps for the days passed instead of keeping only those days.

  • inplace (bool, default True) – If inplace is true, then function overwrites the filtered dataframe. If false returns a DataFrame.

filter_irr(low, high, ref_val=None, col_name=None, inplace=True)

Filter on irradiance values.

Parameters:
  • low (float or int) – Minimum value as fraction (0.8) or absolute 200 (W/m^2).

  • high (float or int) – Max value as fraction (1.2) or absolute 800 (W/m^2).

  • ref_val (float or int or self_val) – Must provide arg when low and high are fractions. Pass self_val to use the value in self.rc.

  • col_name (str, default None) – Column name of irradiance data to filter. By default uses the POA irradiance set in regression_cols attribute or average of the POA columns.

  • inplace (bool, default True) – Default true write back to data_filtered or return filtered dataframe.

Returns:

Filtered dataframe if inplace is False.

Return type:

DataFrame

filter_missing(columns=None)

Drops time intervals with missing data for specified columns.

By default drops intervals which have missing data in the columns defined by regression_cols.

Parameters:

columns (list, default None) – Subset of columns to check for missing data.

filter_op_state(op_state, mult_inv=None, inplace=True)

NOT CURRENTLY IMPLEMENTED - Filter on inverter operation state.

This filter is rarely useful in practice, but will be re-implemented if requested.

Parameters:
  • data (str) – ‘sim’ or ‘das’ determines if filter is on sim or das data

  • op_state (int) – integer inverter operating state to keep

  • mult_inv (list of tuples, [(start, stop, op_state), ...]) – List of tuples where start is the first column of an type of inverter, stop is the last column and op_state is the operating state for the inverter type.

  • inplace (bool, default True) – When True writes over current filtered dataframe. When False returns CapData object.

Returns:

Returns filtered CapData object when inplace is False.

Return type:

CapData

filter_outliers(inplace=True, **kwargs)

Apply eliptic envelope from scikit-learn to remove outliers.

Parameters:
  • inplace (bool) – Default of true writes filtered dataframe back to data_filtered attribute.

  • **kwargs – Passed to sklearn EllipticEnvelope. Contamination keyword is useful to adjust proportion of outliers in dataset. Default is 0.04.

filter_pf(pf, inplace=True)

Filter data on the power factor.

Parameters:
  • pf (float) – 0.999 or similar to remove timestamps with lower power factor values. Values greater than or equal to pf are kept.

  • inplace (bool) – Default of true writes filtered dataframe back to data_filtered attribute.

Return type:

Dataframe when inplace is False.

filter_power(power, percent=None, columns=None, inplace=True)

Remove data above the specified power threshold.

Parameters:
  • power (numeric) – If percent is none, all data equal to or greater than power is removed. If percent is not None, then power should be the nameplate power.

  • percent (None, or numeric, default None) – Data greater than or equal to percent of power is removed. Specify percentage as decimal i.e. 1% is passed as 0.01.

  • columns (None or str, default None) – By default filter is applied to the power data identified in the regression_cols attribute. Pass a column name or column group to filter on. When passing a column group the power filter is applied to each column in the group.

  • inplace (bool, default True) – Default of true writes filtered dataframe back to data_filtered attribute.

Return type:

Dataframe when inplace is false.

filter_pvsyst(inplace=True)

Filter pvsyst data for off max power point tracking operation.

This function is only applicable to simulated data generated by PVsyst. Filters the ‘IL Pmin’, IL Vmin’, ‘IL Pmax’, ‘IL Vmax’ values if they are greater than 0.

Parameters:

inplace (bool, default True) – If inplace is true, then function overwrites the filtered data. If false returns a CapData object.

Return type:

CapData object if inplace is set to False.

filter_sensors(perc_diff=None, inplace=True, row_filter=<function check_all_perc_diff_comb>)

Drop suspicious measurments by comparing values from different sensors.

This method ignores columns generated by the agg_sensors method.

Parameters:
  • perc_diff (dict) – Dictionary to specify a different threshold for each group of sensors. Dictionary keys should be translation dictionary keys and values are floats, like {‘irr-poa-’: 0.05}. By default the poa sensors as set by the regression_cols dictionary are filtered with a 5% percent difference threshold.

  • inplace (bool, default True) – If True, writes over current filtered dataframe. If False, returns CapData object.

Returns:

Returns filtered dataframe if inplace is False.

Return type:

DataFrame

filter_shade(fshdbm=1.0, query_str=None, inplace=True)

Remove data during periods of array shading.

The default behavior assumes the filter is applied to data output from PVsyst and removes all periods where values in the column ‘FShdBm’ are less than 1.0.

Use the query_str parameter when shading losses (power) rather than a shading fraction are available.

Parameters:
  • fshdbm (float, default 1.0) – The value for fractional shading of beam irradiance as given by the PVsyst output parameter FShdBm. Data is removed when the shading fraction is less than the value passed to fshdbm. By default all periods of shading are removed.

  • query_str (str) – Query string to pass to pd.DataFrame.query method. The query string should be a boolean expression comparing a column name to a numeric filter value, like ‘ShdLoss<=50’. The column name must not contain spaces.

  • inplace (bool, default True) – If inplace is true, then function overwrites the filtered dataframe. If false returns a DataFrame.

Returns:

If inplace is false returns a dataframe.

Return type:

pd.DataFrame

filter_time(start=None, end=None, drop=False, days=None, test_date=None, inplace=True, wrap_year=False)

Select data for a specified time period.

Parameters:
  • start (str or pd.Timestamp or None, default None) – Start date for data to be returned. If a string is passed it must be in format that can be converted by pandas.to_datetime. Not required if test_date and days arguments are passed.

  • end (str or pd.Timestamp or None, default None) – End date for data to be returned. If a string is passed it must be in format that can be converted by pandas.to_datetime. Not required if test_date and days arguments are passed.

  • drop (bool, default False) – Set to true to drop time period between start and end rather than keep it. Must supply start and end and wrap_year must be false.

  • days (int or None, default None) – Days in time period to be returned. Not required if start and end are specified.

  • test_date (str or pd.Timestamp or None, default None) – Must be format that can be converted by pandas.to_datetime. Not required if start and end are specified. Requires days argument. Time period returned will be centered on this date.

  • inplace (bool, default True) – If inplace is true, then function overwrites the filtered dataframe. If false returns a DataFrame.

  • wrap_year (bool, default False) – If true calls the wrap_year_end function. See wrap_year_end docstring for details. wrap_year_end was cntg_eoy prior to v0.7.0.

fit_regression(filter=False, inplace=True, summary=True)

Perform a regression with statsmodels on filtered data.

Parameters:
  • filter (bool, default False) – When true removes timestamps where the residuals are greater than two standard deviations. When false just calcualtes ordinary least squares regression.

  • inplace (bool, default True) – If filter is true and inplace is true, then function overwrites the filtered data for sim or das. If false returns a CapData object.

  • summary (bool, default True) – Set to false to not print regression summary.

Returns:

Returns a filtered CapData object if filter is True and inplace is False.

Return type:

CapData

get_filtering_table()

Returns DataFrame showing which filter removed each filtered time interval.

Time intervals removed are marked with a “1”. Time intervals kept are marked with a “0”. Time intervals removed by a previous filter are np.nan/blank. Columns/filters are in order they are run from left to right. The last column labeled “all_filters” shows is True for intervals that were not removed by any of the filters.

get_length_test_period()

Get length of test period.

Uses length of data unless filter_time has been run, then uses length of the kept data after filter_time was run the first time. Subsequent uses of filter_time are ignored.

Rounds up to a period of full days.

Returns:

Days in test period.

Return type:

int

get_pts_required(hrs_req=12.5)

Set number of data points required for complete test attribute.

Parameters:

hrs_req (numeric, default 12.5) – Number of hours to be represented by final filtered test data set. Default of 12.5 hours is dictated by ASTM E2848 and corresponds to 750 1-minute data points, 150 5-minute, or 50 15-minute points.

get_reg_cols(reg_vars=None, filtered_data=True)

Get regression columns renamed with keys from regression_cols.

Parameters:
  • reg_vars (list or str, default None) – By default returns all columns identified in regression_cols. A list with any combination of the keys of regression_cols is valid or pass a single key as a string.

  • filtered_data (bool, default true) – Return filtered or unfiltered data.

Return type:

DataFrame

get_summary()

Print a summary of filtering applied to the data_filtered attribute.

The summary dataframe shows the history of the filtering steps applied to the data including the timestamps remaining after each step, the timestamps removed by each step and the arguments used to call each filtering method.

If the filter arguments are cutoff, the max column width can be increased by setting pd.options.display.max_colwidth.

Parameters:

None

Return type:

Pandas DataFrame

plot(combine={'ghi_csky': '(?=.*ghi)(?=.*irr)', 'inv_sum_mtr_pwr': ['(?=.*real)(?=.*pwr)(?=.*mtr)', '(?=.*pwr)(?=.*agg)'], 'poa_csky': '(?=.*poa)(?=.*irr)', 'poa_ghi': 'irr.*(poa|ghi)$', 'temp_amb_bom': '(?=.*temp)((?=.*amb)|(?=.*bom))'}, default_groups=['inv_sum_mtr_pwr', '(?=.*real)(?=.*pwr)(?=.*inv)', '(?=.*real)(?=.*pwr)(?=.*mtr)', 'poa_ghi', 'poa_csky', 'ghi_csky', 'temp_amb_bom'], width=1500, height=250, **kwargs)

Create a dashboard to explore timeseries plots of the data.

The dashboard contains three tabs: Groups, Layout, and Overlay. The first tab, Groups, presents a column of plots with a separate plot overlaying the measurements for each group of the column_groups. The groups plotted are defined by the default_groups argument.

The second tab, Layout, allows manually selecting groups to plot. The button on this tab can be used to replace the column of plots on the Groups tab with the current figure on the Layout tab. Rerun this method after clicking the button to see the new plots in the Groups tab.

The third tab, Overlay, allows picking a group or any combination of individual tags to overlay on a single plot. The list of groups and tags can be filtered using regular expressions. Adding a text id in the box and clicking Update will add the current overlay to the list of groups on the Layout tab.

Parameters:
  • combine (dict, optional) – Dictionary of group names and regex strings to use to identify groups from column groups and individual tags (columns) to combine into new groups. See the parse_combine function for more details.

  • default_groups (list of str, optional) – List of regex strings to use to identify default groups to plot. See the plotting.find_default_groups function for more details.

  • group_width (int, optional) – The width of the plots on the Groups tab.

  • group_height (int, optional) – The height of the plots on the Groups tab.

  • **kwargs (optional) – Additional keyword arguments are passed to the options of the scatter plot.

Return type:

Panel tabbed layout

predict_capacities(irr_filter=True, percent_filter=20, **kwargs)

Calculate expected capacities.

Parameters:
  • irr_filter (bool, default True) – When true will filter each group of data by a percentage around the reporting irradiance for that group. The data groups are determined from the reporting irradiance attribute.

  • percent_filter (float or int or tuple, default 20) – Percentage or tuple of percentages used to filter around reporting irradiance in the irr_rc_balanced function. Required argument when irr_bal is True. Tuple option allows specifying different percentage for above and below reporting irradiance. (below, above)

  • **kwargs – NOTE: Should match kwargs used to calculate reporting conditions. Passed to filter_grps which passes on to pandas Grouper to control label and closed side of intervals. See pandas Grouper doucmentation for details. Default is left labeled and left closed.

print_points_summary(hrs_req=12.5)

print summary data on the number of points collected.

reg_scatter_matrix()

Create pandas scatter matrix of regression variables.

rep_cond(irr_bal=False, percent_filter=20, w_vel=None, inplace=True, func={'poa': <function perc_wrap.<locals>.numpy_percentile>, 't_amb': 'mean', 'w_vel': 'mean'}, freq=None, grouper_kwargs={}, rc_kwargs={})

Calculate reporting conditons.

Parameters:
  • irr_bal (boolean, default False) – If true, uses the irr_rc_balanced function to determine the reporting conditions. Replaces the calculations specified by func with or without freq.

  • percent_filter (Int, default 20) – Percentage as integer used to filter around reporting irradiance in the irr_rc_balanced function.

  • func (callable, string, dictionary, or list of string/callables) –

    Determines how the reporting condition is calculated. Default is a dictionary poa - 60th numpy_percentile, t_amb - mean

    w_vel - mean

    Can pass a string function (‘mean’) to calculate each reporting condition the same way.

  • freq (str) – String pandas offset alias to specify aggregation frequency for reporting condition calculation. Ex ‘60D’ for 60 Days or ‘MS’ for months start.

  • w_vel (int) – If w_vel is not none, then wind reporting condition will be set to value specified for predictions. Does not affect output unless pred is True and irr_bal is True.

  • inplace (bool, True by default) – When true updates object rc parameter, when false returns dicitionary of reporting conditions.

  • grouper_kwargs (dict) – Passed to pandas Grouper to control label and closed side of intervals. See pandas Grouper doucmentation for details. Default is left labeled and left closed.

  • rc_kwargs (dict) – Passed to the irr_rc_balanced function if irr_bal is set to True.

Returns:

  • dict – Returns a dictionary of reporting conditions if inplace=False otherwise returns None.

  • pandas DataFrame – If pred=True, then returns a pandas dataframe of results.

reset_agg()

Remove aggregation columns from data and data_filtered attributes.

Does not reset filtering of data or data_filtered.

reset_filter()

Set data_filtered to data and reset filtering summary.

Parameters:

data (str) – ‘sim’ or ‘das’ determines if filter is on sim or das data.

review_column_groups()

Print column_groups with nice formatting.

scatter(filtered=True)

Create scatter plot of irradiance vs power.

Parameters:

filtered (bool, default true) – Plots filtered data when true and all data when false.

scatter_filters()

Returns an overlay of scatter plots of intervals removed for each filter.

A scatter plot of power vs irradiance is generated for the time intervals removed for each filtering step. Each of these plots is labeled and overlayed.

scatter_hv(timeseries=False, all_reg_columns=False)

Create holoviews scatter plot of irradiance vs power.

Use holoviews opts magics in notebook cell before calling method to adjust height and width of plots:

%%opts Scatter [height=200, width=400] %%opts Curve [height=200, width=400]

Parameters:
  • timeseries (boolean, default False) – True adds timeseries plot of the data linked to the scatter plot. Points selected in teh scatter plot will be highlighted in the timeseries plot.

  • all_reg_columns (boolean, default False) – Set to True to include the data used in the regression in addition to poa irradiance and power in the hover tooltip.

set_regression_cols(power='', poa='', t_amb='', w_vel='')

Create a dictionary linking the regression variables to data.

Links the independent regression variables to the appropriate translation keys or a column name may be used to specify a single column of data.

Sets attribute and returns nothing.

Parameters:
  • power (str) – Translation key for the power variable.

  • poa (str) – Translation key for the plane of array (poa) irradiance variable.

  • t_amb (str) – Translation key for the ambient temperature variable.

  • w_vel (str) – Translation key for the wind velocity key.

set_test_complete(pts_required)

Sets test_complete attribute.

Parameters:

pts_required (int) – Number of points required to remain after filtering for a complete test.

spatial_uncert(column_groups)

Spatial uncertainties of the independent regression variables.

Parameters:

column_groups (list) – Measurement groups to calculate spatial uncertainty.

Return type:

None, stores dictionary of spatial uncertainties as an attribute.

timeseries_filters()

Returns an overlay of scatter plots of intervals removed for each filter.

A scatter plot of power vs irradiance is generated for the time intervals removed for each filtering step. Each of these plots is labeled and overlayed.

uncertainty()

Calculate random standard uncertainty of the regression.

(SEE times the square root of the leverage of the reporting conditions).

Not fully implemented yet. Need to review and determine what actual variable should be.

class captest.capdata.FilteredLocIndexer(_capdata)

Bases: object

Class to implement __getitem__ for indexing the CapData.data_filtered dataframe.

Allows passing a column_groups key, a list of column_groups keys, or a column or list of columns of the CapData.data_filtered dataframe.

class captest.capdata.LocIndexer(_capdata)

Bases: object

Class to implement __getitem__ for indexing the CapData.data dataframe.

Allows passing a column_groups key, a list of column_groups keys, or a column or list of columns of the CapData.data dataframe.

class captest.capdata.ReportingIrradiance(df, irr_col, **param)

Bases: Parameterized

params(df=DataFrame, irr_col=String, irr_rc=Number, max_percent_above=Integer, max_ref_irradiance=Integer, min_percent_below=Integer, min_ref_irradiance=Integer, percent_band=Integer, poa_flt=DataFrame, points_required=Integer, rc_irr_60th_perc=Number, total_pts=Number, name=String) Parameters of ‘ReportingIrradiance’ ===================================  Parameters changed from their default values are marked in red. Soft bound values are marked in cyan. C/V= Constant/Variable, RO/RW = ReadOnly/ReadWrite, AN=Allow None

Name Value Type Bounds Mode 

df None DataFrame V RW AN irr_col ‘GlobInc’ String V RW irr_rc 0.0 Number V RW poa_flt None DataFrame V RW AN total_pts 0.0 Number V RW rc_irr_60th_perc 0.0 Number V RW percent_band 20 Integer (2, 50) V RW min_percent_below 40 Integer V RW max_percent_above 60 Integer V RW min_ref_irradiance None Integer V RW AN max_ref_irradiance None Integer V RW AN points_required 750 Integer V RW

Parameter docstrings: =====================

df: Data to use to calculate reporting irradiance. irr_col: Name of column in df containing irradiance data. irr_rc: < No docstring available > poa_flt: < No docstring available > total_pts: < No docstring available > rc_irr_60th_perc: < No docstring available > percent_band: < No docstring available > min_percent_below: Minimum number of points as a percentage allowed below the reporting irradiance. max_percent_above: Maximum number of points as a percentage allowed above the reporting irradiance. min_ref_irradiance: Minimum value allowed for the reference irradiance. max_ref_irradiance: Maximum value allowed for the reference irradiance. By default this maximum is calculated by dividing the highest irradiance value in df by high. points_required: This is value is only used in the plot to overlay a horizontal line on the plot of the total points.

dashboard()
df = None
get_rep_irr()

Calculates the reporting irradiance.

Returns:

Float reporting irradiance and filtered dataframe.

Return type:

Tuple

irr_col = 'GlobInc'
irr_rc = 0.0
max_percent_above = 60
max_ref_irradiance = None
min_percent_below = 40
min_ref_irradiance = None
name = 'ReportingIrradiance'
percent_band = 20
plot()
poa_flt = None
points_required = 750
rc_irr_60th_perc = 0.0
save_csv(output_csv_path)

Save possible reporting irradiance data to csv file at given path.

save_plot(output_plot_path=None)

Save a plot of the possible reporting irradiances and time intervals.

Saves plot as an html file at path given.

output_plot_pathstr or Path

Path to save plot to.

total_pts = 0.0
captest.capdata.abs_diff_from_average(series, threshold)

Check each value in series <= average of other values.

Drops NaNs from series before calculating difference from average for each value.

Returns True if there is only one value in the series.

Parameters:
  • series (pd.Series) – Pandas series of values to check.

  • threshold (numeric) – Threshold value for absolute difference from average.

Return type:

bool

captest.capdata.captest_results(sim, das, nameplate, tolerance, check_pvalues=False, pval=0.05, print_res=True)

Print a summary indicating if system passed or failed capacity test.

NOTE: Method will try to adjust for 1000x differences in units.

Parameters:
  • sim (CapData) – CapData object for simulated data.

  • das (CapData) – CapData object for measured data.

  • nameplate (numeric) – Nameplate rating of the PV plant.

  • tolerance (str) – String representing error band. Ex. +/- 3’, ‘- 5’ There must be space between the sign and number. Number is interpreted as a percent. For example, 5 percent is 5 not 0.05.

  • check_pvalues (boolean, default False) – Set to true to check p values for each coefficient. If p values is greater than pval, then the coefficient is set to zero.

  • pval (float, default 0.05) – p value to use as cutoff. Regresion coefficients with a p value greater than pval will be set to zero.

  • print_res (boolean, default True) – Set to False to prevent printing results.

Returns:

  • Capacity test ratio - the capacity calculated from the reporting conditions

  • and the measured data divided by the capacity calculated from the reporting

  • conditions and the simulated data.

captest.capdata.captest_results_check_pvalues(sim, das, nameplate, tolerance, print_res=False, **kwargs)

Print a summary of the capacity test results.

Capacity ratio is the capacity calculated from the reporting conditions and the measured data divided by the capacity calculated from the reporting conditions and the simulated data.

The tolerance is applied to the capacity test ratio to determine if the test passes or fails.

Parameters:
  • sim (CapData) – CapData object for simulated data.

  • das (CapData) – CapData object for measured data.

  • nameplate (numeric) – Nameplate rating of the PV plant.

  • tolerance (str) – String representing error band. Ex. ‘+ 3’, ‘+/- 3’, ‘- 5’ There must be space between the sign and number. Number is interpreted as a percent. For example, 5 percent is 5 not 0.05.

  • print_res (boolean, default True) – Set to False to prevent printing results.

  • **kwargs – kwargs are passed to captest_results. See documentation for captest_results for options. check_pvalues is set in this method, so do not pass again.

  • Prints

  • zero. (Capacity ratio after setting paramters with high p-values to) –

  • zero.

  • coefficients. (P-values for simulated and measured regression) –

  • data. (Regression coefficients (parameters) for simulated and measured) –

captest.capdata.check_all_perc_diff_comb(series, perc_diff)

Check series for pairs of values with percent difference above perc_diff.

Calculates the percent difference between all combinations of two values in the passed series and checks if all of them are below the passed perc_diff.

Parameters:
  • series (pd.Series) – Pandas series of values to check.

  • perc_diff (float) – Percent difference threshold value as decimal i.e. 5% is 0.05.

Return type:

bool

captest.capdata.csky(time_source, loc=None, sys=None, concat=True, output='both')

Calculate clear sky poa and ghi.

Parameters:
  • time_source (dataframe or DatetimeIndex) – If passing a dataframe the index of the dataframe will be used. If the index does not have a timezone the timezone will be set using the timezone in the passed loc dictionary. If passing a DatetimeIndex with a timezone it will be returned directly. If passing a DatetimeIndex without a timezone the timezone in the timezone dictionary will be used.

  • loc (dict) –

    Dictionary of values required to instantiate a pvlib Location object.

    loc = {‘latitude’: float,

    ’longitude’: float, ‘altitude’: float/int, ‘tz’: str, int, float, or pytz.timezone, default ‘UTC’}

    See http://en.wikipedia.org/wiki/List_of_tz_database_time_zones for a list of valid time zones. pytz.timezone objects will be converted to strings. ints and floats must be in hours from UTC.

  • sys (dict) –

    Dictionary of keywords required to create a pvlib SingleAxisTrackerMount or FixedMount.

    Example dictionaries:

    fixed_sys = {‘surface_tilt’: 20,

    ’surface_azimuth’: 180, ‘albedo’: 0.2}

    tracker_sys1 = {‘axis_tilt’: 0, ‘axis_azimuth’: 0,

    ’max_angle’: 90, ‘backtrack’: True, ‘gcr’: 0.2, ‘albedo’: 0.2}

    Refer to pvlib documentation for details.

  • concat (bool, default True) – If concat is True then returns columns as defined by return argument added to passed dataframe, otherwise returns just clear sky data.

  • output (str, default 'both') – both - returns only total poa and ghi poa_all - returns all components of poa ghi_all - returns all components of ghi all - returns all components of poa and ghi

captest.capdata.determine_pass_or_fail(cap_ratio, tolerance, nameplate)

Determine a pass/fail result from a capacity ratio and test tolerance.

Parameters:
  • cap_ratio (float) – Ratio of the measured data regression result to the simulated data regression result.

  • tolerance (str) – String representing error band. Ex. ‘+/- 3’ or ‘- 5’ There must be space between the sign and number. Number is interpreted as a percent. For example, 5 percent is 5 not 0.05.

  • nameplate (numeric) – Nameplate rating of the PV plant.

Returns:

True for a passing test and false for a failing test. Limits for passing and failing test.

Return type:

tuple of boolean and string

captest.capdata.filter_grps(grps, rcs, irr_col, low, high, freq, **kwargs)

Apply irradiance filter around passsed reporting irradiances to groupby.

For each group in the grps argument the irradiance is filtered by a percentage around the reporting irradiance provided in rcs.

Parameters:
  • grps (pandas groupby) – Groupby object with time groups (months, seasons, etc.).

  • rcs (pandas DataFrame) – Dataframe of reporting conditions. Use the rep_cond method to generate a dataframe for this argument.

  • irr_col (str) – String that is the name of the column with the irradiance data.

  • low (float) – Minimum value as fraction e.g. 0.8.

  • high (float) – Max value as fraction e.g. 1.2.

  • freq (str) – Frequency to groupby e.g. ‘MS’ for month start.

  • **kwargs – Passed to pandas Grouper to control label and closed side of intervals. See pandas Grouper doucmentation for details. Default is left labeled and left closed.

Return type:

pandas groupby

captest.capdata.filter_irr(df, irr_col, low, high, ref_val=None)

Top level filter on irradiance values.

Parameters:
  • df (DataFrame) – Dataframe to be filtered.

  • irr_col (str) – String that is the name of the column with the irradiance data.

  • low (float or int) – Minimum value as fraction (0.8) or absolute 200 (W/m^2)

  • high (float or int) – Max value as fraction (1.2) or absolute 800 (W/m^2)

  • ref_val (float or int) – Must provide arg when low/high are fractions

Return type:

DataFrame

captest.capdata.fit_model(df, fml='power ~ poa + I(poa * poa) + I(poa * t_amb) + I(poa * w_vel) - 1')

Fits linear regression using statsmodels to dataframe passed.

Dataframe must be first argument for use with pandas groupby object apply method.

Parameters:
  • df (pandas dataframe) –

  • fml (str) – Formula to fit refer to statsmodels and patsy documentation for format. Default is the formula in ASTM E2848.

Return type:

Statsmodels linear model regression results wrapper object.

captest.capdata.get_summary(*args)

Return summary dataframe of filtering steps for multiple CapData objects.

See documentation for the CapData.get_summary method for additional details.

captest.capdata.get_tz_index(time_source, loc)

Create DatetimeIndex with timezone aligned with location dictionary.

Handles generating a DatetimeIndex with a timezone for use as an agrument to pvlib ModelChain prepare_inputs method or pvlib Location get_clearsky method.

Parameters:

time_source (dataframe or DatetimeIndex) – If passing a dataframe the index of the dataframe will be used. If the index does not have a timezone the timezone will be set using the timezone in the passed loc dictionary. If passing a DatetimeIndex with a timezone it will be returned directly. If passing a DatetimeIndex without a timezone the timezone in the timezone dictionary will be used.

Return type:

DatetimeIndex with timezone

captest.capdata.highlight_pvals(s)

Highlight vals greater than or equal to 0.05 in a Series yellow.

captest.capdata.index_capdata(capdata, label, filtered=True)

Like Dataframe.loc but for CapData objects.

Pass a single label or list of labels to select the columns from the data or data_filtered DataFrames. The label can be a column name, a column group key, or a regression column key.

The special label regcols will return the columns identified in regression_cols.

Parameters:
  • capdata (CapData) – The CapData object to select from.

  • label (str or list) – The label or list of labels to select from the data or data_filtered DataFrames. The label can be a column name, a column group key, or a regression column key. The special label regcols will return the columns identified in regression_cols.

  • filtered (bool, default True) – By default the method will return columns from the data_filtered DataFrame. Set to False to return columns from the data DataFrame.

Return type:

DataFrame

captest.capdata.overlay_scatters(measured, expected, expected_label='PVsyst')

Plot labeled overlay scatter of final filtered measured and simulated data.

Parameters:
  • measured (Overlay) – Holoviews overlay scatter plot produced from CapData object used to calculate reporting conditions.

  • expected (Overlay) – Holoviews overlay scatter plot produced from CapData object not used to calculate reporting conditions.

  • rcs_from_meas (bool) – If rest was run calculating reporting conditions from measured or simulated data.

Returns:

  • Overlay scatter plot of remaining data after filtering from measured and

  • simulated data.

captest.capdata.perc_bounds(percent_filter)

Convert +/- percentage to decimals to be used to determine bounds.

Parameters:

percent_filter (float or tuple, default None) – Percentage or tuple of percentages used to filter around reporting irradiance in the irr_rc_balanced function. Required argument when irr_bal is True.

Returns:

Decimal versions of the percent irradiance filter. 0.8 and 1.2 would be returned when passing 20 to the input.

Return type:

tuple

captest.capdata.perc_difference(x, y)

Calculate percent difference of two values.

captest.capdata.perc_wrap(p)

Wrap numpy percentile function for use in rep_cond method.

captest.capdata.pick_attr(sim, das, name)

Check for conflict between attributes of two CapData objects.

captest.capdata.pred_summary(grps, rcs, allowance, **kwargs)

Summarize reporting conditions, predicted cap, and gauranteed cap.

This method does not calculate reporting conditions.

Parameters:
  • grps (pandas groupby object) – Solar data grouped by season or month used to calculate reporting conditions. This argument is used to fit models for each group.

  • rcs (pandas dataframe) – Dataframe of reporting conditions used to predict capacities.

  • allowance (float) – Percent allowance to calculate gauranteed capacity from predicted capacity.

Returns:

  • Dataframe of reporting conditions, model coefficients, predicted capacities

  • gauranteed capacities, and points in each grouping.

captest.capdata.predict(regs, rcs)

Calculate predicted values for given linear models and predictor values.

Evaluates the first linear model in the iterable with the first row of the predictor values in the dataframe. Passed arguments must be aligned.

Parameters:
  • regs (iterable of statsmodels regression results wrappers) –

  • rcs (pandas dataframe) – Dataframe of predictor values used to evaluate each linear model. The column names must match the strings used in the regression formuala.

Return type:

Pandas series of predicted values.

captest.capdata.print_results(test_passed, expected, actual, cap_ratio, capacity, bounds)

Print formatted results of capacity test.

captest.capdata.pvlib_location(loc)

Create a pvlib location object.

Parameters:

loc (dict) –

Dictionary of values required to instantiate a pvlib Location object.

loc = {‘latitude’: float,

’longitude’: float, ‘altitude’: float/int, ‘tz’: str, int, float, or pytz.timezone, default ‘UTC’}

See http://en.wikipedia.org/wiki/List_of_tz_database_time_zones for a list of valid time zones. pytz.timezone objects will be converted to strings. ints and floats must be in hours from UTC.

Return type:

pvlib location object.

captest.capdata.pvlib_system(sys)

Create a pvlib PVSystem object.

The PVSystem will have either a FixedMount or a SingleAxisTrackerMount depending on the keys of the passed dictionary.

Parameters:

sys (dict) –

Dictionary of keywords required to create a pvlib SingleAxisTrackerMount or FixedMount, plus albedo.

Example dictionaries:

fixed_sys = {‘surface_tilt’: 20,

’surface_azimuth’: 180, ‘albedo’: 0.2}

tracker_sys1 = {‘axis_tilt’: 0, ‘axis_azimuth’: 0,

’max_angle’: 90, ‘backtrack’: True, ‘gcr’: 0.2, ‘albedo’: 0.2}

Refer to pvlib documentation for details.

Return type:

pvlib PVSystem object.

captest.capdata.round_kwarg_floats(kwarg_dict, decimals=3)

Round float values in a dictionary.

Parameters:
  • kwarg_dict (dict) –

  • decimals (int, default 3) – Number of decimal places to round to.

Returns:

Dictionary with rounded floats.

Return type:

dict

captest.capdata.run_test(cd, steps)

Apply a list of capacity test steps to a given CapData object.

A list of CapData methods is applied sequentially with the passed parameters. This method allows succintly defining a capacity test, which facilitates parametric and automatic testing.

Parameters:
  • cd (CapData) – The CapData methods will be applied to this instance of the pvcaptest CapData class.

  • steps (list of tuples) – A list of the methods to be applied and the arguments to be used. Each item in the list should be a tuple of the CapData method followed by a tuple of arguments and a dictionary of keyword arguments. If there are not args or kwargs an empty tuple or dict should be included. Example: [(CapData.filter_irr, (400, 1500), {})]

captest.capdata.sensor_filter(df, threshold, row_filter=<function check_all_perc_diff_comb>)

Check dataframe for rows with inconsistent values.

Applies check_all_perc_diff_comb function along rows of passed dataframe.

Parameters:
  • df (pandas DataFrame) –

  • perc_diff (float) – Percent difference as decimal.

captest.capdata.spans_year(start_date, end_date)

Determine if dates passed are in the same year.

Parameters:
  • start_date (pandas Timestamp) –

  • end_date (pandas Timestamp) –

Return type:

bool

captest.capdata.tstamp_kwarg_to_strings(kwarg_dict)

Convert timestamp values in dictionary to strings.

Parameters:

kwarg_dict (dict) –

Return type:

dict

captest.capdata.update_summary(func)

Decoratates the CapData class filter methods.

Updates the CapData.summary and CapData.summary_ix attributes, which are used to generate summary data by the CapData.get_summary method.

captest.capdata.wrap_seasons(df, freq)

Rearrange an 8760 so a quarterly groupby will result in seasonal groups.

Parameters:
  • df (DataFrame) – Dataframe to be rearranged.

  • freq (str) – String pandas offset alias to specify aggregattion frequency for reporting condition calculation.

Return type:

DataFrame

captest.capdata.wrap_year_end(df, start, end)

Shifts data before or after new year to form a contigous time period.

This function shifts data from the end of the year a year back or data from the begining of the year a year forward, to create a contiguous time period. Intended to be used on historical typical year data.

If start date is in dataframe, then data at the beginning of the year will be moved ahead one year. If end date is in dataframe, then data at the end of the year will be moved back one year.

cntg (contiguous); eoy (end of year)

Parameters:
  • df (pandas DataFrame) – Dataframe to be adjusted.

  • start (pandas Timestamp) – Start date for time period.

  • end (pandas Timestamp) – End date for time period.

captest.columngroups module

class captest.columngroups.ColumnGroups(dict=None, /, **kwargs)

Bases: UserDict

captest.columngroups.group_columns(data)

Create a dict of raw column names paired to categorical column names.

Uses multiple type_def formatted dictionaries to determine the type, sub-type, and equipment type for data series of a dataframe. The determined types are concatenated to a string used as a dictionary key with a list of one or more original column names as the paired value.

Parameters:

data (DataFrame) – Data with columns to group.

Returns:

cg

Return type:

ColumnGroups

captest.columngroups.series_type(series, type_defs)

Assign columns to a category by analyzing the column names.

The type_defs parameter is a dictionary which defines search strings for each key, where the key is a categorical name and the search strings are possible related names. For example an irradiance sensor has the key ‘irr’ with search strings ‘irradiance’ ‘plane of array’, ‘poa’, etc.

Parameters:
  • series (pandas series) – Row or column of dataframe passed by pandas.df.apply.

  • type_defs (dictionary) – Dictionary with the following structure. See type_defs {‘category abbreviation’: [category search strings]}

Returns:

Returns a string representing the category for the series.

Return type:

string

captest.io module

class captest.io.DataLoader(path: str = './data/', loc: dict | None = None, sys: dict | None = None, file_reader: object = <function file_reader>, files_to_load: list | None = None, failed_to_load: list | None = None)

Bases: object

Class to load SCADA data and return a CapData object.

drop_duplicate_rows()
failed_to_load: list | None = None
file_reader(**kwargs)

Read measured solar data from a csv file.

Utilizes pandas read_csv to import measure solar data from a csv file. Attempts a few different encodings, tries to determine the header end by looking for a date in the first column, and concatenates column headings to a single string.

Parameters:
  • path (Path) – Path to file to import.

  • **kwargs – Use to pass additional kwargs to pandas read_csv.

Return type:

pandas DataFrame

files_to_load: list | None = None
load(extension='csv', verbose=True, print_errors=False, **kwargs)

Load file(s) of timeseries data from SCADA / DAS systems.

Set path to the path to a file to load a single file. Set path to the path to a directory of files to load all the files in the directory ending in “csv”. Or, set files_to_load to a list of specific files to load.

Multiple files will be joined together and may include files with different column headings. When multiple files with matching column headings are loaded, the individual files will be reindexed and then joined.

Missing time intervals within the individual files will be filled, but missing time intervals between the individual files will not be filled.

When loading multiple files they will be stored in loaded_files, a dictionary, mapping the file names to a dataframe for each file.

Parameters:
  • extension (str, default "csv") – Change the extension to allow loading different filetypes. Must also set the file_reader attribute to a function that will read that type of file. Do not include a period “.”.

  • verbose (bool, default True) – By default prints path of each file attempted to load and then confirmation it was loaded or states it failed to load. Is only relevant if path is set to a directory not a file. Set to False to not print out any file loading status.

  • print_errors (bool, default False) – Set to true to print error if file fails to load.

  • **kwargs – Are passed through to the file_reader callable, which by default will pass them on to pandas.read_csv.

Returns:

Resulting DataFrame of data is stored to the data attribute.

Return type:

None

loc: dict | None = None
path: str = './data/'
reindex()
set_files_to_load(extension='csv')

Set files_to_load attribute to a list of filepaths.

sort_data()
sys: dict | None = None
captest.io.file_reader(path, **kwargs)

Read measured solar data from a csv file.

Utilizes pandas read_csv to import measure solar data from a csv file. Attempts a few different encodings, tries to determine the header end by looking for a date in the first column, and concatenates column headings to a single string.

Parameters:
  • path (Path) – Path to file to import.

  • **kwargs – Use to pass additional kwargs to pandas read_csv.

Return type:

pandas DataFrame

captest.io.flatten_multi_index(columns)
captest.io.load_data(path, group_columns=<function group_columns>, file_reader=<function file_reader>, name='meas', sort=True, drop_duplicates=True, reindex=True, site=None, column_groups_template=False, verbose=False, **kwargs)

Load file(s) of timeseries data from SCADA / DAS systems.

This is a convenience function to generate an instance of DataLoader and call the load method.

A single file or multiple files can be loaded. Multiple files will be joined together and may include files with different column headings.

Parameters:
  • path (str) – Path to either a single file to load or a directory of files to load.

  • group_columns (function or str, default columngroups.group_columns) – Function to use to group the columns of the loaded data. Function should accept a DataFrame and return a dictionary with keys that are ids and values that are lists of column names. Will be set to the group_columns attribute of the CapData.DataLoader object. Provide a string to load column grouping from a json, yaml, or excel file. The json or yaml file should parse to a dictionary and the excel file should have two columns with the first column containing the group ids and the second column the column names. The first column may have missing values. See function load_excel_column_groups for more details.

  • file_reader (function, default io.file_reader) – Function to use to load an individual file. By default will use the built in file_reader function to try to load csv files. If passing a function to read other filetypes, the kwargs should include the filetype extension e.g. ‘parquet’.

  • name (str) – Identifier that will be assigned to the returned CapData instance.

  • sort (bool, default True) – By default sorts the data by the datetime index from old to new.

  • drop_duplicates (bool, default True) – By default drops rows of the joined data where all the columns are duplicates of another row. Keeps the first instance of the duplicated values. This is helpful if individual data files have overlapping rows with the same data.

  • reindex (bool, default True) – By default will create a new index for the data using the earliest datetime, latest datetime, and the most frequent time interval ensuring there are no missing intervals.

  • site (dict or str, default None) – Pass a dictionary or path to a json or yaml file containing site data, which will be used to generate modeled clear sky ghi and poa values. The clear sky irradiance values are added to the data and the column_groups attribute is updated to include these two irradiance columns. The site data dictionary should be {sys: {system data}, loc: {location data}}. See the capdata.csky documentation for the format of the system data and location data.

  • column_groups_template (bool, default False) – If True, will call CapData.data_columns_to_excel to save a file to use to manually create column groupings at path.

  • verbose (bool, default False) – Set to True to print status of file loading.

  • **kwargs – Passed to DataLoader.load, which passes them to the file_reader function. The default file_reader function passes them to pandas.read_csv.

captest.io.load_excel_column_groups(path)

Load column groups from an excel file.

The excel file should have two columns with no heder. The first column contains the group names and the second column contain the the column names of the data. The first column may have blanks rathe than repeating the group name for each column in the group.

For example: group1, col1

, col2 , col3

group2, col4

, col5

Parameters:

path (str) – Path to file to import.

Returns:

Dictionary mapping column group names to lists of column names.

Return type:

dict

captest.io.load_pvsyst(path, name='pvsyst', egrid_unit_adj_factor=None, set_regression_columns=True, **kwargs)

Load data from a PVsyst energy production model.

Will load day first or month first dates. Expects files that use a comma as a separator rather than a semicolon.

Parameters:
  • path (str) – Path to file to import.

  • name (str, default pvsyst) – Name to assign to returned CapData object.

  • egrid_unit_adj_factor (numeric, default None) – E_Grid will be divided by the value passed.

  • set_regression_columns (bool, default True) – By default sets power to E_Grid, poa to GlobInc, t_amb to T Amb, and w_vel to WindVel. Set to False to not set regression columns on load.

  • **kwargs – Use to pass additional kwargs to pandas read_csv. Pass sep=’;’ to load files that use semicolons instead of commas as the separator.

Return type:

CapData

Notes

Standardizes the ambient temperature column name to T_Amb. v6.63 of PVsyst used “T Amb”, v.6.87 uses “T_Amb”, and v7.2 uses “T_Amb”. Will change ‘T Amb’ or ‘TAmb’ to ‘T_Amb’ if found in the column names.

captest.prtest module

class captest.prtest.PrResults(**params)

Bases: Parameterized

params(dc_nameplate=Number, expected_pr=Number, input_data=ClassSelector, pr=Number, results_data=ClassSelector, timestep=Tuple, name=String)

Results from a PR calculation.

Parameters of ‘PrResults’

 Parameters changed from their default values are marked in red. Soft bound values are marked in cyan. C/V= Constant/Variable, RO/RW = ReadOnly/ReadWrite, AN=Allow None

Name Value Type Bounds Mode 

dc_nameplate 0.0 Number (0, None) V RW pr 0.0 Number V RW timestep (0, 0) Tuple V RW expected_pr 0.0 Number (0, 1) V RW input_data None ClassSelector V RW AN results_data None ClassSelector V RW AN

Parameter docstrings: =====================

dc_nameplate: Summation of nameplate ratings (W) for all installed modules of system. pr: Performance ratio result decimal fraction. timestep: Timestep of series. expected_pr: Expected Performance ratio result decimal fraction. input_data: < No docstring available > results_data: < No docstring available >

dc_nameplate = 0.0
expected_pr = 0.0
input_data = None
name = 'PrResults'
pr = 0.0
print_pr_result()

Print summary of PR result - passing / failing and by how much

results_data = None
timestep = (0, 0)
captest.prtest.avg_typ_cell_temp(poa, cell_temp)

Calculate irradiance weighted cell temperature.

Parameters:
  • poa (Series) – POA irradiance (W/m^2).

  • cell_temp (Series) – Cell temperature for each interval (degrees C).

Returns:

Average irradiance-weighted cell temperature.

Return type:

float

captest.prtest.back_of_module_temp(poa, temp_amb, wind_speed, module_type='glass_cell_poly', racking='open_rack')

Calculate back of module temperature from measured weather data.

Calculate back of module temperature from POA irradiance, ambient temperature, wind speed (at height of 10 meters), and empirically derived heat transfer coefficients.

Equation from NREL Weather Corrected Performance Ratio Report.

Parameters:
  • poa (numeric or Series) – POA irradiance in W/m^2.

  • temp_amb (numeric or Series) – Ambient temperature in degrees C.

  • wind_speed (numeric or Series) – Measured wind speed (m/sec) corrected to measurement height of 10 meters.

  • module_type (str, default 'glass_cell_poly') – Any of glass_cell_poly, glass_cell_glass, or ‘poly_tf_steel’.

  • racking (str, default 'open_rack') – Any of ‘open_rack’, ‘close_roof_mount’, or ‘insulated_back’

Returns:

Back of module temperatures.

Return type:

numeric or Series

captest.prtest.cell_temp(bom, poa, module_type='glass_cell_poly', racking='open_rack')

Calculate cell temp from BOM temp, POA, and heat transfer coefficient.

Equation from NREL Weather Corrected Performance Ratio Report.

Parameters:
  • bom (numeric or Series) –

    Back of module temperature (degrees C). Strictly followin the NREL procedure this value would be obtained from the back_of_module_temp function.

    Alternatively, a measured BOM temperature may be used.

    Refer to p.7 of NREL Weather Corrected Performance Ratio Report.

  • poa (numeric or Series) – POA irradiance in W/m^2.

  • module_type (str, default 'glass_cell_poly') – Any of glass_cell_poly, glass_cell_glass, or ‘poly_tf_steel’.

  • racking (str, default 'open_rack') – Any of ‘open_rack’, ‘close_roof_mount’, or ‘insulated_back’

Returns:

Cell temperature(s).

Return type:

numeric or Series

captest.prtest.get_common_timestep(data, units='m', string_output=True)

Get the most commonly occuring timestep of data as frequency string. :param data: Data with a DateTimeIndex. :type data: Series or DataFrame :param units: String representing date/time unit, such as (D)ay, (M)onth, (Y)ear,

(h)ours, (m)inutes, or (s)econds.

Parameters:

string_output (bool, default True) – Set to False to return a numeric value.

Returns:

frequency string

Return type:

str

captest.prtest.perf_ratio(ac_energy, dc_nameplate, poa, unit_adj=1, degradation=0, year=1, availability=1)

Calculate performance ratio.

Parameters:
  • ac_energy (Series) – Measured energy production (Wh) from system meter.

  • dc_nameplate (numeric) – Summation of nameplate ratings (W) for all installed modules of system under test.

  • poa (Series) – POA irradiance (W/m^2) for each time interval of the test.

  • unit_adj (numeric, default 1) – Scale factor to adjust units of ac_energy. For exmaple pass 1000 to convert measured energy from kWh to Wh within PR calculation.

  • degradation (numeric, default None) – Apply a derate (percent, Ex: 0.5%) for degradation to the expected power (denominator). Must also pass specify a value for the year argument. NOTE: Percent is divided by 100 to convert to decimal within function.

  • year (numeric) – Year of operation to use in degradation calculation.

  • availability (numeric or Series, default 1) – Apply an adjustment for plant availability to the expected power (denominator).

Returns:

Instance of class PrResults.

Return type:

PrResults

captest.prtest.perf_ratio_inputs_ok(ac_energy, dc_nameplate, poa, availability=1)

Check types of perf_ratio arguments.

Parameters:
  • ac_energy (Series) – Measured energy production (Wh) from system meter.

  • dc_nameplate (numeric) – Summation of nameplate ratings (W) for all installed modules of system under test.

  • poa (Series) – POA irradiance (W/m^2) for each time interval of the test.

  • availability (numeric or Series, default 1) – Apply an adjustment for plant availability to the expected power (denominator).

captest.prtest.perf_ratio_temp_corr_nrel(ac_energy, dc_nameplate, poa, power_temp_coeff=None, temp_amb=None, wind_speed=None, base_temp=25, module_type='glass_cell_poly', racking='open_rack', unit_adj=1, degradation=None, year=None, availability=1)

Calculate performance ratio.

Parameters:
  • ac_energy (Series) – Measured energy production (kWh) from system meter.

  • dc_nameplate (numeric) – Summation of nameplate ratings (W) for all installed modules of system under test.

  • poa (Series) – POA irradiance (W/m^2) for each time interval of the test.

  • power_temp_coeff (numeric, default None) – Module power temperature coefficient as percent per degree celsius. Ex. -0.36

  • temp_amb (Series) – Ambient temperature (degrees C) measurements.

  • wind_speed (Series) – Measured wind speed (m/sec) corrected to measurement height of 10 meters.

  • base_temp (numeric, default 25) – Base temperature (in Celsius) to correct power to. Default is the STC of 25 degrees Celsius. The NREL Weather-Corrected Performance Ratio technical report uses the term ‘Tcell_typ_avg’ for this value.

  • module_type (str, default 'glass_cell_poly') – Any of glass_cell_poly, glass_cell_glass, or ‘poly_tf_steel’.

  • racking (str, default 'open_rack') – Any of ‘open_rack’, ‘close_roof_mount’, or ‘insulated_back’

  • unit_adj (numeric, default 1) – Scale factor to adjust units of ac_energy. For exmaple pass 1000 to convert measured energy from kWh to Wh within PR calculation.

  • degradation (numeric, default None) – NOT IMPLEMENTED Apply a derate for degradation to the expected power (denominator). Must also pass specify a value for the year argument.

  • year (numeric) – NOT IMPLEMENTED Year of operation to use in degradation calculation.

  • availability (numeric or Series, default 1) – NOT IMPLEMENTED Apply an adjustment for plant availability to the expected power (denominator).

captest.prtest.temp_correct_power(power, power_temp_coeff, cell_temp, base_temp=25)

Apply temperature correction to PV power.

Divides power by the temperature correction, so low power values that are above base_temp will be increased and high power values that are below the base_temp will be decreased.

Parameters:
  • power (numeric or Series) – PV power (in watts) to correct to the base_temp.

  • power_temp_coeff (numeric) – Module power temperature coefficient as percent per degree celsius. Ex. -0.36

  • cell_temp (numeric or Series) – Cell temperature (in Celsius) used to calculate temperature differential from the base_temp.

  • base_temp (numeric, default 25) – Base temperature (in Celsius) to correct power to. Default is the STC of 25 degrees Celsius.

Returns:

Power corrected for temperature.

Return type:

type matches power

captest.util module

captest.util.append_tags(sel_tags, tags, regex_str)
captest.util.generate_irr_distribution(lowest_irr, highest_irr, rng=Generator(PCG64) at 0x70A884350120)

Create a list of increasing values similar to POA irradiance data.

Default parameters result in increasing values where the difference between each subsquent value is randomly chosen from the typical range of steps for a POA tracker.

Parameters:
  • lowest_irr (numeric) – Lowest value in the list of values returned.

  • highest_irr (numeric) – Highest value in the list of values returned.

  • rng (Numpy Random Generator) – Instance of the default Generator.

Returns:

irr_values

Return type:

list

captest.util.get_common_timestep(data, units='m', string_output=True)

Get the most commonly occuring timestep of data as frequency string.

Parameters:
  • data (Series or DataFrame) – Data with a DateTimeIndex.

  • units (str, default 'm') – String representing date/time unit, such as (D)ay, (M)onth, (Y)ear, (h)ours, (m)inutes, or (s)econds.

  • string_output (bool, default True) – Set to False to return a numeric value.

Returns:

If the string_output is True and the most common timestep is an integer in the specified units then a valid pandas frequency or offset alias is returned. If string_output is false, then a numeric value is returned.

Return type:

str or numeric

captest.util.read_json(path)
captest.util.read_yaml(path)
captest.util.reindex_datetime(data, report=False)

Find dataframe index frequency and reindex to add any missing intervals.

Sorts index of passed dataframe before reindexing.

Parameters:

data (DataFrame) – DataFrame to be reindexed.

Return type:

Reindexed DataFrame

captest.util.tags_by_regex(tag_list, regex_str)

captest.plotting module

captest.plotting.add_custom_plot(name, column_groups, group_tags, column_tags)

Append a new custom group to column groups for plotting.

captest.plotting.filter_list(text_input, ms_to_filter, names, event=None)

Filter a multi-select widget by a regex string.

Parameters:
  • text_input (pn.widgets.TextInput) – The text input widget to get the regex string from.

  • ms_to_filter (pn.widgets.MultiSelect) – The multi-select widget to update.

  • names (list of str) – The list of names to filter.

  • event (pn.widgets.event, optional) – Passed by the param.watch method. Not used.

Return type:

None

captest.plotting.find_default_groups(groups, default_groups)

Find the default groups in the list of groups.

Parameters:
  • groups (list of str) – The list of groups to search for the default groups.

  • default_groups (list of str) – List of regex strings to use to identify default groups.

Returns:

The default groups found in the list of groups.

Return type:

list of str

captest.plotting.group_tag_overlay(group_tags, column_tags)

Overlay curves of groups and individually selected columns.

Parameters:
  • group_tags (list of str) – The tags to plot from the groups selected.

  • column_tags (list of str) – The tags to plot from the individually selected columns.

captest.plotting.msel_from_column_groups(column_groups, groups=True)

Create a multi-select widget from a column groups object.

Parameters:
  • column_groups (ColumnGroups) – The column groups object.

  • groups (bool, default True) – By default creates list of groups i.e. the keys of column_groups, otherwise creates list of individual columns i.e. the values of column_groups concatenated together.

captest.plotting.parse_combine(combine, column_groups=None, data=None, cd=None)

Parse regex strings for identifying groups of columns or tags to combine.

Parameters:
  • combine (dict) – Dictionary of group names and regex strings to use to identify groups from column groups and individual tags (columns) to combine into new groups. Keys should be strings for names of new groups. Values should be either a string or a list of two strings. If a string, the string is used as a regex to identify groups to combine. If a list, the first string is used to identify groups to combine and the second is used to identify individual tags (columns) to combine.

  • column_groups (ColumnGroups, optional) – The column groups object to add new groups to. Required if cd is not provided.

  • data (pd.DataFrame, optional) – The data to use to identify groups and columns to combine. Required if cd is not provided.

  • cd (captest.CapData, optional) – The captest.CapData object with the data and column_groups attributes set. Required if columng_groups and data are not provided.

Returns:

New column groups object with new groups added.

Return type:

ColumnGroups

captest.plotting.plot(cd=None, cg=None, data=None, combine={'ghi_csky': '(?=.*ghi)(?=.*irr)', 'inv_sum_mtr_pwr': ['(?=.*real)(?=.*pwr)(?=.*mtr)', '(?=.*pwr)(?=.*agg)'], 'poa_csky': '(?=.*poa)(?=.*irr)', 'poa_ghi': 'irr.*(poa|ghi)$', 'temp_amb_bom': '(?=.*temp)((?=.*amb)|(?=.*bom))'}, default_groups=['inv_sum_mtr_pwr', '(?=.*real)(?=.*pwr)(?=.*inv)', '(?=.*real)(?=.*pwr)(?=.*mtr)', 'poa_ghi', 'poa_csky', 'ghi_csky', 'temp_amb_bom'], group_width=1500, group_height=250, **kwargs)

Create plotting dashboard.

NOTE: If a ‘plot_defaults.json’ file exists in the same directory as the file this function is called from called, then the default groups will be read from that file instead of using the default_groups argument. Delete or manually edit the file to change the default groups. Use the default_groups or manually edit the file to control the order of the plots.

Parameters:
  • cd (captest.CapData, optional) – The captest.CapData object.

  • cg (captest.ColumnGroups, optional) – The captest.ColumnGroups object. data must also be provided.

  • data (pd.DataFrame, optional) – The data to plot. cg must also be provided.

  • combine (dict, optional) – Dictionary of group names and regex strings to use to identify groups from column groups and individual tags (columns) to combine into new groups. See the parse_combine function for more details.

  • default_groups (list of str, optional) – List of regex strings to use to identify default groups to plot. See the find_default_groups function for more details.

  • group_width (int, optional) – The width of the plots on the Groups tab.

  • group_height (int, optional) – The height of the plots on the Groups tab.

  • **kwargs (optional) – Pass additional keyword arguments to the holoviews options of the scatter plot on the ‘Scatter’ tab.

captest.plotting.plot_group_tag_overlay(data, group_tags, column_tags, width=1500, height=400)

Overlay curves of groups and individually selected columns.

Parameters:
  • data (pd.DataFrame) – The data to plot.

  • group_tags (list of str) – The tags to plot from the groups selected.

  • column_tags (list of str) – The tags to plot from the individually selected columns.

captest.plotting.plot_tag(data, tag, width=1500, height=250)
captest.plotting.plot_tag_groups(data, tags_to_plot, width=1500, height=250)

Plot groups of tags, one of overlayed curves per group.

Parameters:
  • data (pd.DataFrame) – The data to plot.

  • tags_to_plot (list) – List of lists of strings. One plot for each inner list.

captest.plotting.scatter_dboard(data, **kwargs)

Create a dashboard to plot any two columns of data against each other.

Parameters:
  • data (pd.DataFrame) – The data to plot.

  • **kwargs (optional) – Pass additional keyword arguments to the holoviews options of the scatter plot.

Returns:

The dashboard with a scatter plot of the data.

Return type:

pn.Column