captest.capdata.CapData

class captest.capdata.CapData(name)

Class to store capacity test data and column grouping.

CapData objects store a pandas dataframe of measured or simulated data and a dictionary grouping columns by type of measurement.

The column_groups dictionary allows maintaining the original column names while also grouping measurements of the same type from different sensors. Many of the methods for plotting and filtering data rely on the column groupings.

Parameters:

name (str) – Name for the CapData object.
data (pandas dataframe) – Used to store measured or simulated data imported from csv.
data_filtered (pandas dataframe) – Holds filtered data. Filtering methods act on and write to this attribute.
column_groups (dictionary) – Assigned by the group_columns method, which attempts to infer the type of measurement recorded in each column of the dataframe stored in the data attribute. For each inferred measurement type, group_columns creates an abbreviated name and a list of columns that contain measurements of that type. The abbreviated names are the keys and the corresponding values are the lists of columns.
regression_cols (dictionary) – Dictionary identifying which columns in data or groups of columns as identified by the keys of column_groups are the independent variables of the ASTM Capacity test regression equation. Set using set_regression_cols or by directly assigning a dictionary.
summary_ix (list of tuples) – Holds the row index data modified by the update_summary decorator function.
summary (list of dicts) – Holds the data modified by the update_summary decorator function.
rc (DataFrame) – Dataframe for the reporting conditions (poa, t_amb, and w_vel).
regression_results (statsmodels linear regression model) – Holds the linear regression model object.
regression_formula (str) – Regression formula to be fit to measured and simulated data. Must follow the requirements of statsmodels use of patsy.
tolerance (str) – String representing error band. Ex. ‘+ 3’, ‘+/- 3’, ‘- 5’ There must be space between the sign and number. Number is interpreted as a percent. For example, 5 percent is 5 not 0.05.

__init__(name)

Methods

`__init__`(name)
`agg_group`(group_id, agg_func[, verbose, ...])	Aggregate columns in a group.
`agg_sensors`([agg_map, verbose])	Aggregate measurments of the same variable from different sensors.
`column_groups_to_excel`([save_to])	Export the column groups attribute to an excel file.
`copy`()	Create and returns a copy of self.
`create_agg_attributes`()	Create callable attributes for each aggregated column that return data views.
`create_column_group_attributes`()	Create callable attributes for each column group that return data views.
`custom_param`(func, args, *kwargs)	Applies the function func with kwargs and adds result as new column to data.
`data_columns_to_excel`([sort_by_reversed_names])	Write the columns of data to an excel file as a template for a column grouping.
`drop_cols`(columns)	Drop columns from CapData data, data_filtered, and column_groups.
`empty`()	Return a boolean indicating if the CapData object contains data.
`expand_agg_map`(agg_map)	Traverses, expands, and sorts the agg_map.
`expanded_uncert`(grp_to_term[, k])	Calculate expanded uncertainty of the predicted power.
`filter_clearsky`([ghi_col, inplace, keep_clear])	Use pvlib detect_clearsky to remove periods with unstable irradiance.
`filter_custom`(func, args, *kwargs)	Apply update_summary decorator to passed function.
`filter_days`(days[, drop, inplace])	Select or drop timestamps for days passed.
`filter_irr`(low, high[, ref_val, col_name, ...])	Filter on irradiance values.
`filter_missing`([columns])	Removes any rows where the regression columns contain missing data (NaNs).
`filter_op_state`(op_state[, mult_inv, inplace])	NOT CURRENTLY IMPLEMENTED - Filter on inverter operation state.
`filter_outliers`([inplace])	Apply eliptic envelope from scikit-learn to remove outliers.
`filter_pf`(pf[, inplace])	Filter data on the power factor.
`filter_power`(power[, percent, columns, inplace])	Remove data above the specified power threshold.
`filter_pvsyst`([inplace])	Filter pvsyst data for off max power point tracking operation.
`filter_sensors`([perc_diff, inplace, row_filter])	Drop suspicious measurments by comparing values from different sensors.
`filter_shade`([fshdbm, query_str, inplace])	Remove data during periods of array shading.
`filter_time`([start, end, drop, days, ...])	Select data for a specified time period.
`fit_regression`([filter, inplace, summary])	Perform a regression with statsmodels on filtered data.
`get_filtering_table`()	Returns DataFrame showing which filter removed each filtered time interval.
`get_length_test_period`()	Get length of test period.
`get_pts_required`([hrs_req])	Set number of data points required for complete test attribute.
`get_reg_cols`([reg_vars, filtered_data])	Get regression columns renamed with keys from regression_cols.
`get_summary`()	Print a summary of filtering applied to the data_filtered attribute.
`plot`([combine, default_groups, width, ...])	Create a dashboard to explore timeseries plots of the data.
`predict_capacities`([irr_filter, percent_filter])	Calculate expected capacities.
`print_points_summary`([hrs_req])	print summary data on the number of points collected.
`process_regression_columns`([verbose])	Walk the regression column dictionary and calculate parameters.
`reg_scatter_matrix`()	Create pandas scatter matrix of regression variables.
`rename_cols`(column_map)	Rename columns in data, data_filtered, and column_groups.
`rep_cond`([func, w_vel, irr_bal, ...])	Calculate reporting conditions for the current regression formula.
`rep_cond_freq`([irr_bal, percent_filter, ...])	Calculate frequency-grouped reporting conditions.
`reset_agg`()	Remove aggregation columns from data and data_filtered attributes.
`reset_filter`()	Set data_filtered to data and reset filtering summary.
`review_column_groups`()	Print column_groups with nice formatting.
`scatter`([filtered])	Create a matplotlib scatter plot of regression lhs vs.
`scatter_filters`()	Returns an overlay of scatter plots of intervals removed for each filter.
`scatter_hv`([timeseries, all_reg_columns])	Create a holoviews scatter plot of regression lhs vs.
`set_regression_cols`([power, poa, t_amb, w_vel])	Create a dictionary linking the regression variables to data.
`set_test_complete`(pts_required)	Sets test_complete attribute.
`spatial_uncert`(column_groups)	Spatial uncertainties of the independent regression variables.
`timeseries_filters`()	Returns an overlay of scatter plots of intervals removed for each filter.
`uncertainty`()	Calculate random standard uncertainty of the regression.