Title: | Tools for Actuarial Experience Studies |
---|---|
Description: | Experiences studies are an integral component of the actuarial control cycle. Regardless of the decrement or policyholder behavior of interest, the analyses conducted is often the same. Ultimately, this package aims to reduce time spent writing the same code used for different experience studies, therefore increasing the time for to uncover new insights inherit within the relevant experience. |
Authors: | Cody Buehler [aut, cre] |
Maintainer: | Cody Buehler <[email protected]> |
License: | GPL (>= 3) |
Version: | 2.0.0.9000 |
Built: | 2024-10-30 05:01:03 UTC |
Source: | https://github.com/cb12991/expstudy |
There often are situations where an industry table is used for an assumed rate due to a company lacking sufficient credibility to write their own assumption. However, as experience becomes more available, a company would likely want to incorporate this experience into the industry assumption because it provides valuable insight into their own policyholders. A common industry approach is to apply "factor adjustments" developed using company experience to the industry assumption.
compute_fct_adjs( .data, expected_rate, measure_sets = guess_measure_sets(.data), amount_scalar = NULL, method = c("simultaneous", "sequential"), cred_wt_adjs = FALSE, balance_adjs = FALSE, na.rm = FALSE )
compute_fct_adjs( .data, expected_rate, measure_sets = guess_measure_sets(.data), amount_scalar = NULL, method = c("simultaneous", "sequential"), cred_wt_adjs = FALSE, balance_adjs = FALSE, na.rm = FALSE )
.data |
A |
expected_rate |
The underlying expected rate in the experience study for which factor adjustments are being generated for. |
measure_sets |
A (potentially named) list of measure sets. Only need to specify once if
chaining multiple |
amount_scalar |
A numeric vector to use when determining amount-weighted expecteds and variances. The function will determine whether or not the new expecteds/variances are amount-weighted if the corresponding actuals in the study have values greater than 1 (actuals that are not amount-weighted, i.e., counts, should only be 0 or 1). |
method |
String indicating the method of determining factor adjustments: * `simultaneous` will calculate factor adjustments for all combinations of group values in one iteration. * `sequential` will calculate factor adjustments for each grouping variable individually and applies that factor adjustment to the underlying expected rate before continuing with the next grouping variable's factor computation. |
cred_wt_adjs |
Logical indicating if factor adjustments should be credibility-weighted using partial credibility scores. |
balance_adjs |
Logical indicating if credibility-weighted adjustments should be scaled to
produce a 100% A/E ratio in aggregate (has no effect if
|
na.rm |
logical. Should missing values (including |
This function piggy-backs off of measure_sets
defined in other expstudy
functions to quickly produce factor adjustments under a variety of methods.
Providing a dplyr::grouped_df()
will generate factors for each group
according to the method specified. If two or more grouping variables are
provided, an additional "composite" factor adjustment will also be generated
which is the product of each individual adjustment.
A list of data frames that house factor adjustments for each measure set
provided in measure_sets
.
mortexp |> dplyr::group_by( GENDER, SMOKING_STATUS ) |> compute_fct_adjs( EXPECTED_MORTALITY_RT, amount_scalar = FACE_AMOUNT )
mortexp |> dplyr::group_by( GENDER, SMOKING_STATUS ) |> compute_fct_adjs( EXPECTED_MORTALITY_RT, amount_scalar = FACE_AMOUNT )
Attempt to guess the names of a measure set using regular expressions (or regexs).
guess_measure_sets( data, measure_regexs = getOption("expstudy.default_measure_regexs"), measure_set_prefixes = getOption("expstudy.default_measure_set_prefixes"), measure_set_suffixes = getOption("expstudy.default_measure_set_suffixes") )
guess_measure_sets( data, measure_regexs = getOption("expstudy.default_measure_regexs"), measure_set_prefixes = getOption("expstudy.default_measure_set_prefixes"), measure_set_suffixes = getOption("expstudy.default_measure_set_suffixes") )
data |
A |
measure_regexs |
A named list of patterns to use as regexs when guessing columns in the
study dataset to be used for one study measure in each measure set. There
must be one column for each measure in a measure set (actuals, expecteds,
exposures, and variances). Defaults to
|
measure_set_prefixes , measure_set_suffixes
|
Character vectors that will be use to differentiate the same measure in
one measure set from another measure set. Using If the experience study has columns that follow a consistent naming
structure, this function can seamlessly provide other |
A named list of measure sets that identify common variables used for
expstudy
analyses.
guess_measure_sets(mortexp)
guess_measure_sets(mortexp)
A collection of common metrics used in an actuarial environment are provided. Two versions of each metric functions have been developed: one where it takes a measure set for an experience study as its primary argument, and one where vectors can be provided instead.
avg_observed(measure_set, ...) avg_observed_vec(actuals, exposures, ...) avg_expected(measure_set, ...) avg_expected_vec(expecteds, exposures, ...) ci_fctr(measure_set, se_conf = 0.95, two_tailed = TRUE, ...) ci_fctr_vec(exposures, variances, se_conf = 0.95, two_tailed = TRUE, ...) ae_ratio(measure_set, ...) ae_ratio_vec(actuals, expecteds, ...) credibility(measure_set, distance_from_mean = 0.05, cred_conf = 0.95, ...) credibility_vec( expecteds, variances, distance_from_mean = 0.05, cred_conf = 0.95, ... )
avg_observed(measure_set, ...) avg_observed_vec(actuals, exposures, ...) avg_expected(measure_set, ...) avg_expected_vec(expecteds, exposures, ...) ci_fctr(measure_set, se_conf = 0.95, two_tailed = TRUE, ...) ci_fctr_vec(exposures, variances, se_conf = 0.95, two_tailed = TRUE, ...) ae_ratio(measure_set, ...) ae_ratio_vec(actuals, expecteds, ...) credibility(measure_set, distance_from_mean = 0.05, cred_conf = 0.95, ...) credibility_vec( expecteds, variances, distance_from_mean = 0.05, cred_conf = 0.95, ... )
measure_set |
A named character vector or list with each element mapping a column in
the experience study to one of the following measures: |
... |
Not used directly and be left blank. |
actuals , expecteds , exposures , variances
|
Columns in experience study that correspond to individual measures for vector versions of metric functions. |
se_conf |
A number between 0 and 1 corresponding to the confidence level surrounding the standard error calculation. |
two_tailed |
A boolean indicating whether or not a two-tailed hypothesis test should be utilized. |
distance_from_mean |
A number between 0 and 1 representing the precision of the credibility estimate. |
cred_conf |
A number between 0 and 1 corresponding to the confidence level surrounding the credibility calculation. |
Metric functions that use a measure set as its primary argument are intended
to be used with mutate_metrics()
and return a (quosure
)rlang::quo()
.
Use the vector versions (those ending in _vec
) if instead a numeric vector
result is desired.
Measure set versions return a (quosure
)rlang::quo()
to be evaluated in
mutate_metrics()
. Vector versions numeric vector
of the same length of measures used in the calculation per group (if
grouping applied).
avg_observed()
: Calculates the average actual decrements observed per unit of exposure.
avg_expected()
: Calculates the average expected decrements per unit of exposure.
ci_fctr()
: Calculates the additive factor which constructs a confidence interval
around the expected decrement rate for a given level of confidence.
ae_ratio()
: Calculates the ratio of actual decrements to expected decrements, also
referred to as the AE ratio.
credibility()
: Calculates the credibility score according to limited fluctuation
credibility theory.
A dataset containing an example of a mortality experience study for 1000 fictional whole life insurance policyholders.
mortexp
mortexp
A data.frame()
with over 175,000 rows and 24 columns:
AS_OF_DATE
This indicates which point in time a record encompasses.
POLICY_HOLDER
An index used to distinguish policyholders. In this example the policyholder is also the (only) insured.
GENDER
,
SMOKING_STATUS
,
UNDERWRITING_CLASS
,
INSURED_DOB
,
ISSUE_DATE
,
ISSUE_AGE
Various characteristics of an insured at time of issue.
FACE_AMOUNT
Face amount of insurance for a corresponding policy.
TERMINATION_DATE
If terminated, the effective date of termination. An NA
value will be
listed for policies that are still in-force.
ATTAINED_AGE
The age of the insured at the record's AS_OF_DATE
EXPECTED_MORTALITY_RT
An expected mortality rate for an insured. The rate is calculated
according to De Moivre's Law (also known as uniform distribution of
deaths, or ) with
.
POLICY_DURATION_MNTH
, POLICY_DURATION_YR
Temporal indices describing how long a policy has been in-force at the
AS_OF_DATE
. For example, when a policy is first issued (i.e.,
), it is in policy duration year one and policy duration month
one.
POLICY_STATUS
The current status of the policy, either in-force, surrendered, or death. The value will be listed for each policy record even though a decrement only occurs at the end of the policy's duration (for policies which are no longer in-force).
MORT_EXPOSURE_CNT
,MORT_EXPOSURE_AMT
Measures how many policyholders or how much face amount of insurance is exposed to the risk of decrement for an associated observations.
MORT_ACTUAL_CNT
,MORT_ACTUAL_AMT
Measures the decrement occurrence on a policy count or face amount of insurance basis.
MORT_EXPECTED_CNT
,MORT_EXPECTED_AMT
Measures the expected decrement value for an associated observation on a policy count or face amount of insurance basis.
MORT_VARIANCE_CNT
,MORT_VARIANCE_AMT
Measures the variance of the decrement expectation, also on a policy count or face amount of insurance basis. Used to calculate credibility scores and confidence intervals.
All policy record detail is randomly generated. See the Society of Actuaries' publication on experience study calculations for additional information regarding experience study calculations.
mutate_expecvar()
uses a new expected rate for a decrement of interest and
adds a corresponding expected decrements column and corresponding variance
of expected decrements column. If there are already expecteds and variances
measures within the study dataset, either new, prefixed columns will be
added or the current expecteds and variances can be overwritten.
mutate_expecvar( .data, new_expected_rates, new_expecvar_prefix = "auto", measure_sets = guess_measure_sets(.data), amount_scalar = NULL, .by = NULL, .keep = c("all", "used", "unused", "none"), .before = NULL, .after = NULL )
mutate_expecvar( .data, new_expected_rates, new_expecvar_prefix = "auto", measure_sets = guess_measure_sets(.data), amount_scalar = NULL, .by = NULL, .keep = c("all", "used", "unused", "none"), .before = NULL, .after = NULL )
.data |
A |
new_expected_rates |
A numeric vector to use as the expected probability for the study's event
of interest (i.e., policy lapse or insured death). This can be a column
in the dataset or a new numeric vector of length 1 or |
new_expecvar_prefix |
A string to distinguish the new expecteds and variances columns in the
dataset. To overwrite existing expecteds and variances columns, use an
argument value of |
measure_sets |
A (potentially named) list of measure sets. Only need to specify once if
chaining multiple |
amount_scalar |
A numeric vector to use when determining amount-weighted expecteds and variances. The function will determine whether or not the new expecteds/variances are amount-weighted if the corresponding actuals in the study have values greater than 1 (actuals that are not amount-weighted, i.e., counts, should only be 0 or 1). |
.by |
< |
.keep |
Control which columns from
|
.before , .after
|
< |
An object of the same type as .data
. The output has the following
properties:
Columns from .data
will be preserved according to the .keep
argument.
Existing columns that are modified by ...
will always be returned in
their original location.
New columns created through ...
will be placed according to the
.before
and .after
arguments.
The number of rows is not affected.
Columns given the value NULL
will be removed.
Groups will be recomputed if a grouping variable is mutated.
Data frame attributes are preserved.
This function was developed according to current industry practice relating to experience study calculations. Some of the assumptions incorporated are briefly outlined below.
The experience study data is at a seriatim level where repeated observations of multiple units can exist. For example, the study data can contain experience for multiple policies over multiple calendar or policy years.
Each decrement event can be described as a Bernoulli random variable with expected rate of decrement equal to $p$. Furthermore, combining multiple observation units with equal rates of decrement $p$ can be considered a Binomial random variable with $n$ equal to the number of observation units.
Decrements are considered to be uniform between observations.
With these assumptions, new expecteds that are not amount-weighted are calculated as the product of exposures and the expected decrement rate, new variances are calculated as the product of the previously calculated new expecteds and 1 minus the previously calculated new expecteds. Amount-weighted expecteds and variances follow the prior calculations and additionally multiply by the amount scalar and amount scalar squared, respectively.
For a more detailed explanation of these methods used, please refer to the Society of Actuary's publication over experience study calculations.
expstudy
uses a naming convention where some functions are prefixed by the
underling dplyr
verb. The purpose of this is to associate the resulting
structure of the expstudy
function with a very similar output as what the
dplyr
function would produce. Note that the intention here is not replace
all dplyr
use cases but instead add specific functionality to streamline
routine experience study analyses.
mortexp |> dplyr::mutate( NEW_EXPECTED_MORT_RT = runif(n = nrow(mortexp)) ) |> mutate_expecvar( new_expected_rates = NEW_EXPECTED_MORT_RT, new_expecvar_prefix = 'ADJ_', amount_scalar = FACE_AMOUNT )
mortexp |> dplyr::mutate( NEW_EXPECTED_MORT_RT = runif(n = nrow(mortexp)) ) |> mutate_expecvar( new_expected_rates = NEW_EXPECTED_MORT_RT, new_expecvar_prefix = 'ADJ_', amount_scalar = FACE_AMOUNT )
mutate_metrics()
calculates metrics for an experience study using common
measures associated with the data. These measures are identified via the
measure_sets
argument which can be provided directly or be guessed using
regular expressions (regexs
). See guess_measure_sets()
for additional
detail on how this guessing is implemented.
mutate_metrics( .data, measure_sets = guess_measure_sets(.data), metrics = list(AVG_OBSRV = avg_observed, AVG_EXPEC = avg_expected, CI_FCTR = ci_fctr, AE_RATIO = ae_ratio, CREDIBILITY = credibility), ..., .by = NULL, .keep = c("all", "used", "unused", "none"), .before = NULL, .after = NULL )
mutate_metrics( .data, measure_sets = guess_measure_sets(.data), metrics = list(AVG_OBSRV = avg_observed, AVG_EXPEC = avg_expected, CI_FCTR = ci_fctr, AE_RATIO = ae_ratio, CREDIBILITY = credibility), ..., .by = NULL, .keep = c("all", "used", "unused", "none"), .before = NULL, .after = NULL )
.data |
A |
measure_sets |
A (potentially named) list of measure sets. Only need to specify once if
chaining multiple |
metrics |
A named list of functions to calculate metrics. Each function will be
applied to each set identified in |
... |
Additional (optional) arguments passed along to each (metric function)metrics. |
.by |
< |
.keep |
Control which columns from
|
.before , .after
|
< |
This function is structured in a way that uses sets of measures within the
study as the first function argument of each metric function. The default
argument uses a set of metric functions, provided by expstudy
, which are
commonly requested metrics used in actuarial analyses. For convenience,
a vectorized version of these default metric functions have also been
provided; see metrics for more information.
An object of the same type as .data
. The output has the following
properties:
Columns from .data
will be preserved according to the .keep
argument.
Existing columns that are modified by ...
will always be returned in
their original location.
New columns created through ...
will be placed according to the
.before
and .after
arguments.
The number of rows is not affected.
Columns given the value NULL
will be removed.
Groups will be recomputed if a grouping variable is mutated.
Data frame attributes are preserved.
expstudy
uses a naming convention where some functions are prefixed by the
underling dplyr
verb. The purpose of this is to associate the resulting
structure of the expstudy
function with a very similar output as what the
dplyr
function would produce. Note that the intention here is not replace
all dplyr
use cases but instead add specific functionality to streamline
routine experience study analyses.
# Metrics can be added at a seriatim level, but often are # calculated after some aggregation is applied to a cohort: mortexp |> dplyr::group_by( GENDER ) |> summarise_measures() |> mutate_metrics()
# Metrics can be added at a seriatim level, but often are # calculated after some aggregation is applied to a cohort: mortexp |> dplyr::group_by( GENDER ) |> summarise_measures() |> mutate_metrics()
summarise_measures()
functions the same as dplyr::summarise()
and
returns a new data frame per combination of grouping variable. However,
this function is is streamlined to return the sum of an experience study's
measures instead of any arbitrary summary function. These measures are
identified via the measure_sets
argument which can be provided directly
or be guessed using regular expressions (regexs
). See
guess_measure_sets()
for additional detail on how this guessing is
implemented.
summarise_measures( .data, measure_sets = guess_measure_sets(.data), na.rm = TRUE, .groups = "drop", .by = NULL )
summarise_measures( .data, measure_sets = guess_measure_sets(.data), na.rm = TRUE, .groups = "drop", .by = NULL )
.data |
A |
measure_sets |
A (potentially named) list of measure sets. Only need to specify once if
chaining multiple |
na.rm |
logical. Should missing values (including |
.groups |
Grouping structure of the result.
When
In addition, a message informs you of that choice, unless the result is ungrouped,
the option "dplyr.summarise.inform" is set to |
.by |
< |
An object usually of the same type as .data
.
The rows come from the underlying group_keys()
.
The columns are a combination of the grouping keys and the summary expressions that you provide.
The grouping structure is controlled by the .groups=
argument, the
output may be another grouped_df, a tibble or a rowwise data frame.
Data frame attributes are not preserved, because summarise()
fundamentally creates a new data frame.
expstudy
uses a naming convention where some functions are prefixed by the
underling dplyr
verb. The purpose of this is to associate the resulting
structure of the expstudy
function with a very similar output as what the
dplyr
function would produce. Note that the intention here is not replace
all dplyr
use cases but instead add specific functionality to streamline
routine experience study analyses.
mortexp |> dplyr::group_by( UNDERWRITING_CLASS ) |> summarise_measures()
mortexp |> dplyr::group_by( UNDERWRITING_CLASS ) |> summarise_measures()