CompStats.bootstrap

class StatisticSamples[source]

Apply the statistic to num_samples samples taken with replacement from the population (arguments).

Parameters:
  • statistic (Callable) – Statistic.

  • num_samples (int) – Number of bootstrap samples, default=500.

  • n_jobs (int) – Number of jobs to run in parallel, default=1.

>>> from CompStats import StatisticSamples
>>> from sklearn.metrics import accuracy_score
>>> import numpy as np
>>> statistic = StatisticSamples(num_samples=10, statistic=np.mean)
>>> empirical_distribution = np.r_[[3, 4, 5, 2, 4]]
>>> statistic(empirical_distribution)
array([2.8, 3.6, 3.6, 3.6, 2.6, 4. , 2.8, 3. , 3.8, 3.6])
>>> labels = np.r_[[0, 0, 0, 0, 0, 1, 1, 1, 1, 1]]
>>> pred   = np.r_[[0, 0, 1, 0, 0, 1, 1, 1, 0, 1]]
>>> acc = StatisticSamples(num_samples=15, statistic=accuracy_score)
>>> acc(labels, pred)
array([0.9, 0.8, 0.7, 1. , 0.6, 1. , 0.7, 0.9, 0.9, 0.8, 0.9, 0.8, 0.8, 0.8, 0.8])
__init__(statistic: ~typing.Callable[[~numpy.ndarray], float] = <function mean>, num_samples: int = 500, n_jobs: int = 1, BiB: bool = True)[source]
property info

Information about the samples

get_params()[source]

Parameters

property calls

Dictionary containing the output of the calls when a name is given

property n_jobs

Number of jobs to do in parallel

property statistic

Statistic function.

property num_samples

Number of bootstrap samples.

property statistic_samples

It contains the statistic samples of the latest call.

samples(N)[source]

Samples.

Parameters:

N (int) – Population size.

keys()[source]

calls keys

melt(var_name='Algorithm', value_name='Score')[source]

Represent into a long DataFrame