utlvce.generators
The utlvce.generators
module contains functions to generate random UT-LVCE models from given or random DAG adjacencies.
- utlvce.generators.chain_graph_model(p, I, num_latent, e, var_lo, var_hi, int_var_lo, int_var_hi, psi_lo, psi_hi, int_psi_lo, int_psi_hi, B_lo, B_hi, sparse_latents=False, obs=True, random_state=42, verbose=0)
Generate a random model from a chain graph with p nodes.
- Parameters
p (int) – The number of observed variables in the model.
I (set) – The set of intervention targets.
num_latent (int) – The number of latent variables in the model.
e (int) – The number of environments.
var_lo (float) – The lower bound for the variances of the noise terms of the observed variables.
var_hi (float) – The upper bound for the variances of the noise terms of the observed variables.
int_var_lo (float) – The lower bound for the intervention variances on the observed variables.
int_var_hi (float) – The upper bound for the intervention variances on the observed variables.
psi_lo (float) – The lower bound for the variances of the latent variables.
psi_hi (float) – The upper bound for the variances of the latent variables.
int_psi_lo (float) – The lower bound for the intervention variances on the latent variables.
int_psi_hi (float) – The upper bound for the intervention variances on the latent variables.
B_lo (float) – The lower bound for the edge weights between observed variables.
B_hi (float) – The upper bound for the edge weights between observed variables.
sparse_latents (bool, default=False) – If the gamma matrix of latent effects should be sparse (see source).
obs (bool, default=True) – Whether the first environment should be “observational”, i.e. that the variances of the noise terms and latents are lower (variable-wise) than the other environments. With obs=True, the variances for first environment are sampled from [var_lo, var_hi] and, from [var_lo + int_var_lo, var_hi + int_var_hi] for the remaining environments; the same holds for the sampling of psi. If obs=False, the latter interval is used for all environments. Note that is not a necessary assumption for the UT-LVCE estimator, but makes the actual intervention strength less sensitive to the random sampling of parameters.
random_state (int, default=42) – To set the random state for reproducibility. Successive calls with the same random state will return the same model.
verbose (int, default = 0) – If debug and execution traces should be printed. 0 corresponds to no traces, higher values correspond to higher verbosity.
- Returns
model – An instance of the model with the sampled parameters.
- Return type
- Raises
ValueError : – If the intervention targets are not a subset of the variable indices, i.e. [0,…,p-1].
Examples
>>> chain_graph_model(20,{2},2,5,0.5,0.6,3,6,0.2,0.4,1,5,0.7,0.8,False,True,42,0) <utlvce.model.Model object at 0x...>
- utlvce.generators.intervention_targets(p, num_targets, random_state=42)
Sample a set of intervention targets.
- Parameters
p (int) – The number of variables, i.e. targets will be sampled from [0,p-1].
num_targets (int or tuple) – Specifies the number of targets. If a two-element tuple, the number of targets is sampled uniformly at random from [size[0], size[1]]
random_state (int) – To set the random state for reproducibility.
- Returns
targets – A set with the indices of the intervention targets.
- Return type
set
- Raises
ValueError : – If the given number of targets is invalid.
Examples
>>> intervention_targets(20, 3) {1, 13, 14}
>>> intervention_targets(20, (1,10), random_state=1) {0, 2, 8, 12, 17}
>>> intervention_targets(20, (1,10), random_state=2) {1, 3, 4, 6, 8, 13, 18, 19}
>>> intervention_targets(10, 10) {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> intervention_targets(10, 0) set()
Requesting an inappropriate (>p) number of targets yields a ValueError:
>>> intervention_targets(10, 11) Traceback (most recent call last): ... ValueError: Invalid number of targets.
>>> intervention_targets(10, (0,11)) Traceback (most recent call last): ... ValueError: Invalid number of targets.
- utlvce.generators.random_graph_model(p, k, I, num_latent, e, var_lo, var_hi, int_var_lo, int_var_hi, psi_lo, psi_hi, int_psi_lo, int_psi_hi, B_lo, B_hi, sparse_latents=False, obs=True, random_state=42, verbose=0)
Generate a random model from a random Erdős–Rényi graph with p nodes and average degree k.
- Parameters
p (int) – The number of observed variables in the model.
k (float) – The average degree of the underlying Erdős–Rényi graph.
I (set) – The set of intervention targets.
num_latent (int) – The number of latent variables in the model.
e (int) – The number of environments.
var_lo (float) – The lower bound for the variances of the noise terms of the observed variables.
var_hi (float) – The upper bound for the variances of the noise terms of the observed variables.
int_var_lo (float) – The lower bound for the intervention variances on the observed variables.
int_var_hi (float) – The upper bound for the intervention variances on the observed variables.
psi_lo (float) – The lower bound for the variances of the latent variables.
psi_hi (float) – The upper bound for the variances of the latent variables.
int_psi_lo (float) – The lower bound for the intervention variances on the latent variables.
int_psi_hi (float) – The upper bound for the intervention variances on the latent variables.
B_lo (float) – The lower bound for the edge weights between observed variables.
B_hi (float) – The upper bound for the edge weights between observed variables.
sparse_latents (bool, default=False) – If the gamma matrix of latent effects should be sparse (see source).
obs (bool, default=True) – Whether the first environment should be “observational”, i.e. that the variances of the noise terms and latents are lower (variable-wise) than the other environments. With obs=True, the variances for first environment are sampled from [var_lo, var_hi] and, from [var_lo + int_var_lo, var_hi + int_var_hi] for the remaining environments; the same holds for the sampling of psi. If obs=False, the latter interval is used for all environments. Note that is not a necessary assumption for the UT-LVCE estimator, but makes the actual intervention strength less sensitive to the random sampling of parameters.
random_state (int, default=42) – To set the random state for reproducibility. Successive calls with the same random state will return the same model.
verbose (int, default = 0) – If debug and execution traces should be printed. 0 corresponds to no traces, higher values correspond to higher verbosity.
- Returns
model – An instance of the model with the sampled parameters.
- Return type
- Raises
ValueError : – If the intervention targets are not a subset of the variable indices, i.e. [0,…,p-1].
Examples
>>> random_graph_model(20,2.1,{2},2,5,0.5,0.6,3,6,0.2,0.4,1,5,0.7,0.8,False,True,42,0) <utlvce.model.Model object at 0x...>
- utlvce.generators.sample_parameters(A, I, num_latent, e, var_lo, var_hi, int_var_lo, int_var_hi, psi_lo, psi_hi, int_psi_lo, int_psi_hi, B_lo, B_hi, sparse_latents=False, obs=True, random_state=42, verbose=0)
Generate a random model given an adjacency matrix A and intervention targets I.
- Parameters
A (numpy.ndarray) – The adjacency matrix of the DAG underlying the model, where A[i,j] != 0 implies i -> j.
I (set) – The set of intervention targets.
num_latent (int) – The number of latent variables in the model.
e (int) – The number of environments.
var_lo (float) – The lower bound for the variances of the noise terms of the observed variables.
var_hi (float) – The upper bound for the variances of the noise terms of the observed variables.
int_var_lo (float) – The lower bound for the intervention variances on the observed variables.
int_var_hi (float) – The upper bound for the intervention variances on the observed variables.
psi_lo (float) – The lower bound for the variances of the latent variables.
psi_hi (float) – The upper bound for the variances of the latent variables.
int_psi_lo (float) – The lower bound for the intervention variances on the latent variables.
int_psi_hi (float) – The upper bound for the intervention variances on the latent variables.
B_lo (float) – The lower bound for the edge weights between observed variables.
B_hi (float) – The upper bound for the edge weights between observed variables.
sparse_latents (bool, default=False) – If the gamma matrix of latent effects should be sparse (see source).
obs (bool, default=True) – Whether the first environment should be “observational”, i.e. that the variances of the noise terms and latents are lower (variable-wise) than the other environments. With obs=True, the variances for first environment are sampled from [var_lo, var_hi] and, from [var_lo + int_var_lo, var_hi + int_var_hi] for the remaining environments; the same holds for the sampling of psi. If obs=False, the latter interval is used for all environments. Note that is not a necessary assumption for the UT-LVCE estimator, but makes the actual intervention strength less sensitive to the random sampling of parameters.
random_state (int, default=42) – To set the random state for reproducibility. Successive calls with the same random state will return the same model.
verbose (int, default = 0) – If debug and execution traces should be printed. 0 corresponds to no traces, higher values correspond to higher verbosity.
- Returns
model – An instance of the model with the sampled parameters.
- Return type
- Raises
ValueError : – If the given adjacency is not a DAG or the intervention targets are not a subset of the variable indices, i.e. [0,…,p-1].
Examples
>>> A = np.array([[0, 0, 1], [0, 0, 1], [0, 0, 0]]) >>> sample_parameters(A,{2},2,5,0.5,0.6,3,6,0.2,0.4,1,5,0.7,0.8,False,True,42,0) <utlvce.model.Model object at 0x...>
Requesting an inappropriate (>p) number of targets yields a ValueError:
>>> sample_parameters(A,{3},2,5,0.5,0.6,3,6,0.2,0.4,1,5,0.7,0.8,False,True,42,0) Traceback (most recent call last): ... ValueError: The intervention targets must be a subset of [0,...,p-1].
A ValueError is raised if the given adjacency does not correspond to a DAG (e.g. it contains cycles):
>>> A = np.array([[0, 0, 1], [0, 0, 1], [1, 0, 0]]) >>> sample_parameters(A,{2},2,5,0.5,0.6,3,6,0.2,0.4,1,5,0.7,0.8,False,True,42,0) Traceback (most recent call last): ... ValueError: The given adjacency does not correspond to a DAG.