SOFA

class sofa.models.SOFA.SOFA(Xmdata: Union[None, mudata._core.mudata.MuData] = None, num_factors: Union[None, int] = None, Ymdata: Union[None, mudata._core.mudata.MuData] = None, design: Union[None, numpy.ndarray] = None, device: Optional[Literal['cuda', 'cpu']] = 'cpu', horseshoe: bool = True, update_freq: int = 200, subsample: int = 0, metadata: Union[None, pandas.core.frame.DataFrame] = None, verbose: bool = True, horseshoe_scale_feature: float = 1, horseshoe_scale_factor: float = 1, horseshoe_scale_global: float = 1, disp_mean: float = 3.0, seed: Union[None, int] = None)

Initializes a SOFA model instance.

Parameters
  • Xmdata (MuData) – Input data views. Each view should be centered and scaled.

  • num_factors (int) – Number of latent factors.

  • Ymdata (MuData, optional) – guide data. The default is None.

  • design (torch.Tensor, optional) – Design matrix for supervised factors. The default is None.

  • device (str, optional) – Device to fit the model (“cuda” or “cpu”). The default is ‘cpu’.

  • horseshoe (bool, optional) – Whether to use horseshoe priors on the loadings. The default is True.

  • update_freq (int, optional) – Frequency of steps before ELBO is displayed during training. The default is 200.

  • subsample (int, optional) – Number of samples to use for each minibatch. The default is 0 (use all samples).

  • metadata (pandas.DataFrame, optional) – Dataframe with sample metadata. The default is None.

  • verbose (bool, optional) – Whether to print fitting progress. The default is True.

  • horseshoe_scale_feature (float, optional) – Scale for feature-specific horseshoe prior. The default is 1.

  • horseshoe_scale_factor (float, optional) – Scale for factor-specific horseshoe prior. The default is 1.

  • horseshoe_scale_global (float, optional) – Scale for global horseshoe prior. The default is 1.

  • seed (int, optional) – Random seed. The default is None.

fit(n_steps=3000, lr=0.005, refit=False, predict=True)

method to fit the SOFA model

Parameters
  • n_steps (int, optional) – number of iterations for fitting. The default is 3000.

  • lr (float, optional) – learning rate for adam optimizer. The default is 0.005.

  • refit (bool, optional) – whether to refit the model. the default behaviour is that the model will not be newly intialized if you call fit_SOFA twice with refit=False. The default is False.

Returns

Return type

None.

predict(site, num_samples=25, num_split=1024, verbose=False)

Sample from approximate posterior distribution

Parameters
  • site (list) – Name of parameter site to predict.

  • num_samples (int, optional) – Number of samples to sample from approximate posterior distribution, by default 25

  • num_split (int, optional) – Local samples are predicted in chunks of num_split, by default 1024

  • verbose (bool, optional) – Whether to show progress of prediction, by default False

Returns

Predicted parameter values

Return type

numpy array