The ARMED framework for a generic neural network. The conventional model (blue area) predicts y_F from the data sample x. Cluster membership of the sample is one-hot encoded into z. The fixed effects subnetwork (blue + gray areas) is constructed by adding an adversarial classifier (gray area) to predict cluster membership z. The original model is penalized through the generalization loss for learning features that allow cluster membership prediction. The random effects subnetwork (orange area) uses Bayesian layers to learn cluster-specific weights, dependent on z, that follow zero-mean multivariate normal distributions. These weights can be formulated as nonlinear slopes multiplied by the fixed effects latent representation h_F(X; β), linear slopes multiplied by X, and/or intercepts. The fixed and random effects are combined with a mixing function m(...). For prediction on data from clusters unseen during training, z is inferred with a classifier (Z-predictor) trained on data from seen clusters.