Max-Fidelity Linear Discriminant Analysis ========================================= Documentation for :py:mod:`iq_readout.two_state_classifiers.MaxFidLinearClassifier` Characteristics --------------- - 2-state classifier for 2D data - Uses threshold to classify (projected) 2D data - Decision boundary is a straight line (hence *linear*) - Does not assume any probability density function for the 2D data, thus - the PDFs will correspond to the histograms of the calibration data - the accuracy in the threshold is limited by the bin separation of the histogram of the calibration data Example ------- .. plot:: import matplotlib.pyplot as plt import numpy as np from iq_readout.two_state_classifiers import MaxFidLinearClassifier from iq_readout.plots.shots1d import plot_two_pdfs_projected from iq_readout.plots.shots2d import plot_shots_2d, plot_boundaries_2d shots_0, shots_1 = np.load("data_two_state_calibration.npy") classifier = MaxFidLinearClassifier.fit(shots_0, shots_1) fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 4.5)) axes[0] = plot_shots_2d(axes[0], shots_0, shots_1) axes[1] = plot_two_pdfs_projected( axes[1], classifier, shots_0, shots_1, ) axes[0] = plot_boundaries_2d(axes[0], classifier) axes[0].set_title("Decision boundaries") axes[1].set_title("PDFs") plt.show() Threshold classifier -------------------- The threshold classifiers are two-state linear classifiers that use a threshold line (hence *linear*) to divide the 2D plane into two regions (hence *two-state classifier*). Given a threshold :math:`z_{thr}` and a projection function :math:`\vec{z} \rightarrow z_{\parallel}=\vec{z}\cdot \hat{e}_{\parallel}`, it outputs 0 if :math:`z_{\parallel} \leq z_{thr}` and 1 otherwise. *Note: the threshold can be updated based on the prior probabilities of the classes, but it requires some computation* Linearity --------- The linearity of this classifier comes from its definition, as the decision boundary is a line perpendicular to :math:`\hat{e}_{\parallel}` that crosses the point :math:`\vec{A} = z_{thr}\hat{e}_{\parallel}`. Maximum fidelity in a threshold classifier ------------------------------------------ The assignment infidelity :math:`\epsilon` is defined as :math:`\epsilon = (p(m=0|s=1) + p(m=1|s=0))/2`, where :math:`p(m|s)` is the probability of measuring :math:`m` given that the qubit was in state :math:`s`. Note that we have assumed that the probability of seeing state 0 and 1 are the same (i.e. p(s=0)=p(s=1)=1/2), but it can be generalized to any :math:`p(s)` using :math:`\epsilon = p(m=1|s=0)p(s=0) + p(m=0|s=1)p(s=1)`. Given the cumulative density functions :math:`CDF(z_{\parallel}|s)`, we can rewrite the assingment infidelity as .. math:: \epsilon = \frac{1}{2} [CDF(z_{thr}|1) + 1 - CDF(z_{thr}|0)], because the probability of incorrectly assigning :math:`m=0` to state :math:`s=1` is the probability that the projected data is less than the threshold, and equivalently for the other case. Note that we have assumed that (1) the 0 blob is on the left of the 1 blob in the projected axis, and (2) the density functions "behave as expected", meaning that :math:`CDF(z_{\parallel}|0) \geq CDF(z_{\parallel}|1) \;\forall z_{\parallel}`. This last assumption can be broken if the PDFs exhibit more than one maximum. The threshold that maximizes the assingment fidelity :math:`F = 1 - \epsilon` (i.e. minimizes the assignment infidelity) is given by the point that maximizes the distance between the cumulative density functions, .. math:: z^*_{thr} = \mathrm{argmax}_{z_{thr}} F = \mathrm{argmax}_{z_{thr}} CDF(z_{thr}|0) - CDF(z_{thr}|1), where we have omited the constants and factors as we are only interested in the :math:`\mathrm{argmax}`, not the :math:`\mathrm{max}` value. In the general case where the priors are not equal, we have .. math:: z^*_{thr} = \mathrm{argmax}_{z_{thr}} F = \mathrm{argmax}_{z_{thr}} CDF(z_{thr}|0)p(s=0) - CDF(z_{thr}|1)p(s=1), Notes on the algorithm ---------------------- As the classifier is linear, the data can be projected to the axis orthogonal to the decision boundary. The projection axis corresponds to the line with direction :math:`\vec{\mu}_1 - \vec{\mu}_0` that crosses these two means. The direction is chosen this way to have the *blob* from state 0 on the left and the *blob* from state 1 on the right. The projection axis can be estimated from the means of the data for each class :math:`c`, :math:`\{\vec{z}^{(i)}_c\}_i`, given by .. math:: \vec{\nu}_c = \frac{1}{N}\sum_{i=1}^N \vec{z}^{(i)}_c, because :math:`\vec{\mu}_1 - \vec{\mu}_0 \propto \vec{\nu}_1 - \vec{\nu}_0`. The justification is that, given :math:`\vec{z}_c \sim p(\vec{z}|c)`, the estimator of the mean is :math:`\vec{\nu}_c = \sin^2(\theta_c) \vec{\mu}_0 + \cos^2(\theta_c) \vec{\mu}_1`, thus :math:`\vec{\nu}_1 - \vec{\nu}_0 = (\sin^2(\theta_1) - \sin^2(\theta_0)) (\vec{\mu}_1 - \vec{\mu}_0)`. The algorithm uses the following tricks: #. work with projected data (to have more samples in each bin of the histogram)