PyF2F: a robust and simplified fluorophore-to-fluorophore distance measurement tool for Protein interactions from Imaging Complexes after Translocation experiments

Abstract Structural knowledge of protein assemblies in their physiological environment is paramount to understand cellular functions at the molecular level. Protein interactions from Imaging Complexes after Translocation (PICT) is a live-cell imaging technique for the structural characterization of macromolecular assemblies in living cells. PICT relies on the measurement of the separation between labelled molecules using fluorescence microscopy and cell engineering. Unfortunately, the required computational tools to extract molecular distances involve a variety of sophisticated software programs that challenge reproducibility and limit their implementation to highly specialized researchers. Here we introduce PyF2F, a Python-based software that provides a workflow for measuring molecular distances from PICT data, with minimal user programming expertise. We used a published dataset to validate PyF2F’s performance.

The datapoint that is less likely to belong to the dataset is the one whose rejection gives the worst log-likelihood: . This datapoint is rejected and a new estimate of µ and is computed.σ • The process is iterated by exploring a given percentage of the dataset defined by the user, starting from the largest distances (which are those defining the tail of the distribution, where outliers, if present, are more problematic), has been sampled for rejection.We do not expect to reject as many data points, but ⅓ of the dataset is a safe parameter to ensure that we had enough sampling.
• Two subsequent rejections i and i+1 will give two estimates of µ.Their difference, , will decrease when most outliers are rejected and the score will thus be δµ  = µ +1 − µ  maximal: (scores for each iteration)

Note S2. Interpreting the bootstrap method for the outlier rejection
The bootstrap method aims to find the distance distribution that maximises the log-likelihood for estimating the and parameters given a Rice distribution (2).We can achieve this by iteratively µ σ rejecting the data points that are less likely to belong to the dataset (false positives -outliers) and consequently minimise the log-likelihood.As shown in Figure 2B, the bootstrap method scores the estimation of of the resulting distribution after an outlier rejection (black dots), giving us a notion µ about the likelihood to find outliers in that distribution.The lower the score, the more likely to find outliers in that distribution (Figure 2B).Although this approach is suitable for detecting outliers (the one that maximised the scoring function), it faces some limitations that need to be considered when interpreting the results: 1) Two parameter estimation: two parameters ( and ) are estimated at the same time µ σ throughout a minimisation process (see Note S1) using the maximum likelihood estimation.When convergence is not achieved equally for and (due to the presence of large number µ σ of outliers, or to a low number of observations within the distance distribution) one may find two or more "peaks" (maximum relative scores) for and estimations.µ σ 2) Dataset quality: contamination of the dataset (number of outliers) is directly influenced by the quality of the images analysed and the number of the anchor-RFP-FKBP -prey-GFP paired spots that are suitable for the distance measurement (number of true positives).We assumed that the contamination of the dataset with false positives should not be higher than 20% of the data points.Larger contaminations due to poor-quality images, such as the presence of out-of-focus spots, inefficient anchoring, inaccurate cell segmentation, etc., might lower the performance of the bootstrap method for obtaining accurate estimations.

Note S3. A practical case to illustrate PICT assay assessment with PyF2F
As a proof-of-concept, we analysed two PICT datasets from the same yeast strain (i.e.OGY1322 cells expressing the anchor-RFP-FKBP, Exo70-FRB as bait and Sec5-mNeonGreen (Sec5-mNG), being mNG a green fluorescent protein equivalent to the 3xGFP for the current example).Each PICT dataset had been acquired by treating cell samples with different batches of latrunculin A (LatA): a fresh batch (batch 1) and a diluted batch that had been stocked as solution for more than 6 months (batch 2).
LatA is employed to inhibit the polymerization of the actin cables, thereby facilitating the static anchoring of Sla2-based platforms on the plasma membrane (1).Diluted LatA might lead to partial actin polymerization that will impact on the properties of the anchor-RFP-FKBP platforms (structure and dynamism).As shown in Figure S2A, the fluorescent spots corresponding to the anchor-RFP-FKBP imaged in different biological replicates can be grouped in the 2D space defined by the mean logarithmic intensity and the mean second momentum.Although the datasets acquired in cells treated with different LatA batches do not show a significantly different second momentum on average, the statistical dispersion of their distribution suggests that the underlying structure or dynamism of the anchor-RFP-FKBP platforms is different.Indeed, the mean logarithmic intensity of the anchor-RFP-FKBP spots is significantly higher for the PICT dataset treated with the batch 1 of LatA than the dataset obtained with the diluted batch 2 (batch 2), suggesting that subtle differences in the LatA batch might impact on the structure of the anchor-RFP-FKBP platform (i.e, the copy number that populates the anchor-RFP-FKBP platforms).
The consequences that deficient anchor-RFP-FKBP platforms might have on the distance estimations is likely to depend on many factors, such as the nature of intra-assembly or inter-assembly interactions.It is probably the latest the more sensitive to slight defects in the PICT assay due to its intrinsic transient behaviour.We have further compared the estimations for the distance ( ) and µ standard deviation ( ) for the same yeast strain (i.e. cells expressing anchor-RFP-FKBP, Exo70-FRB σ as bait and Sec5-mNG as prey) in four biological replicates for each LatA batch (Figure S2B).No significant differences could be observed between estimations throughout replicates imaged with the different LatA batches (Figure S2B).However, estimations of in cells treated with the LatA batch1 σ are comparably smaller than the estimations done in cells treated with the LatA batch 2. σ

Note S4. Imaging recommendation
Several factors must be carefully monitored by the users to minimise the registration error when aligning the two-channel PICT images: change of focus during the image acquisition, temperature variation, mechanical drift, changes in the light path, sample dynamics, etc (3, 4, 5).To minimise the likelihood of aberrations we suggest: 1) controlling the axial drift with an automated focus correction system (we use the perfect focus system (PSF), Nikon), 2) adding a delay between the movement to a new FOV and the image acquisition to minimise X-Y mechanical drift (we use a 2.5 seconds waiting-time), 3) using an objective with apochromatic optical lenses, 4) imaging the fiducial markers and the PICT sample on the same holder, support and media, and 5) using a dual-band bandpass filter cube that does not require any filter change along the whole experiment to minimise light path variations.

Figure S1. Accuracy of image registration
A) The distribution of centroid-to-centroid distances between the beads coordinates in the two channels is depicted before (grey) and after (cyan) two-channel alignment.alignment can be derived by fitting each lateral distribution to a Gaussian function.F) The registration accuracy (target registration error, TRE) is given by the following equation: We proceed to the subsequent steps of the image analysis only when the registration accuracy is less than 1 nanometer.
compute the log-likelihood given an initial estimate of µ and : σ (log-likelihood) log   ( µ, σ) =  ∈  ∑ log (; µ, σ) B, C) After alignment, we can assess the shift distribution (deviation of bead coordinates in one channel with respect to the other) along B) the x-axis and C) the y-axis through the field of view.Each dot shows a single bead colour-coded to indicate the coordinate shift between the two channels of the same bead.D, E) The average deviation ( and ) between the beads' x and y positions in the two channels after µ

Figure S2 .
Figure S2.Monitoring the performance of the anchor-RFP-FKBP

Figure S3 .
Figure S3.Image registration used in the former approach

Figure S5 .
Figure S5.Comparison of the yeast cell segmentation