Other research themes
We here introduce our other researches.
- Development of a deep learning-based method to identify “good” regions of a cryo-electron microscopy grid
- Developmeny of PyHoleFinder
Development of a deep learning-based method to identify “good” regions of a cryo-electron microscopy grid
A cryo-EM specimen is prepared by applying a protein solution to a metal grid. The metal grid is rapidly frozen to embed the protein molecules in vitreous ice. The thickness of the ice is a critical factor for high-resolution structure determination (Fig. 1). So far, expert researchers have manually selected “good” holes that have ice films with appropriate thicknesses, before acquiring hole images at high magnification. In this study, to reduce the burden and to improve the efficiency of acquiring high-magnification images, we developed a deep-learning-based method to identify “good” holes from low-magnification EM images. We constructed a classifier by combining two deep learning-based methods, namely, YOLOv3 and Xception, for detection and classification of holes, respectively (Fig. 2). We obtained the images of a soluble protein (β-galactosidase) sample and a membrane protein (cytochrome c oxidase) sample. The hole images of each sample were manually classified into good or bad holes according to the ice thickness to create a dataset. The dataset was divided into the training sub-dataset and the validation sub-dataset. The classifier was trained with the training sub-dataset and the performance was assessed on the test sub-dataset. As a result, the accuracy, the precision, and the recall for the soluble protein sample were 0.938, 0.956, and 0.952, respectively. Those for the membrane protein sample were 0.953, 0.951, and 0.986, respectively. High prediction performances were obtained for both the samples (Fig. 3a). When the classifier was trained with the dataset of the soluble protein sample and the prediction was performed on the dataset of the membrane protein sample, the prediction performance was significantly degraded, suggesting the dependency on the sample (Fig. 3b). Fig. 4 shows an example of the output of the classifier applied to a image of the soluble protein sample. In addition, we found that a training data set containing ~2100 hole images was sufficient to obtain good accuracy. This study was conducted in collaboration with Prof. Tani at Nagoya University (present affiliation: Tsukuba University).
Yokoyama et al., Biophys. Rev. 12, 349–354 (2020).
Fig. 1: (a) A 200-mesh grid with a diameter of 3 mm (top), an image of a square in the mesh (middle), and an enlarged view of the holes of the carbon film with a sample (bottom). (b) Schematic illustrations (top) and corresponding EM images (bottom) of holes with thick, good, and thin ice films. In each image, the 2-μm-diameter hole at the center of the view is enclosed in a red dashed circle.
Fig. 2: Outline of the developed system.
Fig. 3: ROC curves of the classifier that was trained with the soluble protein sample training set and then applied to (a) the test sub-dataset of the soluble protein sample and (b) the test sub-dataset of the membrane protein sample.
Fig. 4: An example of the output of the developed system applied to a image of the soluble protein sample.
Development of PyHoleFinder
We developed a program, PyHoleFinder, to find holes in holey carbon films on a cryo-electron microscope (cryo-EM) grid. This program can work with SerialEM. With this program, detected holes can be grouped into 3×3 or 5×5 groups, which enables to efficiently acquire high-magnification images of 9 or 25 holes without moving the cryo-EM stage (Fig. 4).
Fig. 4: Holes are detected and grouped into 3×3 groups with PyHoleFinder.