The membrane module orients proteins across the membrane to identify residue-level subcellular compartments of the E. coli proteome.
(a) In the metadata provided by EcoCyc (cellular compartment), Gene Ontology Terms (pathway, function, compartment), iML1515 (metabolic subsystem), and UniProt (topological & transmembrane domains) databases, there are 1,777 protein structures mapped to at least one gene that is associated with the E. coli membrane. (b) Membrane-crossing residues are identified by the amino acid sequence information provided by UniProt, predicted by DeepTMHMM, and calculated by OPM. From these residues, a plane of best fit is calculated. (c) Structures with two calculated membrane planes pass the QCQA analysis if i) the angle between the planes is less than 35°, ii) the thickness of the membrane embedded region is between 12 and 45 Angstroms, and iii) the cross-sectional area of the membrane embedded region is less than 10,000 Å2. (d) Membrane proteins are oriented using the topological information provided by UniProt (if available) or manually using common protein motifs (see Dataset S6-S7) such that (e) the subcellular compartment of every amino acid of the E. coli proteome can be determined.