Research Related Resources

Note: I would like to make this page a compendium of computational research-related resources! Please help me keep this information updated; email me if sites listed here are outdated, redundant, erroneously categorized, or if you want to share additional useful sites: zuritalopez (at) calstatela.edu. Thank you!


Find journal articles and more via PubMed

Science Librarian Tiffanie Ford-Baxter - Chemistry Library Guide: https://calstatela.libguides.com/chemistry

Prepare a research report

Here’s an article I found that contains good tips on how to read papers: https://www.sciencemag.org/careers/2016/03/how-seriously-read-scientific-paper

Proper Reference Citations also see the articles section in Ch. 14 of the ACS style guide for a very detailed version, and a good summary. 

Reference citation through our library resources: Click on the How To tab on the library's main page and see Cite Sources

Brenda. The comprehensive enzyme information system. https://www.brenda-enzymes.org/


Genomes


Saccharomyces Genome Database (SGD) provides comprehensive integrated biological information for the budding yeast Saccharomyces cerevisiae along with search and analysis tools to explore these data, enabling the discovery of functional relationships between sequence and gene products in fungi and higher organisms. http://www.yeastgenome.org/

Plant Genome Database (PGD) http://www.plantgdb.org/

TAIR Arabidopisis Information Resource www.arabidopsis.org


DNA and/or protein multiple sequence alignment programs


BLAST (Basic Local Alignment Search Tool) to find regions of similarity between DNA or proteins. The program compares sequences to sequence databases and calculates statistical signficiance. With BLAST, you can align two known sequences or compare your sequences to sequence databases https://blast.ncbi.nlm.nih.gov/Blast.cgi

Pairwise Sequence Alignment to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two sequences (protein or nucleic acid). http://www.ebi.ac.uk/Tools/psa/

ClustalW2 a general purpose DNA or protein multiple sequence alignment program for three or more sequences. http://www.genome.jp/tools/clustalw/

Clustal Omega a multiple sequence alignment program that uses seeded guide trees and HMM profile-profile techniques to generate alignments between three or more sequences. http://www.ebi.ac.uk/Tools/msa/clustalo/

Gene Infinity Many different types of resources including nucleotide to protein conversions. http://www.geneinfinity.org/

Benchling R&D cloud lab software with molecular biology tools and lab notebook capabilities. https://www.benchling.com

CloneRanger: https://clones.thermofisher.com/cloneranger.php


Cuffdiff - find significant changes in transcript expression, splicing, and promoter use.

Where in the human body is your protein expressed? The Human Proteome Map, analyzed by totalt mass spectromtrey. Human Proteome Map

KEGG: Kyoto Encyclopedia of Genes and Genomes, a database resource for using molecular-level infomraiton (large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies) for understanding living systems.


Protein Structural Analysis


 

NextProt Exploring the univers of human proteins - includes links to many other useful databases and tools. https://www.nextprot.org/

Protein Data Bank (PDB) a database of experimentally determined protein structures. http://www.rcsb.org/pdb/home/home.do

UniProt UniProt provides protein sequence and functional information.http://www.uniprot.org/

COBALT a multiple protein sequence alignment using conserved domain and local sequence similarity information. Align sequences using domain and protein constraints. https://www.ncbi.nlm.nih.gov/tools/cobalt/cobalt.cgi?LINK_LOC=BlastHomeLink

ExPASy Logo   access to scientific databases and software tools including proteomics, genomics, etc. https://www.expasy.org/

The Cambridge Crystallographic Data Centre (CCDC) Comprehensive database of crystal structures (not proteins). 

ChemDraw (use Cal State LA site license): http://sitesubscription.cambridgesoft.com/sitelicense.cfm?sid=423

ImageJ, an open source image processing program: https://imagej.net/Welcome or https://imagej.nih.gov/ij/links.html

BIOMechanic http://www.biomechanic.org


Protein Motifs (Consensus Sequences and Structural Motifs), Cleaving Sites and Metal Binding


MOTIF Search http://www.genome.jp/tools/motif/

CMA KFERQ Finder http://ec2-18-188-198-152.us-east-2.compute.amazonaws.com:3838/kferq/

ScanProsite tool http://prosite.expasy.org/scanprosite/

Cleave Predict PMAP protease specificity database

Prosper: https://prosper.erc.monash.edu.au/home.html

Degradome (info on all main proteases) http://degradome.uniovi.es/dindex.html

Cysteine Oxidation Prediction Program (COPP): http://copa.calstatela.edu/

Does a metal bind to your protein? CheckMyMetal (CMM): Metal Binding Site Validation Server: http://csgid.org/csgid/metal_sites/


Protein Post-Translational Modification Prediction

A post-translational modification is a covalent processing event resulting from a proteolytic cleavage or from the addition of a modifying group to one amino acid. So far, more than 350 PTMs have been characterized. They modulate the function of most eukaryote proteins by altering their activity state, localization, turnover, and interactions with other proteins. Although proteins can be modified pre-, co- or post-translationally, all protein modifications are generally referred to as PTMs, because a majority of them are made post-translationally, after the protein is folded.


How many PTMs does your protein have?

It is believed that there are over 170 databases and computational tools for PTM analysis. The Cuckoo Workgroup strives to maintain up-to-date resource list: http://www.biocuckoo.org/link.php This site also has good PTM prediction tools. 

PhosphositePlus

PhosphositePlus

dbPTM http://dbptm.mbc.nctu.edu.tw/

http://www.uniprot.org/help/mod_res

https://www.expasy.org/proteomics/post-translational_modification

FindMod is a tool that can predict potential protein post-translational modifications (PTM) and find potential single amino acid subsitutions in peptides. http://web.expasy.org/findmod/?_ga=1.127429003.764446468.1489540422

N-Terminal PTM predication http://terminus.unige.ch/

PHOSIDA, PTM Database http://141.61.102.18/phosida/index.aspx

PhosphoNET - PhosphoNET is an open-access, online resource developed by Kinexus Bioinformatics Corporation to foster the study of cell signalling systems to advance biomedical research in academia and industry.

Prediction of lysine methylation and lysine acetylation: PLMLA http://bioinfo.ncu.edu.cn/inquiries_PLMLA.aspx

How Common Are PTMs? PTM Statistics Curator: Automated Curation and Population of PTM Statistics from the Swiss-Prot Knowledgebase. An automated computational method for quantifying the number of each post-translational modification reported experimentally and non-experimentally in the Swiss-Prot Knowledgebase. Non-experimental qualifiers (Potential, Probable, By Similarity) are consistent with those defined in Swiss-Prot. This method will automatically load the new PTM IDs, curate them, and quantify and even categorize by organism this information every month as the Swiss-Prot Knowledgebase is updated. http://selene.princeton.edu/PTMCuration/

PTM Structureal Database http://www.dsimb.inserm.fr/dsimb_tools/PTM-SD/

RESID Database of Protein Modifications a comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications. http://pir.georgetown.edu/resid/

Databases for major PTMs http://www.geneinfinity.org/sp/sp_proteinptmodifs.html

What happens if there is a mutation on an amino acid that is supposed to be phosphorylated? MIMP http://mimp.baderlab.org/ Predicting the impact of mutations on kinase-substrate phosphorylation. 

NetSurfP server predicts the surface accessibility and secondary structure of amino acids in an amino acid sequence. The method also simultaneously predicts the reliability for each prediction, in the form of a Z-score. http://www.cbs.dtu.dk/services/NetSurfP-1.1/

PTM Structure Modeling: http://vienna-ptm.univie.ac.at/

Arginine Methylation
Prediction Tools
 

MeMo: (Chen, et al., 2006)

BPB-PPMS(Shao, et al., 2009)

MASA(Shien, et al., 2009)

PLMLA: PLMLA is an in silico online tool for prediction of potential lysine methylation and lysine acetylation from protein sequences. (Shi, et al., 2012)

PMeS: PMeS is a web tool for identifying protein methylation sites based on enchanced feature encoding scheme and support vector machine. (Shi, et al., 2012)

iMethyl-PseAAC: The web-server iMethyl-PseAAC is a web server that could predict methylation sites in proteins. (Qiu, et al., 2014)

iLM-2L: A two-level predictor for identifying protein lysine methylation sites and their methylation degrees by incorporating K-gap amino acid pairs into Chou׳s general PseAAC. (Ju, et al., 2015)

GPS-MSP: In this work, we adopted GPS 3.0 algorithm and built GPS-MSP (Methyl-group Specific Predictor) for the prediction of general or type-specific methylline and methylarginine residues in proteins. (Deng, et al., 2016)

iPTM-mLys: The web-server iPTM-mLys is used to predict the identifying Lysine PTM sites and the modified type(s) in proteins. (Qiu, et al., 2016)

PSSMe: PSSMe is a tool for identifying species-specific methylation sites based on information gain (IG) feature optimization method. (Wen, et al., 2016)

MePred-RF: To predict methylation sites, we develop a machine learning based predictor called MePred-RF. (Wei, et al., 2017)

PRmePRedPRmePRed is a SVM based prediction tool to predict arginine methylation sites in proteins. (Kumar, et al., 2017)

Met-predictor: This predictor is developed to predict lysine and arginice methylation sites based on support vector machine (SVM) classifier. It is supplied in source code form along with the required data files and run under the linux. The input is a protein sequence file (fasta format). (Zheng, et al., 2020)

 


 

Protein Structure Prediction


SWISS-MODEL is a fully automated protein structure homology-modelling server, accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer). The purpose of this server is to make Protein Modelling accessible to all biochemists and molecular biologists worldwide. https://swissmodel.expasy.org/?_ga=1.26984061.1337889699.1489539539

Rosetta Commons information on the Rosetta software suite - algorithms for computational modeling and analysis of protein structures. https://www.rosettacommons.org/

http://www.rosettadesigngroup.com/rosettacon/

A compendium of many protein analysis tools by Robert B. Russel, Bioinformatics, Research & Development, SmithKline Beecham Pharmaceuticals: https://molbiol-tools.ca/Protein_Chemistry.htm

EvoDesign an evolutionary profile based approach to de novo protein design. Starting from a scaffold of target protein structure, EvoDesign first identifies protein families with similar folds from the PDB library by TM-align. http://zhanglab.ccmb.med.umich.edu/EvoDesign/


Protein-Protein Interactions


STRING Protein-Protein Interaction Networks.

The ConSurf Server - server for the identification of functional regions in proteins

Human Protein Atlas


Small Molecules and Drugs


Drug Report developed to collect available scientific data on marketed drugs. The information can help scientists, doctors and patients to stay tuned for latest knowledge of drugs. http://drug.report/

RxList. Medications and prescription drug information for consumers and medical health professionals. https://www.rxlist.com/script/main/hp.asp

About Herbs. Expert advice and information on supplements, integrative medicine treatments, and more. https://www.mskcc.org/cancer-care/diagnosis-treatment/symptom-management/integrative-medicine/herbs


Lab Materials and Equipment


Science Exchange

ABRF Core Market Place


Collecting and Presenting Data


 

Useful Guidelines on Collecting Data from the Journal of Biological Chemistry here.