Note: I would like to make this page a compendium of computational research-related resources! Please help me keep this information updated; email me if sites listed here are outdated, redundant, erroneously categorized, or if you want to share additional useful sites: zuritalopez (at) calstatela.edu. Thank you!
Find journal articles and more via PubMed
Science Librarian Tiffanie Ford-Baxter - Chemistry Library Guide: https://calstatela.libguides.com/chemistry
Here’s an article I found that contains good tips on how to read papers: https://www.sciencemag.org/careers/2016/03/how-seriously-read-scientific-paper
Brenda. The comprehensive enzyme information system. https://www.brenda-enzymes.org/
Saccharomyces Genome Database (SGD) provides comprehensive integrated biological information for the budding yeast Saccharomyces cerevisiae along with search and analysis tools to explore these data, enabling the discovery of functional relationships between sequence and gene products in fungi and higher organisms. http://www.yeastgenome.org/
Plant Genome Database (PGD) http://www.plantgdb.org/
TAIR Arabidopisis Information Resource www.arabidopsis.org
SAGA: Software for the Analysis of Genetic Architecture Explore genes. Find out how genes behave and where they are located. Note: requires software download https://rdrr.io/cran/SAGA/
DNA and/or protein multiple sequence alignment programs
BLAST (Basic Local Alignment Search Tool) to find regions of similarity between DNA or proteins. The program compares sequences to sequence databases and calculates statistical signficiance. With BLAST, you can align two known sequences or compare your sequences to sequence databases https://blast.ncbi.nlm.nih.gov/Blast.cgi
Pairwise Sequence Alignment to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two sequences (protein or nucleic acid). http://www.ebi.ac.uk/Tools/psa/
ClustalW2 a general purpose DNA or protein multiple sequence alignment program for three or more sequences. http://www.genome.jp/tools/clustalw/
Clustal Omega a multiple sequence alignment program that uses seeded guide trees and HMM profile-profile techniques to generate alignments between three or more sequences. http://www.ebi.ac.uk/Tools/msa/clustalo/
Gene Infinity Many different types of resources including nucleotide to protein conversions. http://www.geneinfinity.org/
Benchling R&D cloud lab software with molecular biology tools and lab notebook capabilities. https://www.benchling.com
Cuffdiff - find significant changes in transcript expression, splicing, and promoter use.
Where in the human body is your protein expressed? The Human Proteome Map, analyzed by totalt mass spectromtrey.
KEGG: Kyoto Encyclopedia of Genes and Genomes, a database resource for using molecular-level infomraiton (large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies) for understanding living systems.
Protein Structural Analysis
Exploring the univers of human proteins - includes links to many other useful databases and tools. https://www.nextprot.org/
Protein Data Bank (PDB) a database of experimentally determined protein structures. http://www.rcsb.org/pdb/home/home.do
UniProt provides protein sequence and functional information.http://www.uniprot.org/
COBALT a multiple protein sequence alignment using conserved domain and local sequence similarity information. Align sequences using domain and protein constraints. https://www.ncbi.nlm.nih.gov/tools/cobalt/cobalt.cgi?LINK_LOC=BlastHomeLink
access to scientific databases and software tools including proteomics, genomics, etc. https://www.expasy.org/
The Cambridge Crystallographic Data Centre (CCDC) Comprehensive database of crystal structures (not proteins).
ChemDraw (use Cal State LA site license): http://sitesubscription.cambridgesoft.com/sitelicense.cfm?sid=423
Protein Motifs (Consensus Sequences and Structural Motifs), Cleaving Sites and Metal Binding
MOTIF Search http://www.genome.jp/tools/motif/
ScanProsite tool http://prosite.expasy.org/scanprosite/
Cleave Predict PMAP protease specificity database
Degradome (info on all main proteases) http://degradome.uniovi.es/dindex.html
Cysteine Oxidation Prediction Program (COPP): http://copa.calstatela.edu/
Does a metal bind to your protein? CheckMyMetal (CMM): Metal Binding Site Validation Server: http://csgid.org/csgid/metal_sites/
Protein Post-Translational Modification Prediction
A post-translational modification is a covalent processing event resulting from a proteolytic cleavage or from the addition of a modifying group to one amino acid. So far, more than 350 PTMs have been characterized. They modulate the function of most eukaryote proteins by altering their activity state, localization, turnover, and interactions with other proteins. Although proteins can be modified pre-, co- or post-translationally, all protein modifications are generally referred to as PTMs, because a majority of them are made post-translationally, after the protein is folded.
How many PTMs does your protein have?
It is believed that there are over 170 databases and computational tools for PTM analysis. The Cuckoo Workgroup strives to maintain up-to-date resource list: http://www.biocuckoo.org/link.php This site also has good PTM prediction tools.
FindMod is a tool that can predict potential protein post-translational modifications (PTM) and find potential single amino acid subsitutions in peptides. http://web.expasy.org/findmod/?_ga=1.127429003.764446468.1489540422
N-Terminal PTM predication http://terminus.unige.ch/
PHOSIDA, PTM Database http://22.214.171.124/phosida/index.aspx
PhosphoNET - PhosphoNET is an open-access, online resource developed by Kinexus Bioinformatics Corporation to foster the study of cell signalling systems to advance biomedical research in academia and industry.
Prediction of lysine methylation and lysine acetylation: PLMLA http://bioinfo.ncu.edu.cn/inquiries_PLMLA.aspx
How Common Are PTMs? PTM Statistics Curator: Automated Curation and Population of PTM Statistics from the Swiss-Prot Knowledgebase. An automated computational method for quantifying the number of each post-translational modification reported experimentally and non-experimentally in the Swiss-Prot Knowledgebase. Non-experimental qualifiers (Potential, Probable, By Similarity) are consistent with those defined in Swiss-Prot. This method will automatically load the new PTM IDs, curate them, and quantify and even categorize by organism this information every month as the Swiss-Prot Knowledgebase is updated. http://selene.princeton.edu/PTMCuration/
PTM Structureal Database http://www.dsimb.inserm.fr/dsimb_tools/PTM-SD/
RESID Database of Protein Modifications a comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications. http://pir.georgetown.edu/resid/Databases for major PTMs http://www.geneinfinity.org/sp/sp_proteinptmodifs.html
What happens if there is a mutation on an amino acid that is supposed to be phosphorylated? MIMP http://mimp.baderlab.org/ Predicting the impact of mutations on kinase-substrate phosphorylation.
NetSurfP server predicts the surface accessibility and secondary structure of amino acids in an amino acid sequence. The method also simultaneously predicts the reliability for each prediction, in the form of a Z-score. http://www.cbs.dtu.dk/services/NetSurfP-1.1/
PTM Structure Modeling: http://vienna-ptm.univie.ac.at/
GPS-MSP: In this work, we adopted GPS 3.0 algorithm and built GPS-MSP (Methyl-group Specific Predictor) for the prediction of general or type-specific methylline and methylarginine residues in proteins. (Deng, et al., 2016)
Met-predictor: This predictor is developed to predict lysine and arginice methylation sites based on support vector machine (SVM) classifier. It is supplied in source code form along with the required data files and run under the linux. The input is a protein sequence file (fasta format). (Zheng, et al., 2020)
Protein Structure Prediction
SWISS-MODEL is a fully automated protein structure homology-modelling server, accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer). The purpose of this server is to make Protein Modelling accessible to all biochemists and molecular biologists worldwide. https://swissmodel.expasy.org/?_ga=1.26984061.1337889699.1489539539
Rosetta Commons information on the Rosetta software suite - algorithms for computational modeling and analysis of protein structures. https://www.rosettacommons.org/
A compendium of many protein analysis tools by Robert B. Russel, Bioinformatics, Research & Development, SmithKline Beecham Pharmaceuticals: https://molbiol-tools.ca/Protein_Chemistry.htm
SCOP: Structural Classification of Proteins -provides a broad survey of all known protein folds, detailed information about the close relatives of any particular protein, and a framework for future research and classification.
EvoDesign an evolutionary profile based approach to de novo protein design. Starting from a scaffold of target protein structure, EvoDesign first identifies protein families with similar folds from the PDB library by TM-align. http://zhanglab.ccmb.med.umich.edu/EvoDesign/
The ConSurf Server - server for the identification of functional regions in proteins
Small Molecules and Drugs
Drug Report developed to collect available scientific data on marketed drugs. The information can help scientists, doctors and patients to stay tuned for latest knowledge of drugs. http://drug.report/
RxList. Medications and prescription drug information for consumers and medical health professionals. https://www.rxlist.com/script/main/hp.asp
About Herbs. Expert advice and information on supplements, integrative medicine treatments, and more. https://www.mskcc.org/cancer-care/diagnosis-treatment/symptom-management/integrative-medicine/herbs
Lab Materials and Equipment
Collecting and Presenting Data
Useful Guidelines on Collecting Data from the Journal of Biological Chemistry here.