What is new:
ProtoNet v6.1, November 2012
- Sequences: ProtoNet 6.1 supports Swissprot and TrEMBL DBs with over 9 million proteins from UniProt 15.4.
- Clustering: A new efficient algorithm was developed and implemented to cluster all pairwise BLAST E-score based on UniRef50.
- External Sources: Each proteins and clusters are linked to the annotations and features originated from Gene3D, Superfamily, UniProt, ENZYME, InterPro, NCBI Taxonomy, SCOP, GO and more.
- Searching mode:Proteins are searched by any keyword from major external sources and their combinations.
- Protein features:Proteins are associated with graphic domain representation according to the individual resources from InterPro and the detailed features according to UniProtKB.
- Cluster Definition: Each cluster is associated with a major family name determined automatically according to integration of external resources. Clusters can be searched by their dominant family names.
- Cluster Sub-tree: Each cluster is presented as a sub-tree with its neighboring clusters. Only significant merges are presented.
- Cluster Statistics: At any level of the tree hierarchy, clusters are assigned with statistical measures that indicate their level of purity.
- Cluster Taxonomical genome view:At any level of the tree hierarchy,clusters can be viewed by pre selected genome and genomes combinations.
- PANDORA analysis: PANDORA analysis: For any cluster, a view by PANDORA is provided allowing integrating all annotation associated with the proteins in the cluster (www.pandora.cs.huji.ac.il).
- Cluster Analysis: Each cluster can be viewed by ?cluster-browser? option that provides the BLAST results for all protein pairs within the cluster.
- Clustering merging rules: ProtoNet will support few modes of clustering and the tree based on a subset of sequences such has UniRef50 and complete proteomes.
- Clustering sub-tree: A user selection of the tree condensation and resolution.
- A detailed tour is presented to allow a new user to become acquainted with many of the ProtoNet options and capabilities.
- Rappoport, N., Karsenty, S., Stern, A., Linial, N., and Linial, M. (2011). ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree. Nucleic acids research, gkr1027. Get Pdf
- Rappoport, N., Linial, N., and Linial, M. (2013). ProtoNet: charting the expanding universe of protein sequences. Nature biotechnology, 31(4), 290-292.
- Loewenstein, Y., and Linial, M. (2008). Connect the dots: exposing hidden protein family connections from the entire sequence tree. Bioinformatics 24, i193-199.
- Loewenstein, Y., Portugaly, E., Fromer, M., and Linial, M. (2008). Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space. Bioinformatics 24, i41-49.
- Portugaly, E., Linial, N., and Linial, M. (2007). EVEREST: a collection of evolutionary conserved protein domains. Nucleic Acids Res 35, D241-246.
- Sasson, O., Kaplan, N., and Linial, M. (2006). Functional annotation prediction: all for one and one for all. Protein Sci 15, 1557-1562.
- Kaplan, N., Vaaknin, A., and Linial, M. (2003). PANDORA: keyword-based analysis of protein sets by integration of annotation sources. Nucleic Acids Res 31, 5617-5626.