--------------------------------------------- current file for pfam to scop mapping is based on minimal 0.75 agreement between scop and pfam (intersection/union) without multi chain scop entries. Scop domains covering complete chains where assumed to always be above threshold (if on the right chain) current file for pfam to superfam mapping is ~/p/uniref90_keywords/uniprot_xml_keywords/pfam__annotation_stats/pfam2ssf_agreement_thresh0.5.txt based on minimal 0.5 agreement between scop and ssf (intersection/union) agreement was calculated for each protein having both the Pfam and SSF signature (based on the protein2ipr.dat file downloaded on Feb 2008) and averaged across all pfam ssf pairs. Legend: 1. relation [ XXXXXXX ] : YYYYYYY %LINKAGE (=|existing_edges|/(|cluster1|*|cluster2|)) ProtoLevel where relation is the tree relatedness of YYYYYYYY to the best cluster XXXXXXXXX (sibling, parent, or given) and %linkage is the proportion of existing blast edges, from that possible (i.e. the sparsity level of the cluster) the size of the cluster is |cluster1| + |cluster2| 2. best keyword for cluster C is K with Jaccard = J [ TP FP TN FN] Specificity Sensitivity --------------------------------------------- -------------------====== ( 1 ) 6690234_PF00186_PF00303 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00303 is 6548420 with Jaccard = 1.0000 |PF00303|=265 [ 265 0 1099946 0 ] parent [ 6548420 ] : 6690234 0.0974543 (=8709/(293*305)) 90.3238 given [ 6548420 ] : 6548420 0.645833 (=930/(5*288)) 37.7485 best keyword for cluster 6548420 is PF00303 with Jaccard = 1.0000 [ 265 0 1099946 0 ] 1.0000 1.0000 sibling [ 6548420 ] : 6635680 0.299342 (=91/(1*304)) 75.9263 best keyword for cluster 6635680 is PF00186 with Jaccard = 0.8836 [ 281 0 1099893 37 ] 1.0000 0.8836 SUGGESTING RELATEDNESS OF: A> PF00303 ( PF00303 Thymidylate synthase ) B> PF00186 ( PF00186 Dihydrofolate reductase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00186| = 318 , |PF00303| = 265 , |PF00186^PF00303| = 32 ( 10.1% and 12.1% ) both PF00303 and PF00186 have PDB structures PF00186 c.71.1.1 SUPERFAM mapping significantly overlapping: 1 PF00303 SSF55831 0.980 (average over 946 mutual instances, PF00303 949 appearances, SSF55831 1043 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 2 ) 6734619_PF00342_PF00923 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00342 is 6723612 with Jaccard = 1.0000 |PF00342|=340 [ 340 0 1099871 0 ] parent [ 6723612 ] : 6734619 0.0281211 (=3859/(364*377)) 97.2335 given [ 6723612 ] : 6723612 0.0415525 (=182/(365*12)) 95.9425 best keyword for cluster 6723612 is PF00342 with Jaccard = 1.0000 [ 340 0 1099871 0 ] 1.0000 1.0000 sibling [ 6723612 ] : 6692021 0.0958333 (=138/(360*4)) 90.6723 best keyword for cluster 6692021 is PF00923 with Jaccard = 0.9701 [ 325 0 1099876 10 ] 1.0000 0.9701 SUGGESTING RELATEDNESS OF: A> PF00342 ( PF00342 Phosphoglucose isomerase ) B> PF00923 ( PF00923 Transaldolase ) Only A has a clan ( CL0067.7 ). the two keywords coincide on Uniref90 proteins: |PF00342| = 340 , |PF00923| = 335 , |PF00342^PF00923| = 10 ( 2.9% and 3.0% ) both PF00342 and PF00923 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 3 ) 6561262_PF00434_PF05868 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00434 is 6528497 with Jaccard = 1.0000 |PF00434|=45 [ 45 0 1100166 0 ] parent [ 6528497 ] : 6561262 0.613043 (=141/(5*46)) 47.982 given [ 6528497 ] : 6528497 0.755556 (=34/(1*45)) 25.0067 best keyword for cluster 6528497 is PF00434 with Jaccard = 1.0000 [ 45 0 1100166 0 ] 1.0000 1.0000 sibling [ 6528497 ] : 6284839 1 (=4/(1*4)) 1.00235e-10 best keyword for cluster 6284839 is PF05868 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00434 ( PF00434 Glycoprotein VP7 ) B> PF05868 ( PF05868 Rotavirus major outer capsid protein VP7 ) they come from the same clan: CL0217.4 : PF05868 PF00434 the two keywords do not coincide on UniRef90 proteins Neither PF00434 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 4 ) 6735955_PF00509_PF04369 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00509 is 6533060 with Jaccard = 1.0000 |PF00509|=86 [ 86 0 1100125 0 ] parent [ 6533060 ] : 6735955 0.0492424 (=26/(88*6)) 97.3689 given [ 6533060 ] : 6533060 0.737255 (=188/(3*85)) 27.7465 best keyword for cluster 6533060 is PF00509 with Jaccard = 1.0000 [ 86 0 1100125 0 ] 1.0000 1.0000 sibling [ 6533060 ] : 6698780 0.125 (=1/(2*4)) 92 best keyword for cluster 6698780 is PF04369 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00509 ( PF00509 Hemagglutinin ) B> PF04369 ( PF04369 Lactococcin-like family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00509 has a PDB structure (may not be up to date) PF00509 b.19.1.2 h.3.1.1 j.79.1.1 SUPERFAM mapping significantly overlapping: 1 PF00509 SSF49818 0.772 (average over 12960 mutual instances, PF00509 12960 appearances, SSF49818 13846 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 5 ) 6746863_PF00527_PF02703 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00527 is 6670369 with Jaccard = 1.0000 |PF00527|=114 [ 114 0 1100097 0 ] parent [ 6670369 ] : 6746863 0.0252395 (=137/(118*46)) 98.3783 given [ 6670369 ] : 6670369 0.157895 (=72/(114*4)) 85.6465 best keyword for cluster 6670369 is PF00527 with Jaccard = 1.0000 [ 114 0 1100097 0 ] 1.0000 1.0000 sibling [ 6670369 ] : 6720998 0.0444444 (=2/(1*45)) 95.5556 best keyword for cluster 6720998 is PF02703 with Jaccard = 0.9744 [ 38 0 1100172 1 ] 1.0000 0.9744 SUGGESTING RELATEDNESS OF: A> PF00527 ( PF00527 E7 protein, Early protein ) B> PF02703 ( PF02703 Early E1A protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00527 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 6 ) 6764184_PF00677_PF06534 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00677 is 6469508 with Jaccard = 1.0000 |PF00677|=292 [ 292 0 1099919 0 ] parent [ 6469508 ] : 6764184 0.00781752 (=73/(322*29)) 99.4743 given [ 6469508 ] : 6469508 0.967607 (=926/(3*319)) 3.6831 best keyword for cluster 6469508 is PF00677 with Jaccard = 1.0000 [ 292 0 1099919 0 ] 1.0000 1.0000 sibling [ 6469508 ] : 6759748 0.0357143 (=1/(1*28)) 99.25 best keyword for cluster 6759748 is PF06534 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00677 ( PF00677 Lumazine binding domain ) B> PF06534 ( PF06534 Repulsive guidance molecule (RGM) C-terminus ) Only A has a clan ( CL0076.7 ). the two keywords do not coincide on UniRef90 proteins only PF00677 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 7 ) 6737306_PF00023_PF00710 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00710 is 6733339 with Jaccard = 1.0000 |PF00710|=234 [ 234 0 1099977 0 ] parent [ 6733339 ] : 6737306 0.0271625 (=37869/(263*5301)) 97.5146 given [ 6733339 ] : 6733339 0.0350195 (=54/(257*6)) 97.0906 best keyword for cluster 6733339 is PF00710 with Jaccard = 1.0000 [ 234 0 1099977 0 ] 1.0000 1.0000 sibling [ 6733339 ] : 6735540 0.0283899 (=40585/(285*5016)) 97.3283 best keyword for cluster 6735540 is PF00023 with Jaccard = 0.6616 [ 3381 1032 1095101 697 ] 0.7661 0.8291 SUGGESTING RELATEDNESS OF: A> PF00710 ( PF00710 Asparaginase ) B> PF00023 ( PF00023 Ankyrin repeat ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00023| = 4078 , |PF00710| = 234 , |PF00023^PF00710| = 17 ( 0.4% and 7.3% ) both PF00710 and PF00023 have PDB structures PF00710 c.88.1.1 PF00023 d.211.1.1 i.11.1.1 SUPERFAM mapping significantly overlapping: 1 PF00710 SSF53774 0.964 (average over 850 mutual instances, PF00710 892 appearances, SSF53774 893 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 8 ) 6739214_PF00747_PF06261 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00747 is 6546329 with Jaccard = 1.0000 |PF00747|=37 [ 37 0 1100174 0 ] parent [ 6546329 ] : 6739214 0.0304054 (=9/(37*8)) 97.7051 given [ 6546329 ] : 6546329 0.694444 (=25/(1*36)) 36.0094 best keyword for cluster 6546329 is PF00747 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 sibling [ 6546329 ] : 6714599 0.0666667 (=1/(3*5)) 94.6667 best keyword for cluster 6714599 is PF06261 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00747 ( PF00747 ssDNA binding protein ) B> PF06261 ( PF06261 Actinobacillus actinomycetemcomitans leukotoxin activator LktC ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00747 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 9 ) 6715673_PF00815_PF01502 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00815 is 6472857 with Jaccard = 1.0000 |PF00815|=275 [ 275 0 1099936 0 ] parent [ 6472857 ] : 6715673 0.0524554 (=5815/(298*372)) 94.8236 given [ 6472857 ] : 6472857 0.959596 (=285/(1*297)) 4.23058 best keyword for cluster 6472857 is PF00815 with Jaccard = 1.0000 [ 275 0 1099936 0 ] 1.0000 1.0000 sibling [ 6472857 ] : 6597974 0.396393 (=12090/(250*122)) 60.5218 best keyword for cluster 6597974 is PF01502 with Jaccard = 0.6435 [ 231 110 1099852 18 ] 0.6774 0.9277 SUGGESTING RELATEDNESS OF: A> PF00815 ( PF00815 Histidinol dehydrogenase ) B> PF01502 ( PF01502 Phosphoribosyl-AMP cyclohydrolase ) Only A has a clan ( CL0099.8 ). the two keywords coincide on Uniref90 proteins: |PF00815| = 275 , |PF01502| = 249 , |PF00815^PF01502| = 18 ( 6.5% and 7.2% ) both PF00815 and PF01502 have PDB structures PF00815 c.82.1.2 SUPERFAM mapping significantly overlapping: 1 PF00815 SSF53720 0.961 (average over 870 mutual instances, PF00815 872 appearances, SSF53720 10501 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 10 ) 6676446_PF00252_PF00826 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00826 is 6512604 with Jaccard = 1.0000 |PF00826|=81 [ 81 0 1100130 0 ] parent [ 6512604 ] : 6676446 0.145481 (=3818/(81*324)) 87.3159 given [ 6512604 ] : 6512604 0.835443 (=132/(2*79)) 16.6684 best keyword for cluster 6512604 is PF00826 with Jaccard = 1.0000 [ 81 0 1100130 0 ] 1.0000 1.0000 sibling [ 6512604 ] : 6536208 0.762422 (=491/(2*322)) 29.7211 best keyword for cluster 6536208 is PF00252 with Jaccard = 0.9967 [ 298 0 1099912 1 ] 1.0000 0.9967 SUGGESTING RELATEDNESS OF: A> PF00826 ( ) B> PF00252 ( PF00252 Ribosomal protein L16p/L10e ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00826 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00252 SSF54686 0.963 (average over 1521 mutual instances, PF00252 1523 appearances, SSF54686 1907 appearances) 2 PF00826 SSF54686 0.973 (average over 383 mutual instances, PF00826 384 appearances, SSF54686 1907 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 11 ) 6753863_PF00576_PF01014 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01014 is 6397254 with Jaccard = 1.0000 |PF01014|=56 [ 56 0 1100155 0 ] parent [ 6397254 ] : 6753863 0.0110909 (=110/(57*174)) 98.8959 given [ 6397254 ] : 6397254 1 (=260/(5*52)) 0.00586315 best keyword for cluster 6397254 is PF01014 with Jaccard = 1.0000 [ 56 0 1100155 0 ] 1.0000 1.0000 sibling [ 6397254 ] : 6727672 0.0365636 (=246/(58*116)) 96.444 best keyword for cluster 6727672 is PF00576 with Jaccard = 0.9904 [ 103 0 1100107 1 ] 1.0000 0.9904 SUGGESTING RELATEDNESS OF: A> PF01014 ( PF01014 Uricase ) B> PF00576 ( PF00576 HIUase/Transthyretin family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01014 and PF00576 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00576 SSF49472 0.961 (average over 322 mutual instances, PF00576 322 appearances, SSF49472 324 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 12 ) 6760276_PF01019_PF01112 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01019 is 6640942 with Jaccard = 1.0000 |PF01019|=345 [ 345 0 1099866 0 ] parent [ 6640942 ] : 6760276 0.0118467 (=787/(384*173)) 99.2777 given [ 6640942 ] : 6640942 0.228947 (=348/(4*380)) 77.3229 best keyword for cluster 6640942 is PF01019 with Jaccard = 1.0000 [ 345 0 1099866 0 ] 1.0000 1.0000 sibling [ 6640942 ] : 6754308 0.0116279 (=2/(1*172)) 98.9256 best keyword for cluster 6754308 is PF01112 with Jaccard = 0.9872 [ 154 0 1100055 2 ] 1.0000 0.9872 SUGGESTING RELATEDNESS OF: A> PF01019 ( PF01019 Gamma-glutamyltranspeptidase ) B> PF01112 ( PF01112 Asparaginase ) they come from the same clan: CL0052.11 : PF00227 PF03577 PF01804 PF01019 PF00310 PF02275 PF01112 PF03417 the two keywords do not coincide on UniRef90 proteins only PF01019 has a PDB structure (may not be up to date) PF01112 d.153.1.5 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 13 ) 6733689_PF01117_PF03318 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01117 is 6713880 with Jaccard = 1.0000 |PF01117|=22 [ 22 0 1100189 0 ] parent [ 6713880 ] : 6733689 0.0365226 (=71/(27*72)) 97.1339 given [ 6713880 ] : 6713880 0.0545455 (=6/(22*5)) 94.5455 best keyword for cluster 6713880 is PF01117 with Jaccard = 1.0000 [ 22 0 1100189 0 ] 1.0000 1.0000 sibling [ 6713880 ] : 6729888 0.0378378 (=49/(37*35)) 96.7126 best keyword for cluster 6729888 is PF03318 with Jaccard = 0.7143 [ 5 2 1100204 0 ] 0.7143 1.0000 SUGGESTING RELATEDNESS OF: A> PF01117 ( PF01117 Aerolysin toxin ) B> PF03318 ( PF03318 Clostridium epsilon toxin ETX/Bacillus mosquitocidal toxin MTX2 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01117 and PF03318 have PDB structures PF01117 f.8.1.1 PF03318 f.8.1.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 14 ) 6735800_PF01194_PF05864 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01194 is 6638420 with Jaccard = 1.0000 |PF01194|=56 [ 56 0 1100155 0 ] parent [ 6638420 ] : 6735800 0.05 (=30/(10*60)) 97.3523 given [ 6638420 ] : 6638420 0.336207 (=39/(2*58)) 76.5962 best keyword for cluster 6638420 is PF01194 with Jaccard = 1.0000 [ 56 0 1100155 0 ] 1.0000 1.0000 sibling [ 6638420 ] : 6249523 1 (=16/(2*8)) 2.3064e-13 best keyword for cluster 6249523 is PF05864 with Jaccard = 1.0000 [ 10 0 1100201 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01194 ( PF01194 RNA polymerases N / 8 kDa subunit ) B> PF05864 ( PF05864 Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01194 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01194 SSF46924 0.927 (average over 147 mutual instances, PF01194 149 appearances, SSF46924 149 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 15 ) 6749383_PF01219_PF01569 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01219 is 6610135 with Jaccard = 1.0000 |PF01219|=164 [ 164 0 1100047 0 ] parent [ 6610135 ] : 6749383 0.0174943 (=3306/(186*1016)) 98.5736 given [ 6610135 ] : 6610135 0.367568 (=68/(1*185)) 66.8222 best keyword for cluster 6610135 is PF01219 with Jaccard = 1.0000 [ 164 0 1100047 0 ] 1.0000 1.0000 sibling [ 6610135 ] : 6737017 0.0366534 (=1189/(33*983)) 97.486 best keyword for cluster 6737017 is PF01569 with Jaccard = 0.8878 [ 831 2 1099275 103 ] 0.9976 0.8897 SUGGESTING RELATEDNESS OF: A> PF01219 ( PF01219 Prokaryotic diacylglycerol kinase ) B> PF01569 ( PF01569 PAP2 superfamily ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01219| = 164 , |PF01569| = 934 , |PF01219^PF01569| = 14 ( 8.5% and 1.5% ) only PF01219 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01569 SSF48317 0.620 (average over 2481 mutual instances, PF01569 2537 appearances, SSF48317 2657 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 16 ) 6703534_PF00539_PF01254 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01254 is 6350147 with Jaccard = 1.0000 |PF01254|=8 [ 8 0 1100203 0 ] parent [ 6350147 ] : 6703534 0.0990763 (=665/(8*839)) 92.8234 given [ 6350147 ] : 6350147 1 (=12/(2*6)) 5.00009e-06 best keyword for cluster 6350147 is PF01254 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6350147 ] : 6699902 0.11223 (=468/(5*834)) 92.1622 best keyword for cluster 6699902 is PF00539 with Jaccard = 0.9987 [ 743 0 1099467 1 ] 1.0000 0.9987 SUGGESTING RELATEDNESS OF: A> PF01254 ( PF01254 Nuclear transition protein 2 ) B> PF00539 ( PF00539 Transactivating regulatory protein (Tat) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01254 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 17 ) 6754889_PF01102_PF01401 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01401 is 6713308 with Jaccard = 1.0000 |PF01401|=62 [ 62 0 1100149 0 ] parent [ 6713308 ] : 6754889 0.0124533 (=30/(73*33)) 98.9624 given [ 6713308 ] : 6713308 0.0555556 (=4/(1*72)) 94.456 best keyword for cluster 6713308 is PF01401 with Jaccard = 1.0000 [ 62 0 1100149 0 ] 1.0000 1.0000 sibling [ 6713308 ] : 6747216 0.03125 (=1/(1*32)) 98.4062 best keyword for cluster 6747216 is PF01102 with Jaccard = 0.9643 [ 27 0 1100183 1 ] 1.0000 0.9643 SUGGESTING RELATEDNESS OF: A> PF01401 ( PF01401 Angiotensin-converting enzyme ) B> PF01102 ( PF01102 Glycophorin A ) Only A has a clan ( CL0126.12 ). the two keywords do not coincide on UniRef90 proteins both PF01401 and PF01102 have PDB structures PF01102 j.35.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 18 ) 6737568_PF00844_PF01489 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01489 is 6607303 with Jaccard = 1.0000 |PF01489|=72 [ 72 0 1100139 0 ] parent [ 6607303 ] : 6737568 0.0346004 (=365/(77*137)) 97.5423 given [ 6607303 ] : 6607303 0.394737 (=30/(1*76)) 65.4621 best keyword for cluster 6607303 is PF01489 with Jaccard = 1.0000 [ 72 0 1100139 0 ] 1.0000 1.0000 sibling [ 6607303 ] : 6729233 0.0882353 (=12/(1*136)) 96.6397 best keyword for cluster 6729233 is PF00844 with Jaccard = 1.0000 [ 116 0 1100095 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01489 ( PF01489 Geminivirus nuclear export factor BR1 ) B> PF00844 ( PF00844 Geminivirus coat protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01489 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 19 ) 6619151_PF00429_PF01611 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01611 is 6038248 with Jaccard = 1.0000 |PF01611|=10 [ 10 0 1100201 0 ] parent [ 6038248 ] : 6619151 0.365891 (=472/(10*129)) 70.0204 given [ 6038248 ] : 6038248 1 (=16/(2*8)) 8.59438e-31 best keyword for cluster 6038248 is PF01611 with Jaccard = 1.0000 [ 10 0 1100201 0 ] 1.0000 1.0000 sibling [ 6038248 ] : 6611450 0.343548 (=213/(124*5)) 67.2264 best keyword for cluster 6611450 is PF00429 with Jaccard = 0.7042 [ 100 14 1100069 28 ] 0.8772 0.7812 SUGGESTING RELATEDNESS OF: A> PF01611 ( PF01611 Filovirus glycoprotein ) B> PF00429 ( PF00429 ENV polyprotein (coat polyprotein) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01611 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 20 ) 6707990_PF00693_PF01712 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01712 is 6700567 with Jaccard = 1.0000 |PF01712|=135 [ 135 0 1100076 0 ] parent [ 6700567 ] : 6707990 0.0793176 (=716/(51*177)) 93.6278 given [ 6700567 ] : 6700567 0.0976331 (=132/(169*8)) 92.2806 best keyword for cluster 6700567 is PF01712 with Jaccard = 1.0000 [ 135 0 1100076 0 ] 1.0000 1.0000 sibling [ 6700567 ] : 6515153 0.848837 (=292/(8*43)) 17.8121 best keyword for cluster 6515153 is PF00693 with Jaccard = 0.8600 [ 43 7 1100161 0 ] 0.8600 1.0000 SUGGESTING RELATEDNESS OF: A> PF01712 ( PF01712 Deoxynucleoside kinase ) B> PF00693 ( PF00693 Thymidine kinase from herpesvirus ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01712 and PF00693 have PDB structures PF01712 c.37.1.1 PF00693 c.37.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 21 ) 6775595_PF01785_PF04269 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01785 is 6761627 with Jaccard = 1.0000 |PF01785|=37 [ 37 0 1100174 0 ] parent [ 6761627 ] : 6775595 0.00214707 (=8/(69*54)) 99.8706 given [ 6761627 ] : 6761627 0.00649351 (=5/(14*55)) 99.3507 best keyword for cluster 6761627 is PF01785 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 sibling [ 6761627 ] : 6765753 0.00601504 (=4/(19*35)) 99.5457 best keyword for cluster 6765753 is PF04269 with Jaccard = 0.9333 [ 14 1 1100196 0 ] 0.9333 1.0000 SUGGESTING RELATEDNESS OF: A> PF01785 ( PF01785 Closterovirus coat protein ) B> PF04269 ( PF04269 Protein of unknown function, DUF440 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01785 has a PDB structure (may not be up to date) PF04269 d.17.7.1 SUPERFAM mapping significantly overlapping: 1 PF04269 SSF102816 0.992 (average over 88 mutual instances, PF04269 88 appearances, SSF102816 88 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 22 ) 6751324_PF01819_PF07279 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01819 is 6668358 with Jaccard = 1.0000 |PF01819|=8 [ 8 0 1100203 0 ] parent [ 6668358 ] : 6751324 0.0153846 (=2/(10*13)) 98.72 given [ 6668358 ] : 6668358 0.16 (=4/(5*5)) 85.0497 best keyword for cluster 6668358 is PF01819 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6668358 ] : 6733487 0.047619 (=2/(7*6)) 97.1072 best keyword for cluster 6733487 is PF07279 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01819 ( PF01819 Levivirus coat protein ) B> PF07279 ( PF07279 Protein of unknown function (DUF1442) ) Only B has a clan ( CL0102.14 ). the two keywords do not coincide on UniRef90 proteins only PF01819 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01819 SSF55405 0.991 (average over 31 mutual instances, PF01819 31 appearances, SSF55405 34 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 23 ) 6700595_PF01874_PF03802 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01874 is 6637228 with Jaccard = 1.0000 |PF01874|=102 [ 102 0 1100109 0 ] parent [ 6637228 ] : 6700595 0.0785047 (=210/(25*107)) 92.2838 given [ 6637228 ] : 6637228 0.276003 (=674/(33*74)) 76.3039 best keyword for cluster 6637228 is PF01874 with Jaccard = 1.0000 [ 102 0 1100109 0 ] 1.0000 1.0000 sibling [ 6637228 ] : 6464752 0.974026 (=150/(11*14)) 2.96668 best keyword for cluster 6464752 is PF03802 with Jaccard = 0.7419 [ 23 0 1100180 8 ] 1.0000 0.7419 SUGGESTING RELATEDNESS OF: A> PF01874 ( PF01874 ATP:dephospho-CoA triphosphoribosyl transferase ) B> PF03802 ( PF03802 Apo-citrate lyase phosphoribosyl-dephospho-CoA transferase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01874| = 102 , |PF03802| = 31 , |PF01874^PF03802| = 8 ( 7.8% and 25.8% ) Neither PF01874 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 24 ) 6708998_PF01907_PF06869 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01907 is 6473568 with Jaccard = 1.0000 |PF01907|=66 [ 66 0 1100145 0 ] parent [ 6473568 ] : 6708998 0.0963365 (=71/(67*11)) 93.7936 given [ 6473568 ] : 6473568 0.959574 (=902/(47*20)) 4.39396 best keyword for cluster 6473568 is PF01907 with Jaccard = 1.0000 [ 66 0 1100145 0 ] 1.0000 1.0000 sibling [ 6473568 ] : 6541382 0.666667 (=20/(5*6)) 33.3334 best keyword for cluster 6541382 is PF06869 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01907 ( PF01907 Ribosomal protein L37e ) B> PF06869 ( PF06869 Protein of unknown function (DUF1258) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01907 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01907 SSF57829 0.973 (average over 178 mutual instances, PF01907 178 appearances, SSF57829 837 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 25 ) 6734323_PF01910_PF07615 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01910 is 6549099 with Jaccard = 1.0000 |PF01910|=106 [ 106 0 1100105 0 ] parent [ 6549099 ] : 6734323 0.038214 (=95/(113*22)) 97.204 given [ 6549099 ] : 6549099 0.643151 (=1878/(40*73)) 38.1115 best keyword for cluster 6549099 is PF01910 with Jaccard = 1.0000 [ 106 0 1100105 0 ] 1.0000 1.0000 sibling [ 6549099 ] : 6703912 0.142857 (=16/(14*8)) 92.8925 best keyword for cluster 6703912 is PF07615 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01910 ( PF01910 Domain of unknown function DUF77 ) B> PF07615 ( PF07615 YKOF-related Family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01910 and PF07615 have PDB structures PF01910 d.58.48.1 SUPERFAM mapping significantly overlapping: 1 PF01910 SSF89957 0.936 (average over 259 mutual instances, PF01910 259 appearances, SSF89957 271 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 26 ) 6663981_PF01775_PF01911 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01911 is 6491337 with Jaccard = 1.0000 |PF01911|=17 [ 17 0 1100194 0 ] parent [ 6491337 ] : 6663981 0.206437 (=186/(17*53)) 84.0597 given [ 6491337 ] : 6491337 0.939394 (=62/(6*11)) 8.66675 best keyword for cluster 6491337 is PF01911 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6491337 ] : 6625242 0.286667 (=43/(50*3)) 72.577 best keyword for cluster 6625242 is PF01775 with Jaccard = 1.0000 [ 49 0 1100162 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01911 ( PF01911 Ribosomal LX protein ) B> PF01775 ( PF01775 Ribosomal L18ae protein family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01911 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 27 ) 6676966_PF01917_PF04975 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01917 is 6652451 with Jaccard = 1.0000 |PF01917|=60 [ 60 0 1100151 0 ] parent [ 6652451 ] : 6676966 0.135742 (=139/(16*64)) 87.4912 given [ 6652451 ] : 6652451 0.220275 (=176/(47*17)) 80.81 best keyword for cluster 6652451 is PF01917 with Jaccard = 1.0000 [ 60 0 1100151 0 ] 1.0000 1.0000 sibling [ 6652451 ] : 6555978 0.615385 (=24/(3*13)) 43.5642 best keyword for cluster 6555978 is PF04975 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01917 ( PF01917 Archaebacterial flagellin ) B> PF04975 ( PF04975 Archaeal flagellar protein G ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01917 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 28 ) 6760000_PF01960_PF03576 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01960 is 6316795 with Jaccard = 1.0000 |PF01960|=204 [ 204 0 1100007 0 ] parent [ 6316795 ] : 6760000 0.0108387 (=267/(218*113)) 99.2638 given [ 6316795 ] : 6316795 1 (=217/(1*217)) 2.30416e-08 best keyword for cluster 6316795 is PF01960 with Jaccard = 1.0000 [ 204 0 1100007 0 ] 1.0000 1.0000 sibling [ 6316795 ] : 6759753 0.00892857 (=1/(1*112)) 99.25 best keyword for cluster 6759753 is PF03576 with Jaccard = 1.0000 [ 96 0 1100115 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01960 ( PF01960 ArgJ family ) B> PF03576 ( PF03576 Peptidase family S58 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01960 and PF03576 have PDB structures PF01960 d.154.1.2 SUPERFAM mapping significantly overlapping: 1 PF03576 SSF56266 0.924 (average over 295 mutual instances, PF03576 295 appearances, SSF56266 883 appearances) 2 PF01960 SSF56266 0.979 (average over 568 mutual instances, PF01960 572 appearances, SSF56266 883 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 29 ) 6618498_PF01343_PF01972 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01972 is 6465557 with Jaccard = 1.0000 |PF01972|=33 [ 33 0 1100178 0 ] parent [ 6465557 ] : 6618498 0.332616 (=4634/(36*387)) 69.8682 given [ 6465557 ] : 6465557 0.969697 (=96/(3*33)) 3.09155 best keyword for cluster 6465557 is PF01972 with Jaccard = 1.0000 [ 33 0 1100178 0 ] 1.0000 1.0000 sibling [ 6465557 ] : 6606401 0.377922 (=291/(2*385)) 64.8742 best keyword for cluster 6606401 is PF01343 with Jaccard = 0.9508 [ 348 0 1099845 18 ] 1.0000 0.9508 SUGGESTING RELATEDNESS OF: A> PF01972 ( PF01972 Protein of unknown function DUF114 ) B> PF01343 ( PF01343 Peptidase family S49 ) they come from the same clan: CL0127.6 : PF03255 PF01039 PF00574 PF01972 PF00378 PF06833 PF03572 PF01343 the two keywords do not coincide on UniRef90 proteins Neither PF01972 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 30 ) 6760043_PF01998_PF05805 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01998 is 6645338 with Jaccard = 1.0000 |PF01998|=25 [ 25 0 1100186 0 ] parent [ 6645338 ] : 6760043 0.00788177 (=8/(29*35)) 99.2662 given [ 6645338 ] : 6645338 0.230769 (=18/(26*3)) 78.5426 best keyword for cluster 6645338 is PF01998 with Jaccard = 1.0000 [ 25 0 1100186 0 ] 1.0000 1.0000 sibling [ 6645338 ] : 6717386 0.06 (=9/(30*5)) 95.0668 best keyword for cluster 6717386 is PF05805 with Jaccard = 1.0000 [ 29 0 1100182 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01998 ( PF01998 Protein of unknown function DUF131 ) B> PF05805 ( PF05805 L6 membrane protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01998 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 31 ) 6561248_PF02031_PF05547 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02031 is 6164337 with Jaccard = 1.0000 |PF02031|=7 [ 7 0 1100204 0 ] parent [ 6164337 ] : 6561248 0.62406 (=166/(7*38)) 47.9608 given [ 6164337 ] : 6164337 1 (=6/(1*6)) 3.63333e-20 best keyword for cluster 6164337 is PF02031 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6164337 ] : 6509585 0.858238 (=224/(29*9)) 15.11 best keyword for cluster 6509585 is PF05547 with Jaccard = 0.6444 [ 29 2 1100166 14 ] 0.9355 0.6744 SUGGESTING RELATEDNESS OF: A> PF02031 ( PF02031 Streptomyces extracellular neutral proteinase (M7) family ) B> PF05547 ( PF05547 Immune inhibitor A peptidase M6 ) they come from the same clan: CL0126.12 : PF08325 PF01421 PF01752 PF01457 PF02031 PF09471 PF05299 PF05547 PF05572 PF01434 PF01447 PF02128 PF02102 PF02074 PF01432 PF01742 PF01401 PF01431 PF05548 PF00413 PF01433 PF01863 PF07998 PF01400 the two keywords do not coincide on UniRef90 proteins only PF02031 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 32 ) 6616117_PF02055_PF02057 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02057 is 6536926 with Jaccard = 1.0000 |PF02057|=11 [ 11 0 1100200 0 ] parent [ 6536926 ] : 6616117 0.359566 (=265/(67*11)) 68.9668 given [ 6536926 ] : 6536926 0.7 (=7/(1*10)) 30.0036 best keyword for cluster 6536926 is PF02057 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 sibling [ 6536926 ] : 6486390 0.942353 (=801/(50*17)) 7.20224 best keyword for cluster 6486390 is PF02055 with Jaccard = 0.8689 [ 53 5 1100150 3 ] 0.9138 0.9464 SUGGESTING RELATEDNESS OF: A> PF02057 ( PF02057 Glycosyl hydrolase family 59 ) B> PF02055 ( PF02055 O-Glycosyl hydrolase family 30 ) they come from the same clan: CL0058.10 : PF07971 PF02446 PF03198 PF02324 PF02057 PF01630 PF07745 PF02449 PF01229 PF01301 PF01055 PF02055 PF00933 PF02836 PF02156 PF01183 PF00728 PF00704 PF00332 PF01373 PF00331 PF00232 PF02638 PF00150 PF00128 PF02065 the two keywords do not coincide on UniRef90 proteins only PF02057 has a PDB structure (may not be up to date) PF02055 c.1.8.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 33 ) 6767530_PF02083_PF03303 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02083 is 6734368 with Jaccard = 1.0000 |PF02083|=17 [ 17 0 1100194 0 ] parent [ 6734368 ] : 6767530 0.00408163 (=5/(25*49)) 99.6199 given [ 6734368 ] : 6734368 0.0441176 (=6/(17*8)) 97.211 best keyword for cluster 6734368 is PF02083 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6734368 ] : 6757855 0.0117647 (=6/(15*34)) 99.1449 best keyword for cluster 6757855 is PF03303 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02083 ( PF02083 Urotensin II ) B> PF03303 ( PF03303 WTF protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02083 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 34 ) 6746226_PF00705_PF02144 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02144 is 6724671 with Jaccard = 1.0000 |PF02144|=23 [ 23 0 1100188 0 ] parent [ 6724671 ] : 6746226 0.0206774 (=116/(110*51)) 98.3284 given [ 6724671 ] : 6724671 0.0415225 (=24/(34*17)) 96.0645 best keyword for cluster 6724671 is PF02144 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 sibling [ 6724671 ] : 6649498 0.247706 (=27/(1*109)) 79.9756 best keyword for cluster 6649498 is PF00705 with Jaccard = 0.9515 [ 98 5 1100108 0 ] 0.9515 1.0000 SUGGESTING RELATEDNESS OF: A> PF02144 ( PF02144 Repair protein Rad1/Rec1/Rad17 ) B> PF00705 ( PF00705 Proliferating cell nuclear antigen, N-terminal domain ) Only B has a clan ( CL0060.7 ). the two keywords do not coincide on UniRef90 proteins only PF02144 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 35 ) 6568023_PF00096_PF02200 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02200 is 6030065 with Jaccard = 1.0000 |PF02200|=27 [ 27 0 1100184 0 ] parent [ 6030065 ] : 6568023 0.505667 (=55771/(28*3939)) 50.5762 given [ 6030065 ] : 6030065 1 (=27/(1*27)) 1.63293e-31 best keyword for cluster 6030065 is PF02200 with Jaccard = 1.0000 [ 27 0 1100184 0 ] 1.0000 1.0000 sibling [ 6030065 ] : 6565745 0.579583 (=13677/(6*3933)) 50.1386 best keyword for cluster 6565745 is PF00096 with Jaccard = 0.7430 [ 3636 8 1095317 1250 ] 0.9978 0.7442 SUGGESTING RELATEDNESS OF: A> PF02200 ( PF02200 STE like transcription factor ) B> PF00096 ( PF00096 Zinc finger, C2H2 type ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00096| = 4886 , |PF02200| = 27 , |PF00096^PF02200| = 15 ( 0.3% and 55.6% ) only PF02200 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 36 ) 6632161_PF02263_PF05879 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02263 is 6600235 with Jaccard = 1.0000 |PF02263|=91 [ 91 0 1100120 0 ] parent [ 6600235 ] : 6632161 0.279375 (=1341/(96*50)) 75.1732 given [ 6600235 ] : 6600235 0.398148 (=215/(90*6)) 61.7589 best keyword for cluster 6600235 is PF02263 with Jaccard = 1.0000 [ 91 0 1100120 0 ] 1.0000 1.0000 sibling [ 6600235 ] : 6618709 0.330357 (=111/(42*8)) 69.9763 best keyword for cluster 6618709 is PF05879 with Jaccard = 0.8936 [ 42 0 1100164 5 ] 1.0000 0.8936 SUGGESTING RELATEDNESS OF: A> PF02263 ( PF02263 Guanylate-binding protein, N-terminal domain ) B> PF05879 ( PF05879 Root hair defective 3 GTP-binding protein (RHD3) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02263 has a PDB structure (may not be up to date) PF02263 c.37.1.8 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 37 ) 6745265_PF01160_PF02315 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02315 is 6425638 with Jaccard = 1.0000 |PF02315|=12 [ 12 0 1100199 0 ] parent [ 6425638 ] : 6745265 0.0311355 (=34/(12*91)) 98.2522 given [ 6425638 ] : 6425638 1 (=11/(1*11)) 0.188364 best keyword for cluster 6425638 is PF02315 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6425638 ] : 6706795 0.0666667 (=6/(1*90)) 93.4249 best keyword for cluster 6706795 is PF01160 with Jaccard = 1.0000 [ 38 0 1100173 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02315 ( PF02315 Methanol dehydrogenase beta subunit ) B> PF01160 ( PF01160 Vertebrate endogenous opioids neuropeptide ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02315 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF02315 SSF48666 0.766 (average over 20 mutual instances, PF02315 20 appearances, SSF48666 20 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 38 ) 6749298_PF02350_PF04007 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02350 is 6621592 with Jaccard = 1.0000 |PF02350|=268 [ 268 0 1099943 0 ] parent [ 6621592 ] : 6749298 0.0184441 (=202/(296*37)) 98.5682 given [ 6621592 ] : 6621592 0.29932 (=176/(2*294)) 71.0835 best keyword for cluster 6621592 is PF02350 with Jaccard = 1.0000 [ 268 0 1099943 0 ] 1.0000 1.0000 sibling [ 6621592 ] : 6738192 0.0314685 (=9/(26*11)) 97.6077 best keyword for cluster 6738192 is PF04007 with Jaccard = 0.9259 [ 25 1 1100184 1 ] 0.9615 0.9615 SUGGESTING RELATEDNESS OF: A> PF02350 ( PF02350 UDP-N-acetylglucosamine 2-epimerase ) B> PF04007 ( PF04007 Protein of unknown function (DUF354) ) they come from the same clan: CL0113.8 : PF06925 PF02684 PF04464 PF04101 PF01075 PF03033 PF00982 PF00534 PF05693 PF02350 PF04007 PF06722 PF05159 PF08660 PF00343 PF00201 the two keywords do not coincide on UniRef90 proteins only PF02350 has a PDB structure (may not be up to date) PF02350 c.87.1.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 39 ) 6749540_PF02386_PF03814 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02386 is 6559302 with Jaccard = 1.0000 |PF02386|=326 [ 326 0 1099885 0 ] parent [ 6559302 ] : 6749540 0.0191244 (=747/(372*105)) 98.5865 given [ 6559302 ] : 6559302 0.567523 (=19470/(203*169)) 46.1176 best keyword for cluster 6559302 is PF02386 with Jaccard = 1.0000 [ 326 0 1099885 0 ] 1.0000 1.0000 sibling [ 6559302 ] : 6743108 0.026 (=13/(100*5)) 98.0632 best keyword for cluster 6743108 is PF03814 with Jaccard = 1.0000 [ 94 0 1100117 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02386 ( PF02386 Cation transport protein ) B> PF03814 ( PF03814 Potassium-transporting ATPase A subunit ) Only A has a clan ( CL0030.10 ). the two keywords do not coincide on UniRef90 proteins Neither PF02386 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 40 ) 6637846_PF02439_PF05393 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02439 is 6414549 with Jaccard = 1.0000 |PF02439|=17 [ 17 0 1100194 0 ] parent [ 6414549 ] : 6637846 0.270588 (=69/(17*15)) 76.4309 given [ 6414549 ] : 6414549 1 (=72/(9*8)) 0.0556366 best keyword for cluster 6414549 is PF02439 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6414549 ] : 6546736 0.694444 (=25/(3*12)) 36.4043 best keyword for cluster 6546736 is PF05393 with Jaccard = 0.7500 [ 3 1 1100207 0 ] 0.7500 1.0000 SUGGESTING RELATEDNESS OF: A> PF02439 ( PF02439 Adenovirus E3 region protein CR2 ) B> PF05393 ( PF05393 Human adenovirus early E3A glycoprotein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02439 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 41 ) 6726696_PF02443_PF07305 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02443 is 6506052 with Jaccard = 1.0000 |PF02443|=21 [ 21 0 1100190 0 ] parent [ 6506052 ] : 6726696 0.046875 (=9/(24*8)) 96.3226 given [ 6506052 ] : 6506052 0.861111 (=93/(6*18)) 13.9414 best keyword for cluster 6506052 is PF02443 with Jaccard = 1.0000 [ 21 0 1100190 0 ] 1.0000 1.0000 sibling [ 6506052 ] : 6672808 0.142857 (=1/(1*7)) 86.2857 best keyword for cluster 6672808 is PF07305 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02443 ( PF02443 Circovirus ORF-2 protein ) B> PF07305 ( PF07305 Protein of unknown function (DUF1454) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02443 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 42 ) 6775898_PF00482_PF02529 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02529 is 6512765 with Jaccard = 1.0000 |PF02529|=27 [ 27 0 1100184 0 ] parent [ 6512765 ] : 6775898 0.00229885 (=54/(27*870)) 99.8776 given [ 6512765 ] : 6512765 0.9 (=45/(2*25)) 16.6988 best keyword for cluster 6512765 is PF02529 with Jaccard = 1.0000 [ 27 0 1100184 0 ] 1.0000 1.0000 sibling [ 6512765 ] : 6773386 0.00243759 (=79/(831*39)) 99.8154 best keyword for cluster 6773386 is PF00482 with Jaccard = 0.9794 [ 712 3 1099484 12 ] 0.9958 0.9834 SUGGESTING RELATEDNESS OF: A> PF02529 ( PF02529 Cytochrome B6-F complex subunit 5 ) B> PF00482 ( PF00482 Bacterial type II secretion system protein F domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02529 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF02529 SSF103446 0.807 (average over 170 mutual instances, PF02529 170 appearances, SSF103446 172 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 43 ) 6729102_PF02632_PF07155 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02632 is 6574799 with Jaccard = 1.0000 |PF02632|=156 [ 156 0 1100055 0 ] parent [ 6574799 ] : 6729102 0.0455752 (=2336/(172*298)) 96.6228 given [ 6574799 ] : 6574799 0.485294 (=165/(2*170)) 52.1508 best keyword for cluster 6574799 is PF02632 with Jaccard = 1.0000 [ 156 0 1100055 0 ] 1.0000 1.0000 sibling [ 6574799 ] : 6712034 0.0725709 (=717/(38*260)) 94.2517 best keyword for cluster 6712034 is PF07155 with Jaccard = 0.8462 [ 22 4 1100185 0 ] 0.8462 1.0000 SUGGESTING RELATEDNESS OF: A> PF02632 ( PF02632 BioY family ) B> PF07155 ( PF07155 Protein of unknown function (DUF1393) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02632 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 44 ) 6666590_PF02621_PF02642 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02642 is 6529533 with Jaccard = 1.0000 |PF02642|=45 [ 45 0 1100166 0 ] parent [ 6529533 ] : 6666590 0.178286 (=491/(51*54)) 84.6206 given [ 6529533 ] : 6529533 0.747826 (=172/(5*46)) 25.607 best keyword for cluster 6529533 is PF02642 with Jaccard = 1.0000 [ 45 0 1100166 0 ] 1.0000 1.0000 sibling [ 6529533 ] : 6598018 0.413462 (=43/(2*52)) 60.5863 best keyword for cluster 6598018 is PF02621 with Jaccard = 0.9762 [ 41 0 1100169 1 ] 1.0000 0.9762 SUGGESTING RELATEDNESS OF: A> PF02642 ( PF02642 Uncharacterized ACR, COG2107 ) B> PF02621 ( PF02621 Uncharacterized ACR, COG1427 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02642 has a PDB structure (may not be up to date) PF02642 c.94.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 45 ) 6756280_PF02659_PF03596 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02659 is 6560928 with Jaccard = 1.0000 |PF02659|=78 [ 78 0 1100133 0 ] parent [ 6560928 ] : 6756280 0.013824 (=71/(107*48)) 99.0487 given [ 6560928 ] : 6560928 0.57875 (=1389/(32*75)) 47.5867 best keyword for cluster 6560928 is PF02659 with Jaccard = 1.0000 [ 78 0 1100133 0 ] 1.0000 1.0000 sibling [ 6560928 ] : 6717408 0.0592334 (=17/(41*7)) 95.0708 best keyword for cluster 6717408 is PF03596 with Jaccard = 0.9667 [ 29 0 1100181 1 ] 1.0000 0.9667 SUGGESTING RELATEDNESS OF: A> PF02659 ( PF02659 Domain of unknown function DUF ) B> PF03596 ( PF03596 Cadmium resistance transporter ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02659 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 46 ) 6560926_PF02667_PF03806 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02667 is 6536414 with Jaccard = 1.0000 |PF02667|=37 [ 37 0 1100174 0 ] parent [ 6536414 ] : 6560926 0.557653 (=1625/(62*47)) 47.5842 given [ 6536414 ] : 6536414 0.717391 (=33/(1*46)) 29.927 best keyword for cluster 6536414 is PF02667 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 sibling [ 6536414 ] : 6482805 0.937853 (=166/(59*3)) 6.3391 best keyword for cluster 6482805 is PF03806 with Jaccard = 0.9286 [ 52 4 1100155 0 ] 0.9286 1.0000 SUGGESTING RELATEDNESS OF: A> PF02667 ( PF02667 Short chain fatty acid transporter ) B> PF03806 ( PF03806 AbgT putative transporter family ) they come from the same clan: CL0182.8 : PF06450 PF00939 PF03553 PF07158 PF02652 PF02447 PF04165 PF07854 PF07399 PF03606 PF03605 PF06808 PF03600 PF02040 PF00873 PF03806 PF02667 the two keywords do not coincide on UniRef90 proteins Neither PF02667 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF03806 SSF103473 0.746 (average over 1 mutual instances, PF03806 1 appearances, SSF103473 39293 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 47 ) 6757239_PF00696_PF02670 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02670 is 6456649 with Jaccard = 1.0000 |PF02670|=230 [ 230 0 1099981 0 ] parent [ 6456649 ] : 6757239 0.0127026 (=5532/(249*1749)) 99.1056 given [ 6456649 ] : 6456649 0.985095 (=727/(3*246)) 1.92559 best keyword for cluster 6456649 is PF02670 with Jaccard = 1.0000 [ 230 0 1099981 0 ] 1.0000 1.0000 sibling [ 6456649 ] : 6750126 0.0200229 (=35/(1*1748)) 98.6267 best keyword for cluster 6750126 is PF00696 with Jaccard = 0.8169 [ 1298 265 1098622 26 ] 0.8305 0.9804 SUGGESTING RELATEDNESS OF: A> PF02670 ( PF02670 1-deoxy-D-xylulose 5-phosphate reductoisomerase ) B> PF00696 ( PF00696 Amino acid kinase family ) Only A has a clan ( CL0063.17 ). the two keywords do not coincide on UniRef90 proteins both PF02670 and PF00696 have PDB structures PF02670 c.2.1.3 PF00696 c.73.1.1 c.73.1.2 c.73.1.3 SUPERFAM mapping significantly overlapping: 1 PF02670 SSF51735 0.860 (average over 693 mutual instances, PF02670 693 appearances, SSF51735 164772 appearances) 2 PF00696 SSF53633 0.922 (average over 4687 mutual instances, PF00696 5933 appearances, SSF53633 7277 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 48 ) 6707567_PF02118_PF02688 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02688 is 6360931 with Jaccard = 1.0000 |PF02688|=62 [ 62 0 1100149 0 ] parent [ 6360931 ] : 6707567 0.0743728 (=664/(62*144)) 93.5584 given [ 6360931 ] : 6360931 1 (=561/(11*51)) 2.72794e-05 best keyword for cluster 6360931 is PF02688 with Jaccard = 1.0000 [ 62 0 1100149 0 ] 1.0000 1.0000 sibling [ 6360931 ] : 6683987 0.130022 (=466/(32*112)) 89.0859 best keyword for cluster 6683987 is PF02118 with Jaccard = 0.6087 [ 42 26 1100142 1 ] 0.6176 0.9767 SUGGESTING RELATEDNESS OF: A> PF02688 ( PF02688 Domain of unknown function DUF215 ) B> PF02118 ( PF02118 C.elegans Srg family integral membrane protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02688 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 49 ) 6645831_PF02701_PF05344 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02701 is 6301493 with Jaccard = 1.0000 |PF02701|=95 [ 95 0 1100116 0 ] parent [ 6301493 ] : 6645831 0.253506 (=235/(103*9)) 78.7358 given [ 6301493 ] : 6301493 1 (=396/(4*99)) 1.9965e-09 best keyword for cluster 6301493 is PF02701 with Jaccard = 1.0000 [ 95 0 1100116 0 ] 1.0000 1.0000 sibling [ 6301493 ] : 6606497 0.5 (=7/(2*7)) 64.995 best keyword for cluster 6606497 is PF05344 with Jaccard = 0.8000 [ 4 0 1100206 1 ] 1.0000 0.8000 SUGGESTING RELATEDNESS OF: A> PF02701 ( PF02701 Dof domain, zinc finger ) B> PF05344 ( PF05344 Domain of Unknown Function (DUF746) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02701 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 50 ) 6690965_PF01375_PF02917 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02917 is 6488710 with Jaccard = 1.0000 |PF02917|=8 [ 8 0 1100203 0 ] parent [ 6488710 ] : 6690965 0.1125 (=9/(8*10)) 90.4783 given [ 6488710 ] : 6488710 1 (=12/(2*6)) 7.89477 best keyword for cluster 6488710 is PF02917 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6488710 ] : 6685327 0.222222 (=2/(1*9)) 89.3333 best keyword for cluster 6685327 is PF01375 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02917 ( PF02917 Pertussis toxin, subunit 1 ) B> PF01375 ( PF01375 Heat-labile enterotoxin alpha chain ) they come from the same clan: CL0084.8 : PF02917 PF00644 PF01375 PF02763 PF03496 PF01129 the two keywords do not coincide on UniRef90 proteins both PF02917 and PF01375 have PDB structures PF01375 d.166.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 51 ) 6754260_PF02935_PF04762 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02935 is 6604280 with Jaccard = 1.0000 |PF02935|=20 [ 20 0 1100191 0 ] parent [ 6604280 ] : 6754260 0.0117845 (=14/(27*44)) 98.9231 given [ 6604280 ] : 6604280 0.5 (=13/(1*26)) 63.7 best keyword for cluster 6604280 is PF02935 with Jaccard = 1.0000 [ 20 0 1100191 0 ] 1.0000 1.0000 sibling [ 6604280 ] : 6751636 0.0232558 (=1/(1*43)) 98.7442 best keyword for cluster 6751636 is PF04762 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02935 ( PF02935 Cytochrome c oxidase subunit VIIc ) B> PF04762 ( PF04762 IKI3 family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02935 has a PDB structure (may not be up to date) PF02935 f.23.6.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 52 ) 6496729_PF02977_PF06801 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02977 is 6101122 with Jaccard = 1.0000 |PF02977|=2 [ 2 0 1100209 0 ] parent [ 6101122 ] : 6496729 1 (=8/(2*4)) 10.2116 given [ 6101122 ] : 6101122 1 (=1/(1*1)) 2e-25 best keyword for cluster 6101122 is PF02977 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 sibling [ 6101122 ] : 6387427 1 (=3/(1*3)) 0.0014 best keyword for cluster 6387427 is PF06801 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02977 ( PF02977 Carboxypeptidase A inhibitor ) B> PF06801 ( PF06801 Protein of unknown function, DUF1532 ) Only A has a clan ( CL0096.7 ). the two keywords do not coincide on UniRef90 proteins only PF02977 has a PDB structure (may not be up to date) PF02977 g.3.2.1 SUPERFAM mapping significantly overlapping: 1 PF02977 SSF57027 0.991 (average over 3 mutual instances, PF02977 3 appearances, SSF57027 43 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 53 ) 6545097_PF03019_PF05744 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03019 is 5862882 with Jaccard = 1.0000 |PF03019|=2 [ 2 0 1100209 0 ] parent [ 5862882 ] : 6545097 0.75 (=3/(2*2)) 35.25 given [ 5862882 ] : 5862882 1 (=1/(1*1)) 2e-47 best keyword for cluster 5862882 is PF03019 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 sibling [ 5862882 ] : 5848995 1 (=1/(1*1)) 7e-49 best keyword for cluster 5848995 is PF05744 with Jaccard = 0.6667 [ 2 0 1100208 1 ] 1.0000 0.6667 SUGGESTING RELATEDNESS OF: A> PF03019 ( PF03019 Furovirus P26 ) B> PF05744 ( PF05744 Benyvirus P25 protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03019| = 2 , |PF05744| = 3 , |PF03019^PF05744| = 1 ( 50.0% and 33.3% ) Neither PF03019 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 54 ) 6753592_PF01784_PF03091 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03091 is 6576312 with Jaccard = 1.0000 |PF03091|=146 [ 146 0 1100065 0 ] parent [ 6576312 ] : 6753592 0.0164445 (=808/(155*317)) 98.876 given [ 6576312 ] : 6576312 0.477124 (=146/(2*153)) 52.6496 best keyword for cluster 6576312 is PF03091 with Jaccard = 1.0000 [ 146 0 1100065 0 ] 1.0000 1.0000 sibling [ 6576312 ] : 6724392 0.0411765 (=210/(300*17)) 96.0361 best keyword for cluster 6724392 is PF01784 with Jaccard = 0.9913 [ 229 0 1099980 2 ] 1.0000 0.9913 SUGGESTING RELATEDNESS OF: A> PF03091 ( PF03091 CutA1 divalent ion tolerance protein ) B> PF01784 ( PF01784 NIF3 (NGG1p interacting factor 3) ) Only A has a clan ( CL0089.8 ). the two keywords do not coincide on UniRef90 proteins both PF03091 and PF01784 have PDB structures PF03091 d.58.5.2 SUPERFAM mapping significantly overlapping: 1 PF01784 SSF102705 0.943 (average over 691 mutual instances, PF01784 822 appearances, SSF102705 692 appearances) 2 PF03091 SSF54913 0.947 (average over 389 mutual instances, PF03091 390 appearances, SSF54913 2763 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 55 ) 6721673_PF00793_PF03102 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03102 is 6497825 with Jaccard = 1.0000 |PF03102|=146 [ 146 0 1100065 0 ] parent [ 6497825 ] : 6721673 0.0524902 (=4437/(158*535)) 95.6569 given [ 6497825 ] : 6497825 0.901075 (=419/(3*155)) 10.8848 best keyword for cluster 6497825 is PF03102 with Jaccard = 1.0000 [ 146 0 1100065 0 ] 1.0000 1.0000 sibling [ 6497825 ] : 6655682 0.227403 (=16167/(246*289)) 81.9951 best keyword for cluster 6655682 is PF00793 with Jaccard = 0.9895 [ 473 0 1099733 5 ] 1.0000 0.9895 SUGGESTING RELATEDNESS OF: A> PF03102 ( PF03102 NeuB family ) B> PF00793 ( PF00793 DAHP synthetase I family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF03102 and PF00793 have PDB structures PF00793 c.1.10.4 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 56 ) 6703844_PF03014_PF03115 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03115 is 6662633 with Jaccard = 1.0000 |PF03115|=23 [ 23 0 1100188 0 ] parent [ 6662633 ] : 6703844 0.094086 (=35/(31*12)) 92.8751 given [ 6662633 ] : 6662633 0.161905 (=34/(21*10)) 83.813 best keyword for cluster 6662633 is PF03115 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 sibling [ 6662633 ] : 6668281 0.15 (=3/(2*10)) 85.0074 best keyword for cluster 6668281 is PF03014 with Jaccard = 0.9000 [ 9 1 1100201 0 ] 0.9000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03115 ( PF03115 Astrovirus capsid protein precursor ) B> PF03014 ( PF03014 Structural protein 2 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03115 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 57 ) 6612540_PF01671_PF03158 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03158 is 6323662 with Jaccard = 1.0000 |PF03158|=11 [ 11 0 1100200 0 ] parent [ 6323662 ] : 6612540 0.35 (=77/(11*20)) 67.5518 given [ 6323662 ] : 6323662 1 (=18/(2*9)) 7.22961e-08 best keyword for cluster 6323662 is PF03158 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 sibling [ 6323662 ] : 6604515 0.361111 (=13/(2*18)) 63.9222 best keyword for cluster 6604515 is PF01671 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03158 ( PF03158 Multigene family 530 protein ) B> PF01671 ( PF01671 African swine fever virus multigene family 360 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03158 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF03158 SSF48403 0.558 (average over 1 mutual instances, PF03158 1 appearances, SSF48403 17044 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 58 ) 6766171_PF03192_PF03628 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03192 is 6660260 with Jaccard = 1.0000 |PF03192|=25 [ 25 0 1100186 0 ] parent [ 6660260 ] : 6766171 0.00725953 (=4/(29*19)) 99.5626 given [ 6660260 ] : 6660260 0.191667 (=23/(24*5)) 83.399 best keyword for cluster 6660260 is PF03192 with Jaccard = 1.0000 [ 25 0 1100186 0 ] 1.0000 1.0000 sibling [ 6660260 ] : 6721474 0.0641026 (=5/(6*13)) 95.6315 best keyword for cluster 6721474 is PF03628 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03192 ( PF03192 Pyrococcus protein of unknown function, DUF257 ) B> PF03628 ( PF03628 PapG chaperone-binding domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03192 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 59 ) 6739274_PF03194_PF04659 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03194 is 6493721 with Jaccard = 1.0000 |PF03194|=60 [ 60 0 1100151 0 ] parent [ 6493721 ] : 6739274 0.0302154 (=94/(61*51)) 97.7126 given [ 6493721 ] : 6493721 0.940678 (=111/(2*59)) 9.36073 best keyword for cluster 6493721 is PF03194 with Jaccard = 1.0000 [ 60 0 1100151 0 ] 1.0000 1.0000 sibling [ 6493721 ] : 6726960 0.0464286 (=26/(16*35)) 96.3518 best keyword for cluster 6726960 is PF04659 with Jaccard = 0.9500 [ 19 1 1100191 0 ] 0.9500 1.0000 SUGGESTING RELATEDNESS OF: A> PF03194 ( PF03194 LUC7 N_terminus ) B> PF04659 ( PF04659 Archaeal flagella protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03194 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 60 ) 6718989_PF02153_PF03201 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03201 is 6353238 with Jaccard = 1.0000 |PF03201|=14 [ 14 0 1100197 0 ] parent [ 6353238 ] : 6718989 0.0564436 (=226/(14*286)) 95.2773 given [ 6353238 ] : 6353238 1 (=48/(6*8)) 8.34744e-06 best keyword for cluster 6353238 is PF03201 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6353238 ] : 6705140 0.0912281 (=26/(1*285)) 93.13 best keyword for cluster 6705140 is PF02153 with Jaccard = 0.8755 [ 232 20 1099946 13 ] 0.9206 0.9469 SUGGESTING RELATEDNESS OF: A> PF03201 ( PF03201 H2-forming N5,N10-methylenetetrahydromethanopterin dehydrogenase ) B> PF02153 ( PF02153 Prephenate dehydrogenase ) Only B has a clan ( CL0063.17 ). the two keywords do not coincide on UniRef90 proteins Neither PF03201 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF02153 SSF51735 0.580 (average over 737 mutual instances, PF02153 915 appearances, SSF51735 164772 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 61 ) 6716237_PF03220_PF08095 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03220 is 5890004 with Jaccard = 1.0000 |PF03220|=12 [ 12 0 1100199 0 ] parent [ 5890004 ] : 6716237 0.0769231 (=6/(13*6)) 94.9231 given [ 5890004 ] : 5890004 1 (=12/(1*12)) 1.1491e-44 best keyword for cluster 5890004 is PF03220 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 5890004 ] : 6697977 0.125 (=1/(4*2)) 91.875 best keyword for cluster 6697977 is PF08095 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03220 ( PF03220 Tombusvirus P19 core protein ) B> PF08095 ( PF08095 Hefutoxin family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF03220 and PF08095 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF03220 SSF103145 0.842 (average over 54 mutual instances, PF03220 54 appearances, SSF103145 54 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 62 ) 6740829_PF03240_PF06981 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03240 is 6685134 with Jaccard = 1.0000 |PF03240|=123 [ 123 0 1100088 0 ] parent [ 6685134 ] : 6740829 0.0287168 (=762/(145*183)) 97.859 given [ 6685134 ] : 6685134 0.120567 (=68/(141*4)) 89.3038 best keyword for cluster 6685134 is PF03240 with Jaccard = 1.0000 [ 123 0 1100088 0 ] 1.0000 1.0000 sibling [ 6685134 ] : 6729729 0.0408685 (=64/(9*174)) 96.6955 best keyword for cluster 6729729 is PF06981 with Jaccard = 0.9813 [ 105 2 1100104 0 ] 0.9813 1.0000 SUGGESTING RELATEDNESS OF: A> PF03240 ( ) B> PF06981 ( ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03240 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 63 ) 6765561_PF02237_PF03309 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03309 is 6592249 with Jaccard = 1.0000 |PF03309|=164 [ 164 0 1100047 0 ] parent [ 6592249 ] : 6765561 0.00467695 (=367/(190*413)) 99.5374 given [ 6592249 ] : 6592249 0.428571 (=81/(1*189)) 58.0669 best keyword for cluster 6592249 is PF03309 with Jaccard = 1.0000 [ 164 0 1100047 0 ] 1.0000 1.0000 sibling [ 6592249 ] : 6731294 0.0325159 (=419/(34*379)) 96.8599 best keyword for cluster 6731294 is PF02237 with Jaccard = 0.6176 [ 210 128 1099871 2 ] 0.6213 0.9906 SUGGESTING RELATEDNESS OF: A> PF03309 ( PF03309 Bordetella pertussis Bvg accessory factor family ) B> PF02237 ( PF02237 Biotin protein ligase C terminal domain ) Only B has a clan ( CL0206.5 ). the two keywords coincide on Uniref90 proteins: |PF02237| = 212 , |PF03309| = 164 , |PF02237^PF03309| = 1 ( 0.5% and 0.6% ) only PF03309 has a PDB structure (may not be up to date) PF02237 b.34.1.1 SUPERFAM mapping significantly overlapping: 1 PF02237 SSF50037 0.995 (average over 385 mutual instances, PF02237 387 appearances, SSF50037 2023 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 64 ) 6718725_PF03314_PF05637 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03314 is 6116985 with Jaccard = 1.0000 |PF03314|=18 [ 18 0 1100193 0 ] parent [ 6116985 ] : 6718725 0.0723514 (=112/(18*86)) 95.248 given [ 6116985 ] : 6116985 1 (=56/(4*14)) 4.56494e-24 best keyword for cluster 6116985 is PF03314 with Jaccard = 1.0000 [ 18 0 1100193 0 ] 1.0000 1.0000 sibling [ 6116985 ] : 6677423 0.152792 (=145/(73*13)) 87.5603 best keyword for cluster 6677423 is PF05637 with Jaccard = 0.9706 [ 66 1 1100143 1 ] 0.9851 0.9851 SUGGESTING RELATEDNESS OF: A> PF03314 ( PF03314 Protein of unknown function, DUF273 ) B> PF05637 ( PF05637 galactosyl transferase GMA12/MNN10 family ) Only B has a clan ( CL0110.6 ). the two keywords do not coincide on UniRef90 proteins Neither PF03314 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 65 ) 6698651_PF03331_PF07977 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03331 is 6468662 with Jaccard = 1.0000 |PF03331|=135 [ 135 0 1100076 0 ] parent [ 6468662 ] : 6698651 0.0818869 (=5265/(152*423)) 91.9929 given [ 6468662 ] : 6468662 0.966887 (=146/(1*151)) 3.51342 best keyword for cluster 6468662 is PF03331 with Jaccard = 1.0000 [ 135 0 1100076 0 ] 1.0000 1.0000 sibling [ 6468662 ] : 6683841 0.143705 (=121/(2*421)) 89.05 best keyword for cluster 6683841 is PF07977 with Jaccard = 0.8889 [ 312 0 1099860 39 ] 1.0000 0.8889 SUGGESTING RELATEDNESS OF: A> PF03331 ( PF03331 UDP-3-O-acyl N-acetylglycosamine deacetylase ) B> PF07977 ( PF07977 FabA-like domain ) Only B has a clan ( CL0050.7 ). the two keywords coincide on Uniref90 proteins: |PF03331| = 135 , |PF07977| = 351 , |PF03331^PF07977| = 12 ( 8.9% and 3.4% ) both PF03331 and PF07977 have PDB structures PF07977 d.38.1.2 d.38.1.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 66 ) 6730442_PF03337_PF07868 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03337 is 6203794 with Jaccard = 1.0000 |PF03337|=12 [ 12 0 1100199 0 ] parent [ 6203794 ] : 6730442 0.0384615 (=1/(13*2)) 96.7692 given [ 6203794 ] : 6203794 1 (=22/(2*11)) 5.95465e-17 best keyword for cluster 6203794 is PF03337 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6203794 ] : 6398744 1 (=1/(1*1)) 0.007 best keyword for cluster 6398744 is PF07868 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03337 ( PF03337 Poxvirus F12L protein ) B> PF07868 ( PF07868 Protein of unknown function (DUF1655) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03337 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 67 ) 6769752_PF03369_PF04759 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03369 is 6379454 with Jaccard = 1.0000 |PF03369|=19 [ 19 0 1100192 0 ] parent [ 6379454 ] : 6769752 0.00679117 (=4/(19*31)) 99.7029 given [ 6379454 ] : 6379454 1 (=18/(1*18)) 0.000444445 best keyword for cluster 6379454 is PF03369 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 sibling [ 6379454 ] : 6764022 0.0333333 (=1/(1*30)) 99.4667 best keyword for cluster 6764022 is PF04759 with Jaccard = 1.0000 [ 24 0 1100187 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03369 ( PF03369 Herpesvirus UL3 protein ) B> PF04759 ( PF04759 Protein of unknown function, DUF617 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03369 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 68 ) 6738682_PF01681_PF03380 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03380 is 6395617 with Jaccard = 1.0000 |PF03380|=16 [ 16 0 1100195 0 ] parent [ 6395617 ] : 6738682 0.0263158 (=8/(16*19)) 97.6541 given [ 6395617 ] : 6395617 1 (=63/(7*9)) 0.00457322 best keyword for cluster 6395617 is PF03380 with Jaccard = 1.0000 [ 16 0 1100195 0 ] 1.0000 1.0000 sibling [ 6395617 ] : 6695651 0.0857143 (=6/(14*5)) 91.4874 best keyword for cluster 6695651 is PF01681 with Jaccard = 0.6667 [ 8 2 1100199 2 ] 0.8000 0.8000 SUGGESTING RELATEDNESS OF: A> PF03380 ( PF03380 Caenorhabditis protein of unknown function, DUF282 ) B> PF01681 ( PF01681 C6 domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03380 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 69 ) 6714930_PF00115_PF03390 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03390 is 6545427 with Jaccard = 1.0000 |PF03390|=36 [ 36 0 1100175 0 ] parent [ 6545427 ] : 6714930 0.075415 (=11459/(41*3706)) 94.7136 given [ 6545427 ] : 6545427 0.648649 (=96/(37*4)) 35.5371 best keyword for cluster 6545427 is PF03390 with Jaccard = 1.0000 [ 36 0 1100175 0 ] 1.0000 1.0000 sibling [ 6545427 ] : 6702546 0.110931 (=411/(1*3705)) 92.6389 best keyword for cluster 6702546 is PF00115 with Jaccard = 0.9964 [ 3288 0 1096911 12 ] 1.0000 0.9964 SUGGESTING RELATEDNESS OF: A> PF03390 ( PF03390 Bacterial sodium:citrate symporter ) B> PF00115 ( PF00115 Cytochrome C and Quinol oxidase polypeptide I ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03390 has a PDB structure (may not be up to date) PF00115 f.24.1.1 SUPERFAM mapping significantly overlapping: 1 PF00115 SSF81442 0.952 (average over 60836 mutual instances, PF00115 60939 appearances, SSF81442 60941 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 70 ) 6682295_PF03418_PF06866 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03418 is 6104243 with Jaccard = 1.0000 |PF03418|=23 [ 23 0 1100188 0 ] parent [ 6104243 ] : 6682295 0.148026 (=135/(24*38)) 88.7803 given [ 6104243 ] : 6104243 1 (=23/(1*23)) 3.91787e-25 best keyword for cluster 6104243 is PF03418 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 sibling [ 6104243 ] : 6450664 0.986111 (=71/(2*36)) 1.39873 best keyword for cluster 6450664 is PF06866 with Jaccard = 1.0000 [ 36 0 1100175 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03418 ( PF03418 Germination protease ) B> PF06866 ( PF06866 Protein of unknown function (DUF1256) ) they come from the same clan: CL0095.8 : PF01750 PF06866 PF03418 the two keywords do not coincide on UniRef90 proteins only PF03418 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 71 ) 6726044_PF00290_PF03437 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03437 is 6441796 with Jaccard = 1.0000 |PF03437|=37 [ 37 0 1100174 0 ] parent [ 6441796 ] : 6726044 0.0521127 (=592/(40*284)) 96.2404 given [ 6441796 ] : 6441796 0.9925 (=397/(20*20)) 0.750139 best keyword for cluster 6441796 is PF03437 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 sibling [ 6441796 ] : 6467471 0.971467 (=2145/(8*276)) 3.33629 best keyword for cluster 6467471 is PF00290 with Jaccard = 0.9059 [ 260 2 1099924 25 ] 0.9924 0.9123 SUGGESTING RELATEDNESS OF: A> PF03437 ( PF03437 BtpA family ) B> PF00290 ( PF00290 Tryptophan synthase alpha chain ) they come from the same clan: CL0036.17 : PF05690 PF01680 PF00834 PF01729 PF00697 PF03740 PF01884 PF00724 PF00215 PF03060 PF04095 PF04131 PF00478 PF00218 PF00977 PF01645 PF04309 PF01070 PF01207 PF04481 PF04476 PF01180 PF00701 PF01791 PF03932 PF03437 PF01081 PF00121 PF09370 PF02581 PF00290 the two keywords do not coincide on UniRef90 proteins only PF03437 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF03437 SSF51366 0.881 (average over 36 mutual instances, PF03437 64 appearances, SSF51366 8168 appearances) 2 PF00290 SSF51366 0.965 (average over 971 mutual instances, PF00290 1015 appearances, SSF51366 8168 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 72 ) 6624418_PF03531_PF08512 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03531 is 6505640 with Jaccard = 1.0000 |PF03531|=46 [ 46 0 1100165 0 ] parent [ 6505640 ] : 6624418 0.301333 (=226/(15*50)) 72.328 given [ 6505640 ] : 6505640 0.87234 (=123/(3*47)) 13.6458 best keyword for cluster 6505640 is PF03531 with Jaccard = 1.0000 [ 46 0 1100165 0 ] 1.0000 1.0000 sibling [ 6505640 ] : 6388117 1 (=56/(7*8)) 0.00158046 best keyword for cluster 6388117 is PF08512 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03531 ( PF03531 Structure-specific recognition protein (SSRP1) ) B> PF08512 ( PF08512 Histone chaperone Rttp106-like ) they come from the same clan: CL0215.5 : PF03531 PF08512 the two keywords do not coincide on UniRef90 proteins Neither PF03531 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 73 ) 6718707_PF03554_PF05702 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03554 is 6477431 with Jaccard = 1.0000 |PF03554|=29 [ 29 0 1100182 0 ] parent [ 6477431 ] : 6718707 0.0598911 (=33/(29*19)) 95.2443 given [ 6477431 ] : 6477431 0.957143 (=201/(15*14)) 5.16296 best keyword for cluster 6477431 is PF03554 with Jaccard = 1.0000 [ 29 0 1100182 0 ] 1.0000 1.0000 sibling [ 6477431 ] : 6699135 0.153846 (=12/(6*13)) 92.0256 best keyword for cluster 6699135 is PF05702 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03554 ( PF03554 UL73 viral envelope glycoprotein ) B> PF05702 ( PF05702 Herpesvirus UL49.5 envelope/tegument protein ) they come from the same clan: CL0146.7 : PF05702 PF03554 the two keywords do not coincide on UniRef90 proteins Neither PF03554 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 74 ) 6607735_PF03569_PF07255 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03569 is 5663698 with Jaccard = 1.0000 |PF03569|=1 [ 1 0 1100210 0 ] parent [ 5663698 ] : 6607735 0.5 (=4/(4*2)) 65.625 given [ 5663698 ] : 5663698 1 (=3/(1*3)) 4.66667e-71 best keyword for cluster 5663698 is PF03569 with Jaccard = 1.0000 [ 1 0 1100210 0 ] 1.0000 1.0000 sibling [ 5663698 ] : 6235623 1 (=1/(1*1)) 2e-14 best keyword for cluster 6235623 is PF07255 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03569 ( PF03569 Peptidase family C8 ) B> PF07255 ( PF07255 Benyvirus 14KDa protein ) Only A has a clan ( CL0125.9 ). the two keywords do not coincide on UniRef90 proteins Neither PF03569 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 75 ) 6705040_PF03584_PF04664 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03584 is 6596709 with Jaccard = 1.0000 |PF03584|=20 [ 20 0 1100191 0 ] parent [ 6596709 ] : 6705040 0.0785414 (=56/(23*31)) 93.11 given [ 6596709 ] : 6596709 0.4 (=36/(18*5)) 60 best keyword for cluster 6596709 is PF03584 with Jaccard = 1.0000 [ 20 0 1100191 0 ] 1.0000 1.0000 sibling [ 6596709 ] : 6641416 0.238095 (=20/(3*28)) 77.4404 best keyword for cluster 6641416 is PF04664 with Jaccard = 1.0000 [ 26 0 1100185 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03584 ( PF03584 Herpesvirus ICP4-like protein N-terminal region ) B> PF04664 ( PF04664 Opioid growth factor receptor (OGFr) conserved region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03584 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 76 ) 6733155_PF03616_PF05684 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03616 is 6556795 with Jaccard = 1.0000 |PF03616|=60 [ 60 0 1100151 0 ] parent [ 6556795 ] : 6733155 0.0441067 (=119/(71*38)) 97.0673 given [ 6556795 ] : 6556795 0.597727 (=526/(16*55)) 44.0769 best keyword for cluster 6556795 is PF03616 with Jaccard = 1.0000 [ 60 0 1100151 0 ] 1.0000 1.0000 sibling [ 6556795 ] : 6651156 0.234375 (=45/(32*6)) 80.3941 best keyword for cluster 6651156 is PF05684 with Jaccard = 1.0000 [ 30 0 1100181 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03616 ( PF03616 Sodium/glutamate symporter ) B> PF05684 ( PF05684 Protein of unknown function (DUF819) ) they come from the same clan: CL0064.7 : PF06826 PF03547 PF03601 PF05684 PF05982 PF03616 PF06965 PF00999 PF03977 PF01758 the two keywords do not coincide on UniRef90 proteins Neither PF03616 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 77 ) 6717710_PF03644_PF05903 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03644 is 6706567 with Jaccard = 1.0000 |PF03644|=39 [ 39 0 1100172 0 ] parent [ 6706567 ] : 6717710 0.0498732 (=236/(91*52)) 95.1135 given [ 6706567 ] : 6706567 0.078125 (=15/(4*48)) 93.3974 best keyword for cluster 6706567 is PF03644 with Jaccard = 1.0000 [ 39 0 1100172 0 ] 1.0000 1.0000 sibling [ 6706567 ] : 6534232 0.75129 (=1456/(34*57)) 28.38 best keyword for cluster 6534232 is PF05903 with Jaccard = 0.8462 [ 88 0 1100107 16 ] 1.0000 0.8462 SUGGESTING RELATEDNESS OF: A> PF03644 ( PF03644 Glycosyl hydrolase family 85 ) B> PF05903 ( PF05903 PPPDE putative peptidase domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03644| = 39 , |PF05903| = 104 , |PF03644^PF05903| = 3 ( 7.7% and 2.9% ) Neither PF03644 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 78 ) 6722841_PF03666_PF06218 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03666 is 6575662 with Jaccard = 1.0000 |PF03666|=32 [ 32 0 1100179 0 ] parent [ 6575662 ] : 6722841 0.0614187 (=71/(34*34)) 95.8374 given [ 6575662 ] : 6575662 0.5 (=32/(2*32)) 52.4362 best keyword for cluster 6575662 is PF03666 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 sibling [ 6575662 ] : 6664821 0.2 (=24/(30*4)) 84.248 best keyword for cluster 6664821 is PF06218 with Jaccard = 1.0000 [ 30 0 1100181 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03666 ( PF03666 Uncharacterised protein family (UPF0171) ) B> PF06218 ( PF06218 Nitrogen permease regulator 2 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03666 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 79 ) 6558522_PF00429_PF03708 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03708 is 6497025 with Jaccard = 1.0000 |PF03708|=14 [ 14 0 1100197 0 ] parent [ 6497025 ] : 6558522 0.569444 (=984/(16*108)) 45.5788 given [ 6497025 ] : 6497025 0.904762 (=57/(7*9)) 10.381 best keyword for cluster 6497025 is PF03708 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6497025 ] : 6556826 0.570874 (=294/(5*103)) 44.105 best keyword for cluster 6556826 is PF00429 with Jaccard = 0.7500 [ 96 0 1100083 32 ] 1.0000 0.7500 SUGGESTING RELATEDNESS OF: A> PF03708 ( PF03708 Avian retrovirus envelope protein, gp85 ) B> PF00429 ( PF00429 ENV polyprotein (coat polyprotein) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03708 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 80 ) 6752068_PF02130_PF03740 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03740 is 6479539 with Jaccard = 1.0000 |PF03740|=146 [ 146 0 1100065 0 ] parent [ 6479539 ] : 6752068 0.0123457 (=582/(162*291)) 98.7739 given [ 6479539 ] : 6479539 0.944099 (=152/(1*161)) 5.5978 best keyword for cluster 6479539 is PF03740 with Jaccard = 1.0000 [ 146 0 1100065 0 ] 1.0000 1.0000 sibling [ 6479539 ] : 6751948 0.0206897 (=6/(1*290)) 98.7655 best keyword for cluster 6751948 is PF02130 with Jaccard = 0.9850 [ 263 0 1099944 4 ] 1.0000 0.9850 SUGGESTING RELATEDNESS OF: A> PF03740 ( PF03740 Pyridoxal phosphate biosynthesis protein PdxJ ) B> PF02130 ( PF02130 Uncharacterized protein family UPF0054 ) Only A has a clan ( CL0036.17 ). the two keywords coincide on Uniref90 proteins: |PF02130| = 267 , |PF03740| = 146 , |PF02130^PF03740| = 1 ( 0.4% and 0.7% ) both PF03740 and PF02130 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF03740 SSF63892 0.984 (average over 495 mutual instances, PF03740 495 appearances, SSF63892 498 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 81 ) 6682244_PF03775_PF03961 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03775 is 6481608 with Jaccard = 1.0000 |PF03775|=122 [ 122 0 1100089 0 ] parent [ 6481608 ] : 6682244 0.151884 (=1572/(138*75)) 88.7541 given [ 6481608 ] : 6481608 0.945726 (=4426/(60*78)) 6.05531 best keyword for cluster 6481608 is PF03775 with Jaccard = 1.0000 [ 122 0 1100089 0 ] 1.0000 1.0000 sibling [ 6481608 ] : 6549351 0.630137 (=92/(2*73)) 38.3585 best keyword for cluster 6549351 is PF03961 with Jaccard = 1.0000 [ 65 0 1100146 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03775 ( PF03775 Septum formation inhibitor MinC, C-terminal domain ) B> PF03961 ( PF03961 Protein of unknown function (DUF342) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03775 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF03775 SSF63848 0.956 (average over 426 mutual instances, PF03775 426 appearances, SSF63848 651 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 82 ) 6719091_PF03601_PF03812 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03812 is 6066077 with Jaccard = 1.0000 |PF03812|=24 [ 24 0 1100187 0 ] parent [ 6066077 ] : 6719091 0.0627907 (=270/(25*172)) 95.2908 given [ 6066077 ] : 6066077 1 (=156/(12*13)) 2.11809e-28 best keyword for cluster 6066077 is PF03812 with Jaccard = 1.0000 [ 24 0 1100187 0 ] 1.0000 1.0000 sibling [ 6066077 ] : 6626729 0.368421 (=63/(1*171)) 73.2716 best keyword for cluster 6626729 is PF03601 with Jaccard = 0.9935 [ 152 0 1100058 1 ] 1.0000 0.9935 SUGGESTING RELATEDNESS OF: A> PF03812 ( PF03812 2-keto-3-deoxygluconate permease ) B> PF03601 ( PF03601 Conserved hypothetical protein 698 ) Only B has a clan ( CL0064.7 ). the two keywords do not coincide on UniRef90 proteins Neither PF03812 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 83 ) 6769312_PF03816_PF07349 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03816 is 6748358 with Jaccard = 1.0000 |PF03816|=268 [ 268 0 1099943 0 ] parent [ 6748358 ] : 6769312 0.00348211 (=77/(351*63)) 99.6875 given [ 6748358 ] : 6748358 0.0176587 (=89/(336*15)) 98.4974 best keyword for cluster 6748358 is PF03816 with Jaccard = 1.0000 [ 268 0 1099943 0 ] 1.0000 1.0000 sibling [ 6748358 ] : 6765690 0.00598086 (=5/(44*19)) 99.5427 best keyword for cluster 6765690 is PF07349 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03816 ( PF03816 Cell envelope-related transcriptional attenuator domain ) B> PF07349 ( PF07349 Protein of unknown function (DUF1478) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03816 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 84 ) 6740868_PF03870_PF05404 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03870 is 6522437 with Jaccard = 1.0000 |PF03870|=38 [ 38 0 1100173 0 ] parent [ 6522437 ] : 6740868 0.0277778 (=19/(18*38)) 97.864 given [ 6522437 ] : 6522437 0.805556 (=58/(2*36)) 21.3546 best keyword for cluster 6522437 is PF03870 with Jaccard = 1.0000 [ 38 0 1100173 0 ] 1.0000 1.0000 sibling [ 6522437 ] : 6410460 1 (=17/(1*17)) 0.0336493 best keyword for cluster 6410460 is PF05404 with Jaccard = 1.0000 [ 18 0 1100193 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03870 ( PF03870 RNA polymerase Rpb8 ) B> PF05404 ( PF05404 Translocon-associated protein, delta subunit precursor (TRAP-delta) ) Only A has a clan ( CL0021.12 ). the two keywords do not coincide on UniRef90 proteins only PF03870 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF03870 SSF50249 0.950 (average over 88 mutual instances, PF03870 88 appearances, SSF50249 52669 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 85 ) 6714365_PF01598_PF03897 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03897 is 6428067 with Jaccard = 1.0000 |PF03897|=41 [ 41 0 1100170 0 ] parent [ 6428067 ] : 6714365 0.0755003 (=1494/(51*388)) 94.6228 given [ 6428067 ] : 6428067 1 (=650/(26*25)) 0.236971 best keyword for cluster 6428067 is PF03897 with Jaccard = 1.0000 [ 41 0 1100170 0 ] 1.0000 1.0000 sibling [ 6428067 ] : 6689106 0.10733 (=246/(6*382)) 90.0997 best keyword for cluster 6689106 is PF01598 with Jaccard = 0.9712 [ 202 4 1100003 2 ] 0.9806 0.9902 SUGGESTING RELATEDNESS OF: A> PF03897 ( ) B> PF01598 ( ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03897 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 86 ) 6699580_PF02430_PF03993 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03993 is 6685662 with Jaccard = 1.0000 |PF03993|=23 [ 23 0 1100188 0 ] parent [ 6685662 ] : 6699580 0.0833333 (=91/(28*39)) 92.1105 given [ 6685662 ] : 6685662 0.111111 (=12/(36*3)) 89.3977 best keyword for cluster 6685662 is PF03993 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 sibling [ 6685662 ] : 6681421 0.130435 (=15/(5*23)) 88.5474 best keyword for cluster 6681421 is PF02430 with Jaccard = 0.9333 [ 14 1 1100196 0 ] 0.9333 1.0000 SUGGESTING RELATEDNESS OF: A> PF03993 ( PF03993 Domain of Unknown Function (DUF349) ) B> PF02430 ( PF02430 Apical membrane antigen 1 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03993 has a PDB structure (may not be up to date) PF02430 g.61.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 87 ) 6706703_PF04000_PF07493 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04000 is 6638412 with Jaccard = 1.0000 |PF04000|=46 [ 46 0 1100165 0 ] parent [ 6638412 ] : 6706703 0.0772201 (=220/(77*37)) 93.4166 given [ 6638412 ] : 6638412 0.24487 (=358/(43*34)) 76.5882 best keyword for cluster 6638412 is PF04000 with Jaccard = 1.0000 [ 46 0 1100165 0 ] 1.0000 1.0000 sibling [ 6638412 ] : 6680814 0.128788 (=17/(33*4)) 88.3827 best keyword for cluster 6680814 is PF07493 with Jaccard = 0.9286 [ 26 0 1100183 2 ] 1.0000 0.9286 SUGGESTING RELATEDNESS OF: A> PF04000 ( PF04000 Sas10/Utp3/C1D family ) B> PF07493 ( ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04000 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 88 ) 6766946_PF03621_PF04019 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04019 is 6592319 with Jaccard = 1.0000 |PF04019|=25 [ 25 0 1100186 0 ] parent [ 6592319 ] : 6766946 0.00694444 (=15/(27*80)) 99.596 given [ 6592319 ] : 6592319 0.42 (=21/(25*2)) 58.1465 best keyword for cluster 6592319 is PF04019 with Jaccard = 1.0000 [ 25 0 1100186 0 ] 1.0000 1.0000 sibling [ 6592319 ] : 6682588 0.125 (=38/(76*4)) 88.8624 best keyword for cluster 6682588 is PF03621 with Jaccard = 0.9620 [ 76 0 1100132 3 ] 1.0000 0.9620 SUGGESTING RELATEDNESS OF: A> PF04019 ( PF04019 Protein of unknown function (DUF359) ) B> PF03621 ( PF03621 MbtH-like protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04019 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 89 ) 6776165_PF01924_PF04029 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04029 is 6600910 with Jaccard = 1.0000 |PF04029|=58 [ 58 0 1100153 0 ] parent [ 6600910 ] : 6776165 0.0018528 (=18/(67*145)) 99.883 given [ 6600910 ] : 6600910 0.407143 (=171/(7*60)) 62.0183 best keyword for cluster 6600910 is PF04029 with Jaccard = 1.0000 [ 58 0 1100153 0 ] 1.0000 1.0000 sibling [ 6600910 ] : 6775481 0.00694444 (=1/(1*144)) 99.8681 best keyword for cluster 6775481 is PF01924 with Jaccard = 1.0000 [ 121 0 1100090 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04029 ( PF04029 2-phosphosulpholactate phosphatase ) B> PF01924 ( PF01924 Hydrogenase formation hypA family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04029 has a PDB structure (may not be up to date) PF04029 c.148.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 90 ) 6718005_PF04035_PF06093 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04035 is 6466949 with Jaccard = 1.0000 |PF04035|=25 [ 25 0 1100186 0 ] parent [ 6466949 ] : 6718005 0.0618182 (=51/(25*33)) 95.1466 given [ 6466949 ] : 6466949 0.973333 (=146/(10*15)) 3.27405 best keyword for cluster 6466949 is PF04035 with Jaccard = 1.0000 [ 25 0 1100186 0 ] 1.0000 1.0000 sibling [ 6466949 ] : 6510434 0.866667 (=78/(3*30)) 15.744 best keyword for cluster 6510434 is PF06093 with Jaccard = 1.0000 [ 30 0 1100181 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04035 ( PF04035 Archaeal DNA-directed RNA polymerase subunit E'' (RpoE'' or RpoE2) ) B> PF06093 ( PF06093 Transcription elongation protein Spt4 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04035 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 91 ) 6651698_PF04045_PF05452 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04045 is 6553244 with Jaccard = 1.0000 |PF04045|=36 [ 36 0 1100175 0 ] parent [ 6553244 ] : 6651698 0.203947 (=31/(38*4)) 80.5116 given [ 6553244 ] : 6553244 0.609524 (=64/(3*35)) 41.3344 best keyword for cluster 6553244 is PF04045 with Jaccard = 1.0000 [ 36 0 1100175 0 ] 1.0000 1.0000 sibling [ 6553244 ] : 6592303 0.5 (=2/(2*2)) 58.125 best keyword for cluster 6592303 is PF05452 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04045 ( PF04045 Arp2/3 complex, 34 kD subunit p34-Arc ) B> PF05452 ( PF05452 Clavanin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04045 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 92 ) 6681927_PF04099_PF04628 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04099 is 6582101 with Jaccard = 1.0000 |PF04099|=75 [ 75 0 1100136 0 ] parent [ 6582101 ] : 6681927 0.134615 (=630/(78*60)) 88.6667 given [ 6582101 ] : 6582101 0.480263 (=73/(2*76)) 54.4222 best keyword for cluster 6582101 is PF04099 with Jaccard = 1.0000 [ 75 0 1100136 0 ] 1.0000 1.0000 sibling [ 6582101 ] : 6659161 0.209821 (=47/(4*56)) 83.1383 best keyword for cluster 6659161 is PF04628 with Jaccard = 0.9565 [ 22 1 1100188 0 ] 0.9565 1.0000 SUGGESTING RELATEDNESS OF: A> PF04099 ( PF04099 Sybindin-like family ) B> PF04628 ( PF04628 Sedlin, N-terminal conserved region ) they come from the same clan: CL0212.4 : PF01217 PF04628 PF04099 the two keywords do not coincide on UniRef90 proteins only PF04099 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04628 SSF64356 0.915 (average over 153 mutual instances, PF04628 154 appearances, SSF64356 1711 appearances) 2 PF04099 SSF64356 0.954 (average over 167 mutual instances, PF04099 170 appearances, SSF64356 1711 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 93 ) 6503008_PF04120_PF07300 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04120 is 5840337 with Jaccard = 1.0000 |PF04120|=7 [ 7 0 1100204 0 ] parent [ 5840337 ] : 6503008 0.920168 (=219/(7*34)) 12.5888 given [ 5840337 ] : 5840337 1 (=12/(3*4)) 8.33752e-50 best keyword for cluster 5840337 is PF04120 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 5840337 ] : 6462708 0.984375 (=63/(2*32)) 2.66949 best keyword for cluster 6462708 is PF07300 with Jaccard = 0.8571 [ 24 0 1100183 4 ] 1.0000 0.8571 SUGGESTING RELATEDNESS OF: A> PF04120 ( PF04120 Low affinity iron permease ) B> PF07300 ( PF07300 Protein of unknown function (DUF1452) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04120 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 94 ) 6729510_PF04132_PF04157 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04132 is 6607654 with Jaccard = 1.0000 |PF04132|=35 [ 35 0 1100176 0 ] parent [ 6607654 ] : 6729510 0.0415293 (=63/(37*41)) 96.6686 given [ 6607654 ] : 6607654 0.35625 (=57/(32*5)) 65.5231 best keyword for cluster 6607654 is PF04132 with Jaccard = 1.0000 [ 35 0 1100176 0 ] 1.0000 1.0000 sibling [ 6607654 ] : 6709014 0.0722222 (=13/(36*5)) 93.7974 best keyword for cluster 6709014 is PF04157 with Jaccard = 0.9375 [ 30 1 1100179 1 ] 0.9677 0.9677 SUGGESTING RELATEDNESS OF: A> PF04132 ( ) B> PF04157 ( PF04157 EAP30/Vps36 family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04132 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 95 ) 6771880_PF00909_PF04143 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04143 is 6641466 with Jaccard = 1.0000 |PF04143|=255 [ 255 0 1099956 0 ] parent [ 6641466 ] : 6771880 0.00381098 (=905/(328*724)) 99.7714 given [ 6641466 ] : 6641466 0.270415 (=5752/(89*239)) 77.4524 best keyword for cluster 6641466 is PF04143 with Jaccard = 1.0000 [ 255 0 1099956 0 ] 1.0000 1.0000 sibling [ 6641466 ] : 6770675 0.0055325 (=4/(1*723)) 99.7331 best keyword for cluster 6770675 is PF00909 with Jaccard = 0.9448 [ 531 28 1099649 3 ] 0.9499 0.9944 SUGGESTING RELATEDNESS OF: A> PF04143 ( PF04143 YeeE/YedE family (DUF395) ) B> PF00909 ( PF00909 Ammonium Transporter Family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04143 has a PDB structure (may not be up to date) PF00909 f.44.1.1 SUPERFAM mapping significantly overlapping: 1 PF00909 SSF111352 0.924 (average over 1549 mutual instances, PF00909 1587 appearances, SSF111352 1628 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 96 ) 6741418_PF04062_PF04189 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04189 is 6489650 with Jaccard = 1.0000 |PF04189|=45 [ 45 0 1100166 0 ] parent [ 6489650 ] : 6741418 0.0208333 (=37/(37*48)) 97.9167 given [ 6489650 ] : 6489650 0.935652 (=538/(23*25)) 8.15847 best keyword for cluster 6489650 is PF04189 with Jaccard = 1.0000 [ 45 0 1100166 0 ] 1.0000 1.0000 sibling [ 6489650 ] : 6442028 0.992424 (=131/(33*4)) 0.769978 best keyword for cluster 6442028 is PF04062 with Jaccard = 0.9714 [ 34 0 1100176 1 ] 1.0000 0.9714 SUGGESTING RELATEDNESS OF: A> PF04189 ( PF04189 Gcd10p family ) B> PF04062 ( PF04062 P21-ARC (ARP2/3 complex 21 kDa subunit) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04062| = 35 , |PF04189| = 45 , |PF04062^PF04189| = 1 ( 2.9% and 2.2% ) Neither PF04189 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04062 SSF69060 0.984 (average over 81 mutual instances, PF04062 81 appearances, SSF69060 81 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 97 ) 6736052_PF04206_PF06587 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04206 is 5874250 with Jaccard = 1.0000 |PF04206|=11 [ 11 0 1100200 0 ] parent [ 5874250 ] : 6736052 0.0272727 (=3/(11*10)) 97.3776 given [ 5874250 ] : 5874250 1 (=10/(1*10)) 3.00006e-46 best keyword for cluster 5874250 is PF04206 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 sibling [ 5874250 ] : 6708293 0.08 (=2/(5*5)) 93.68 best keyword for cluster 6708293 is PF06587 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04206 ( PF04206 Tetrahydromethanopterin S-methyltransferase, subunit E ) B> PF06587 ( PF06587 Protein of unknown function (DUF1137) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04206 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 98 ) 6726703_PF02302_PF04215 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04215 is 6296028 with Jaccard = 1.0000 |PF04215|=63 [ 63 0 1100148 0 ] parent [ 6296028 ] : 6726703 0.0394571 (=500/(64*198)) 96.3236 given [ 6296028 ] : 6296028 1 (=828/(46*18)) 7.25879e-10 best keyword for cluster 6296028 is PF04215 with Jaccard = 1.0000 [ 63 0 1100148 0 ] 1.0000 1.0000 sibling [ 6296028 ] : 6701543 0.101166 (=989/(94*104)) 92.4494 best keyword for cluster 6701543 is PF02302 with Jaccard = 0.6496 [ 178 0 1099937 96 ] 1.0000 0.6496 SUGGESTING RELATEDNESS OF: A> PF04215 ( PF04215 Putative sugar-specific permease, SgaT/UlaA ) B> PF02302 ( PF02302 PTS system, Lactose/Cellobiose specific IIB subunit ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02302| = 274 , |PF04215| = 63 , |PF02302^PF04215| = 7 ( 2.6% and 11.1% ) only PF04215 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 99 ) 6623162_PF02550_PF04223 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04223 is 6480810 with Jaccard = 1.0000 |PF04223|=32 [ 32 0 1100179 0 ] parent [ 6480810 ] : 6623162 0.319368 (=2325/(35*208)) 71.7707 given [ 6480810 ] : 6480810 0.941176 (=32/(1*34)) 5.88235 best keyword for cluster 6480810 is PF04223 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 sibling [ 6480810 ] : 6579868 0.502415 (=104/(1*207)) 53.6536 best keyword for cluster 6579868 is PF02550 with Jaccard = 1.0000 [ 182 0 1100029 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04223 ( PF04223 Citrate lyase, alpha subunit (CitF) ) B> PF02550 ( PF02550 Acetyl-CoA hydrolase/transferase N-terminal domain ) Only B has a clan ( CL0246.3 ). the two keywords do not coincide on UniRef90 proteins only PF04223 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 100 ) 6699268_PF03441_PF04244 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04244 is 6537683 with Jaccard = 1.0000 |PF04244|=62 [ 62 0 1100149 0 ] parent [ 6537683 ] : 6699268 0.0910146 (=2822/(74*419)) 92.0514 given [ 6537683 ] : 6537683 0.69863 (=51/(1*73)) 30.7167 best keyword for cluster 6537683 is PF04244 with Jaccard = 1.0000 [ 62 0 1100149 0 ] 1.0000 1.0000 sibling [ 6537683 ] : 6685874 0.124699 (=207/(415*4)) 89.4406 best keyword for cluster 6685874 is PF03441 with Jaccard = 0.9658 [ 367 7 1099831 6 ] 0.9813 0.9839 SUGGESTING RELATEDNESS OF: A> PF04244 ( PF04244 Deoxyribodipyrimidine photolyase-related protein ) B> PF03441 ( PF03441 FAD binding domain of DNA photolyase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03441| = 373 , |PF04244| = 62 , |PF03441^PF04244| = 2 ( 0.5% and 3.2% ) only PF04244 has a PDB structure (may not be up to date) PF03441 a.99.1.1 SUPERFAM mapping significantly overlapping: 1 PF03441 SSF48173 0.874 (average over 1104 mutual instances, PF03441 2070 appearances, SSF48173 2238 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 101 ) 6626515_PF04272_PF05366 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04272 is 6357188 with Jaccard = 1.0000 |PF04272|=3 [ 3 0 1100208 0 ] parent [ 6357188 ] : 6626515 0.333333 (=3/(3*3)) 73.09 given [ 6357188 ] : 6357188 1 (=2/(1*2)) 1.51e-05 best keyword for cluster 6357188 is PF04272 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 sibling [ 6357188 ] : 6344397 1 (=2/(1*2)) 2e-06 best keyword for cluster 6344397 is PF05366 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04272 ( PF04272 Phospholamban ) B> PF05366 ( PF05366 Sarcolipin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF04272 and PF05366 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 102 ) 6698473_PF04315_PF08401 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04315 is 6686003 with Jaccard = 1.0000 |PF04315|=30 [ 30 0 1100181 0 ] parent [ 6686003 ] : 6698473 0.112251 (=591/(135*39)) 91.9559 given [ 6686003 ] : 6686003 0.105263 (=4/(1*38)) 89.4737 best keyword for cluster 6686003 is PF04315 with Jaccard = 1.0000 [ 30 0 1100181 0 ] 1.0000 1.0000 sibling [ 6686003 ] : 6683025 0.131016 (=441/(33*102)) 88.944 best keyword for cluster 6683025 is PF08401 with Jaccard = 0.7978 [ 71 13 1100122 5 ] 0.8452 0.9342 SUGGESTING RELATEDNESS OF: A> PF04315 ( PF04315 Protein of unknown function, DUF462 ) B> PF08401 ( PF08401 Domain of unknown function (DUF1738) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04315 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 103 ) 6738557_PF04335_PF04585 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04335 is 6671095 with Jaccard = 1.0000 |PF04335|=70 [ 70 0 1100141 0 ] parent [ 6671095 ] : 6738557 0.0286987 (=116/(86*47)) 97.6416 given [ 6671095 ] : 6671095 0.15261 (=38/(3*83)) 85.8243 best keyword for cluster 6671095 is PF04335 with Jaccard = 1.0000 [ 70 0 1100141 0 ] 1.0000 1.0000 sibling [ 6671095 ] : 6695417 0.116279 (=20/(43*4)) 91.4119 best keyword for cluster 6695417 is PF04585 with Jaccard = 1.0000 [ 36 0 1100175 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04335 ( PF04335 VirB8 protein ) B> PF04585 ( PF04585 Conjugal transfer protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04335 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 104 ) 6756909_PF04362_PF05683 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04362 is 6339235 with Jaccard = 1.0000 |PF04362|=60 [ 60 0 1100151 0 ] parent [ 6339235 ] : 6756909 0.0138408 (=208/(68*221)) 99.0862 given [ 6339235 ] : 6339235 1 (=1107/(27*41)) 9.30839e-07 best keyword for cluster 6339235 is PF04362 with Jaccard = 1.0000 [ 60 0 1100151 0 ] 1.0000 1.0000 sibling [ 6339235 ] : 6636984 0.254545 (=56/(1*220)) 76.2277 best keyword for cluster 6636984 is PF05683 with Jaccard = 0.7000 [ 140 60 1100011 0 ] 0.7000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04362 ( PF04362 Bacterial Fe(2+) trafficking ) B> PF05683 ( PF05683 Fumarase C-terminus ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF04362 and PF05683 have PDB structures PF04362 d.279.1.1 SUPERFAM mapping significantly overlapping: 1 PF04362 SSF111148 0.897 (average over 269 mutual instances, PF04362 269 appearances, SSF111148 269 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 105 ) 6614466_PF01989_PF04412 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04412 is 6266441 with Jaccard = 1.0000 |PF04412|=34 [ 34 0 1100177 0 ] parent [ 6266441 ] : 6614466 0.319444 (=299/(36*26)) 68.2913 given [ 6266441 ] : 6266441 1 (=320/(16*20)) 4.52681e-12 best keyword for cluster 6266441 is PF04412 with Jaccard = 1.0000 [ 34 0 1100177 0 ] 1.0000 1.0000 sibling [ 6266441 ] : 6511975 0.84 (=21/(1*25)) 16.2514 best keyword for cluster 6511975 is PF01989 with Jaccard = 0.6857 [ 24 0 1100176 11 ] 1.0000 0.6857 SUGGESTING RELATEDNESS OF: A> PF04412 ( PF04412 Protein of unknown function (DUF521) ) B> PF01989 ( PF01989 Protein of unknown function DUF126 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01989| = 35 , |PF04412| = 34 , |PF01989^PF04412| = 11 ( 31.4% and 32.4% ) Neither PF04412 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 106 ) 6753963_PF03231_PF04461 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04461 is 6614442 with Jaccard = 1.0000 |PF04461|=104 [ 104 0 1100107 0 ] parent [ 6614442 ] : 6753963 0.0159774 (=34/(112*19)) 98.9024 given [ 6614442 ] : 6614442 0.405405 (=45/(1*111)) 68.2712 best keyword for cluster 6614442 is PF04461 with Jaccard = 1.0000 [ 104 0 1100107 0 ] 1.0000 1.0000 sibling [ 6614442 ] : 6724826 0.047619 (=4/(12*7)) 96.0875 best keyword for cluster 6724826 is PF03231 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04461 ( PF04461 Protein of unknown function (DUF520) ) B> PF03231 ( PF03231 Bunyavirus non-structural protein NS-S ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04461 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 107 ) 6701721_PF04472_PF07783 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04472 is 6500238 with Jaccard = 1.0000 |PF04472|=94 [ 94 0 1100117 0 ] parent [ 6500238 ] : 6701721 0.103061 (=202/(98*20)) 92.4861 given [ 6500238 ] : 6500238 0.88996 (=1108/(83*15)) 11.6781 best keyword for cluster 6500238 is PF04472 with Jaccard = 1.0000 [ 94 0 1100117 0 ] 1.0000 1.0000 sibling [ 6500238 ] : 6659324 0.176471 (=9/(17*3)) 83.2219 best keyword for cluster 6659324 is PF07783 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04472 ( PF04472 Protein of unknown function (DUF552) ) B> PF07783 ( PF07783 Protein of unknown function (DUF1621) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04472 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 108 ) 6757417_PF04284_PF04474 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04474 is 6593323 with Jaccard = 1.0000 |PF04474|=55 [ 55 0 1100156 0 ] parent [ 6593323 ] : 6757417 0.0111111 (=36/(60*54)) 99.1166 given [ 6593323 ] : 6593323 0.422414 (=49/(2*58)) 58.5054 best keyword for cluster 6593323 is PF04474 with Jaccard = 1.0000 [ 55 0 1100156 0 ] 1.0000 1.0000 sibling [ 6593323 ] : 6749918 0.0206677 (=13/(37*17)) 98.6137 best keyword for cluster 6749918 is PF04284 with Jaccard = 1.0000 [ 34 0 1100177 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04474 ( PF04474 Protein of unknown function (DUF554) ) B> PF04284 ( PF04284 Protein of unknown function (DUF441) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04474 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 109 ) 6727414_PF02250_PF04491 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04491 is 6382237 with Jaccard = 1.0000 |PF04491|=6 [ 6 0 1100205 0 ] parent [ 6382237 ] : 6727414 0.0438596 (=5/(6*19)) 96.4088 given [ 6382237 ] : 6382237 1 (=9/(3*3)) 0.000666756 best keyword for cluster 6382237 is PF04491 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6382237 ] : 6647837 0.25 (=22/(8*11)) 79.2808 best keyword for cluster 6647837 is PF02250 with Jaccard = 0.9444 [ 17 0 1100193 1 ] 1.0000 0.9444 SUGGESTING RELATEDNESS OF: A> PF04491 ( PF04491 Poxvirus T4 protein, N terminus ) B> PF02250 ( PF02250 35kD major secreted virus protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04491 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF02250 SSF49889 0.911 (average over 94 mutual instances, PF02250 94 appearances, SSF49889 94 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 110 ) 6624840_PF04497_PF04638 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04497 is 6177740 with Jaccard = 1.0000 |PF04497|=12 [ 12 0 1100199 0 ] parent [ 6177740 ] : 6624840 0.293706 (=42/(13*11)) 72.4799 given [ 6177740 ] : 6177740 1 (=22/(2*11)) 4.55e-19 best keyword for cluster 6177740 is PF04497 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6177740 ] : 6479393 0.944444 (=17/(2*9)) 5.55618 best keyword for cluster 6479393 is PF04638 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04497 ( PF04497 Poxvirus E2 protein ) B> PF04638 ( PF04638 Pox virus protein O1 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04497 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 111 ) 6657100_PF04523_PF05765 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04523 is 5785944 with Jaccard = 1.0000 |PF04523|=7 [ 7 0 1100204 0 ] parent [ 5785944 ] : 6657100 0.202381 (=17/(7*12)) 82.3439 given [ 5785944 ] : 5785944 1 (=10/(2*5)) 8.09003e-56 best keyword for cluster 5785944 is PF04523 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 5785944 ] : 6286516 1 (=27/(3*9)) 1.48519e-10 best keyword for cluster 6286516 is PF05765 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04523 ( PF04523 Herpes virus tegument protein U30 ) B> PF05765 ( PF05765 Tegument protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04523 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 112 ) 6714693_PF04529_PF05830 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04529 is 6307902 with Jaccard = 1.0000 |PF04529|=7 [ 7 0 1100204 0 ] parent [ 6307902 ] : 6714693 0.0714286 (=8/(16*7)) 94.6688 given [ 6307902 ] : 6307902 1 (=10/(2*5)) 5.08e-09 best keyword for cluster 6307902 is PF04529 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6307902 ] : 6084671 1 (=60/(6*10)) 8.5e-27 best keyword for cluster 6084671 is PF05830 with Jaccard = 1.0000 [ 16 0 1100195 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04529 ( PF04529 Herpesvirus U59 protein ) B> PF05830 ( PF05830 Nodulation protein Z (NodZ) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04529 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 113 ) 5990058_PF03043_PF04532 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04532 is 5482787 with Jaccard = 1.0000 |PF04532|=7 [ 7 0 1100204 0 ] parent [ 5482787 ] : 5990058 1 (=98/(7*14)) 4.12245e-35 given [ 5482787 ] : 5482787 1 (=10/(2*5)) 4.50211e-99 best keyword for cluster 5482787 is PF04532 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 5482787 ] : 5652310 1 (=33/(3*11)) 1.21333e-72 best keyword for cluster 5652310 is PF03043 with Jaccard = 0.6087 [ 14 0 1100188 9 ] 1.0000 0.6087 SUGGESTING RELATEDNESS OF: A> PF04532 ( PF04532 Protein of unknown function (DUF587) ) B> PF03043 ( PF03043 Herpesvirus UL87 family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03043| = 23 , |PF04532| = 7 , |PF03043^PF04532| = 7 ( 30.4% and 100.0% ) Neither PF04532 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 114 ) 6741166_PF04528_PF04537 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04537 is 6334332 with Jaccard = 1.0000 |PF04537|=12 [ 12 0 1100199 0 ] parent [ 6334332 ] : 6741166 0.0484496 (=25/(12*43)) 97.8936 given [ 6334332 ] : 6334332 1 (=27/(3*9)) 4.02894e-07 best keyword for cluster 6334332 is PF04537 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6334332 ] : 6738690 0.0238095 (=1/(1*42)) 97.6548 best keyword for cluster 6738690 is PF04528 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04537 ( PF04537 Herpesvirus UL55 protein ) B> PF04528 ( PF04528 Adenovirus early E4 34 kDa protein conserved region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04537 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 115 ) 6468227_PF04541_PF05900 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04541 is 6025537 with Jaccard = 1.0000 |PF04541|=7 [ 7 0 1100204 0 ] parent [ 6025537 ] : 6468227 1 (=91/(13*7)) 3.47301 given [ 6025537 ] : 6025537 1 (=10/(2*5)) 6.00022e-32 best keyword for cluster 6025537 is PF04541 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6025537 ] : 5976267 1 (=30/(3*10)) 2.22451e-36 best keyword for cluster 5976267 is PF05900 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04541 ( PF04541 Herpesvirus virion protein U34 ) B> PF05900 ( PF05900 Gammaherpesvirus BFRF1 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04541 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 116 ) 6539036_PF01664_PF04582 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04582 is 6460898 with Jaccard = 1.0000 |PF04582|=19 [ 19 0 1100192 0 ] parent [ 6460898 ] : 6539036 0.719298 (=123/(9*19)) 31.7089 given [ 6460898 ] : 6460898 0.983333 (=59/(15*4)) 2.41249 best keyword for cluster 6460898 is PF04582 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 sibling [ 6460898 ] : 6089875 1 (=20/(4*5)) 2.27011e-26 best keyword for cluster 6089875 is PF01664 with Jaccard = 0.7500 [ 9 0 1100199 3 ] 1.0000 0.7500 SUGGESTING RELATEDNESS OF: A> PF04582 ( PF04582 Reovirus sigma C capsid protein ) B> PF01664 ( PF01664 Reovirus viral attachment protein sigma 1 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF04582 and PF01664 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 117 ) 6732872_PF04637_PF06106 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04637 is 6455903 with Jaccard = 1.0000 |PF04637|=15 [ 15 0 1100196 0 ] parent [ 6455903 ] : 6732872 0.0333333 (=6/(15*12)) 97.0384 given [ 6455903 ] : 6455903 0.981481 (=53/(6*9)) 1.87769 best keyword for cluster 6455903 is PF04637 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 sibling [ 6455903 ] : 6703747 0.0857143 (=3/(5*7)) 92.8572 best keyword for cluster 6703747 is PF06106 with Jaccard = 0.8571 [ 6 1 1100204 0 ] 0.8571 1.0000 SUGGESTING RELATEDNESS OF: A> PF04637 ( PF04637 Herpesvirus phosphoprotein 85 (HHV6-7 U14/HCMV UL25) ) B> PF06106 ( PF06106 Staphylococcus protein of unknown function (DUF950) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04637 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 118 ) 6687298_PF00071_PF04670 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04670 is 6489581 with Jaccard = 1.0000 |PF04670|=66 [ 66 0 1100145 0 ] parent [ 6489581 ] : 6687298 0.124608 (=27125/(67*3249)) 89.7127 given [ 6489581 ] : 6489581 0.923214 (=1034/(32*35)) 8.12218 best keyword for cluster 6489581 is PF04670 with Jaccard = 1.0000 [ 66 0 1100145 0 ] 1.0000 1.0000 sibling [ 6489581 ] : 6686756 0.122059 (=2770/(7*3242)) 89.6045 best keyword for cluster 6686756 is PF00071 with Jaccard = 0.6831 [ 2108 930 1097125 48 ] 0.6939 0.9777 SUGGESTING RELATEDNESS OF: A> PF04670 ( PF04670 Gtr1/RagA G protein conserved region ) B> PF00071 ( PF00071 Ras family ) Only B has a clan ( CL0017.14 ). the two keywords do not coincide on UniRef90 proteins only PF04670 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 119 ) 6666441_PF04677_PF05011 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04677 is 6562020 with Jaccard = 1.0000 |PF04677|=61 [ 61 0 1100150 0 ] parent [ 6562020 ] : 6666441 0.168783 (=505/(44*68)) 84.5916 given [ 6562020 ] : 6562020 0.533566 (=612/(31*37)) 48.5922 best keyword for cluster 6562020 is PF04677 with Jaccard = 1.0000 [ 61 0 1100150 0 ] 1.0000 1.0000 sibling [ 6562020 ] : 6559297 0.604651 (=26/(1*43)) 46.1052 best keyword for cluster 6559297 is PF05011 with Jaccard = 0.6818 [ 30 13 1100167 1 ] 0.6977 0.9677 SUGGESTING RELATEDNESS OF: A> PF04677 ( PF04677 Protein similar to CwfJ C-terminus 1 ) B> PF05011 ( PF05011 Lariat debranching enzyme, C-terminal domain ) Only A has a clan ( CL0265.2 ). the two keywords do not coincide on UniRef90 proteins Neither PF04677 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04677 SSF54197 0.709 (average over 38 mutual instances, PF04677 38 appearances, SSF54197 2604 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 120 ) 6744017_PF00314_PF04681 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04681 is 6593425 with Jaccard = 1.0000 |PF04681|=6 [ 6 0 1100205 0 ] parent [ 6593425 ] : 6744017 0.0242131 (=120/(21*236)) 98.1473 given [ 6593425 ] : 6593425 0.447368 (=17/(2*19)) 58.6105 best keyword for cluster 6593425 is PF04681 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6593425 ] : 6720066 0.0662393 (=31/(2*234)) 95.4238 best keyword for cluster 6720066 is PF00314 with Jaccard = 0.9447 [ 188 2 1100012 9 ] 0.9895 0.9543 SUGGESTING RELATEDNESS OF: A> PF04681 ( PF04681 Blastomyces yeast-phase-specific protein ) B> PF00314 ( PF00314 Thaumatin family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04681 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00314 SSF49870 0.924 (average over 562 mutual instances, PF00314 579 appearances, SSF49870 583 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 121 ) 6672813_PF04691_PF05778 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04691 is 6602369 with Jaccard = 1.0000 |PF04691|=8 [ 8 0 1100203 0 ] parent [ 6602369 ] : 6672813 0.144444 (=13/(9*10)) 86.2901 given [ 6602369 ] : 6602369 0.380952 (=8/(3*7)) 62.9733 best keyword for cluster 6602369 is PF04691 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6602369 ] : 6420996 1 (=8/(1*8)) 0.1142 best keyword for cluster 6420996 is PF05778 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04691 ( PF04691 Apolipoprotein C-I (ApoC-1) ) B> PF05778 ( PF05778 Apolipoprotein CIII (Apo-CIII) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04691 has a PDB structure (may not be up to date) PF04691 j.39.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 122 ) 6661979_PF04533_PF04743 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04743 is 5977189 with Jaccard = 1.0000 |PF04743|=12 [ 12 0 1100199 0 ] parent [ 5977189 ] : 6661979 0.197917 (=19/(12*8)) 83.6965 given [ 5977189 ] : 5977189 1 (=35/(5*7)) 2.94783e-36 best keyword for cluster 5977189 is PF04743 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 5977189 ] : 6403455 1 (=7/(1*7)) 0.0132857 best keyword for cluster 6403455 is PF04533 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04743 ( PF04743 BSRF1-like protein ) B> PF04533 ( PF04533 Herpes virus U44 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04743 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 123 ) 6758924_PF00046_PF04770 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04770 is 6455778 with Jaccard = 1.0000 |PF04770|=38 [ 38 0 1100173 0 ] parent [ 6455778 ] : 6758924 0.0116433 (=2265/(43*4524)) 99.2069 given [ 6455778 ] : 6455778 0.981481 (=424/(27*16)) 1.85337 best keyword for cluster 6455778 is PF04770 with Jaccard = 1.0000 [ 38 0 1100173 0 ] 1.0000 1.0000 sibling [ 6455778 ] : 6758416 0.00928587 (=42/(1*4523)) 99.1778 best keyword for cluster 6758416 is PF00046 with Jaccard = 0.7971 [ 3284 750 1096091 86 ] 0.8141 0.9745 SUGGESTING RELATEDNESS OF: A> PF04770 ( PF04770 ZF-HD protein dimerisation region ) B> PF00046 ( PF00046 Homeobox domain ) Only B has a clan ( CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF00046| = 3370 , |PF04770| = 38 , |PF00046^PF04770| = 1 ( 0.0% and 2.6% ) only PF04770 has a PDB structure (may not be up to date) PF00046 a.4.1.1 j.92.1.1 SUPERFAM mapping significantly overlapping: 1 PF00046 SSF46689 0.773 (average over 9143 mutual instances, PF00046 9568 appearances, SSF46689 68153 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 124 ) 6655288_PF02957_PF04861 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04861 is 5994512 with Jaccard = 1.0000 |PF04861|=2 [ 2 0 1100209 0 ] parent [ 5994512 ] : 6655288 0.240854 (=79/(2*164)) 81.7596 given [ 5994512 ] : 5994512 1 (=1/(1*1)) 1e-34 best keyword for cluster 5994512 is PF04861 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 sibling [ 5994512 ] : 6622080 0.314815 (=102/(2*162)) 71.3512 best keyword for cluster 6622080 is PF02957 with Jaccard = 0.9627 [ 155 0 1100050 6 ] 1.0000 0.9627 SUGGESTING RELATEDNESS OF: A> PF04861 ( PF04861 Circovirus VP2 protein ) B> PF02957 ( PF02957 TT viral ORF2 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04861 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 125 ) 6720061_PF01778_PF04874 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04874 is 6434848 with Jaccard = 1.0000 |PF04874|=43 [ 43 0 1100168 0 ] parent [ 6434848 ] : 6720061 0.0570248 (=138/(44*55)) 95.4213 given [ 6434848 ] : 6434848 0.995614 (=227/(6*38)) 0.438605 best keyword for cluster 6434848 is PF04874 with Jaccard = 1.0000 [ 43 0 1100168 0 ] 1.0000 1.0000 sibling [ 6434848 ] : 6703803 0.0925926 (=5/(1*54)) 92.8704 best keyword for cluster 6703803 is PF01778 with Jaccard = 1.0000 [ 51 0 1100160 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04874 ( PF04874 Mak16 protein ) B> PF01778 ( PF01778 Ribosomal L28e protein family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04874 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 126 ) 6518288_PF04903_PF07140 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04903 is 6502709 with Jaccard = 1.0000 |PF04903|=6 [ 6 0 1100205 0 ] parent [ 6502709 ] : 6518288 0.8125 (=39/(6*8)) 19.3681 given [ 6502709 ] : 6502709 0.875 (=14/(4*4)) 12.5 best keyword for cluster 6502709 is PF04903 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6502709 ] : 5934272 1 (=8/(4*2)) 2.88001e-40 best keyword for cluster 5934272 is PF07140 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04903 ( PF04903 Poxvirus interferon gamma receptor ) B> PF07140 ( PF07140 Interferon gamma receptor alpha chain (IFNGR1) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04903 has a PDB structure (may not be up to date) PF07140 b.1.2.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 127 ) 6708831_PF04953_PF06857 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04953 is 6361150 with Jaccard = 1.0000 |PF04953|=29 [ 29 0 1100182 0 ] parent [ 6361150 ] : 6708831 0.0807692 (=63/(30*26)) 93.7691 given [ 6361150 ] : 6361150 1 (=161/(23*7)) 2.89541e-05 best keyword for cluster 6361150 is PF04953 with Jaccard = 1.0000 [ 29 0 1100182 0 ] 1.0000 1.0000 sibling [ 6361150 ] : 6416824 1 (=165/(11*15)) 0.0728859 best keyword for cluster 6416824 is PF06857 with Jaccard = 0.9286 [ 26 0 1100183 2 ] 1.0000 0.9286 SUGGESTING RELATEDNESS OF: A> PF04953 ( PF04953 Citrate lyase, gamma subunit ) B> PF06857 ( PF06857 Malonate decarboxylase delta subunit (MdcD) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04953 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 128 ) 6679715_PF04956_PF06921 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04956 is 6644429 with Jaccard = 1.0000 |PF04956|=39 [ 39 0 1100172 0 ] parent [ 6644429 ] : 6679715 0.143476 (=232/(49*33)) 88.0998 given [ 6644429 ] : 6644429 0.268116 (=37/(3*46)) 78.3097 best keyword for cluster 6644429 is PF04956 with Jaccard = 1.0000 [ 39 0 1100172 0 ] 1.0000 1.0000 sibling [ 6644429 ] : 6645473 0.255556 (=23/(3*30)) 78.6423 best keyword for cluster 6645473 is PF06921 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04956 ( PF04956 TrbC/VIRB2 family ) B> PF06921 ( ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04956 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 129 ) 6652360_PF04962_PF06845 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04962 is 6292252 with Jaccard = 1.0000 |PF04962|=33 [ 33 0 1100178 0 ] parent [ 6292252 ] : 6652360 0.229475 (=450/(37*53)) 80.755 given [ 6292252 ] : 6292252 1 (=36/(1*36)) 3.94403e-10 best keyword for cluster 6292252 is PF04962 with Jaccard = 1.0000 [ 33 0 1100178 0 ] 1.0000 1.0000 sibling [ 6292252 ] : 6427211 1 (=240/(5*48)) 0.218321 best keyword for cluster 6427211 is PF06845 with Jaccard = 0.9804 [ 50 0 1100160 1 ] 1.0000 0.9804 SUGGESTING RELATEDNESS OF: A> PF04962 ( PF04962 5-keto 4-deoxyuronate isomerase ) B> PF06845 ( PF06845 Myo-inositol catabolism protein IolB ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04962 has a PDB structure (may not be up to date) PF04962 b.82.1.13 SUPERFAM mapping significantly overlapping: 1 PF06845 SSF51182 0.902 (average over 165 mutual instances, PF06845 165 appearances, SSF51182 14255 appearances) 2 PF04962 SSF51182 0.952 (average over 115 mutual instances, PF04962 115 appearances, SSF51182 14255 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 130 ) 6675133_PF04965_PF07115 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04965 is 6664362 with Jaccard = 1.0000 |PF04965|=77 [ 77 0 1100134 0 ] parent [ 6664362 ] : 6675133 0.167224 (=200/(13*92)) 86.9705 given [ 6664362 ] : 6664362 0.172285 (=46/(3*89)) 84.1489 best keyword for cluster 6664362 is PF04965 with Jaccard = 1.0000 [ 77 0 1100134 0 ] 1.0000 1.0000 sibling [ 6664362 ] : 6612793 0.404762 (=17/(6*7)) 67.6918 best keyword for cluster 6612793 is PF07115 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04965 ( PF04965 Gene 25-like lysozyme ) B> PF07115 ( PF07115 Protein of unknown function (DUF1371) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04965 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 131 ) 6737909_PF01917_PF04974 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04974 is 6715771 with Jaccard = 1.0000 |PF04974|=15 [ 15 0 1100196 0 ] parent [ 6715771 ] : 6737909 0.0282258 (=70/(80*31)) 97.578 given [ 6715771 ] : 6715771 0.0672269 (=16/(14*17)) 94.844 best keyword for cluster 6715771 is PF04974 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 sibling [ 6715771 ] : 6676966 0.135742 (=139/(16*64)) 87.4912 best keyword for cluster 6676966 is PF01917 with Jaccard = 0.8219 [ 60 13 1100138 0 ] 0.8219 1.0000 SUGGESTING RELATEDNESS OF: A> PF04974 ( PF04974 Archaeal flagellar protein F ) B> PF01917 ( PF01917 Archaebacterial flagellin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04974 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 132 ) 6693292_PF00113_PF05034 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05034 is 6615589 with Jaccard = 1.0000 |PF05034|=17 [ 17 0 1100194 0 ] parent [ 6615589 ] : 6693292 0.113277 (=1046/(18*513)) 90.9547 given [ 6615589 ] : 6615589 0.34375 (=11/(16*2)) 68.6997 best keyword for cluster 6615589 is PF05034 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6615589 ] : 6629605 0.285039 (=724/(508*5)) 74.3969 best keyword for cluster 6629605 is PF00113 with Jaccard = 0.9665 [ 462 15 1099733 1 ] 0.9686 0.9978 SUGGESTING RELATEDNESS OF: A> PF05034 ( PF05034 Methylaspartate ammonia-lyase N-terminus ) B> PF00113 ( PF00113 Enolase, C-terminal TIM barrel domain ) A and B come from a different clan ( CL0227.3 , CL0256.2 ). the two keywords do not coincide on UniRef90 proteins both PF05034 and PF00113 have PDB structures PF00113 c.1.11.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 133 ) 6672094_PF02948_PF05111 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05111 is 6597213 with Jaccard = 1.0000 |PF05111|=8 [ 8 0 1100203 0 ] parent [ 6597213 ] : 6672094 0.15625 (=45/(8*36)) 86.0401 given [ 6597213 ] : 6597213 0.428571 (=3/(1*7)) 60.4286 best keyword for cluster 6597213 is PF05111 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6597213 ] : 6626788 0.294118 (=20/(2*34)) 73.3307 best keyword for cluster 6626788 is PF02948 with Jaccard = 0.9677 [ 30 1 1100180 0 ] 0.9677 1.0000 SUGGESTING RELATEDNESS OF: A> PF05111 ( PF05111 Ameloblastin precursor (Amelin) ) B> PF02948 ( PF02948 Amelogenin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05111 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 134 ) 6748081_PF01970_PF05145 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05145 is 6594580 with Jaccard = 1.0000 |PF05145|=65 [ 65 0 1100146 0 ] parent [ 6594580 ] : 6748081 0.0188554 (=255/(84*161)) 98.4737 given [ 6594580 ] : 6594580 0.414634 (=68/(2*82)) 59.2019 best keyword for cluster 6594580 is PF05145 with Jaccard = 1.0000 [ 65 0 1100146 0 ] 1.0000 1.0000 sibling [ 6594580 ] : 6654642 0.224522 (=141/(157*4)) 81.5858 best keyword for cluster 6654642 is PF01970 with Jaccard = 0.9923 [ 129 0 1100081 1 ] 1.0000 0.9923 SUGGESTING RELATEDNESS OF: A> PF05145 ( PF05145 Putative ammonia monooxygenase ) B> PF01970 ( PF01970 Integral membrane protein DUF112 ) Only A has a clan ( CL0142.6 ). the two keywords coincide on Uniref90 proteins: |PF01970| = 130 , |PF05145| = 65 , |PF01970^PF05145| = 1 ( 0.8% and 1.5% ) Neither PF05145 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 135 ) 6667157_PF05206_PF05253 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05206 is 6560464 with Jaccard = 1.0000 |PF05206|=34 [ 34 0 1100177 0 ] parent [ 6560464 ] : 6667157 0.236035 (=300/(41*31)) 84.757 given [ 6560464 ] : 6560464 0.574324 (=85/(4*37)) 47.0756 best keyword for cluster 6560464 is PF05206 with Jaccard = 1.0000 [ 34 0 1100177 0 ] 1.0000 1.0000 sibling [ 6560464 ] : 6588721 0.547619 (=92/(7*24)) 56.9234 best keyword for cluster 6588721 is PF05253 with Jaccard = 0.9600 [ 24 0 1100186 1 ] 1.0000 0.9600 SUGGESTING RELATEDNESS OF: A> PF05206 ( PF05206 Methyltransferase TRM13 ) B> PF05253 ( PF05253 Uncharacterised protein family (UPF0224) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05206 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 136 ) 6753721_PF02320_PF05254 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05254 is 6522746 with Jaccard = 1.0000 |PF05254|=24 [ 24 0 1100187 0 ] parent [ 6522746 ] : 6753721 0.0168 (=21/(25*50)) 98.8861 given [ 6522746 ] : 6522746 0.80303 (=53/(22*3)) 21.6183 best keyword for cluster 6522746 is PF05254 with Jaccard = 1.0000 [ 24 0 1100187 0 ] 1.0000 1.0000 sibling [ 6522746 ] : 6743172 0.0204082 (=1/(1*49)) 98.0694 best keyword for cluster 6743172 is PF02320 with Jaccard = 1.0000 [ 34 0 1100177 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05254 ( PF05254 Uncharacterised protein family (UPF0203) ) B> PF02320 ( PF02320 Ubiquinol-cytochrome C reductase hinge protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05254 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 137 ) 6651126_PF05271_PF08112 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05271 is 6015658 with Jaccard = 1.0000 |PF05271|=5 [ 5 0 1100206 0 ] parent [ 6015658 ] : 6651126 0.266667 (=8/(5*6)) 80.3787 given [ 6015658 ] : 6015658 1 (=6/(3*2)) 8.3937e-33 best keyword for cluster 6015658 is PF05271 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 sibling [ 6015658 ] : 6593440 0.444444 (=4/(3*3)) 58.6222 best keyword for cluster 6593440 is PF08112 with Jaccard = 0.7500 [ 3 1 1100207 0 ] 0.7500 1.0000 SUGGESTING RELATEDNESS OF: A> PF05271 ( PF05271 Tobravirus 2B protein ) B> PF08112 ( PF08112 ATP synthase epsilon subunit ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05271 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 138 ) 6749058_PF02713_PF05274 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05274 is 6132517 with Jaccard = 1.0000 |PF05274|=16 [ 16 0 1100195 0 ] parent [ 6132517 ] : 6749058 0.0208333 (=10/(20*24)) 98.5477 given [ 6132517 ] : 6132517 1 (=84/(6*14)) 9.29527e-23 best keyword for cluster 6132517 is PF05274 with Jaccard = 1.0000 [ 16 0 1100195 0 ] 1.0000 1.0000 sibling [ 6132517 ] : 6706568 0.0875 (=7/(20*4)) 93.3975 best keyword for cluster 6706568 is PF02713 with Jaccard = 1.0000 [ 16 0 1100195 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05274 ( PF05274 Occlusion-derived virus envelope protein E25 ) B> PF02713 ( PF02713 Domain of unknown function DUF220 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05274 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 139 ) 6722425_PF00096_PF05281 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05281 is 6438894 with Jaccard = 1.0000 |PF05281|=14 [ 14 0 1100197 0 ] parent [ 6438894 ] : 6722425 0.0447213 (=3349/(14*5349)) 95.7748 given [ 6438894 ] : 6438894 1 (=49/(7*7)) 0.602526 best keyword for cluster 6438894 is PF05281 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6438894 ] : 6722123 0.0569141 (=6368/(21*5328)) 95.7328 best keyword for cluster 6722123 is PF00096 with Jaccard = 0.8219 [ 4237 269 1095056 649 ] 0.9403 0.8672 SUGGESTING RELATEDNESS OF: A> PF05281 ( PF05281 Neuroendocrine protein 7B2 precursor (Secretogranin V) ) B> PF00096 ( PF00096 Zinc finger, C2H2 type ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00096| = 4886 , |PF05281| = 14 , |PF00096^PF05281| = 1 ( 0.0% and 7.1% ) only PF05281 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 140 ) 6650171_PF05289_PF06238 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05289 is 6317166 with Jaccard = 1.0000 |PF05289|=5 [ 5 0 1100206 0 ] parent [ 6317166 ] : 6650171 0.2 (=6/(5*6)) 80.0339 given [ 6317166 ] : 6317166 1 (=4/(1*4)) 2.50018e-08 best keyword for cluster 6317166 is PF05289 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 sibling [ 6317166 ] : 6548644 0.625 (=5/(2*4)) 37.991 best keyword for cluster 6548644 is PF06238 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05289 ( PF05289 Borrelia hemolysin accessory protein ) B> PF06238 ( PF06238 Borrelia burgdorferi BBR25 lipoprotein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05289 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 141 ) 6704853_PF00184_PF05294 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05294 is 6516382 with Jaccard = 1.0000 |PF05294|=13 [ 13 0 1100198 0 ] parent [ 6516382 ] : 6704853 0.119048 (=130/(13*84)) 93.0722 given [ 6516382 ] : 6516382 0.833333 (=10/(1*12)) 18.2556 best keyword for cluster 6516382 is PF05294 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 sibling [ 6516382 ] : 6691264 0.127458 (=188/(59*25)) 90.5119 best keyword for cluster 6691264 is PF00184 with Jaccard = 0.7910 [ 53 14 1100144 0 ] 0.7910 1.0000 SUGGESTING RELATEDNESS OF: A> PF05294 ( PF05294 Scorpion short toxin ) B> PF00184 ( PF00184 Neurohypophysial hormones, C-terminal Domain ) Only A has a clan ( CL0054.8 ). the two keywords do not coincide on UniRef90 proteins both PF05294 and PF00184 have PDB structures PF00184 b.9.1.1 SUPERFAM mapping significantly overlapping: 1 PF00184 SSF49606 0.854 (average over 98 mutual instances, PF00184 98 appearances, SSF49606 180 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 142 ) 6540203_PF05307_PF05946 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05307 is 6035886 with Jaccard = 1.0000 |PF05307|=3 [ 3 0 1100208 0 ] parent [ 6035886 ] : 6540203 0.711111 (=32/(3*15)) 32.7208 given [ 6035886 ] : 6035886 1 (=2/(1*2)) 5.01e-31 best keyword for cluster 6035886 is PF05307 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 sibling [ 6035886 ] : 6186738 1 (=36/(12*3)) 2.42136e-18 best keyword for cluster 6186738 is PF05946 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05307 ( PF05307 Bundlin ) B> PF05946 ( PF05946 Toxin-coregulated pilus subunit TcpA ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF05307 and PF05946 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 143 ) 6635708_PF05310_PF07375 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05310 is 6340435 with Jaccard = 1.0000 |PF05310|=6 [ 6 0 1100205 0 ] parent [ 6340435 ] : 6635708 0.3 (=9/(6*5)) 75.9464 given [ 6340435 ] : 6340435 1 (=5/(1*5)) 1.012e-06 best keyword for cluster 6340435 is PF05310 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6340435 ] : 6345958 1 (=4/(1*4)) 2.62735e-06 best keyword for cluster 6345958 is PF07375 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05310 ( PF05310 Tenuivirus NS-3 Protein ) B> PF07375 ( PF07375 Tenuivirus PV2 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05310 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 144 ) 6689042_PF05332_PF05752 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05332 is 6450143 with Jaccard = 1.0000 |PF05332|=7 [ 7 0 1100204 0 ] parent [ 6450143 ] : 6689042 0.131579 (=20/(8*19)) 90.0725 given [ 6450143 ] : 6450143 1 (=12/(2*6)) 1.32342 best keyword for cluster 6450143 is PF05332 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6450143 ] : 6467496 0.966667 (=58/(15*4)) 3.34094 best keyword for cluster 6467496 is PF05752 with Jaccard = 1.0000 [ 16 0 1100195 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05332 ( PF05332 Protein of unknown function (DUF743) ) B> PF05752 ( PF05752 Calicivirus minor structural protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05332 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 145 ) 6759213_PF05336_PF06271 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05336 is 6667257 with Jaccard = 1.0000 |PF05336|=53 [ 53 0 1100158 0 ] parent [ 6667257 ] : 6759213 0.00814672 (=195/(68*352)) 99.2219 given [ 6667257 ] : 6667257 0.159091 (=21/(2*66)) 84.7903 best keyword for cluster 6667257 is PF05336 with Jaccard = 1.0000 [ 53 0 1100158 0 ] 1.0000 1.0000 sibling [ 6667257 ] : 6752146 0.016197 (=50/(343*9)) 98.7779 best keyword for cluster 6752146 is PF06271 with Jaccard = 0.9350 [ 302 0 1099888 21 ] 1.0000 0.9350 SUGGESTING RELATEDNESS OF: A> PF05336 ( PF05336 Protein of unknown function (DUF718) ) B> PF06271 ( PF06271 RDD family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF05336| = 53 , |PF06271| = 323 , |PF05336^PF06271| = 1 ( 1.9% and 0.3% ) only PF05336 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 146 ) 6743231_PF05394_PF07420 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05394 is 6650416 with Jaccard = 1.0000 |PF05394|=6 [ 6 0 1100205 0 ] parent [ 6650416 ] : 6743231 0.0246914 (=2/(9*9)) 98.0741 given [ 6650416 ] : 6650416 0.2 (=4/(5*4)) 80.1801 best keyword for cluster 6650416 is PF05394 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6650416 ] : 6724602 0.0555556 (=1/(3*6)) 96.0556 best keyword for cluster 6724602 is PF07420 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05394 ( PF05394 Avirulence protein ) B> PF07420 ( PF07420 Protein of unknown function (DUF1509) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05394 has a PDB structure (may not be up to date) PF05394 e.45.1.1 SUPERFAM mapping significantly overlapping: 1 PF05394 SSF103383 0.859 (average over 16 mutual instances, PF05394 16 appearances, SSF103383 16 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 147 ) 6726011_PF05395_PF05781 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05395 is 6518685 with Jaccard = 1.0000 |PF05395|=15 [ 15 0 1100196 0 ] parent [ 6518685 ] : 6726011 0.0431373 (=11/(17*15)) 96.2356 given [ 6518685 ] : 6518685 0.826923 (=43/(4*13)) 19.7202 best keyword for cluster 6518685 is PF05395 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 sibling [ 6518685 ] : 6693097 0.111111 (=4/(3*12)) 90.8972 best keyword for cluster 6693097 is PF05781 with Jaccard = 0.9000 [ 9 1 1100201 0 ] 0.9000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05395 ( PF05395 Protein phosphatase inhibitor 1/DARPP-32 ) B> PF05781 ( PF05781 MRVI1 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05395 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 148 ) 6617884_PF05412_PF06460 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05412 is 6392080 with Jaccard = 1.0000 |PF05412|=12 [ 12 0 1100199 0 ] parent [ 6392080 ] : 6617884 0.320513 (=100/(12*26)) 69.5986 given [ 6392080 ] : 6392080 1 (=27/(3*9)) 0.0028837 best keyword for cluster 6392080 is PF05412 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6392080 ] : 6575475 0.48 (=12/(1*25)) 52.36 best keyword for cluster 6575475 is PF06460 with Jaccard = 0.8824 [ 15 2 1100194 0 ] 0.8824 1.0000 SUGGESTING RELATEDNESS OF: A> PF05412 ( PF05412 Equine arterivirus Nsp2-type cysteine proteinase ) B> PF06460 ( PF06460 Coronavirus NSP13 ) A and B come from a different clan ( CL0125.9 , CL0102.14 ). the two keywords do not coincide on UniRef90 proteins Neither PF05412 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 149 ) 6744887_PF04736_PF05434 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05434 is 6703363 with Jaccard = 1.0000 |PF05434|=20 [ 20 0 1100191 0 ] parent [ 6703363 ] : 6744887 0.0237154 (=6/(23*11)) 98.2253 given [ 6703363 ] : 6703363 0.0789474 (=6/(19*4)) 92.7895 best keyword for cluster 6703363 is PF05434 with Jaccard = 1.0000 [ 20 0 1100191 0 ] 1.0000 1.0000 sibling [ 6703363 ] : 6722383 0.0666667 (=2/(6*5)) 95.7667 best keyword for cluster 6722383 is PF04736 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05434 ( PF05434 TMEM9 ) B> PF04736 ( PF04736 Eclosion hormone ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05434 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 150 ) 6681380_PF03045_PF05463 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05463 is 6049508 with Jaccard = 1.0000 |PF05463|=6 [ 6 0 1100205 0 ] parent [ 6049508 ] : 6681380 0.149254 (=70/(7*67)) 88.5322 given [ 6049508 ] : 6049508 1 (=10/(5*2)) 8.10209e-30 best keyword for cluster 6049508 is PF05463 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6049508 ] : 6644402 0.241667 (=261/(27*40)) 78.2854 best keyword for cluster 6644402 is PF03045 with Jaccard = 0.8293 [ 34 6 1100170 1 ] 0.8500 0.9714 SUGGESTING RELATEDNESS OF: A> PF05463 ( PF05463 Sclerostin (SOST) ) B> PF03045 ( PF03045 DAN domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05463 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 151 ) 6776705_PF05305_PF05480 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05480 is 6556842 with Jaccard = 1.0000 |PF05480|=14 [ 14 0 1100197 0 ] parent [ 6556842 ] : 6776705 0.00113636 (=1/(16*55)) 99.8948 given [ 6556842 ] : 6556842 0.563636 (=31/(5*11)) 44.1308 best keyword for cluster 6556842 is PF05480 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6556842 ] : 6709645 0.084 (=21/(5*50)) 93.8837 best keyword for cluster 6709645 is PF05305 with Jaccard = 0.9556 [ 43 0 1100166 2 ] 1.0000 0.9556 SUGGESTING RELATEDNESS OF: A> PF05480 ( PF05480 Staphylococcus haemolytic protein ) B> PF05305 ( PF05305 Protein of unknown function (DUF732) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05480 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 152 ) 6719944_PF03607_PF05517 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05517 is 6684704 with Jaccard = 1.0000 |PF05517|=32 [ 32 0 1100179 0 ] parent [ 6684704 ] : 6719944 0.0510204 (=90/(36*49)) 95.4035 given [ 6684704 ] : 6684704 0.121212 (=12/(3*33)) 89.2238 best keyword for cluster 6684704 is PF05517 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 sibling [ 6684704 ] : 6703704 0.0897436 (=35/(39*10)) 92.8475 best keyword for cluster 6703704 is PF03607 with Jaccard = 0.7193 [ 41 0 1100154 16 ] 1.0000 0.7193 SUGGESTING RELATEDNESS OF: A> PF05517 ( PF05517 p25-alpha ) B> PF03607 ( PF03607 Doublecortin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF05517 and PF03607 have PDB structures PF03607 d.15.11.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 153 ) 6682824_PF05535_PF08138 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05535 is 6413746 with Jaccard = 1.0000 |PF05535|=10 [ 10 0 1100201 0 ] parent [ 6413746 ] : 6682824 0.194805 (=15/(11*7)) 88.8987 given [ 6413746 ] : 6413746 1 (=10/(1*10)) 0.0505 best keyword for cluster 6413746 is PF05535 with Jaccard = 1.0000 [ 10 0 1100201 0 ] 1.0000 1.0000 sibling [ 6413746 ] : 6433919 1 (=6/(1*6)) 0.402167 best keyword for cluster 6433919 is PF08138 with Jaccard = 0.7000 [ 7 0 1100201 3 ] 1.0000 0.7000 SUGGESTING RELATEDNESS OF: A> PF05535 ( PF05535 Chromadorea ALT protein ) B> PF08138 ( PF08138 Sex peptide (SP) family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05535 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 154 ) 6710840_PF02567_PF05544 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05544 is 6538449 with Jaccard = 1.0000 |PF05544|=68 [ 68 0 1100143 0 ] parent [ 6538449 ] : 6710840 0.0746374 (=1477/(77*257)) 94.0709 given [ 6538449 ] : 6538449 0.753333 (=113/(75*2)) 31.0755 best keyword for cluster 6538449 is PF05544 with Jaccard = 1.0000 [ 68 0 1100143 0 ] 1.0000 1.0000 sibling [ 6538449 ] : 6641224 0.249012 (=252/(4*253)) 77.3737 best keyword for cluster 6641224 is PF02567 with Jaccard = 0.9913 [ 229 0 1099980 2 ] 1.0000 0.9913 SUGGESTING RELATEDNESS OF: A> PF05544 ( PF05544 Proline racemase ) B> PF02567 ( PF02567 Phenazine biosynthesis-like protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF05544 and PF02567 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 155 ) 6708421_PF05571_PF06775 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05571 is 6150558 with Jaccard = 1.0000 |PF05571|=9 [ 9 0 1100202 0 ] parent [ 6150558 ] : 6708421 0.0634921 (=20/(9*35)) 93.7051 given [ 6150558 ] : 6150558 1 (=14/(2*7)) 2.85714e-21 best keyword for cluster 6150558 is PF05571 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 sibling [ 6150558 ] : 6641201 0.227273 (=15/(2*33)) 77.3516 best keyword for cluster 6641201 is PF06775 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05571 ( PF05571 Protein of unknown function (DUF766) ) B> PF06775 ( PF06775 Putative adipose-regulatory protein (Seipin) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05571 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 156 ) 6566290_PF05576_PF05577 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05576 is 5974417 with Jaccard = 1.0000 |PF05576|=8 [ 8 0 1100203 0 ] parent [ 5974417 ] : 6566290 0.552174 (=508/(8*115)) 50.2355 given [ 5974417 ] : 5974417 1 (=15/(3*5)) 1.536e-36 best keyword for cluster 5974417 is PF05576 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 5974417 ] : 6554294 0.59292 (=134/(2*113)) 42.1368 best keyword for cluster 6554294 is PF05577 with Jaccard = 0.9823 [ 111 0 1100098 2 ] 1.0000 0.9823 SUGGESTING RELATEDNESS OF: A> PF05576 ( PF05576 PS-10 peptidase S37 ) B> PF05577 ( PF05577 Serine carboxypeptidase S28 ) they come from the same clan: CL0028.14 : PF05728 PF00975 PF07519 PF06850 PF07819 PF00326 PF05576 PF05577 PF02129 PF00450 PF02089 PF03403 PF03096 PF01764 PF01674 PF00151 PF03583 PF02450 PF03959 PF00756 PF06028 PF05990 PF05677 PF05057 PF04301 PF08538 PF07176 PF06821 PF06500 PF06342 PF06259 PF01738 PF01083 PF00135 PF07224 PF08840 PF05448 PF02273 PF08386 PF07859 PF02230 PF00561 PF06057 the two keywords do not coincide on UniRef90 proteins Neither PF05576 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 157 ) 6731085_PF05630_PF06101 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05630 is 6526015 with Jaccard = 1.0000 |PF05630|=45 [ 45 0 1100166 0 ] parent [ 6526015 ] : 6731085 0.0451945 (=79/(46*38)) 96.842 given [ 6526015 ] : 6526015 0.8 (=36/(1*45)) 23.6626 best keyword for cluster 6526015 is PF05630 with Jaccard = 1.0000 [ 45 0 1100166 0 ] 1.0000 1.0000 sibling [ 6526015 ] : 6715143 0.0540541 (=2/(1*37)) 94.7459 best keyword for cluster 6715143 is PF06101 with Jaccard = 0.8182 [ 9 0 1100200 2 ] 1.0000 0.8182 SUGGESTING RELATEDNESS OF: A> PF05630 ( PF05630 Necrosis inducing protein (NPP1) ) B> PF06101 ( PF06101 Plant protein of unknown function (DUF946) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05630 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 158 ) 6706019_PF01253_PF05634 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05634 is 5960817 with Jaccard = 1.0000 |PF05634|=8 [ 8 0 1100203 0 ] parent [ 5960817 ] : 6706019 0.0712366 (=106/(8*186)) 93.3152 given [ 5960817 ] : 5960817 1 (=12/(2*6)) 8.75e-38 best keyword for cluster 5960817 is PF05634 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 5960817 ] : 6669782 0.151351 (=28/(1*185)) 85.4912 best keyword for cluster 6669782 is PF01253 with Jaccard = 0.8255 [ 175 0 1099999 37 ] 1.0000 0.8255 SUGGESTING RELATEDNESS OF: A> PF05634 ( PF05634 Arabidopsis thaliana protein of unknown function (DUF794) ) B> PF01253 ( PF01253 Translation initiation factor SUI1 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01253| = 212 , |PF05634| = 8 , |PF01253^PF05634| = 1 ( 0.5% and 12.5% ) only PF05634 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01253 SSF55159 0.759 (average over 596 mutual instances, PF01253 693 appearances, SSF55159 617 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 159 ) 6723742_PF00004_PF05673 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05673 is 6372123 with Jaccard = 1.0000 |PF05673|=91 [ 91 0 1100120 0 ] parent [ 6372123 ] : 6723742 0.0576556 (=44628/(103*7515)) 95.9603 given [ 6372123 ] : 6372123 1 (=102/(1*102)) 0.000150049 best keyword for cluster 6372123 is PF05673 with Jaccard = 1.0000 [ 91 0 1100120 0 ] 1.0000 1.0000 sibling [ 6372123 ] : 6721874 0.0629668 (=73193/(158*7357)) 95.6861 best keyword for cluster 6721874 is PF00004 with Jaccard = 0.6307 [ 4005 2206 1093861 139 ] 0.6448 0.9665 SUGGESTING RELATEDNESS OF: A> PF05673 ( PF05673 Protein of unknown function (DUF815) ) B> PF00004 ( PF00004 ATPase family associated with various cellular activities (AAA) ) Only B has a clan ( CL0023.26 ). the two keywords coincide on Uniref90 proteins: |PF00004| = 4144 , |PF05673| = 91 , |PF00004^PF05673| = 9 ( 0.2% and 9.9% ) only PF05673 has a PDB structure (may not be up to date) PF00004 c.37.1.1 c.37.1.20 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 160 ) 6605432_PF05733_PF06606 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05733 is 6285061 with Jaccard = 1.0000 |PF05733|=7 [ 7 0 1100204 0 ] parent [ 6285061 ] : 6605432 0.380952 (=24/(7*9)) 64.286 given [ 6285061 ] : 6285061 1 (=6/(1*6)) 1.02505e-10 best keyword for cluster 6285061 is PF05733 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6285061 ] : 6518751 0.875 (=7/(1*8)) 19.7705 best keyword for cluster 6518751 is PF06606 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05733 ( PF05733 Tenuivirus nucleocapsid protein ) B> PF06606 ( PF06606 Phlebovirus nucleocapsid (N) protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05733 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 161 ) 6736942_PF01345_PF05753 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05753 is 6632333 with Jaccard = 1.0000 |PF05753|=21 [ 21 0 1100190 0 ] parent [ 6632333 ] : 6736942 0.0327277 (=263/(28*287)) 97.4789 given [ 6632333 ] : 6632333 0.29932 (=44/(21*7)) 75.2152 best keyword for cluster 6632333 is PF05753 with Jaccard = 1.0000 [ 21 0 1100190 0 ] 1.0000 1.0000 sibling [ 6632333 ] : 6712902 0.0714431 (=1409/(114*173)) 94.3948 best keyword for cluster 6712902 is PF01345 with Jaccard = 0.6698 [ 71 11 1100105 24 ] 0.8659 0.7474 SUGGESTING RELATEDNESS OF: A> PF05753 ( PF05753 Translocon-associated protein beta (TRAPB) ) B> PF01345 ( PF01345 Domain of unknown function DUF11 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05753 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 162 ) 6745007_PF01016_PF05775 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05775 is 6485662 with Jaccard = 1.0000 |PF05775|=9 [ 9 0 1100202 0 ] parent [ 6485662 ] : 6745007 0.0332551 (=85/(9*284)) 98.2352 given [ 6485662 ] : 6485662 0.944444 (=17/(6*3)) 7.0526 best keyword for cluster 6485662 is PF05775 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 sibling [ 6485662 ] : 6734529 0.0363636 (=90/(275*9)) 97.2276 best keyword for cluster 6734529 is PF01016 with Jaccard = 1.0000 [ 250 0 1099961 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05775 ( PF05775 Enterobacteria AfaD invasin protein ) B> PF01016 ( PF01016 Ribosomal L27 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF05775 and PF01016 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF01016 SSF110324 0.753 (average over 898 mutual instances, PF01016 898 appearances, SSF110324 904 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 163 ) 6552098_PF02723_PF05780 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05780 is 6502172 with Jaccard = 1.0000 |PF05780|=6 [ 6 0 1100205 0 ] parent [ 6502172 ] : 6552098 0.75 (=18/(4*6)) 40.4417 given [ 6502172 ] : 6502172 0.888889 (=8/(3*3)) 12.1179 best keyword for cluster 6502172 is PF05780 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6502172 ] : 6429978 1 (=3/(1*3)) 0.281333 best keyword for cluster 6429978 is PF02723 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05780 ( PF05780 Coronavirus nonstructural protein 4 ) B> PF02723 ( PF02723 Non-structural protein NS3/Small envelope protein E ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05780 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 164 ) 6680950_PF05796_PF07341 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05796 is 6352508 with Jaccard = 1.0000 |PF05796|=12 [ 12 0 1100199 0 ] parent [ 6352508 ] : 6680950 0.145833 (=7/(12*4)) 88.4254 given [ 6352508 ] : 6352508 1 (=27/(3*9)) 7.40753e-06 best keyword for cluster 6352508 is PF05796 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6352508 ] : 6557024 0.666667 (=2/(1*3)) 44.3333 best keyword for cluster 6557024 is PF07341 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05796 ( PF05796 Chordopoxvirus protein G2 ) B> PF07341 ( PF07341 Protein of unknown function (DUF1473) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05796 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 165 ) 6729369_PF05799_PF06212 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05799 is 6512394 with Jaccard = 1.0000 |PF05799|=8 [ 8 0 1100203 0 ] parent [ 6512394 ] : 6729369 0.0603448 (=14/(8*29)) 96.6599 given [ 6512394 ] : 6512394 0.857143 (=6/(1*7)) 16.6013 best keyword for cluster 6512394 is PF05799 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6512394 ] : 6636798 0.269231 (=21/(3*26)) 76.1737 best keyword for cluster 6636798 is PF06212 with Jaccard = 1.0000 [ 25 0 1100186 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05799 ( PF05799 Cytochrome c oxidase subunit Vc (COX5C) ) B> PF06212 ( PF06212 GRIM-19 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05799 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 166 ) 6647322_PF05851_PF07401 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05851 is 5997824 with Jaccard = 1.0000 |PF05851|=6 [ 6 0 1100205 0 ] parent [ 5997824 ] : 6647322 0.266667 (=24/(6*15)) 79.1272 given [ 5997824 ] : 5997824 1 (=8/(4*2)) 2.0005e-34 best keyword for cluster 5997824 is PF05851 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 5997824 ] : 6545520 0.692308 (=18/(2*13)) 35.6278 best keyword for cluster 6545520 is PF07401 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05851 ( PF05851 Lentivirus virion infectivity factor (VIF) ) B> PF07401 ( PF07401 Bovine Lentivirus VIF protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05851 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 167 ) 6749957_PF05873_PF08181 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05873 is 6713494 with Jaccard = 1.0000 |PF05873|=20 [ 20 0 1100191 0 ] parent [ 6713494 ] : 6749957 0.0165094 (=14/(53*16)) 98.6159 given [ 6713494 ] : 6713494 0.0602837 (=17/(6*47)) 94.4918 best keyword for cluster 6713494 is PF05873 with Jaccard = 1.0000 [ 20 0 1100191 0 ] 1.0000 1.0000 sibling [ 6713494 ] : 6729805 0.0333333 (=2/(6*10)) 96.7035 best keyword for cluster 6729805 is PF08181 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05873 ( PF05873 ATP synthase D chain, mitochondrial (ATP5H) ) B> PF08181 ( PF08181 DegQ (SacQ) family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05873 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 168 ) 6509819_PF05722_PF05920 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05920 is 6349015 with Jaccard = 1.0000 |PF05920|=13 [ 13 0 1100198 0 ] parent [ 6349015 ] : 6509819 0.912593 (=616/(25*27)) 15.2979 given [ 6349015 ] : 6349015 1 (=84/(4*21)) 4.13007e-06 best keyword for cluster 6349015 is PF05920 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 sibling [ 6349015 ] : 6426982 1 (=92/(4*23)) 0.21135 best keyword for cluster 6426982 is PF05722 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05920 ( PF05920 Coprinus cinereus mating-type protein ) B> PF05722 ( PF05722 Ustilago B locus mating-type protein ) Only A has a clan ( CL0123.12 ). the two keywords do not coincide on UniRef90 proteins Neither PF05920 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 169 ) 6757902_PF05947_PF06996 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05947 is 6479870 with Jaccard = 1.0000 |PF05947|=94 [ 94 0 1100117 0 ] parent [ 6479870 ] : 6757902 0.00860086 (=86/(101*99)) 99.149 given [ 6479870 ] : 6479870 0.945578 (=278/(3*98)) 5.65956 best keyword for cluster 6479870 is PF05947 with Jaccard = 1.0000 [ 94 0 1100117 0 ] 1.0000 1.0000 sibling [ 6479870 ] : 6756624 0.0102041 (=1/(1*98)) 99.0684 best keyword for cluster 6756624 is PF06996 with Jaccard = 0.9878 [ 81 0 1100129 1 ] 1.0000 0.9878 SUGGESTING RELATEDNESS OF: A> PF05947 ( PF05947 Bacterial protein of unknown function (DUF879) ) B> PF06996 ( PF06996 Protein of unknown function (DUF1305) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF05947| = 94 , |PF06996| = 82 , |PF05947^PF06996| = 1 ( 1.1% and 1.2% ) Neither PF05947 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 170 ) 6672236_PF00600_PF05993 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05993 is 6182456 with Jaccard = 1.0000 |PF05993|=7 [ 7 0 1100204 0 ] parent [ 6182456 ] : 6672236 0.236607 (=53/(7*32)) 86.1161 given [ 6182456 ] : 6182456 1 (=12/(4*3)) 1.00034e-18 best keyword for cluster 6182456 is PF05993 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6182456 ] : 6603505 0.677419 (=21/(1*31)) 63.4839 best keyword for cluster 6603505 is PF00600 with Jaccard = 0.6829 [ 28 0 1100170 13 ] 1.0000 0.6829 SUGGESTING RELATEDNESS OF: A> PF05993 ( PF05993 Reovirus major virion structural protein Mu-1/Mu-1C (M2) ) B> PF00600 ( PF00600 Influenza non-structural protein (NS1) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF05993 and PF00600 have PDB structures PF00600 a.16.1.1 SUPERFAM mapping significantly overlapping: 1 PF05993 SSF69908 0.988 (average over 30 mutual instances, PF05993 30 appearances, SSF69908 30 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 171 ) 6597138_PF05994_PF07159 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05994 is 6577545 with Jaccard = 1.0000 |PF05994|=18 [ 18 0 1100193 0 ] parent [ 6577545 ] : 6597138 0.45614 (=182/(19*21)) 60.3451 given [ 6577545 ] : 6577545 0.5 (=10/(1*20)) 52.9755 best keyword for cluster 6577545 is PF05994 with Jaccard = 1.0000 [ 18 0 1100193 0 ] 1.0000 1.0000 sibling [ 6577545 ] : 6340607 1 (=48/(3*16)) 1.04167e-06 best keyword for cluster 6340607 is PF07159 with Jaccard = 0.9500 [ 19 0 1100191 1 ] 1.0000 0.9500 SUGGESTING RELATEDNESS OF: A> PF05994 ( PF05994 Cytoplasmic Fragile-X interacting family ) B> PF07159 ( PF07159 Protein of unknown function (DUF1394) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05994 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 172 ) 6749622_PF05996_PF06405 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05996 is 6486543 with Jaccard = 1.0000 |PF05996|=52 [ 52 0 1100159 0 ] parent [ 6486543 ] : 6749622 0.0196078 (=18/(54*17)) 98.5927 given [ 6486543 ] : 6486543 0.935185 (=606/(18*36)) 7.28387 best keyword for cluster 6486543 is PF05996 with Jaccard = 1.0000 [ 52 0 1100159 0 ] 1.0000 1.0000 sibling [ 6486543 ] : 6728289 0.0416667 (=3/(9*8)) 96.5167 best keyword for cluster 6728289 is PF06405 with Jaccard = 0.8750 [ 7 1 1100203 0 ] 0.8750 1.0000 SUGGESTING RELATEDNESS OF: A> PF05996 ( PF05996 Ferredoxin-dependent bilin reductase ) B> PF06405 ( PF06405 Red chlorophyll catabolite reductase (RCC reductase) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05996 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 173 ) 6747907_PF06075_PF08528 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06075 is 6639802 with Jaccard = 1.0000 |PF06075|=14 [ 14 0 1100197 0 ] parent [ 6639802 ] : 6747907 0.0208333 (=4/(16*12)) 98.4599 given [ 6639802 ] : 6639802 0.230769 (=9/(13*3)) 76.9887 best keyword for cluster 6639802 is PF06075 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6639802 ] : 6703520 0.0909091 (=1/(1*11)) 92.8182 best keyword for cluster 6703520 is PF08528 with Jaccard = 0.8182 [ 9 0 1100200 2 ] 1.0000 0.8182 SUGGESTING RELATEDNESS OF: A> PF06075 ( PF06075 Plant protein of unknown function (DUF936) ) B> PF08528 ( PF08528 Whi5 like ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06075 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 174 ) 6774560_PF03257_PF06099 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06099 is 6604123 with Jaccard = 1.0000 |PF06099|=19 [ 19 0 1100192 0 ] parent [ 6604123 ] : 6774560 0.00241109 (=4/(21*79)) 99.8451 given [ 6604123 ] : 6604123 0.5 (=10/(1*20)) 63.5127 best keyword for cluster 6604123 is PF06099 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 sibling [ 6604123 ] : 6756664 0.0102041 (=15/(30*49)) 99.0709 best keyword for cluster 6756664 is PF03257 with Jaccard = 0.6087 [ 14 9 1100188 0 ] 0.6087 1.0000 SUGGESTING RELATEDNESS OF: A> PF06099 ( PF06099 Phenol hydroxylase subunit ) B> PF03257 ( PF03257 Mycoplasma adhesin P1 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06099 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 175 ) 6733367_PF00015_PF06103 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06103 is 6726004 with Jaccard = 1.0000 |PF06103|=33 [ 33 0 1100178 0 ] parent [ 6726004 ] : 6733367 0.042611 (=12364/(72*4030)) 97.0938 given [ 6726004 ] : 6726004 0.0471154 (=49/(20*52)) 96.2351 best keyword for cluster 6726004 is PF06103 with Jaccard = 1.0000 [ 33 0 1100178 0 ] 1.0000 1.0000 sibling [ 6726004 ] : 6732672 0.0379158 (=1980/(13*4017)) 97.0101 best keyword for cluster 6732672 is PF00015 with Jaccard = 0.8611 [ 2735 412 1097035 29 ] 0.8691 0.9895 SUGGESTING RELATEDNESS OF: A> PF06103 ( PF06103 Bacterial protein of unknown function (DUF948) ) B> PF00015 ( PF00015 Methyl-accepting chemotaxis protein (MCP) signaling domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06103 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF06103 SSF47954 0.819 (average over 1 mutual instances, PF06103 1 appearances, SSF47954 3885 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 176 ) 6612532_PF06147_PF06914 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06147 is 6557065 with Jaccard = 1.0000 |PF06147|=14 [ 14 0 1100197 0 ] parent [ 6557065 ] : 6612532 0.344444 (=62/(12*15)) 67.5446 given [ 6557065 ] : 6557065 0.571429 (=8/(1*14)) 44.3738 best keyword for cluster 6557065 is PF06147 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6557065 ] : 6499254 0.888889 (=24/(9*3)) 11.1112 best keyword for cluster 6499254 is PF06914 with Jaccard = 1.0000 [ 10 0 1100201 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06147 ( PF06147 Protein of unknown function (DUF968) ) B> PF06914 ( PF06914 Protein of unknown function (DUF1277) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06147 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 177 ) 6732946_PF00543_PF06153 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06153 is 6462482 with Jaccard = 1.0000 |PF06153|=31 [ 31 0 1100180 0 ] parent [ 6462482 ] : 6732946 0.0434851 (=550/(34*372)) 97.0461 given [ 6462482 ] : 6462482 0.975 (=117/(4*30)) 2.61774 best keyword for cluster 6462482 is PF06153 with Jaccard = 1.0000 [ 31 0 1100180 0 ] 1.0000 1.0000 sibling [ 6462482 ] : 6679748 0.14462 (=1301/(26*346)) 88.1122 best keyword for cluster 6679748 is PF00543 with Jaccard = 0.9724 [ 317 0 1099885 9 ] 1.0000 0.9724 SUGGESTING RELATEDNESS OF: A> PF06153 ( PF06153 Protein of unknown function (DUF970) ) B> PF00543 ( PF00543 Nitrogen regulatory protein P-II ) they come from the same clan: CL0089.8 : PF08029 PF06153 PF02641 PF03091 PF00543 the two keywords coincide on Uniref90 proteins: |PF00543| = 326 , |PF06153| = 31 , |PF00543^PF06153| = 1 ( 0.3% and 3.2% ) only PF06153 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00543 SSF54913 0.911 (average over 1190 mutual instances, PF00543 1203 appearances, SSF54913 2763 appearances) 2 PF06153 SSF54913 0.994 (average over 106 mutual instances, PF06153 106 appearances, SSF54913 2763 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 178 ) 6740956_PF06157_PF06195 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06157 is 6704726 with Jaccard = 1.0000 |PF06157|=23 [ 23 0 1100188 0 ] parent [ 6704726 ] : 6740956 0.0324074 (=28/(36*24)) 97.8744 given [ 6704726 ] : 6704726 0.0774194 (=12/(31*5)) 93.0321 best keyword for cluster 6704726 is PF06157 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 sibling [ 6704726 ] : 6718524 0.0555556 (=6/(6*18)) 95.2176 best keyword for cluster 6718524 is PF06195 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06157 ( PF06157 Protein of unknown function (DUF973) ) B> PF06195 ( PF06195 Protein of unknown function (DUF996) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06157 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 179 ) 6662790_PF01470_PF06162 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06162 is 6347970 with Jaccard = 1.0000 |PF06162|=11 [ 11 0 1100200 0 ] parent [ 6347970 ] : 6662790 0.176033 (=213/(11*110)) 83.8457 given [ 6347970 ] : 6347970 1 (=28/(4*7)) 3.59592e-06 best keyword for cluster 6347970 is PF06162 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 sibling [ 6347970 ] : 6571430 0.522885 (=377/(7*103)) 51.3244 best keyword for cluster 6571430 is PF01470 with Jaccard = 0.9888 [ 88 0 1100122 1 ] 1.0000 0.9888 SUGGESTING RELATEDNESS OF: A> PF06162 ( PF06162 Caenorhabditis elegans protein of unknown function (DUF976) ) B> PF01470 ( PF01470 Pyroglutamyl peptidase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06162 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF06162 SSF53182 0.639 (average over 8 mutual instances, PF06162 8 appearances, SSF53182 285 appearances) 2 PF01470 SSF53182 0.943 (average over 276 mutual instances, PF01470 277 appearances, SSF53182 285 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 180 ) 6673923_PF01903_PF06180 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06180 is 6343254 with Jaccard = 1.0000 |PF06180|=28 [ 28 0 1100183 0 ] parent [ 6343254 ] : 6673923 0.170699 (=889/(31*168)) 86.6456 given [ 6343254 ] : 6343254 1 (=150/(6*25)) 1.79778e-06 best keyword for cluster 6343254 is PF06180 with Jaccard = 1.0000 [ 28 0 1100183 0 ] 1.0000 1.0000 sibling [ 6343254 ] : 6671378 0.186503 (=152/(5*163)) 85.8541 best keyword for cluster 6671378 is PF01903 with Jaccard = 0.9313 [ 149 0 1100051 11 ] 1.0000 0.9313 SUGGESTING RELATEDNESS OF: A> PF06180 ( PF06180 Cobalt chelatase (CbiK) ) B> PF01903 ( PF01903 CbiX ) they come from the same clan: CL0043.7 : PF06180 PF01903 PF00762 the two keywords do not coincide on UniRef90 proteins both PF06180 and PF01903 have PDB structures PF06180 c.92.1.2 PF01903 c.92.1.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 181 ) 6607810_PF06193_PF06909 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06193 is 5955101 with Jaccard = 1.0000 |PF06193|=3 [ 3 0 1100208 0 ] parent [ 5955101 ] : 6607810 0.416667 (=10/(3*8)) 65.703 given [ 5955101 ] : 5955101 1 (=2/(1*2)) 2.5005e-38 best keyword for cluster 5955101 is PF06193 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 sibling [ 5955101 ] : 6545869 0.75 (=9/(2*6)) 35.9787 best keyword for cluster 6545869 is PF06909 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06193 ( PF06193 Orthopoxvirus A5L protein ) B> PF06909 ( PF06909 Protein of unknown function (DUF1274) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06193 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 182 ) 6762134_PF06210_PF07300 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06210 is 6512231 with Jaccard = 1.0000 |PF06210|=53 [ 53 0 1100158 0 ] parent [ 6512231 ] : 6762134 0.00907945 (=36/(61*65)) 99.3776 given [ 6512231 ] : 6512231 0.846698 (=359/(53*8)) 16.4585 best keyword for cluster 6512231 is PF06210 with Jaccard = 1.0000 [ 53 0 1100158 0 ] 1.0000 1.0000 sibling [ 6512231 ] : 6759162 0.015625 (=1/(1*64)) 99.2188 best keyword for cluster 6759162 is PF07300 with Jaccard = 0.8000 [ 28 7 1100176 0 ] 0.8000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06210 ( PF06210 Protein of unknown function (DUF1003) ) B> PF07300 ( PF07300 Protein of unknown function (DUF1452) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06210 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 183 ) 6707568_PF04195_PF06217 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06217 is 6192842 with Jaccard = 1.0000 |PF06217|=12 [ 12 0 1100199 0 ] parent [ 6192842 ] : 6707568 0.0710744 (=559/(13*605)) 93.5585 given [ 6192842 ] : 6192842 1 (=40/(5*8)) 7.63296e-18 best keyword for cluster 6192842 is PF06217 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6192842 ] : 6690466 0.0978441 (=118/(2*603)) 90.3542 best keyword for cluster 6690466 is PF04195 with Jaccard = 0.8333 [ 305 58 1099845 3 ] 0.8402 0.9903 SUGGESTING RELATEDNESS OF: A> PF06217 ( PF06217 GAGA binding protein-like family ) B> PF04195 ( PF04195 Putative gypsy type transposon ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06217 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 184 ) 6669712_PF06229_PF06268 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06229 is 6538480 with Jaccard = 1.0000 |PF06229|=22 [ 22 0 1100189 0 ] parent [ 6538480 ] : 6669712 0.19496 (=147/(26*29)) 85.4409 given [ 6538480 ] : 6538480 0.715152 (=118/(15*11)) 31.1103 best keyword for cluster 6538480 is PF06229 with Jaccard = 1.0000 [ 22 0 1100189 0 ] 1.0000 1.0000 sibling [ 6538480 ] : 6645488 0.275362 (=38/(6*23)) 78.6531 best keyword for cluster 6645488 is PF06268 with Jaccard = 0.9524 [ 20 0 1100190 1 ] 1.0000 0.9524 SUGGESTING RELATEDNESS OF: A> PF06229 ( PF06229 FRG1-like family ) B> PF06268 ( PF06268 Fascin domain ) they come from the same clan: CL0066.9 : PF00652 PF02815 PF00197 PF00340 PF06229 PF00167 PF06268 PF04601 PF03498 PF05588 PF07468 PF05270 PF07951 the two keywords do not coincide on UniRef90 proteins only PF06229 has a PDB structure (may not be up to date) PF06268 b.42.5.1 SUPERFAM mapping significantly overlapping: 1 PF06229 SSF50405 0.549 (average over 42 mutual instances, PF06229 42 appearances, SSF50405 195 appearances) 2 PF06268 SSF50405 0.748 (average over 53 mutual instances, PF06268 53 appearances, SSF50405 195 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 185 ) 6764362_PF00223_PF06234 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06234 is 6551601 with Jaccard = 1.0000 |PF06234|=10 [ 10 0 1100201 0 ] parent [ 6551601 ] : 6764362 0.0135895 (=29/(11*194)) 99.4822 given [ 6551601 ] : 6551601 0.6 (=6/(1*10)) 40.0001 best keyword for cluster 6551601 is PF06234 with Jaccard = 1.0000 [ 10 0 1100201 0 ] 1.0000 1.0000 sibling [ 6551601 ] : 6760531 0.00903614 (=42/(28*166)) 99.2922 best keyword for cluster 6760531 is PF00223 with Jaccard = 1.0000 [ 116 0 1100095 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06234 ( PF06234 Toluene-4-monooxygenase system protein B (TmoB) ) B> PF00223 ( PF00223 Photosystem I psaA/psaB protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF06234 and PF00223 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF06234 SSF110814 0.975 (average over 21 mutual instances, PF06234 21 appearances, SSF110814 21 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 186 ) 6750698_PF05682_PF06243 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06243 is 6527124 with Jaccard = 1.0000 |PF06243|=37 [ 37 0 1100174 0 ] parent [ 6527124 ] : 6750698 0.0200846 (=38/(43*44)) 98.6696 given [ 6527124 ] : 6527124 0.841667 (=101/(3*40)) 24.2872 best keyword for cluster 6527124 is PF06243 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 sibling [ 6527124 ] : 6748255 0.0232558 (=1/(1*43)) 98.4884 best keyword for cluster 6748255 is PF05682 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06243 ( PF06243 Phenylacetic acid degradation B ) B> PF05682 ( PF05682 Phosphorylase kinase alpha/beta ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06243 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 187 ) 6611985_PF06285_PF07190 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06285 is 5897495 with Jaccard = 1.0000 |PF06285|=2 [ 2 0 1100209 0 ] parent [ 5897495 ] : 6611985 0.333333 (=2/(2*3)) 67.5 given [ 5897495 ] : 5897495 1 (=1/(1*1)) 7e-44 best keyword for cluster 5897495 is PF06285 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 sibling [ 5897495 ] : 6561939 1 (=2/(1*2)) 48.5 best keyword for cluster 6561939 is PF07190 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06285 ( PF06285 Protein of unknown function (DUF1038) ) B> PF07190 ( PF07190 Protein of unknown function (DUF1406) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06285 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 188 ) 6687592_PF06304_PF06570 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06304 is 6373256 with Jaccard = 1.0000 |PF06304|=8 [ 8 0 1100203 0 ] parent [ 6373256 ] : 6687592 0.12549 (=64/(10*51)) 89.7545 given [ 6373256 ] : 6373256 1 (=21/(3*7)) 0.000186907 best keyword for cluster 6373256 is PF06304 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6373256 ] : 6665704 0.194444 (=105/(15*36)) 84.4283 best keyword for cluster 6665704 is PF06570 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06304 ( PF06304 Protein of unknown function (DUF1048) ) B> PF06570 ( PF06570 Protein of unknown function (DUF1129) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06304 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 189 ) 6682268_PF03313_PF06354 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06354 is 6637613 with Jaccard = 1.0000 |PF06354|=37 [ 37 0 1100174 0 ] parent [ 6637613 ] : 6682268 0.137195 (=1260/(224*41)) 88.764 given [ 6637613 ] : 6637613 0.236842 (=27/(38*3)) 76.3604 best keyword for cluster 6637613 is PF06354 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 sibling [ 6637613 ] : 6537747 0.697079 (=5799/(47*177)) 30.7838 best keyword for cluster 6537747 is PF03313 with Jaccard = 0.7871 [ 159 43 1100009 0 ] 0.7871 1.0000 SUGGESTING RELATEDNESS OF: A> PF06354 ( PF06354 Protein of unknown function (DUF1063) ) B> PF03313 ( PF03313 Serine dehydratase alpha chain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06354 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 190 ) 6605478_PF06357_PF08087 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06357 is 6407369 with Jaccard = 1.0000 |PF06357|=4 [ 4 0 1100207 0 ] parent [ 6407369 ] : 6605478 0.464286 (=26/(4*14)) 64.3339 given [ 6407369 ] : 6407369 1 (=3/(1*3)) 0.0223333 best keyword for cluster 6407369 is PF06357 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 sibling [ 6407369 ] : 6473895 1 (=13/(1*13)) 4.40817 best keyword for cluster 6473895 is PF08087 with Jaccard = 0.7368 [ 14 0 1100192 5 ] 1.0000 0.7368 SUGGESTING RELATEDNESS OF: A> PF06357 ( PF06357 Omega-atracotoxin ) B> PF08087 ( PF08087 Conotoxin O-superfamily ) Only A has a clan ( CL0083.9 ). the two keywords do not coincide on UniRef90 proteins only PF06357 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 191 ) 6567383_PF06358_PF06716 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06358 is 5919364 with Jaccard = 1.0000 |PF06358|=2 [ 2 0 1100209 0 ] parent [ 5919364 ] : 6567383 0.5 (=2/(2*2)) 50.45 given [ 5919364 ] : 5919364 1 (=1/(1*1)) 1e-41 best keyword for cluster 5919364 is PF06358 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 sibling [ 5919364 ] : 6160596 1 (=1/(1*1)) 2e-20 best keyword for cluster 6160596 is PF06716 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06358 ( PF06358 Protein of unknown function (DUF1065) ) B> PF06716 ( PF06716 Protein of unknown function (DUF1201) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06358 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 192 ) 6714912_PF01027_PF06539 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06539 is 6626501 with Jaccard = 1.0000 |PF06539|=32 [ 32 0 1100179 0 ] parent [ 6626501 ] : 6714912 0.0648471 (=929/(38*377)) 94.7092 given [ 6626501 ] : 6626501 0.277778 (=20/(2*36)) 73.0771 best keyword for cluster 6626501 is PF06539 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 sibling [ 6626501 ] : 6694175 0.0953654 (=107/(3*374)) 91.1356 best keyword for cluster 6694175 is PF01027 with Jaccard = 0.8990 [ 276 0 1099904 31 ] 1.0000 0.8990 SUGGESTING RELATEDNESS OF: A> PF06539 ( PF06539 Protein of unknown function (DUF1112) ) B> PF01027 ( PF01027 Uncharacterised protein family UPF0005 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06539 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 193 ) 6593602_PF04258_PF06550 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06550 is 6242234 with Jaccard = 1.0000 |PF06550|=14 [ 14 0 1100197 0 ] parent [ 6242234 ] : 6593602 0.486555 (=579/(14*85)) 58.7706 given [ 6242234 ] : 6242234 1 (=33/(3*11)) 6.20457e-14 best keyword for cluster 6242234 is PF06550 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6242234 ] : 6562687 0.542169 (=90/(2*83)) 49.0988 best keyword for cluster 6562687 is PF04258 with Jaccard = 0.9651 [ 83 0 1100125 3 ] 1.0000 0.9651 SUGGESTING RELATEDNESS OF: A> PF06550 ( PF06550 Protein of unknown function (DUF1119) ) B> PF04258 ( PF04258 Signal peptide peptidase ) they come from the same clan: CL0130.6 : PF06550 PF04258 PF01478 PF01080 the two keywords do not coincide on UniRef90 proteins only PF06550 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 194 ) 6562682_PF02502_PF06562 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06562 is 6137323 with Jaccard = 1.0000 |PF06562|=17 [ 17 0 1100194 0 ] parent [ 6137323 ] : 6562682 0.579258 (=2653/(20*229)) 49.0913 given [ 6137323 ] : 6137323 1 (=51/(17*3)) 2.18261e-22 best keyword for cluster 6137323 is PF06562 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6137323 ] : 6494884 0.923333 (=831/(4*225)) 9.71686 best keyword for cluster 6494884 is PF02502 with Jaccard = 0.9810 [ 206 0 1100001 4 ] 1.0000 0.9810 SUGGESTING RELATEDNESS OF: A> PF06562 ( ) B> PF02502 ( PF02502 Ribose/Galactose Isomerase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06562 has a PDB structure (may not be up to date) PF02502 c.121.1.1 SUPERFAM mapping significantly overlapping: 1 PF02502 SSF89623 0.954 (average over 779 mutual instances, PF02502 788 appearances, SSF89623 793 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 195 ) 6643833_PF03027_PF06585 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06585 is 6584814 with Jaccard = 1.0000 |PF06585|=4 [ 4 0 1100207 0 ] parent [ 6584814 ] : 6643833 0.254795 (=93/(5*73)) 78.111 given [ 6584814 ] : 6584814 0.5 (=3/(3*2)) 55.451 best keyword for cluster 6584814 is PF06585 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 sibling [ 6584814 ] : 6640910 0.233333 (=49/(70*3)) 77.3018 best keyword for cluster 6640910 is PF03027 with Jaccard = 0.8875 [ 71 0 1100131 9 ] 1.0000 0.8875 SUGGESTING RELATEDNESS OF: A> PF06585 ( PF06585 Haemolymph juvenile hormone binding protein (JHBP) ) B> PF03027 ( PF03027 Odorant binding protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06585 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 196 ) 6536308_PF04706_PF06607 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06607 is 6518161 with Jaccard = 1.0000 |PF06607|=15 [ 15 0 1100196 0 ] parent [ 6518161 ] : 6536308 0.713555 (=279/(23*17)) 29.827 given [ 6518161 ] : 6518161 0.9375 (=15/(1*16)) 19.2516 best keyword for cluster 6518161 is PF06607 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 sibling [ 6518161 ] : 6505222 0.904762 (=38/(2*21)) 13.3688 best keyword for cluster 6505222 is PF04706 with Jaccard = 0.9500 [ 19 0 1100191 1 ] 1.0000 0.9500 SUGGESTING RELATEDNESS OF: A> PF06607 ( PF06607 Prokineticin ) B> PF04706 ( PF04706 Dickkopf N-terminal cysteine-rich region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06607 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04706 SSF57027 0.529 (average over 1 mutual instances, PF04706 1 appearances, SSF57027 43 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 197 ) 6660579_PF05258_PF06647 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06647 is 6537173 with Jaccard = 1.0000 |PF06647|=28 [ 28 0 1100183 0 ] parent [ 6537173 ] : 6660579 0.199001 (=677/(42*81)) 83.4517 given [ 6537173 ] : 6537173 0.730556 (=263/(30*12)) 30.2164 best keyword for cluster 6537173 is PF06647 with Jaccard = 1.0000 [ 28 0 1100183 0 ] 1.0000 1.0000 sibling [ 6537173 ] : 6650307 0.225 (=18/(1*80)) 80.1135 best keyword for cluster 6650307 is PF05258 with Jaccard = 0.7353 [ 25 0 1100177 9 ] 1.0000 0.7353 SUGGESTING RELATEDNESS OF: A> PF06647 ( PF06647 Protein of unknown function (DUF1159) ) B> PF05258 ( PF05258 Protein of unknown function (DUF721) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06647 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 198 ) 6768259_PF01310_PF06649 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06649 is 6467462 with Jaccard = 1.0000 |PF06649|=18 [ 18 0 1100193 0 ] parent [ 6467462 ] : 6768259 0.00414079 (=4/(21*46)) 99.6492 given [ 6467462 ] : 6467462 0.977778 (=88/(6*15)) 3.33503 best keyword for cluster 6467462 is PF06649 with Jaccard = 1.0000 [ 18 0 1100193 0 ] 1.0000 1.0000 sibling [ 6467462 ] : 6753521 0.0133929 (=6/(14*32)) 98.8723 best keyword for cluster 6753521 is PF01310 with Jaccard = 1.0000 [ 30 0 1100181 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06649 ( PF06649 Protein of unknown function (DUF1161) ) B> PF01310 ( PF01310 Adenovirus hexon associated protein, protein VIII ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06649 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 199 ) 6702845_PF02810_PF06685 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06685 is 6348829 with Jaccard = 1.0000 |PF06685|=7 [ 7 0 1100204 0 ] parent [ 6348829 ] : 6702845 0.0765521 (=381/(7*711)) 92.6708 given [ 6348829 ] : 6348829 1 (=6/(1*6)) 4.00032e-06 best keyword for cluster 6348829 is PF06685 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6348829 ] : 6700055 0.0834515 (=353/(6*705)) 92.1962 best keyword for cluster 6700055 is PF02810 with Jaccard = 0.6573 [ 374 160 1099642 35 ] 0.7004 0.9144 SUGGESTING RELATEDNESS OF: A> PF06685 ( PF06685 Protein of unknown function (DUF1186) ) B> PF02810 ( PF02810 SEC-C motif ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02810| = 409 , |PF06685| = 7 , |PF02810^PF06685| = 1 ( 0.2% and 14.3% ) only PF06685 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 200 ) 6725303_PF00999_PF06826 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06826 is 6428778 with Jaccard = 1.0000 |PF06826|=63 [ 63 0 1100148 0 ] parent [ 6428778 ] : 6725303 0.0520848 (=6683/(70*1833)) 96.1477 given [ 6428778 ] : 6428778 0.997533 (=1213/(32*38)) 0.251136 best keyword for cluster 6428778 is PF06826 with Jaccard = 1.0000 [ 63 0 1100148 0 ] 1.0000 1.0000 sibling [ 6428778 ] : 6709395 0.0707204 (=53186/(1213*620)) 93.8441 best keyword for cluster 6709395 is PF00999 with Jaccard = 0.7175 [ 1194 456 1098547 14 ] 0.7236 0.9884 SUGGESTING RELATEDNESS OF: A> PF06826 ( PF06826 Predicted Permease Membrane Region ) B> PF00999 ( PF00999 Sodium/hydrogen exchanger family ) they come from the same clan: CL0064.7 : PF06826 PF03547 PF03601 PF05684 PF05982 PF03616 PF06965 PF00999 PF03977 PF01758 the two keywords do not coincide on UniRef90 proteins only PF06826 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 201 ) 6652057_PF05602_PF06836 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06836 is 6453855 with Jaccard = 1.0000 |PF06836|=17 [ 17 0 1100194 0 ] parent [ 6453855 ] : 6652057 0.27342 (=251/(17*54)) 80.6986 given [ 6453855 ] : 6453855 0.983333 (=59/(12*5)) 1.67437 best keyword for cluster 6453855 is PF06836 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6453855 ] : 6615832 0.313725 (=48/(3*51)) 68.8235 best keyword for cluster 6615832 is PF05602 with Jaccard = 1.0000 [ 51 0 1100160 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06836 ( PF06836 Protein of unknown function (DUF1240) ) B> PF05602 ( PF05602 Cleft lip and palate transmembrane protein 1 (CLPTM1) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06836 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 202 ) 6604293_PF06819_PF06847 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06847 is 6554514 with Jaccard = 1.0000 |PF06847|=17 [ 17 0 1100194 0 ] parent [ 6554514 ] : 6604293 0.418301 (=64/(9*17)) 63.7092 given [ 6554514 ] : 6554514 0.619048 (=26/(3*14)) 42.3601 best keyword for cluster 6554514 is PF06847 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6554514 ] : 6432270 1 (=20/(4*5)) 0.350884 best keyword for cluster 6432270 is PF06819 with Jaccard = 0.8889 [ 8 1 1100202 0 ] 0.8889 1.0000 SUGGESTING RELATEDNESS OF: A> PF06847 ( PF06847 Archaeal Peptidase A24 C-terminus Type II ) B> PF06819 ( PF06819 Archaeal Peptidase A24 C-terminal Domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06847 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 203 ) 6766845_PF00338_PF06856 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06856 is 6524521 with Jaccard = 1.0000 |PF06856|=18 [ 18 0 1100193 0 ] parent [ 6524521 ] : 6766845 0.00543901 (=28/(18*286)) 99.5923 given [ 6524521 ] : 6524521 0.775 (=62/(8*10)) 22.7035 best keyword for cluster 6524521 is PF06856 with Jaccard = 1.0000 [ 18 0 1100193 0 ] 1.0000 1.0000 sibling [ 6524521 ] : 6726916 0.0381304 (=155/(15*271)) 96.3481 best keyword for cluster 6726916 is PF00338 with Jaccard = 1.0000 [ 205 0 1100006 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06856 ( PF06856 Protein of unknown function (DUF1251) ) B> PF00338 ( PF00338 Ribosomal protein S10p/S20e ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06856 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00338 SSF54999 0.971 (average over 1055 mutual instances, PF00338 1056 appearances, SSF54999 1061 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 204 ) 6767361_PF01357_PF06865 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06865 is 6440739 with Jaccard = 1.0000 |PF06865|=46 [ 46 0 1100165 0 ] parent [ 6440739 ] : 6767361 0.00649838 (=173/(54*493)) 99.6118 given [ 6440739 ] : 6440739 0.993103 (=720/(25*29)) 0.690185 best keyword for cluster 6440739 is PF06865 with Jaccard = 1.0000 [ 46 0 1100165 0 ] 1.0000 1.0000 sibling [ 6440739 ] : 6765849 0.0066309 (=131/(44*449)) 99.5501 best keyword for cluster 6765849 is PF01357 with Jaccard = 0.7397 [ 270 95 1099846 0 ] 0.7397 1.0000 SUGGESTING RELATEDNESS OF: A> PF06865 ( PF06865 Protein of unknown function (DUF1255) ) B> PF01357 ( PF01357 Pollen allergen ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06865 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 205 ) 6738371_PF04306_PF06897 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06897 is 6713513 with Jaccard = 1.0000 |PF06897|=32 [ 32 0 1100179 0 ] parent [ 6713513 ] : 6738371 0.0336458 (=140/(57*73)) 97.6253 given [ 6713513 ] : 6713513 0.0555556 (=24/(9*48)) 94.4991 best keyword for cluster 6713513 is PF06897 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 sibling [ 6713513 ] : 6720837 0.0497512 (=20/(6*67)) 95.5347 best keyword for cluster 6720837 is PF04306 with Jaccard = 1.0000 [ 37 0 1100174 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06897 ( PF06897 Protein of unknown function (DUF1269) ) B> PF04306 ( PF04306 Protein of unknown function (DUF456) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06897 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 206 ) 6740286_PF03833_PF06906 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06906 is 6336414 with Jaccard = 1.0000 |PF06906|=21 [ 21 0 1100190 0 ] parent [ 6336414 ] : 6740286 0.0378788 (=50/(24*55)) 97.8034 given [ 6336414 ] : 6336414 1 (=80/(20*4)) 5.74267e-07 best keyword for cluster 6336414 is PF06906 with Jaccard = 1.0000 [ 21 0 1100190 0 ] 1.0000 1.0000 sibling [ 6336414 ] : 6718365 0.0701058 (=53/(27*28)) 95.1994 best keyword for cluster 6718365 is PF03833 with Jaccard = 0.7097 [ 22 9 1100180 0 ] 0.7097 1.0000 SUGGESTING RELATEDNESS OF: A> PF06906 ( PF06906 Protein of unknown function (DUF1272) ) B> PF03833 ( PF03833 DNA polymerase II large subunit DP2 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06906 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 207 ) 6714146_PF04391_PF06967 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06967 is 6270216 with Jaccard = 1.0000 |PF06967|=23 [ 23 0 1100188 0 ] parent [ 6270216 ] : 6714146 0.0714286 (=115/(23*70)) 94.5837 given [ 6270216 ] : 6270216 1 (=42/(21*2)) 8.89159e-12 best keyword for cluster 6270216 is PF06967 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 sibling [ 6270216 ] : 6695280 0.102941 (=14/(2*68)) 91.388 best keyword for cluster 6695280 is PF04391 with Jaccard = 0.9750 [ 39 1 1100171 0 ] 0.9750 1.0000 SUGGESTING RELATEDNESS OF: A> PF06967 ( PF06967 Mo-dependent nitrogenase C-terminus ) B> PF04391 ( PF04391 Protein of unknown function (DUF533) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06967 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 208 ) 6735731_PF00831_PF06984 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06984 is 6537196 with Jaccard = 1.0000 |PF06984|=39 [ 39 0 1100172 0 ] parent [ 6537196 ] : 6735731 0.0346955 (=433/(40*312)) 97.3435 given [ 6537196 ] : 6537196 0.701299 (=162/(7*33)) 30.2449 best keyword for cluster 6537196 is PF06984 with Jaccard = 1.0000 [ 39 0 1100172 0 ] 1.0000 1.0000 sibling [ 6537196 ] : 6607871 0.38245 (=1155/(10*302)) 65.7607 best keyword for cluster 6607871 is PF00831 with Jaccard = 0.9965 [ 281 0 1099929 1 ] 1.0000 0.9965 SUGGESTING RELATEDNESS OF: A> PF06984 ( PF06984 Mitochondrial 39-S ribosomal protein L47 (MRP-L47) ) B> PF00831 ( PF00831 Ribosomal L29 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06984 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF06984 SSF46561 0.752 (average over 14 mutual instances, PF06984 14 appearances, SSF46561 1019 appearances) 2 PF00831 SSF46561 0.926 (average over 999 mutual instances, PF00831 1000 appearances, SSF46561 1019 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 209 ) 6696135_PF07006_PF07252 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07006 is 6317449 with Jaccard = 1.0000 |PF07006|=19 [ 19 0 1100192 0 ] parent [ 6317449 ] : 6696135 0.109114 (=85/(19*41)) 91.5899 given [ 6317449 ] : 6317449 1 (=78/(13*6)) 2.60803e-08 best keyword for cluster 6317449 is PF07006 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 sibling [ 6317449 ] : 6664039 0.195767 (=74/(14*27)) 84.074 best keyword for cluster 6664039 is PF07252 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07006 ( PF07006 Protein of unknown function (DUF1310) ) B> PF07252 ( PF07252 Protein of unknown function (DUF1433) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07006 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 210 ) 6723254_PF03301_PF07014 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07014 is 6303319 with Jaccard = 1.0000 |PF07014|=9 [ 9 0 1100202 0 ] parent [ 6303319 ] : 6723254 0.0610329 (=39/(9*71)) 95.8948 given [ 6303319 ] : 6303319 1 (=8/(1*8)) 2.5e-09 best keyword for cluster 6303319 is PF07014 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 sibling [ 6303319 ] : 6676221 0.147059 (=30/(3*68)) 87.2258 best keyword for cluster 6676221 is PF03301 with Jaccard = 1.0000 [ 59 0 1100152 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07014 ( PF07014 Hs1pro-1 protein C-terminus ) B> PF03301 ( PF03301 Tryptophan 2,3-dioxygenase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07014 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 211 ) 6773216_PF07023_PF07999 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07023 is 6445759 with Jaccard = 1.0000 |PF07023|=32 [ 32 0 1100179 0 ] parent [ 6445759 ] : 6773216 0.00322545 (=29/(37*243)) 99.811 given [ 6445759 ] : 6445759 0.990909 (=327/(15*22)) 0.997043 best keyword for cluster 6445759 is PF07023 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 sibling [ 6445759 ] : 6769867 0.00413223 (=1/(1*242)) 99.7066 best keyword for cluster 6769867 is PF07999 with Jaccard = 0.9944 [ 176 1 1100034 0 ] 0.9944 1.0000 SUGGESTING RELATEDNESS OF: A> PF07023 ( PF07023 Protein of unknown function (DUF1315) ) B> PF07999 ( PF07999 Retrotransposon hot spot protein ) Only A has a clan ( CL0072.14 ). the two keywords do not coincide on UniRef90 proteins Neither PF07023 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 212 ) 6726885_PF04965_PF07025 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07025 is 6699234 with Jaccard = 1.0000 |PF07025|=81 [ 81 0 1100130 0 ] parent [ 6699234 ] : 6726885 0.0480519 (=481/(91*110)) 96.3428 given [ 6699234 ] : 6699234 0.0948276 (=33/(87*4)) 92.0491 best keyword for cluster 6699234 is PF07025 with Jaccard = 1.0000 [ 81 0 1100130 0 ] 1.0000 1.0000 sibling [ 6699234 ] : 6720064 0.0533333 (=28/(5*105)) 95.4232 best keyword for cluster 6720064 is PF04965 with Jaccard = 0.9277 [ 77 6 1100128 0 ] 0.9277 1.0000 SUGGESTING RELATEDNESS OF: A> PF07025 ( ) B> PF04965 ( PF04965 Gene 25-like lysozyme ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07025 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 213 ) 6678285_PF06133_PF07050 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07050 is 6617796 with Jaccard = 1.0000 |PF07050|=20 [ 20 0 1100191 0 ] parent [ 6617796 ] : 6678285 0.139706 (=285/(40*51)) 87.7535 given [ 6617796 ] : 6617796 0.3225 (=129/(20*20)) 69.537 best keyword for cluster 6617796 is PF07050 with Jaccard = 1.0000 [ 20 0 1100191 0 ] 1.0000 1.0000 sibling [ 6617796 ] : 6654679 0.18617 (=35/(47*4)) 81.6075 best keyword for cluster 6654679 is PF06133 with Jaccard = 0.9756 [ 40 0 1100170 1 ] 1.0000 0.9756 SUGGESTING RELATEDNESS OF: A> PF07050 ( PF07050 Protein of unknown function (DUF1333) ) B> PF06133 ( PF06133 Protein of unknown function (DUF964) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07050 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 214 ) 6729287_PF01081_PF07071 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07071 is 6288448 with Jaccard = 1.0000 |PF07071|=11 [ 11 0 1100200 0 ] parent [ 6288448 ] : 6729287 0.0474658 (=118/(11*226)) 96.6454 given [ 6288448 ] : 6288448 1 (=10/(1*10)) 2e-10 best keyword for cluster 6288448 is PF07071 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 sibling [ 6288448 ] : 6678844 0.15786 (=242/(7*219)) 87.8921 best keyword for cluster 6678844 is PF01081 with Jaccard = 0.9846 [ 192 0 1100016 3 ] 1.0000 0.9846 SUGGESTING RELATEDNESS OF: A> PF07071 ( PF07071 Protein of unknown function (DUF1341) ) B> PF01081 ( PF01081 KDPG and KHG aldolase ) Only B has a clan ( CL0036.17 ). the two keywords do not coincide on UniRef90 proteins only PF07071 has a PDB structure (may not be up to date) PF01081 c.1.10.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 215 ) 6680901_PF07088_PF07181 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07088 is 6297211 with Jaccard = 1.0000 |PF07088|=7 [ 7 0 1100204 0 ] parent [ 6297211 ] : 6680901 0.119048 (=5/(6*7)) 88.4074 given [ 6297211 ] : 6297211 1 (=10/(2*5)) 9.01e-10 best keyword for cluster 6297211 is PF07088 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6297211 ] : 6109666 1 (=5/(1*5)) 1.00042e-24 best keyword for cluster 6109666 is PF07181 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07088 ( PF07088 GvpD gas vesicle protein ) B> PF07181 ( PF07181 VirC2 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07088 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 216 ) 6765756_PF00816_PF07146 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07146 is 6742240 with Jaccard = 1.0000 |PF07146|=17 [ 17 0 1100194 0 ] parent [ 6742240 ] : 6765756 0.00577201 (=40/(210*33)) 99.5458 given [ 6742240 ] : 6742240 0.0225564 (=6/(19*14)) 97.9948 best keyword for cluster 6742240 is PF07146 with Jaccard = 1.0000 [ 17 0 1100194 0 ] 1.0000 1.0000 sibling [ 6742240 ] : 6732367 0.0386939 (=237/(175*35)) 96.9927 best keyword for cluster 6732367 is PF00816 with Jaccard = 1.0000 [ 150 0 1100061 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07146 ( PF07146 Protein of unknown function (DUF1389) ) B> PF00816 ( PF00816 H-NS histone family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07146 has a PDB structure (may not be up to date) PF00816 a.155.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 217 ) 6748319_PF00335_PF07150 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07150 is 6691378 with Jaccard = 1.0000 |PF07150|=8 [ 8 0 1100203 0 ] parent [ 6691378 ] : 6748319 0.0176329 (=146/(15*552)) 98.4933 given [ 6691378 ] : 6691378 0.113636 (=5/(4*11)) 90.5531 best keyword for cluster 6691378 is PF07150 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 sibling [ 6691378 ] : 6738017 0.0322623 (=243/(14*538)) 97.5898 best keyword for cluster 6738017 is PF00335 with Jaccard = 0.9867 [ 445 1 1099760 5 ] 0.9978 0.9889 SUGGESTING RELATEDNESS OF: A> PF07150 ( PF07150 Protein of unknown function (DUF1390) ) B> PF00335 ( PF00335 Tetraspanin family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07150 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 218 ) 6671500_PF00669_PF07164 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07164 is 6507099 with Jaccard = 1.0000 |PF07164|=14 [ 14 0 1100197 0 ] parent [ 6507099 ] : 6671500 0.162438 (=5263/(40*810)) 85.933 given [ 6507099 ] : 6507099 0.874459 (=202/(7*33)) 14.1077 best keyword for cluster 6507099 is PF07164 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 sibling [ 6507099 ] : 6643793 0.235149 (=380/(2*808)) 78.0815 best keyword for cluster 6643793 is PF00669 with Jaccard = 0.9661 [ 684 22 1099503 2 ] 0.9688 0.9971 SUGGESTING RELATEDNESS OF: A> PF07164 ( PF07164 Putative flagellar hook-associated protein 3 (HAP3) ) B> PF00669 ( PF00669 Bacterial flagellin N-terminus ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07164 has a PDB structure (may not be up to date) PF00669 e.32.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 219 ) 6702869_PF07183_PF07756 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07183 is 6520379 with Jaccard = 1.0000 |PF07183|=12 [ 12 0 1100199 0 ] parent [ 6520379 ] : 6702869 0.0926724 (=43/(16*29)) 92.6756 given [ 6520379 ] : 6520379 0.8 (=12/(1*15)) 20.2269 best keyword for cluster 6520379 is PF07183 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 sibling [ 6520379 ] : 6631579 0.275 (=33/(5*24)) 75.0431 best keyword for cluster 6631579 is PF07756 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07183 ( PF07183 Protein of unknown function (DUF1403) ) B> PF07756 ( PF07756 Protein of unknown function (DUF1612) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07183 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 220 ) 6675009_PF07241_PF07873 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07241 is 6395454 with Jaccard = 1.0000 |PF07241|=24 [ 24 0 1100187 0 ] parent [ 6395454 ] : 6675009 0.172727 (=95/(22*25)) 86.9088 given [ 6395454 ] : 6395454 1 (=66/(3*22)) 0.0044824 best keyword for cluster 6395454 is PF07241 with Jaccard = 1.0000 [ 24 0 1100187 0 ] 1.0000 1.0000 sibling [ 6395454 ] : 6359649 1 (=40/(2*20)) 2.12643e-05 best keyword for cluster 6359649 is PF07873 with Jaccard = 1.0000 [ 21 0 1100190 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07241 ( PF07241 Protein of unknown function (DUF1429) ) B> PF07873 ( PF07873 YabP family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07241 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 221 ) 6672174_PF06656_PF07245 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07245 is 6376394 with Jaccard = 1.0000 |PF07245|=16 [ 16 0 1100195 0 ] parent [ 6376394 ] : 6672174 0.151786 (=17/(16*7)) 86.0753 given [ 6376394 ] : 6376394 1 (=55/(5*11)) 0.000295131 best keyword for cluster 6376394 is PF07245 with Jaccard = 1.0000 [ 16 0 1100195 0 ] 1.0000 1.0000 sibling [ 6376394 ] : 6608857 0.4 (=4/(5*2)) 66.49 best keyword for cluster 6608857 is PF06656 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07245 ( PF07245 Phlebovirus glycoprotein G2 ) B> PF06656 ( PF06656 Tenuivirus PVC2 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07245 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 222 ) 6728300_PF04271_PF07261 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07261 is 6545393 with Jaccard = 1.0000 |PF07261|=33 [ 33 0 1100178 0 ] parent [ 6545393 ] : 6728300 0.0441122 (=478/(43*252)) 96.5185 given [ 6545393 ] : 6545393 0.678947 (=129/(5*38)) 35.5023 best keyword for cluster 6545393 is PF07261 with Jaccard = 1.0000 [ 33 0 1100178 0 ] 1.0000 1.0000 sibling [ 6545393 ] : 6721711 0.0456731 (=475/(52*200)) 95.6656 best keyword for cluster 6721711 is PF04271 with Jaccard = 0.7607 [ 89 25 1100094 3 ] 0.7807 0.9674 SUGGESTING RELATEDNESS OF: A> PF07261 ( PF07261 Replication initiation and membrane attachment protein (DnaB) ) B> PF04271 ( PF04271 DnaD-like domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07261 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 223 ) 6707929_PF07211_PF07352 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07352 is 6529230 with Jaccard = 1.0000 |PF07352|=15 [ 15 0 1100196 0 ] parent [ 6529230 ] : 6707929 0.0864198 (=14/(18*9)) 93.6176 given [ 6529230 ] : 6529230 0.763889 (=55/(12*6)) 25.3618 best keyword for cluster 6529230 is PF07352 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 sibling [ 6529230 ] : 6662867 0.166667 (=3/(6*3)) 83.8579 best keyword for cluster 6662867 is PF07211 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07352 ( PF07352 Bacteriophage Mu Gam like protein ) B> PF07211 ( PF07211 Protein of unknown function (DUF1417) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07352 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 224 ) 6696871_PF02987_PF07384 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07384 is 6683589 with Jaccard = 1.0000 |PF07384|=1 [ 1 0 1100210 0 ] parent [ 6683589 ] : 6696871 0.103596 (=386/(9*414)) 91.7138 given [ 6683589 ] : 6683589 0.125 (=1/(1*8)) 89 best keyword for cluster 6683589 is PF07384 with Jaccard = 1.0000 [ 1 0 1100210 0 ] 1.0000 1.0000 sibling [ 6683589 ] : 6693759 0.106762 (=761/(18*396)) 91.0191 best keyword for cluster 6693759 is PF02987 with Jaccard = 0.6947 [ 132 26 1100021 32 ] 0.8354 0.8049 SUGGESTING RELATEDNESS OF: A> PF07384 ( PF07384 Protein of unknown function (DUF1497) ) B> PF02987 ( PF02987 Late embryogenesis abundant protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07384 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 225 ) 6736878_PF02691_PF07406 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07406 is 6181172 with Jaccard = 1.0000 |PF07406|=9 [ 9 0 1100202 0 ] parent [ 6181172 ] : 6736878 0.0299145 (=14/(9*52)) 97.4704 given [ 6181172 ] : 6181172 1 (=18/(3*6)) 8.89382e-19 best keyword for cluster 6181172 is PF07406 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 sibling [ 6181172 ] : 6725357 0.0416667 (=8/(4*48)) 96.151 best keyword for cluster 6725357 is PF02691 with Jaccard = 0.9375 [ 45 1 1100163 2 ] 0.9783 0.9574 SUGGESTING RELATEDNESS OF: A> PF07406 ( PF07406 NICE-3 protein ) B> PF02691 ( PF02691 Vacuolating cyotoxin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07406 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 226 ) 6756120_PF01963_PF07446 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07446 is 6689118 with Jaccard = 1.0000 |PF07446|=61 [ 61 0 1100150 0 ] parent [ 6689118 ] : 6756120 0.0141666 (=107/(83*91)) 99.0387 given [ 6689118 ] : 6689118 0.115616 (=77/(74*9)) 90.102 best keyword for cluster 6689118 is PF07446 with Jaccard = 1.0000 [ 61 0 1100150 0 ] 1.0000 1.0000 sibling [ 6689118 ] : 6740494 0.0278293 (=30/(77*14)) 97.8268 best keyword for cluster 6740494 is PF01963 with Jaccard = 0.9831 [ 58 0 1100152 1 ] 1.0000 0.9831 SUGGESTING RELATEDNESS OF: A> PF07446 ( PF07446 GumN protein ) B> PF01963 ( PF01963 TraB family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07446 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 227 ) 6600048_PF00666_PF07448 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07448 is 6262049 with Jaccard = 1.0000 |PF07448|=6 [ 6 0 1100205 0 ] parent [ 6262049 ] : 6600048 0.460317 (=116/(6*42)) 61.5633 given [ 6262049 ] : 6262049 1 (=5/(1*5)) 2.01606e-12 best keyword for cluster 6262049 is PF07448 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 sibling [ 6262049 ] : 6536028 0.815789 (=124/(4*38)) 29.552 best keyword for cluster 6536028 is PF00666 with Jaccard = 0.9459 [ 35 0 1100174 2 ] 1.0000 0.9459 SUGGESTING RELATEDNESS OF: A> PF07448 ( PF07448 Secreted phosphoprotein 24 (Spp-24) ) B> PF00666 ( PF00666 Cathelicidin ) they come from the same clan: CL0121.6 : PF00666 PF00031 PF07448 the two keywords do not coincide on UniRef90 proteins only PF07448 has a PDB structure (may not be up to date) PF00666 d.17.1.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 228 ) 6713040_PF00728_PF07555 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07555 is 6653972 with Jaccard = 1.0000 |PF07555|=31 [ 31 0 1100180 0 ] parent [ 6653972 ] : 6713040 0.0634921 (=576/(36*252)) 94.4161 given [ 6653972 ] : 6653972 0.191176 (=13/(2*34)) 81.3482 best keyword for cluster 6653972 is PF07555 with Jaccard = 1.0000 [ 31 0 1100180 0 ] 1.0000 1.0000 sibling [ 6653972 ] : 6702335 0.0916335 (=23/(1*251)) 92.5988 best keyword for cluster 6702335 is PF00728 with Jaccard = 0.9957 [ 233 0 1099977 1 ] 1.0000 0.9957 SUGGESTING RELATEDNESS OF: A> PF07555 ( PF07555 Hyaluronidase ) B> PF00728 ( PF00728 Glycosyl hydrolase family 20, catalytic domain ) Only B has a clan ( CL0058.10 ). the two keywords do not coincide on UniRef90 proteins only PF07555 has a PDB structure (may not be up to date) PF00728 c.1.8.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 229 ) 6645404_PF04642_PF07794 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07794 is 6512476 with Jaccard = 1.0000 |PF07794|=13 [ 13 0 1100198 0 ] parent [ 6512476 ] : 6645404 0.238782 (=149/(13*48)) 78.5899 given [ 6512476 ] : 6512476 0.833333 (=30/(4*9)) 16.6667 best keyword for cluster 6512476 is PF07794 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 sibling [ 6512476 ] : 6625243 0.284375 (=91/(8*40)) 72.5771 best keyword for cluster 6625243 is PF04642 with Jaccard = 0.7500 [ 6 2 1100203 0 ] 0.7500 1.0000 SUGGESTING RELATEDNESS OF: A> PF07794 ( PF07794 Protein of unknown function (DUF1633) ) B> PF04642 ( PF04642 Protein of unknown function, DUF601 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07794 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 230 ) 6734835_PF02624_PF07812 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07812 is 6575948 with Jaccard = 1.0000 |PF07812|=23 [ 23 0 1100188 0 ] parent [ 6575948 ] : 6734835 0.0286499 (=73/(26*98)) 97.2563 given [ 6575948 ] : 6575948 0.479167 (=23/(24*2)) 52.5428 best keyword for cluster 6575948 is PF07812 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 sibling [ 6575948 ] : 6723850 0.0412371 (=4/(1*97)) 95.9815 best keyword for cluster 6723850 is PF02624 with Jaccard = 0.9348 [ 86 0 1100119 6 ] 1.0000 0.9348 SUGGESTING RELATEDNESS OF: A> PF07812 ( PF07812 TfuA-like protein ) B> PF02624 ( PF02624 YcaO-like family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02624| = 92 , |PF07812| = 23 , |PF02624^PF07812| = 1 ( 1.1% and 4.3% ) Neither PF07812 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 231 ) 6640090_PF04961_PF07837 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07837 is 6552246 with Jaccard = 1.0000 |PF07837|=36 [ 36 0 1100175 0 ] parent [ 6552246 ] : 6640090 0.23141 (=389/(41*41)) 77.0151 given [ 6552246 ] : 6552246 0.7 (=28/(1*40)) 40.5791 best keyword for cluster 6552246 is PF07837 with Jaccard = 1.0000 [ 36 0 1100175 0 ] 1.0000 1.0000 sibling [ 6552246 ] : 6628199 0.263158 (=30/(3*38)) 73.758 best keyword for cluster 6628199 is PF04961 with Jaccard = 0.7609 [ 35 0 1100165 11 ] 1.0000 0.7609 SUGGESTING RELATEDNESS OF: A> PF07837 ( PF07837 Formiminotransferase domain, N-terminal subdomain ) B> PF04961 ( PF04961 Formiminotransferase-cyclodeaminase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04961| = 46 , |PF07837| = 36 , |PF04961^PF07837| = 10 ( 21.7% and 27.8% ) both PF07837 and PF04961 have PDB structures PF07837 d.58.34.1 PF04961 a.191.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 232 ) 6658087_PF07399_PF07854 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07854 is 6022870 with Jaccard = 1.0000 |PF07854|=15 [ 15 0 1100196 0 ] parent [ 6022870 ] : 6658087 0.226389 (=163/(15*48)) 82.7444 given [ 6022870 ] : 6022870 1 (=14/(1*14)) 3.57921e-32 best keyword for cluster 6022870 is PF07854 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 sibling [ 6022870 ] : 6633549 0.308594 (=158/(32*16)) 75.4505 best keyword for cluster 6633549 is PF07399 with Jaccard = 0.7619 [ 16 5 1100190 0 ] 0.7619 1.0000 SUGGESTING RELATEDNESS OF: A> PF07854 ( PF07854 Protein of unknown function (DUF1646) ) B> PF07399 ( PF07399 Protein of unknown function (DUF1504) ) they come from the same clan: CL0182.8 : PF06450 PF00939 PF03553 PF07158 PF02652 PF02447 PF04165 PF07854 PF07399 PF03606 PF03605 PF06808 PF03600 PF02040 PF00873 PF03806 PF02667 the two keywords do not coincide on UniRef90 proteins Neither PF07854 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 233 ) 6734858_PF00614_PF07894 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07894 is 6651714 with Jaccard = 1.0000 |PF07894|=35 [ 35 0 1100176 0 ] parent [ 6651714 ] : 6734858 0.0338302 (=1112/(38*865)) 97.2589 given [ 6651714 ] : 6651714 0.198529 (=27/(34*4)) 80.524 best keyword for cluster 6651714 is PF07894 with Jaccard = 1.0000 [ 35 0 1100176 0 ] 1.0000 1.0000 sibling [ 6651714 ] : 6715742 0.0644783 (=660/(12*853)) 94.8356 best keyword for cluster 6715742 is PF00614 with Jaccard = 0.9690 [ 657 13 1099533 8 ] 0.9806 0.9880 SUGGESTING RELATEDNESS OF: A> PF07894 ( PF07894 Protein of unknown function (DUF1669) ) B> PF00614 ( PF00614 Phospholipase D Active site motif ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07894 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 234 ) 6753445_PF01281_PF07942 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07942 is 6491569 with Jaccard = 1.0000 |PF07942|=58 [ 58 0 1100153 0 ] parent [ 6491569 ] : 6753445 0.0125095 (=230/(58*317)) 98.8673 given [ 6491569 ] : 6491569 0.924528 (=245/(5*53)) 8.73119 best keyword for cluster 6491569 is PF07942 with Jaccard = 1.0000 [ 58 0 1100153 0 ] 1.0000 1.0000 sibling [ 6491569 ] : 6740541 0.030303 (=102/(306*11)) 97.8315 best keyword for cluster 6740541 is PF01281 with Jaccard = 0.9821 [ 274 1 1099932 4 ] 0.9964 0.9856 SUGGESTING RELATEDNESS OF: A> PF07942 ( PF07942 N2227-like protein ) B> PF01281 ( PF01281 Ribosomal protein L9, N-terminal domain ) Only A has a clan ( CL0102.14 ). the two keywords coincide on Uniref90 proteins: |PF01281| = 278 , |PF07942| = 58 , |PF01281^PF07942| = 1 ( 0.4% and 1.7% ) only PF07942 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 235 ) 6676079_PF05300_PF07956 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07956 is 6479923 with Jaccard = 1.0000 |PF07956|=13 [ 13 0 1100198 0 ] parent [ 6479923 ] : 6676079 0.16036 (=89/(15*37)) 87.1812 given [ 6479923 ] : 6479923 0.946429 (=53/(8*7)) 5.6792 best keyword for cluster 6479923 is PF07956 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 sibling [ 6479923 ] : 6672233 0.138889 (=5/(1*36)) 86.1129 best keyword for cluster 6672233 is PF05300 with Jaccard = 0.7037 [ 19 8 1100184 0 ] 0.7037 1.0000 SUGGESTING RELATEDNESS OF: A> PF07956 ( PF07956 Protein of Unknown function (DUF1690) ) B> PF05300 ( PF05300 Protein of unknown function (DUF737) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07956 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 236 ) 6659593_PF04314_PF07987 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07987 is 6596885 with Jaccard = 1.0000 |PF07987|=36 [ 36 0 1100175 0 ] parent [ 6596885 ] : 6659593 0.168105 (=896/(41*130)) 83.2756 given [ 6596885 ] : 6596885 0.447368 (=51/(3*38)) 60.1169 best keyword for cluster 6596885 is PF07987 with Jaccard = 1.0000 [ 36 0 1100175 0 ] 1.0000 1.0000 sibling [ 6596885 ] : 6638422 0.257812 (=66/(2*128)) 76.5979 best keyword for cluster 6638422 is PF04314 with Jaccard = 0.9500 [ 114 0 1100091 6 ] 1.0000 0.9500 SUGGESTING RELATEDNESS OF: A> PF07987 ( PF07987 Domain of unkown function (DUF1775) ) B> PF04314 ( PF04314 Protein of unknown function (DUF461) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04314| = 120 , |PF07987| = 36 , |PF04314^PF07987| = 6 ( 5.0% and 16.7% ) only PF07987 has a PDB structure (may not be up to date) PF04314 b.2.10.1 SUPERFAM mapping significantly overlapping: 1 PF04314 SSF110087 0.786 (average over 395 mutual instances, PF04314 395 appearances, SSF110087 417 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 237 ) 6728765_PF05013_PF08014 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08014 is 6625336 with Jaccard = 1.0000 |PF08014|=30 [ 30 0 1100181 0 ] parent [ 6625336 ] : 6728765 0.0395745 (=279/(50*141)) 96.5812 given [ 6625336 ] : 6625336 0.295635 (=149/(14*36)) 72.6554 best keyword for cluster 6625336 is PF08014 with Jaccard = 1.0000 [ 30 0 1100181 0 ] 1.0000 1.0000 sibling [ 6625336 ] : 6691550 0.105839 (=58/(137*4)) 90.5826 best keyword for cluster 6691550 is PF05013 with Jaccard = 1.0000 [ 117 0 1100094 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF08014 ( PF08014 Domain of unknown function (DUF1704) ) B> PF05013 ( PF05013 N-formylglutamate amidohydrolase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF08014 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 238 ) 6759786_PF00272_PF08107 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08107 is 6691232 with Jaccard = 1.0000 |PF08107|=22 [ 22 0 1100189 0 ] parent [ 6691232 ] : 6759786 0.0118881 (=17/(22*65)) 99.2522 given [ 6691232 ] : 6691232 0.0952381 (=2/(1*21)) 90.5016 best keyword for cluster 6691232 is PF08107 with Jaccard = 1.0000 [ 22 0 1100189 0 ] 1.0000 1.0000 sibling [ 6691232 ] : 6741762 0.0328947 (=15/(8*57)) 97.9496 best keyword for cluster 6741762 is PF00272 with Jaccard = 1.0000 [ 51 0 1100160 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF08107 ( PF08107 Pleurocidin family ) B> PF00272 ( PF00272 Cecropin family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF08107 and PF00272 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 239 ) 6625572_PF00400_PF08149 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08149 is 6299611 with Jaccard = 1.0000 |PF08149|=47 [ 47 0 1100164 0 ] parent [ 6299611 ] : 6625572 0.326452 (=64457/(47*4201)) 72.7022 given [ 6299611 ] : 6299611 1 (=46/(1*46)) 1.30436e-09 best keyword for cluster 6299611 is PF08149 with Jaccard = 1.0000 [ 47 0 1100164 0 ] 1.0000 1.0000 sibling [ 6299611 ] : 6624157 0.31286 (=33961/(26*4175)) 72.0956 best keyword for cluster 6624157 is PF00400 with Jaccard = 0.6951 [ 3976 38 1094491 1706 ] 0.9905 0.6998 SUGGESTING RELATEDNESS OF: A> PF08149 ( PF08149 BING4CT (NUC141) domain ) B> PF00400 ( PF00400 WD domain, G-beta repeat ) Only B has a clan ( CL0186.8 ). the two keywords coincide on Uniref90 proteins: |PF00400| = 5682 , |PF08149| = 47 , |PF00400^PF08149| = 44 ( 0.8% and 93.6% ) only PF08149 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 240 ) 6737546_PF00400_PF08159 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08159 is 6695144 with Jaccard = 1.0000 |PF08159|=50 [ 50 0 1100161 0 ] parent [ 6695144 ] : 6737546 0.0325438 (=20362/(99*6320)) 97.5386 given [ 6695144 ] : 6695144 0.12018 (=293/(46*53)) 91.3417 best keyword for cluster 6695144 is PF08159 with Jaccard = 1.0000 [ 50 0 1100161 0 ] 1.0000 1.0000 sibling [ 6695144 ] : 6736650 0.0304183 (=1536/(8*6312)) 97.4416 best keyword for cluster 6736650 is PF00400 with Jaccard = 0.8806 [ 5141 156 1094373 541 ] 0.9705 0.9048 SUGGESTING RELATEDNESS OF: A> PF08159 ( PF08159 NUC153 domain ) B> PF00400 ( PF00400 WD domain, G-beta repeat ) Only B has a clan ( CL0186.8 ). the two keywords coincide on Uniref90 proteins: |PF00400| = 5682 , |PF08159| = 50 , |PF00400^PF08159| = 5 ( 0.1% and 10.0% ) only PF08159 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 241 ) 6774912_PF06006_PF08195 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08195 is 6770811 with Jaccard = 1.0000 |PF08195|=1 [ 1 0 1100210 0 ] parent [ 6770811 ] : 6774912 0.0015361 (=4/(42*62)) 99.8533 given [ 6770811 ] : 6770811 0.0163934 (=1/(1*61)) 99.7377 best keyword for cluster 6770811 is PF08195 with Jaccard = 1.0000 [ 1 0 1100210 0 ] 1.0000 1.0000 sibling [ 6770811 ] : 6768338 0.00470588 (=2/(25*17)) 99.6518 best keyword for cluster 6768338 is PF06006 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF08195 ( PF08195 TRI9 protein ) B> PF06006 ( PF06006 Bacterial protein of unknown function (DUF905) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF08195 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 242 ) 6729629_PF01581_PF08257 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08257 is 6629938 with Jaccard = 1.0000 |PF08257|=7 [ 7 0 1100204 0 ] parent [ 6629938 ] : 6729629 0.0435374 (=32/(7*105)) 96.6833 given [ 6629938 ] : 6629938 0.666667 (=4/(1*6)) 74.5 best keyword for cluster 6629938 is PF08257 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 sibling [ 6629938 ] : 6725506 0.0515789 (=49/(10*95)) 96.169 best keyword for cluster 6725506 is PF01581 with Jaccard = 0.8971 [ 61 0 1100143 7 ] 1.0000 0.8971 SUGGESTING RELATEDNESS OF: A> PF08257 ( PF08257 Sulfakinin family ) B> PF01581 ( PF01581 FMRFamide related peptide family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF08257 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 243 ) 6722846_PF05827_PF08319 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08319 is 6619152 with Jaccard = 1.0000 |PF08319|=18 [ 18 0 1100193 0 ] parent [ 6619152 ] : 6722846 0.0601173 (=41/(22*31)) 95.838 given [ 6619152 ] : 6619152 0.321429 (=36/(8*14)) 70.021 best keyword for cluster 6619152 is PF08319 with Jaccard = 1.0000 [ 18 0 1100193 0 ] 1.0000 1.0000 sibling [ 6619152 ] : 6705544 0.0769231 (=10/(26*5)) 93.2016 best keyword for cluster 6705544 is PF05827 with Jaccard = 0.9600 [ 24 0 1100186 1 ] 1.0000 0.9600 SUGGESTING RELATEDNESS OF: A> PF08319 ( PF08319 ER protein BIG1 ) B> PF05827 ( PF05827 Vacuolar ATP synthase subunit S1 (ATP6S1) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF08319 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 244 ) 6236339_PF07952_PF08470 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08470 is 5184278 with Jaccard = 1.0000 |PF08470|=10 [ 10 0 1100201 0 ] parent [ 5184278 ] : 6236339 1 (=140/(10*14)) 2.06062e-14 given [ 5184278 ] : 5184278 1 (=9/(1*9)) 0 best keyword for cluster 5184278 is PF08470 with Jaccard = 1.0000 [ 10 0 1100201 0 ] 1.0000 1.0000 sibling [ 5184278 ] : 6071757 1 (=13/(1*13)) 6.85435e-28 best keyword for cluster 6071757 is PF07952 with Jaccard = 0.9286 [ 13 1 1100197 0 ] 0.9286 1.0000 SUGGESTING RELATEDNESS OF: A> PF08470 ( PF08470 Nontoxic nonhaemagglutinin C-terminal ) B> PF07952 ( PF07952 Clostridium neurotoxin, Translocation domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF08470 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 245 ) 6760921_PF00420_PF06235 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00420 is 6743340 with Jaccard = 0.9989 |PF00420|=929 [ 928 0 1099282 1 ] parent [ 6743340 ] : 6760921 0.0100538 (=200/(19*1047)) 99.3142 given [ 6743340 ] : 6743340 0.0240034 (=4204/(838*209)) 98.0836 best keyword for cluster 6743340 is PF00420 with Jaccard = 0.9989 [ 928 0 1099282 1 ] 1.0000 0.9989 sibling [ 6743340 ] : 6718092 0.0512821 (=4/(13*6)) 95.1635 best keyword for cluster 6718092 is PF06235 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00420 ( PF00420 NADH-ubiquinone/plastoquinone oxidoreductase chain 4L ) B> PF06235 ( PF06235 NADH dehydrogenase subunit 4L (NAD4L) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF00420 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 246 ) 6767216_PF01545_PF05181 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01545 is 6690713 with Jaccard = 0.9977 |PF01545|=880 [ 878 0 1099331 2 ] parent [ 6690713 ] : 6767216 0.00416271 (=263/(972*65)) 99.6059 given [ 6690713 ] : 6690713 0.113659 (=3212/(30*942)) 90.4195 best keyword for cluster 6690713 is PF01545 with Jaccard = 0.9977 [ 878 0 1099331 2 ] 1.0000 0.9977 sibling [ 6690713 ] : 6746882 0.0188889 (=17/(20*45)) 98.3799 best keyword for cluster 6746882 is PF05181 with Jaccard = 0.9677 [ 30 1 1100180 0 ] 0.9677 1.0000 SUGGESTING RELATEDNESS OF: A> PF01545 ( PF01545 Cation efflux family ) B> PF05181 ( PF05181 XPA protein C-terminus ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01545 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05181 SSF46955 0.701 (average over 55 mutual instances, PF05181 57 appearances, SSF46955 11923 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 247 ) 6553025_PF01715_PF01745 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01715 is 6538777 with Jaccard = 0.9971 |PF01715|=339 [ 338 0 1099872 1 ] parent [ 6538777 ] : 6553025 0.653226 (=4617/(19*372)) 41.1118 given [ 6538777 ] : 6538777 0.735849 (=273/(1*371)) 31.431 best keyword for cluster 6538777 is PF01715 with Jaccard = 0.9971 [ 338 0 1099872 1 ] 1.0000 0.9971 sibling [ 6538777 ] : 6398972 1 (=18/(1*18)) 0.00723507 best keyword for cluster 6398972 is PF01745 with Jaccard = 0.8571 [ 18 0 1100190 3 ] 1.0000 0.8571 SUGGESTING RELATEDNESS OF: A> PF01715 ( PF01715 IPP transferase ) B> PF01745 ( PF01745 Isopentenyl transferase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01715| = 339 , |PF01745| = 21 , |PF01715^PF01745| = 3 ( 0.9% and 14.3% ) Neither PF01715 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 248 ) 6743131_PF00871_PF07318 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00871 is 6558767 with Jaccard = 0.9969 |PF00871|=321 [ 320 0 1099890 1 ] parent [ 6558767 ] : 6743131 0.0249478 (=191/(348*22)) 98.066 given [ 6558767 ] : 6558767 0.611272 (=423/(2*346)) 45.8346 best keyword for cluster 6558767 is PF00871 with Jaccard = 0.9969 [ 320 0 1099890 1 ] 1.0000 0.9969 sibling [ 6558767 ] : 6716438 0.0583333 (=7/(10*12)) 94.9506 best keyword for cluster 6716438 is PF07318 with Jaccard = 0.9231 [ 12 1 1100198 0 ] 0.9231 1.0000 SUGGESTING RELATEDNESS OF: A> PF00871 ( PF00871 Acetokinase family ) B> PF07318 ( PF07318 Protein of unknown function (DUF1464) ) Only A has a clan ( CL0108.10 ). the two keywords do not coincide on UniRef90 proteins only PF00871 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 249 ) 6685101_PF01268_PF02882 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01268 is 6574435 with Jaccard = 0.9968 |PF01268|=308 [ 307 0 1099903 1 ] parent [ 6574435 ] : 6685101 0.10742 (=12844/(318*376)) 89.2919 given [ 6574435 ] : 6574435 0.47943 (=303/(2*316)) 52.0643 best keyword for cluster 6574435 is PF01268 with Jaccard = 0.9968 [ 307 0 1099903 1 ] 1.0000 0.9968 sibling [ 6574435 ] : 6677784 0.165333 (=62/(1*375)) 87.6654 best keyword for cluster 6677784 is PF02882 with Jaccard = 0.8992 [ 339 4 1099834 34 ] 0.9883 0.9088 SUGGESTING RELATEDNESS OF: A> PF01268 ( PF01268 Formate--tetrahydrofolate ligase ) B> PF02882 ( PF02882 Tetrahydrofolate dehydrogenase/cyclohydrolase, NAD(P)-binding domain ) Only B has a clan ( CL0063.17 ). the two keywords coincide on Uniref90 proteins: |PF01268| = 308 , |PF02882| = 373 , |PF01268^PF02882| = 35 ( 11.4% and 9.4% ) both PF01268 and PF02882 have PDB structures PF01268 c.37.1.10 SUPERFAM mapping significantly overlapping: 1 PF02882 SSF51735 0.956 (average over 1086 mutual instances, PF02882 1086 appearances, SSF51735 164772 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 250 ) 6662778_PF01795_PF06962 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01795 is 6617005 with Jaccard = 0.9965 |PF01795|=284 [ 284 1 1099926 0 ] parent [ 6617005 ] : 6662778 0.197811 (=3290/(54*308)) 83.8393 given [ 6617005 ] : 6617005 0.330055 (=302/(305*3)) 69.2658 best keyword for cluster 6617005 is PF01795 with Jaccard = 0.9965 [ 284 1 1099926 0 ] 0.9965 1.0000 sibling [ 6617005 ] : 6456009 0.981132 (=52/(1*53)) 1.8999 best keyword for cluster 6456009 is PF06962 with Jaccard = 0.9800 [ 49 0 1100161 1 ] 1.0000 0.9800 SUGGESTING RELATEDNESS OF: A> PF01795 ( PF01795 MraW methylase family ) B> PF06962 ( PF06962 Putative rRNA methylase ) they come from the same clan: CL0102.14 : PF06962 PF00398 PF06325 PF03291 PF01135 PF01358 PF06460 PF01189 PF05401 PF01234 PF01555 PF02384 PF07942 PF05175 PF05063 PF07109 PF02475 PF07021 PF08003 PF05148 PF01795 PF02390 PF01596 PF00891 PF09445 PF08242 PF08241 PF05971 PF02086 PF02527 PF08704 PF01728 PF01269 PF07669 PF06080 PF05891 PF05430 PF04816 PF04672 PF04445 PF04378 PF01861 PF03269 PF03141 PF07757 PF07279 PF05219 PF08123 PF00145 PF03602 PF02353 PF01739 PF06859 PF09243 PF01564 PF03848 PF05724 PF02005 PF05958 PF01209 PF01170 the two keywords do not coincide on UniRef90 proteins only PF01795 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 251 ) 6746842_PF00015_PF02470 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02470 is 6736817 with Jaccard = 0.9964 |PF02470|=557 [ 555 0 1099654 2 ] parent [ 6736817 ] : 6746842 0.0224868 (=66499/(628*4709)) 98.375 given [ 6736817 ] : 6736817 0.0317717 (=177/(619*9)) 97.4622 best keyword for cluster 6736817 is PF02470 with Jaccard = 0.9964 [ 555 0 1099654 2 ] 1.0000 0.9964 sibling [ 6736817 ] : 6745796 0.0236725 (=35831/(347*4362)) 98.2964 best keyword for cluster 6745796 is PF00015 with Jaccard = 0.8016 [ 2735 648 1096799 29 ] 0.8085 0.9895 SUGGESTING RELATEDNESS OF: A> PF02470 ( PF02470 mce related protein ) B> PF00015 ( PF00015 Methyl-accepting chemotaxis protein (MCP) signaling domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02470 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 252 ) 6658992_PF00068_PF05826 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00068 is 6645295 with Jaccard = 0.9961 |PF00068|=254 [ 253 0 1099957 1 ] parent [ 6645295 ] : 6658992 0.24362 (=2482/(36*283)) 83.0149 given [ 6645295 ] : 6645295 0.229537 (=129/(2*281)) 78.5045 best keyword for cluster 6645295 is PF00068 with Jaccard = 0.9961 [ 253 0 1099957 1 ] 1.0000 0.9961 sibling [ 6645295 ] : 6616056 0.343434 (=34/(3*33)) 68.9341 best keyword for cluster 6616056 is PF05826 with Jaccard = 1.0000 [ 32 0 1100179 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00068 ( PF00068 Phospholipase A2 ) B> PF05826 ( PF05826 Phospholipase A2 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF00068 and PF05826 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF05826 SSF48619 0.746 (average over 62 mutual instances, PF05826 62 appearances, SSF48619 849 appearances) 2 PF00068 SSF48619 0.990 (average over 726 mutual instances, PF00068 726 appearances, SSF48619 849 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 253 ) 6651265_PF00977_PF01884 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00977 is 6632472 with Jaccard = 0.9959 |PF00977|=486 [ 484 0 1099725 2 ] parent [ 6632472 ] : 6651265 0.231575 (=6322/(52*525)) 80.4663 given [ 6632472 ] : 6632472 0.294455 (=308/(2*523)) 75.2482 best keyword for cluster 6632472 is PF00977 with Jaccard = 0.9959 [ 484 0 1099725 2 ] 1.0000 0.9959 sibling [ 6632472 ] : 6538508 0.696429 (=468/(28*24)) 31.1492 best keyword for cluster 6538508 is PF01884 with Jaccard = 0.9804 [ 50 0 1100160 1 ] 1.0000 0.9804 SUGGESTING RELATEDNESS OF: A> PF00977 ( PF00977 Histidine biosynthesis protein ) B> PF01884 ( PF01884 PcrB family ) they come from the same clan: CL0036.17 : PF05690 PF01680 PF00834 PF01729 PF00697 PF03740 PF01884 PF00724 PF00215 PF03060 PF04095 PF04131 PF00478 PF00218 PF00977 PF01645 PF04309 PF01070 PF01207 PF04481 PF04476 PF01180 PF00701 PF01791 PF03932 PF03437 PF01081 PF00121 PF09370 PF02581 PF00290 the two keywords do not coincide on UniRef90 proteins both PF00977 and PF01884 have PDB structures PF00977 c.1.2.1 SUPERFAM mapping significantly overlapping: 1 PF00977 SSF51366 0.938 (average over 1629 mutual instances, PF00977 1632 appearances, SSF51366 8168 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 254 ) 6643704_PF00873_PF02355 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00873 is 6640682 with Jaccard = 0.9952 |PF00873|=1263 [ 1257 0 1098948 6 ] parent [ 6640682 ] : 6643704 0.250534 (=188158/(523*1436)) 78.0119 given [ 6640682 ] : 6640682 0.230447 (=1320/(4*1432)) 77.2349 best keyword for cluster 6640682 is PF00873 with Jaccard = 0.9952 [ 1257 0 1098948 6 ] 1.0000 0.9952 sibling [ 6640682 ] : 6518549 0.814485 (=55284/(239*284)) 19.5749 best keyword for cluster 6518549 is PF02355 with Jaccard = 0.9553 [ 449 20 1099741 1 ] 0.9574 0.9978 SUGGESTING RELATEDNESS OF: A> PF00873 ( PF00873 AcrB/AcrD/AcrF family ) B> PF02355 ( PF02355 Protein export membrane protein ) Only A has a clan ( CL0182.8 ). the two keywords do not coincide on UniRef90 proteins only PF00873 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 255 ) 6748546_PF00316_PF00459 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00316 is 6540891 with Jaccard = 0.9951 |PF00316|=206 [ 205 0 1100005 1 ] parent [ 6540891 ] : 6748546 0.0209252 (=4349/(223*932)) 98.5054 given [ 6540891 ] : 6540891 0.690045 (=305/(2*221)) 33.0142 best keyword for cluster 6540891 is PF00316 with Jaccard = 0.9951 [ 205 0 1100005 1 ] 1.0000 0.9951 sibling [ 6540891 ] : 6741039 0.0325564 (=3013/(113*819)) 97.88 best keyword for cluster 6741039 is PF00459 with Jaccard = 0.8765 [ 752 102 1099353 4 ] 0.8806 0.9947 SUGGESTING RELATEDNESS OF: A> PF00316 ( PF00316 Fructose-1-6-bisphosphatase ) B> PF00459 ( PF00459 Inositol monophosphatase family ) they come from the same clan: CL0171.6 : PF00316 PF03320 PF00459 the two keywords do not coincide on UniRef90 proteins both PF00316 and PF00459 have PDB structures PF00316 e.7.1.1 PF00459 e.7.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 256 ) 6752572_PF02146_PF04502 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02146 is 6733556 with Jaccard = 0.9951 |PF02146|=409 [ 407 0 1099802 2 ] parent [ 6733556 ] : 6752572 0.0122138 (=463/(81*468)) 98.8082 given [ 6733556 ] : 6733556 0.0299786 (=14/(1*467)) 97.1176 best keyword for cluster 6733556 is PF02146 with Jaccard = 0.9951 [ 407 0 1099802 2 ] 1.0000 0.9951 sibling [ 6733556 ] : 6642691 0.240506 (=38/(2*79)) 77.7892 best keyword for cluster 6642691 is PF04502 with Jaccard = 0.9863 [ 72 0 1100138 1 ] 1.0000 0.9863 SUGGESTING RELATEDNESS OF: A> PF02146 ( PF02146 Sir2 family ) B> PF04502 ( PF04502 Family of unknown function (DUF572) ) Only A has a clan ( CL0085.9 ). the two keywords coincide on Uniref90 proteins: |PF02146| = 409 , |PF04502| = 73 , |PF02146^PF04502| = 1 ( 0.2% and 1.4% ) only PF02146 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 257 ) 6769822_PF02416_PF07544 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02416 is 6755055 with Jaccard = 0.9951 |PF02416|=406 [ 404 0 1099805 2 ] parent [ 6755055 ] : 6769822 0.00409531 (=154/(68*553)) 99.7048 given [ 6755055 ] : 6755055 0.0130956 (=199/(29*524)) 98.9733 best keyword for cluster 6755055 is PF02416 with Jaccard = 0.9951 [ 404 0 1099805 2 ] 1.0000 0.9951 sibling [ 6755055 ] : 6751265 0.0138408 (=12/(51*17)) 98.7155 best keyword for cluster 6751265 is PF07544 with Jaccard = 0.9130 [ 21 0 1100188 2 ] 1.0000 0.9130 SUGGESTING RELATEDNESS OF: A> PF02416 ( PF02416 mttA/Hcf106 family ) B> PF07544 ( PF07544 RNA polymerase II transcription mediator ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02416 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 258 ) 6758546_PF02699_PF04085 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04085 is 6585954 with Jaccard = 0.9950 |PF04085|=199 [ 198 0 1100012 1 ] parent [ 6585954 ] : 6758546 0.0139672 (=832/(219*272)) 99.1852 given [ 6585954 ] : 6585954 0.486239 (=106/(1*218)) 55.8544 best keyword for cluster 6585954 is PF04085 with Jaccard = 0.9950 [ 198 0 1100012 1 ] 1.0000 0.9950 sibling [ 6585954 ] : 6737068 0.0417234 (=276/(245*27)) 97.4961 best keyword for cluster 6737068 is PF02699 with Jaccard = 0.9087 [ 209 21 1099981 0 ] 0.9087 1.0000 SUGGESTING RELATEDNESS OF: A> PF04085 ( PF04085 rod shape-determining protein MreC ) B> PF02699 ( PF02699 Preprotein translocase subunit ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04085 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 259 ) 6729668_PF01758_PF03977 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01758 is 6699795 with Jaccard = 0.9947 |PF01758|=377 [ 376 1 1099833 1 ] parent [ 6699795 ] : 6729668 0.0405675 (=1324/(69*473)) 96.6876 given [ 6699795 ] : 6699795 0.0941692 (=730/(17*456)) 92.1405 best keyword for cluster 6699795 is PF01758 with Jaccard = 0.9947 [ 376 1 1099833 1 ] 0.9973 0.9973 sibling [ 6699795 ] : 6216906 1 (=320/(5*64)) 6.56969e-16 best keyword for cluster 6216906 is PF03977 with Jaccard = 0.9839 [ 61 0 1100149 1 ] 1.0000 0.9839 SUGGESTING RELATEDNESS OF: A> PF01758 ( PF01758 Sodium Bile acid symporter family ) B> PF03977 ( PF03977 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit ) they come from the same clan: CL0064.7 : PF06826 PF03547 PF03601 PF05684 PF05982 PF03616 PF06965 PF00999 PF03977 PF01758 the two keywords do not coincide on UniRef90 proteins Neither PF01758 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 260 ) 6741096_PF00177_PF05549 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00177 is 6606993 with Jaccard = 0.9946 |PF00177|=367 [ 365 0 1099844 2 ] parent [ 6606993 ] : 6741096 0.0297619 (=140/(392*12)) 97.8852 given [ 6606993 ] : 6606993 0.353846 (=276/(2*390)) 65.1253 best keyword for cluster 6606993 is PF00177 with Jaccard = 0.9946 [ 365 0 1099844 2 ] 1.0000 0.9946 sibling [ 6606993 ] : 6668816 0.2 (=4/(10*2)) 85.195 best keyword for cluster 6668816 is PF05549 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00177 ( PF00177 Ribosomal protein S7p/S5e ) B> PF05549 ( PF05549 Allexivirus 40kDa protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00177 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00177 SSF47973 0.955 (average over 1488 mutual instances, PF00177 1489 appearances, SSF47973 1493 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 261 ) 6754138_PF00881_PF02277 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02277 is 6696819 with Jaccard = 0.9945 |PF02277|=182 [ 181 0 1100029 1 ] parent [ 6696819 ] : 6754138 0.0116079 (=2488/(197*1088)) 98.9145 given [ 6696819 ] : 6696819 0.0972222 (=665/(45*152)) 91.7 best keyword for cluster 6696819 is PF02277 with Jaccard = 0.9945 [ 181 0 1100029 1 ] 1.0000 0.9945 sibling [ 6696819 ] : 6738255 0.0319249 (=2000/(61*1027)) 97.6168 best keyword for cluster 6738255 is PF00881 with Jaccard = 0.9383 [ 821 44 1099336 10 ] 0.9491 0.9880 SUGGESTING RELATEDNESS OF: A> PF02277 ( PF02277 Phosphoribosyltransferase ) B> PF00881 ( PF00881 Nitroreductase family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00881| = 831 , |PF02277| = 182 , |PF00881^PF02277| = 4 ( 0.5% and 2.2% ) both PF02277 and PF00881 have PDB structures PF02277 c.39.1.1 SUPERFAM mapping significantly overlapping: 1 PF02277 SSF52733 0.934 (average over 503 mutual instances, PF02277 510 appearances, SSF52733 515 appearances) 2 PF00881 SSF55469 0.829 (average over 2724 mutual instances, PF00881 2740 appearances, SSF55469 3051 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 262 ) 6778875_PF03152_PF04203 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04203 is 6774760 with Jaccard = 0.9944 |PF04203|=178 [ 177 0 1100033 1 ] parent [ 6774760 ] : 6778875 0.000955779 (=30/(133*236)) 99.9354 given [ 6774760 ] : 6774760 0.00224905 (=19/(44*192)) 99.8497 best keyword for cluster 6774760 is PF04203 with Jaccard = 0.9944 [ 177 0 1100033 1 ] 1.0000 0.9944 sibling [ 6774760 ] : 6773392 0.00272603 (=12/(62*71)) 99.8155 best keyword for cluster 6773392 is PF03152 with Jaccard = 0.6304 [ 58 24 1100119 10 ] 0.7073 0.8529 SUGGESTING RELATEDNESS OF: A> PF04203 ( PF04203 Sortase family ) B> PF03152 ( PF03152 Ubiquitin fusion degradation protein UFD1 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF04203 and PF03152 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 263 ) 6751374_PF01150_PF02541 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01150 is 6711614 with Jaccard = 0.9942 |PF01150|=173 [ 172 0 1100038 1 ] parent [ 6711614 ] : 6751374 0.0188701 (=1168/(331*187)) 98.7241 given [ 6711614 ] : 6711614 0.0626984 (=79/(180*7)) 94.1834 best keyword for cluster 6711614 is PF01150 with Jaccard = 0.9942 [ 172 0 1100038 1 ] 1.0000 0.9942 sibling [ 6711614 ] : 6707944 0.074159 (=97/(327*4)) 93.6222 best keyword for cluster 6707944 is PF02541 with Jaccard = 0.9933 [ 297 0 1099912 2 ] 1.0000 0.9933 SUGGESTING RELATEDNESS OF: A> PF01150 ( PF01150 GDA1/CD39 (nucleoside phosphatase) family ) B> PF02541 ( PF02541 Ppx/GppA phosphatase family ) they come from the same clan: CL0108.10 : PF06406 PF00480 PF02541 PF00814 PF06723 PF05378 PF01968 PF00012 PF03727 PF00349 PF02685 PF01150 PF02491 PF00370 PF02782 PF02543 PF01869 PF00022 PF00871 PF03702 the two keywords do not coincide on UniRef90 proteins only PF01150 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 264 ) 6733370_PF01936_PF04396 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01936 is 6714532 with Jaccard = 0.9939 |PF01936|=164 [ 164 1 1100046 0 ] parent [ 6714532 ] : 6733370 0.0374665 (=546/(247*59)) 97.0951 given [ 6714532 ] : 6714532 0.0716487 (=186/(11*236)) 94.6519 best keyword for cluster 6714532 is PF01936 with Jaccard = 0.9939 [ 164 1 1100046 0 ] 0.9939 1.0000 sibling [ 6714532 ] : 6719105 0.0471698 (=15/(6*53)) 95.2927 best keyword for cluster 6719105 is PF04396 with Jaccard = 0.9762 [ 41 0 1100169 1 ] 1.0000 0.9762 SUGGESTING RELATEDNESS OF: A> PF01936 ( PF01936 Protein of unknown function DUF88 ) B> PF04396 ( PF04396 Protein of unknown function, DUF537 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01936 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 265 ) 6705925_PF00950_PF01032 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01032 is 6676879 with Jaccard = 0.9938 |PF01032|=809 [ 804 0 1099402 5 ] parent [ 6676879 ] : 6705925 0.0807105 (=32731/(874*464)) 93.2869 given [ 6676879 ] : 6676879 0.15729 (=411/(3*871)) 87.4365 best keyword for cluster 6676879 is PF01032 with Jaccard = 0.9938 [ 804 0 1099402 5 ] 1.0000 0.9938 sibling [ 6676879 ] : 6697540 0.0822242 (=834/(441*23)) 91.8064 best keyword for cluster 6697540 is PF00950 with Jaccard = 0.9878 [ 404 4 1099802 1 ] 0.9902 0.9975 SUGGESTING RELATEDNESS OF: A> PF01032 ( PF01032 FecCD transport family ) B> PF00950 ( PF00950 ABC 3 transport family ) they come from the same clan: CL0142.6 : PF00950 PF05145 PF02653 PF01032 PF01098 the two keywords do not coincide on UniRef90 proteins only PF01032 has a PDB structure (may not be up to date) PF01032 f.22.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 266 ) 6763029_PF01925_PF07290 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01925 is 6700986 with Jaccard = 0.9932 |PF01925|=877 [ 871 0 1099334 6 ] parent [ 6700986 ] : 6763029 0.00696362 (=343/(1048*47)) 99.4184 given [ 6700986 ] : 6700986 0.0958065 (=17626/(223*825)) 92.3528 best keyword for cluster 6700986 is PF01925 with Jaccard = 0.9932 [ 871 0 1099334 6 ] 1.0000 0.9932 sibling [ 6700986 ] : 6751882 0.0163043 (=9/(23*24)) 98.7609 best keyword for cluster 6751882 is PF07290 with Jaccard = 0.9286 [ 13 1 1100197 0 ] 0.9286 1.0000 SUGGESTING RELATEDNESS OF: A> PF01925 ( PF01925 Domain of unknown function DUF81 ) B> PF07290 ( PF07290 Protein of unknown function (DUF1449) ) Only B has a clan ( CL0252.2 ). the two keywords do not coincide on UniRef90 proteins Neither PF01925 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 267 ) 6524525_PF00006_PF07497 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07497 is 6391976 with Jaccard = 0.9932 |PF07497|=145 [ 145 1 1100065 0 ] parent [ 6391976 ] : 6524525 0.796193 (=152904/(164*1171)) 22.7046 given [ 6391976 ] : 6391976 1 (=163/(1*163)) 0.00281036 best keyword for cluster 6391976 is PF07497 with Jaccard = 0.9932 [ 145 1 1100065 0 ] 0.9932 1.0000 sibling [ 6391976 ] : 6500669 0.896913 (=5229/(5*1166)) 11.9606 best keyword for cluster 6500669 is PF00006 with Jaccard = 0.8694 [ 1092 4 1098955 160 ] 0.9964 0.8722 SUGGESTING RELATEDNESS OF: A> PF07497 ( PF07497 Rho termination factor, RNA-binding domain ) B> PF00006 ( PF00006 ATP synthase alpha/beta family, nucleotide-binding domain ) Only A has a clan ( CL0021.12 ). the two keywords coincide on Uniref90 proteins: |PF00006| = 1252 , |PF07497| = 145 , |PF00006^PF07497| = 144 ( 11.5% and 99.3% ) both PF07497 and PF00006 have PDB structures PF00006 b.86.1.2 c.37.1.11 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 268 ) 6737226_PF04610_PF07863 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04610 is 6726881 with Jaccard = 0.9931 |PF04610|=144 [ 143 0 1100067 1 ] parent [ 6726881 ] : 6737226 0.0323643 (=167/(24*215)) 97.5038 given [ 6726881 ] : 6726881 0.0467715 (=452/(64*151)) 96.3417 best keyword for cluster 6726881 is PF04610 with Jaccard = 0.9931 [ 143 0 1100067 1 ] 1.0000 0.9931 sibling [ 6726881 ] : 6665083 0.181818 (=8/(2*22)) 84.2675 best keyword for cluster 6665083 is PF07863 with Jaccard = 1.0000 [ 15 0 1100196 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04610 ( PF04610 TrbL/VirB6 plasmid conjugal transfer protein ) B> PF07863 ( PF07863 Homologues of TraJ from Bacteroides conjugative transposon ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04610 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 269 ) 6705892_PF02091_PF02092 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02091 is 6439843 with Jaccard = 0.9928 |PF02091|=138 [ 137 0 1100073 1 ] parent [ 6439843 ] : 6705892 0.0673119 (=2177/(157*206)) 93.279 given [ 6439843 ] : 6439843 0.993506 (=459/(154*3)) 0.649351 best keyword for cluster 6439843 is PF02091 with Jaccard = 0.9928 [ 137 0 1100073 1 ] 1.0000 0.9928 sibling [ 6439843 ] : 6672209 0.165854 (=34/(1*205)) 86.0968 best keyword for cluster 6672209 is PF02092 with Jaccard = 0.9381 [ 182 2 1100017 10 ] 0.9891 0.9479 SUGGESTING RELATEDNESS OF: A> PF02091 ( PF02091 Glycyl-tRNA synthetase alpha subunit ) B> PF02092 ( PF02092 Glycyl-tRNA synthetase beta subunit ) Only A has a clan ( CL0040.10 ). the two keywords coincide on Uniref90 proteins: |PF02091| = 138 , |PF02092| = 192 , |PF02091^PF02092| = 11 ( 8.0% and 5.7% ) only PF02091 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 270 ) 6716140_PF00768_PF02113 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02113 is 6648149 with Jaccard = 0.9926 |PF02113|=136 [ 135 0 1100075 1 ] parent [ 6648149 ] : 6716140 0.0725591 (=5029/(151*459)) 94.907 given [ 6648149 ] : 6648149 0.22973 (=102/(148*3)) 79.4139 best keyword for cluster 6648149 is PF02113 with Jaccard = 0.9926 [ 135 0 1100075 1 ] 1.0000 0.9926 sibling [ 6648149 ] : 6675816 0.14442 (=132/(2*457)) 87.111 best keyword for cluster 6675816 is PF00768 with Jaccard = 0.9881 [ 416 2 1099790 3 ] 0.9952 0.9928 SUGGESTING RELATEDNESS OF: A> PF02113 ( PF02113 D-Ala-D-Ala carboxypeptidase 3 (S13) family ) B> PF00768 ( PF00768 D-alanyl-D-alanine carboxypeptidase ) they come from the same clan: CL0013.12 : PF02113 PF00768 PF04960 PF00144 PF00905 the two keywords coincide on Uniref90 proteins: |PF00768| = 419 , |PF02113| = 136 , |PF00768^PF02113| = 1 ( 0.2% and 0.7% ) both PF02113 and PF00768 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02113 SSF56601 0.850 (average over 447 mutual instances, PF02113 449 appearances, SSF56601 18812 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 271 ) 6723068_PF02485_PF03267 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02485 is 6601232 with Jaccard = 0.9924 |PF02485|=132 [ 131 0 1100079 1 ] parent [ 6601232 ] : 6723068 0.0507246 (=364/(52*138)) 95.8668 given [ 6601232 ] : 6601232 0.402523 (=989/(21*117)) 62.3826 best keyword for cluster 6601232 is PF02485 with Jaccard = 0.9924 [ 131 0 1100079 1 ] 1.0000 0.9924 sibling [ 6601232 ] : 6587498 0.490196 (=25/(1*51)) 56.3828 best keyword for cluster 6587498 is PF03267 with Jaccard = 1.0000 [ 48 0 1100163 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02485 ( PF02485 Core-2/I-Branching enzyme ) B> PF03267 ( PF03267 Domain of unknown function, DUF266 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02485 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 272 ) 6755148_PF03788_PF04172 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04172 is 6595757 with Jaccard = 0.9923 |PF04172|=129 [ 129 1 1100081 0 ] parent [ 6595757 ] : 6755148 0.0120537 (=262/(152*143)) 98.979 given [ 6595757 ] : 6595757 0.420582 (=188/(149*3)) 59.9851 best keyword for cluster 6595757 is PF04172 with Jaccard = 0.9923 [ 129 1 1100081 0 ] 0.9923 1.0000 sibling [ 6595757 ] : 6630440 0.282609 (=195/(138*5)) 74.829 best keyword for cluster 6630440 is PF03788 with Jaccard = 0.9764 [ 124 0 1100084 3 ] 1.0000 0.9764 SUGGESTING RELATEDNESS OF: A> PF04172 ( PF04172 LrgB-like family ) B> PF03788 ( PF03788 LrgA family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03788| = 127 , |PF04172| = 129 , |PF03788^PF04172| = 1 ( 0.8% and 0.8% ) Neither PF04172 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 273 ) 6767657_PF00278_PF01168 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01168 is 6746182 with Jaccard = 0.9919 |PF01168|=613 [ 611 3 1099595 2 ] parent [ 6746182 ] : 6767657 0.00528002 (=3167/(781*768)) 99.625 given [ 6746182 ] : 6746182 0.0207981 (=3023/(306*475)) 98.3253 best keyword for cluster 6746182 is PF01168 with Jaccard = 0.9919 [ 611 3 1099595 2 ] 0.9951 0.9967 sibling [ 6746182 ] : 6766706 0.00502084 (=53/(14*754)) 99.5864 best keyword for cluster 6766706 is PF00278 with Jaccard = 0.9248 [ 615 46 1099546 4 ] 0.9304 0.9935 SUGGESTING RELATEDNESS OF: A> PF01168 ( PF01168 Alanine racemase, N-terminal domain ) B> PF00278 ( PF00278 Pyridoxal-dependent decarboxylase, C-terminal sheet domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01168 and PF00278 have PDB structures PF01168 c.1.6.1 c.1.6.2 SUPERFAM mapping significantly overlapping: 1 PF00278 SSF50621 0.705 (average over 1601 mutual instances, PF00278 1639 appearances, SSF50621 5076 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 274 ) 6763955_PF01292_PF04264 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01292 is 6758683 with Jaccard = 0.9918 |PF01292|=365 [ 362 0 1099846 3 ] parent [ 6758683 ] : 6763955 0.00549582 (=920/(270*620)) 99.4633 given [ 6758683 ] : 6758683 0.00876913 (=165/(32*588)) 99.1933 best keyword for cluster 6758683 is PF01292 with Jaccard = 0.9918 [ 362 0 1099846 3 ] 1.0000 0.9918 sibling [ 6758683 ] : 6712397 0.0656566 (=104/(6*264)) 94.3155 best keyword for cluster 6712397 is PF04264 with Jaccard = 0.9674 [ 208 2 1099996 5 ] 0.9905 0.9765 SUGGESTING RELATEDNESS OF: A> PF01292 ( PF01292 Cytochrome b561 family ) B> PF04264 ( PF04264 YceI-like domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01292| = 365 , |PF04264| = 213 , |PF01292^PF04264| = 4 ( 1.1% and 1.9% ) only PF01292 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04264 SSF101874 0.935 (average over 775 mutual instances, PF04264 819 appearances, SSF101874 813 appearances) 2 PF01292 SSF81342 0.948 (average over 1156 mutual instances, PF01292 1185 appearances, SSF81342 82802 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 275 ) 6728285_PF02127_PF05343 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02127 is 6527533 with Jaccard = 0.9917 |PF02127|=120 [ 119 0 1100091 1 ] parent [ 6527533 ] : 6728285 0.0466621 (=1082/(124*187)) 96.5157 given [ 6527533 ] : 6527533 0.764228 (=94/(1*123)) 24.7605 best keyword for cluster 6527533 is PF02127 with Jaccard = 0.9917 [ 119 0 1100091 1 ] 1.0000 0.9917 sibling [ 6527533 ] : 6712483 0.0628415 (=46/(4*183)) 94.332 best keyword for cluster 6712483 is PF05343 with Jaccard = 0.9607 [ 171 0 1100033 7 ] 1.0000 0.9607 SUGGESTING RELATEDNESS OF: A> PF02127 ( PF02127 Aminopeptidase I zinc metalloprotease (M18) ) B> PF05343 ( PF05343 M42 glutamyl aminopeptidase ) they come from the same clan: CL0035.11 : PF05343 PF04389 PF01546 PF02127 PF00883 PF00246 PF05450 PF04952 the two keywords do not coincide on UniRef90 proteins both PF02127 and PF05343 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 276 ) 6678820_PF03710_PF08335 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03710 is 6261092 with Jaccard = 0.9917 |PF03710|=121 [ 120 0 1100090 1 ] parent [ 6261092 ] : 6678820 0.15667 (=3448/(131*168)) 87.8852 given [ 6261092 ] : 6261092 1 (=258/(129*2)) 1.9382e-12 best keyword for cluster 6261092 is PF03710 with Jaccard = 0.9917 [ 120 0 1100090 1 ] 1.0000 0.9917 sibling [ 6261092 ] : 6665383 0.183735 (=61/(2*166)) 84.3405 best keyword for cluster 6665383 is PF08335 with Jaccard = 0.7584 [ 113 35 1100062 1 ] 0.7635 0.9912 SUGGESTING RELATEDNESS OF: A> PF03710 ( PF03710 Glutamate-ammonia ligase adenylyltransferase ) B> PF08335 ( PF08335 GlnD PII-uridylyltransferase ) Only A has a clan ( CL0260.2 ). the two keywords do not coincide on UniRef90 proteins only PF03710 has a PDB structure (may not be up to date) PF03710 d.218.1.9 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 277 ) 6745083_PF03901_PF04921 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03901 is 6607080 with Jaccard = 0.9917 |PF03901|=120 [ 119 0 1100091 1 ] parent [ 6607080 ] : 6745083 0.0183206 (=84/(131*35)) 98.2412 given [ 6607080 ] : 6607080 0.387906 (=789/(18*113)) 65.2303 best keyword for cluster 6607080 is PF03901 with Jaccard = 0.9917 [ 119 0 1100091 1 ] 1.0000 0.9917 sibling [ 6607080 ] : 6731220 0.0402299 (=7/(29*6)) 96.8569 best keyword for cluster 6731220 is PF04921 with Jaccard = 0.9333 [ 28 2 1100181 0 ] 0.9333 1.0000 SUGGESTING RELATEDNESS OF: A> PF03901 ( PF03901 Alg9-like mannosyltransferase family ) B> PF04921 ( PF04921 XAP5 protein ) Only A has a clan ( CL0111.6 ). the two keywords coincide on Uniref90 proteins: |PF03901| = 120 , |PF04921| = 28 , |PF03901^PF04921| = 1 ( 0.8% and 3.6% ) Neither PF03901 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 278 ) 6755711_PF01323_PF06965 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06965 is 6502272 with Jaccard = 0.9916 |PF06965|=119 [ 118 0 1100092 1 ] parent [ 6502272 ] : 6755711 0.0105609 (=1406/(133*1001)) 99.0137 given [ 6502272 ] : 6502272 0.900763 (=236/(2*131)) 12.1763 best keyword for cluster 6502272 is PF06965 with Jaccard = 0.9916 [ 118 0 1100092 1 ] 1.0000 0.9916 sibling [ 6502272 ] : 6743735 0.0261836 (=438/(984*17)) 98.1195 best keyword for cluster 6743735 is PF01323 with Jaccard = 0.9719 [ 518 3 1099678 12 ] 0.9942 0.9774 SUGGESTING RELATEDNESS OF: A> PF06965 ( PF06965 Na+/H+ antiporter 1 ) B> PF01323 ( PF01323 DSBA-like thioredoxin domain ) A and B come from a different clan ( CL0064.7 , CL0172.11 ). the two keywords coincide on Uniref90 proteins: |PF01323| = 530 , |PF06965| = 119 , |PF01323^PF06965| = 4 ( 0.8% and 3.4% ) both PF06965 and PF01323 have PDB structures PF01323 c.47.1.13 SUPERFAM mapping significantly overlapping: 1 PF01323 SSF52833 0.887 (average over 1786 mutual instances, PF01323 1794 appearances, SSF52833 34965 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 279 ) 6745017_PF02321_PF07405 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02321 is 6742278 with Jaccard = 0.9915 |PF02321|=1170 [ 1164 4 1099037 6 ] parent [ 6742278 ] : 6745017 0.019257 (=339/(12*1467)) 98.2359 given [ 6742278 ] : 6742278 0.0242461 (=283/(8*1459)) 97.9984 best keyword for cluster 6742278 is PF02321 with Jaccard = 0.9915 [ 1164 4 1099037 6 ] 0.9966 0.9949 sibling [ 6742278 ] : 6723188 0.0571429 (=2/(7*5)) 95.8857 best keyword for cluster 6723188 is PF07405 with Jaccard = 0.7500 [ 3 1 1100207 0 ] 0.7500 1.0000 SUGGESTING RELATEDNESS OF: A> PF02321 ( PF02321 Outer membrane efflux protein ) B> PF07405 ( PF07405 Protein of unknown function (DUF1506) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02321 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 280 ) 6629352_PF01053_PF06838 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01053 is 6621731 with Jaccard = 0.9914 |PF01053|=815 [ 809 1 1099395 6 ] parent [ 6621731 ] : 6629352 0.29154 (=15721/(61*884)) 74.3192 given [ 6621731 ] : 6621731 0.436014 (=385/(1*883)) 71.2134 best keyword for cluster 6621731 is PF01053 with Jaccard = 0.9914 [ 809 1 1099395 6 ] 0.9988 0.9926 sibling [ 6621731 ] : 6560120 0.551515 (=182/(6*55)) 46.9849 best keyword for cluster 6560120 is PF06838 with Jaccard = 0.9455 [ 52 3 1100156 0 ] 0.9455 1.0000 SUGGESTING RELATEDNESS OF: A> PF01053 ( PF01053 Cys/Met metabolism PLP-dependent enzyme ) B> PF06838 ( PF06838 Aluminium resistance protein ) they come from the same clan: CL0061.8 : PF05889 PF00464 PF03841 PF00282 PF01276 PF02347 PF01041 PF01053 PF01212 PF00266 PF00202 PF00155 PF06838 PF04864 the two keywords do not coincide on UniRef90 proteins only PF01053 has a PDB structure (may not be up to date) PF01053 c.67.1.3 SUPERFAM mapping significantly overlapping: 1 PF06838 SSF53383 0.898 (average over 166 mutual instances, PF06838 167 appearances, SSF53383 34644 appearances) 2 PF01053 SSF53383 0.965 (average over 2570 mutual instances, PF01053 2583 appearances, SSF53383 34644 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 281 ) 6768387_PF02221_PF06011 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02221 is 6758279 with Jaccard = 0.9912 |PF02221|=113 [ 112 0 1100098 1 ] parent [ 6758279 ] : 6768387 0.00479167 (=69/(160*90)) 99.6532 given [ 6758279 ] : 6758279 0.0111989 (=34/(138*22)) 99.1694 best keyword for cluster 6758279 is PF02221 with Jaccard = 0.9912 [ 112 0 1100098 1 ] 1.0000 0.9912 sibling [ 6758279 ] : 6767238 0.011236 (=1/(1*89)) 99.6067 best keyword for cluster 6767238 is PF06011 with Jaccard = 0.9508 [ 58 3 1100150 0 ] 0.9508 1.0000 SUGGESTING RELATEDNESS OF: A> PF02221 ( PF02221 ML domain ) B> PF06011 ( PF06011 Transient receptor potential (TRP) ion channel ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02221 has a PDB structure (may not be up to date) PF02221 b.1.18.7 SUPERFAM mapping significantly overlapping: 1 PF02221 SSF81296 0.978 (average over 173 mutual instances, PF02221 173 appearances, SSF81296 30857 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 282 ) 6758016_PF01026_PF02126 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01026 is 6741382 with Jaccard = 0.9907 |PF01026|=536 [ 532 1 1099674 4 ] parent [ 6741382 ] : 6758016 0.0114149 (=314/(46*598)) 99.1546 given [ 6741382 ] : 6741382 0.022766 (=147/(587*11)) 97.9137 best keyword for cluster 6741382 is PF01026 with Jaccard = 0.9907 [ 532 1 1099674 4 ] 0.9981 0.9925 sibling [ 6741382 ] : 6482848 0.94375 (=453/(30*16)) 6.35177 best keyword for cluster 6482848 is PF02126 with Jaccard = 0.9333 [ 42 0 1100166 3 ] 1.0000 0.9333 SUGGESTING RELATEDNESS OF: A> PF01026 ( PF01026 TatD related DNase ) B> PF02126 ( PF02126 Phosphotriesterase family ) they come from the same clan: CL0034.9 : PF01979 PF04909 PF07969 PF00962 PF01244 PF02811 PF02126 PF01026 the two keywords do not coincide on UniRef90 proteins both PF01026 and PF02126 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 283 ) 6769239_PF02329_PF03637 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03637 is 6549674 with Jaccard = 0.9906 |PF03637|=106 [ 105 0 1100105 1 ] parent [ 6549674 ] : 6769239 0.00431732 (=16/(109*34)) 99.6847 given [ 6549674 ] : 6549674 0.643468 (=1054/(18*91)) 38.6649 best keyword for cluster 6549674 is PF03637 with Jaccard = 0.9906 [ 105 0 1100105 1 ] 1.0000 0.9906 sibling [ 6549674 ] : 6747293 0.0166667 (=4/(24*10)) 98.4108 best keyword for cluster 6747293 is PF02329 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03637 ( PF03637 Mob1/phocein family ) B> PF02329 ( PF02329 Histidine carboxylase PI chain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF03637 and PF02329 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF03637 SSF101152 0.918 (average over 261 mutual instances, PF03637 261 appearances, SSF101152 271 appearances) 2 PF02329 SSF56271 0.959 (average over 22 mutual instances, PF02329 22 appearances, SSF56271 104 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 284 ) 6614036_PF04168_PF04169 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04169 is 6514081 with Jaccard = 0.9905 |PF04169|=104 [ 104 1 1100106 0 ] parent [ 6514081 ] : 6614036 0.322947 (=2674/(115*72)) 68.0623 given [ 6514081 ] : 6514081 0.836283 (=189/(2*113)) 17.0344 best keyword for cluster 6514081 is PF04169 with Jaccard = 0.9905 [ 104 1 1100106 0 ] 0.9905 1.0000 sibling [ 6514081 ] : 6554825 0.742857 (=104/(2*70)) 42.674 best keyword for cluster 6554825 is PF04168 with Jaccard = 0.6633 [ 65 0 1100113 33 ] 1.0000 0.6633 SUGGESTING RELATEDNESS OF: A> PF04169 ( PF04169 Domain of unknown function (DUF404) ) B> PF04168 ( PF04168 Bacterial domain of unknown function (DUF403) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04168| = 98 , |PF04169| = 104 , |PF04168^PF04169| = 33 ( 33.7% and 31.7% ) Neither PF04169 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 285 ) 6745410_PF00416_PF06831 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06831 is 6553051 with Jaccard = 0.9904 |PF06831|=309 [ 309 3 1099899 0 ] parent [ 6553051 ] : 6745410 0.0267345 (=3320/(344*361)) 98.2648 given [ 6553051 ] : 6553051 0.647455 (=2786/(13*331)) 41.1455 best keyword for cluster 6553051 is PF06831 with Jaccard = 0.9904 [ 309 3 1099899 0 ] 0.9904 1.0000 sibling [ 6553051 ] : 6737622 0.0527778 (=19/(1*360)) 97.5483 best keyword for cluster 6737622 is PF00416 with Jaccard = 0.9969 [ 320 0 1099890 1 ] 1.0000 0.9969 SUGGESTING RELATEDNESS OF: A> PF06831 ( PF06831 Formamidopyrimidine-DNA glycosylase H2TH domain ) B> PF00416 ( PF00416 Ribosomal protein S13/S18 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF06831 and PF00416 have PDB structures PF06831 a.156.1.2 SUPERFAM mapping significantly overlapping: 1 PF00416 SSF46946 0.898 (average over 1258 mutual instances, PF00416 1260 appearances, SSF46946 3615 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 286 ) 6759015_PF01730_PF02814 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01730 is 6562784 with Jaccard = 0.9903 |PF01730|=103 [ 102 0 1100108 1 ] parent [ 6562784 ] : 6759015 0.00816143 (=127/(133*117)) 99.2108 given [ 6562784 ] : 6562784 0.522901 (=137/(2*131)) 49.214 best keyword for cluster 6562784 is PF01730 with Jaccard = 0.9903 [ 102 0 1100108 1 ] 1.0000 0.9903 sibling [ 6562784 ] : 6752344 0.0194175 (=28/(103*14)) 98.7923 best keyword for cluster 6752344 is PF02814 with Jaccard = 0.9451 [ 86 5 1100120 0 ] 0.9451 1.0000 SUGGESTING RELATEDNESS OF: A> PF01730 ( PF01730 UreF ) B> PF02814 ( PF02814 UreE urease accessory protein, N-terminal domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01730| = 103 , |PF02814| = 86 , |PF01730^PF02814| = 1 ( 1.0% and 1.2% ) only PF01730 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 287 ) 6650867_PF01222_PF06966 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01222 is 6554368 with Jaccard = 0.9902 |PF01222|=102 [ 101 0 1100109 1 ] parent [ 6554368 ] : 6650867 0.250959 (=2355/(102*92)) 80.2748 given [ 6554368 ] : 6554368 0.643564 (=65/(1*101)) 42.2253 best keyword for cluster 6554368 is PF01222 with Jaccard = 0.9902 [ 101 0 1100109 1 ] 1.0000 0.9902 sibling [ 6554368 ] : 6598067 0.450893 (=303/(8*84)) 60.6577 best keyword for cluster 6598067 is PF06966 with Jaccard = 0.9419 [ 81 2 1100125 3 ] 0.9759 0.9643 SUGGESTING RELATEDNESS OF: A> PF01222 ( PF01222 Ergosterol biosynthesis ERG4/ERG24 family ) B> PF06966 ( PF06966 Protein of unknown function (DUF1295) ) they come from the same clan: CL0115.7 : PF04191 PF04140 PF01222 PF06966 PF02544 the two keywords do not coincide on UniRef90 proteins Neither PF01222 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 288 ) 6770153_PF03006_PF04080 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03006 is 6768942 with Jaccard = 0.9902 |PF03006|=305 [ 302 0 1099906 3 ] parent [ 6768942 ] : 6770153 0.00475135 (=135/(77*369)) 99.7158 given [ 6768942 ] : 6768942 0.00340582 (=26/(22*347)) 99.6746 best keyword for cluster 6768942 is PF03006 with Jaccard = 0.9902 [ 302 0 1099906 3 ] 1.0000 0.9902 sibling [ 6768942 ] : 6760055 0.0107962 (=16/(38*39)) 99.2667 best keyword for cluster 6760055 is PF04080 with Jaccard = 1.0000 [ 34 0 1100177 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03006 ( PF03006 Haemolysin-III related ) B> PF04080 ( PF04080 Per1-like ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03006 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 289 ) 6750217_PF01187_PF01361 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01361 is 6738423 with Jaccard = 0.9897 |PF01361|=194 [ 192 0 1100017 2 ] parent [ 6738423 ] : 6750217 0.0180963 (=377/(83*251)) 98.6341 given [ 6738423 ] : 6738423 0.0308642 (=60/(8*243)) 97.6298 best keyword for cluster 6738423 is PF01361 with Jaccard = 0.9897 [ 192 0 1100017 2 ] 1.0000 0.9897 sibling [ 6738423 ] : 6652507 0.195833 (=47/(3*80)) 80.8467 best keyword for cluster 6652507 is PF01187 with Jaccard = 0.9714 [ 68 1 1100141 1 ] 0.9855 0.9855 SUGGESTING RELATEDNESS OF: A> PF01361 ( PF01361 Tautomerase enzyme ) B> PF01187 ( PF01187 Macrophage migration inhibitory factor (MIF) ) they come from the same clan: CL0082.7 : PF02962 PF01187 PF01361 the two keywords do not coincide on UniRef90 proteins both PF01361 and PF01187 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 290 ) 6745290_PF00049_PF03488 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00049 is 6730824 with Jaccard = 0.9890 |PF00049|=182 [ 180 0 1100029 2 ] parent [ 6730824 ] : 6745290 0.0286305 (=291/(44*231)) 98.2546 given [ 6730824 ] : 6730824 0.0505165 (=489/(55*176)) 96.8109 best keyword for cluster 6730824 is PF00049 with Jaccard = 0.9890 [ 180 0 1100029 2 ] 1.0000 0.9890 sibling [ 6730824 ] : 6713740 0.0810811 (=21/(7*37)) 94.5119 best keyword for cluster 6713740 is PF03488 with Jaccard = 1.0000 [ 25 0 1100186 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00049 ( PF00049 Insulin/IGF/Relaxin family ) B> PF03488 ( PF03488 Nematode insulin-related peptide beta type ) they come from the same clan: CL0239.3 : PF00049 PF03488 the two keywords do not coincide on UniRef90 proteins only PF00049 has a PDB structure (may not be up to date) PF00049 g.1.1.1 j.75.1.1 SUPERFAM mapping significantly overlapping: 1 PF00049 SSF56994 0.776 (average over 578 mutual instances, PF00049 578 appearances, SSF56994 698 appearances) 2 PF03488 SSF56994 0.852 (average over 26 mutual instances, PF03488 26 appearances, SSF56994 698 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 291 ) 6607664_PF02538_PF05378 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02538 is 6537248 with Jaccard = 0.9887 |PF02538|=177 [ 175 0 1100034 2 ] parent [ 6537248 ] : 6607664 0.346224 (=13402/(207*187)) 65.5373 given [ 6537248 ] : 6537248 0.727027 (=269/(2*185)) 30.299 best keyword for cluster 6537248 is PF02538 with Jaccard = 0.9887 [ 175 0 1100034 2 ] 1.0000 0.9887 sibling [ 6537248 ] : 6474067 0.958146 (=9821/(125*82)) 4.4726 best keyword for cluster 6474067 is PF05378 with Jaccard = 0.7462 [ 194 3 1099951 63 ] 0.9848 0.7549 SUGGESTING RELATEDNESS OF: A> PF02538 ( PF02538 Hydantoinase B/oxoprolinase ) B> PF05378 ( PF05378 Hydantoinase/oxoprolinase N-terminal region ) Only B has a clan ( CL0108.10 ). the two keywords coincide on Uniref90 proteins: |PF02538| = 177 , |PF05378| = 257 , |PF02538^PF05378| = 60 ( 33.9% and 23.3% ) Neither PF02538 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05378 SSF53383 0.793 (average over 1 mutual instances, PF05378 1 appearances, SSF53383 34644 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 292 ) 6755212_PF00891_PF02545 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02545 is 6674688 with Jaccard = 0.9887 |PF02545|=353 [ 349 0 1099858 4 ] parent [ 6674688 ] : 6755212 0.0103212 (=2268/(391*562)) 98.9836 given [ 6674688 ] : 6674688 0.166667 (=194/(388*3)) 86.8243 best keyword for cluster 6674688 is PF02545 with Jaccard = 0.9887 [ 349 0 1099858 4 ] 1.0000 0.9887 sibling [ 6674688 ] : 6751447 0.0172977 (=667/(80*482)) 98.7291 best keyword for cluster 6751447 is PF00891 with Jaccard = 0.6996 [ 368 146 1099685 12 ] 0.7160 0.9684 SUGGESTING RELATEDNESS OF: A> PF02545 ( PF02545 Maf-like protein ) B> PF00891 ( PF00891 O-methyltransferase ) A and B come from a different clan ( CL0269.2 , CL0102.14 ). the two keywords coincide on Uniref90 proteins: |PF00891| = 380 , |PF02545| = 353 , |PF00891^PF02545| = 7 ( 1.8% and 2.0% ) both PF02545 and PF00891 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 293 ) 6691574_PF02317_PF07479 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07479 is 6652458 with Jaccard = 0.9884 |PF07479|=344 [ 342 2 1099865 2 ] parent [ 6652458 ] : 6691574 0.114798 (=2591/(370*61)) 90.5913 given [ 6652458 ] : 6652458 0.203252 (=75/(1*369)) 80.8118 best keyword for cluster 6652458 is PF07479 with Jaccard = 0.9884 [ 342 2 1099865 2 ] 0.9942 0.9942 sibling [ 6652458 ] : 6675162 0.132143 (=37/(56*5)) 86.9912 best keyword for cluster 6675162 is PF02317 with Jaccard = 0.9818 [ 54 1 1100156 0 ] 0.9818 1.0000 SUGGESTING RELATEDNESS OF: A> PF07479 ( PF07479 NAD-dependent glycerol-3-phosphate dehydrogenase C-terminus ) B> PF02317 ( PF02317 NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain ) Only A has a clan ( CL0106.7 ). the two keywords do not coincide on UniRef90 proteins both PF07479 and PF02317 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02317 SSF48179 0.882 (average over 195 mutual instances, PF02317 361 appearances, SSF48179 20570 appearances) 2 PF07479 SSF48179 0.946 (average over 1191 mutual instances, PF07479 2352 appearances, SSF48179 20570 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 294 ) 6758849_PF01757_PF04235 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01757 is 6750455 with Jaccard = 0.9877 |PF01757|=733 [ 724 0 1099478 9 ] parent [ 6750455 ] : 6758849 0.0118531 (=3730/(315*999)) 99.2011 given [ 6750455 ] : 6750455 0.0144163 (=473/(34*965)) 98.6519 best keyword for cluster 6750455 is PF01757 with Jaccard = 0.9877 [ 724 0 1099478 9 ] 1.0000 0.9877 sibling [ 6750455 ] : 6748206 0.0221728 (=449/(225*90)) 98.4834 best keyword for cluster 6748206 is PF04235 with Jaccard = 0.6290 [ 78 46 1100087 0 ] 0.6290 1.0000 SUGGESTING RELATEDNESS OF: A> PF01757 ( PF01757 Acyltransferase family ) B> PF04235 ( PF04235 Protein of unknown function (DUF418) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01757 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 295 ) 6767541_PF02082_PF03631 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03631 is 6730402 with Jaccard = 0.9873 |PF03631|=315 [ 311 0 1099896 4 ] parent [ 6730402 ] : 6767541 0.00478897 (=964/(368*547)) 99.6202 given [ 6730402 ] : 6730402 0.0366692 (=144/(11*357)) 96.7636 best keyword for cluster 6730402 is PF03631 with Jaccard = 0.9873 [ 311 0 1099896 4 ] 1.0000 0.9873 sibling [ 6730402 ] : 6766996 0.00633903 (=89/(520*27)) 99.598 best keyword for cluster 6766996 is PF02082 with Jaccard = 0.9307 [ 470 3 1099706 32 ] 0.9937 0.9363 SUGGESTING RELATEDNESS OF: A> PF03631 ( PF03631 Ribonuclease BN-like family ) B> PF02082 ( PF02082 Transcriptional regulator ) Only B has a clan ( CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF02082| = 502 , |PF03631| = 315 , |PF02082^PF03631| = 5 ( 1.0% and 1.6% ) only PF03631 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 296 ) 6737647_PF01330_PF02132 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01330 is 6697737 with Jaccard = 0.9870 |PF01330|=228 [ 227 2 1099981 1 ] parent [ 6697737 ] : 6737647 0.0476279 (=2811/(227*260)) 97.5529 given [ 6697737 ] : 6697737 0.111543 (=86/(257*3)) 91.8403 best keyword for cluster 6697737 is PF01330 with Jaccard = 0.9870 [ 227 2 1099981 1 ] 0.9913 0.9956 sibling [ 6697737 ] : 6475873 0.969027 (=219/(1*226)) 4.80038 best keyword for cluster 6475873 is PF02132 with Jaccard = 0.9507 [ 193 10 1100008 0 ] 0.9507 1.0000 SUGGESTING RELATEDNESS OF: A> PF01330 ( PF01330 RuvA N terminal domain ) B> PF02132 ( PF02132 RecR protein ) Only A has a clan ( CL0021.12 ). the two keywords do not coincide on UniRef90 proteins both PF01330 and PF02132 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF01330 SSF50249 0.982 (average over 769 mutual instances, PF01330 2238 appearances, SSF50249 52669 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 297 ) 6682370_PF00636_PF02137 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02137 is 6551095 with Jaccard = 0.9870 |PF02137|=77 [ 76 0 1100134 1 ] parent [ 6551095 ] : 6682370 0.128951 (=4900/(79*481)) 88.8153 given [ 6551095 ] : 6551095 0.609649 (=139/(3*76)) 39.8307 best keyword for cluster 6551095 is PF02137 with Jaccard = 0.9870 [ 76 0 1100134 1 ] 1.0000 0.9870 sibling [ 6551095 ] : 6656333 0.215546 (=513/(5*476)) 82.0966 best keyword for cluster 6656333 is PF00636 with Jaccard = 0.7120 [ 356 88 1099711 56 ] 0.8018 0.8641 SUGGESTING RELATEDNESS OF: A> PF02137 ( PF02137 Adenosine-deaminase (editase) domain ) B> PF00636 ( PF00636 RNase3 domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF02137 and PF00636 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00636 SSF69065 0.604 (average over 1400 mutual instances, PF00636 1401 appearances, SSF69065 2883 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 298 ) 6673857_PF00313_PF06961 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06961 is 6503442 with Jaccard = 0.9867 |PF06961|=75 [ 74 0 1100136 1 ] parent [ 6503442 ] : 6673857 0.146562 (=9794/(81*825)) 86.6022 given [ 6503442 ] : 6503442 0.905063 (=143/(2*79)) 12.8527 best keyword for cluster 6503442 is PF06961 with Jaccard = 0.9867 [ 74 0 1100136 1 ] 1.0000 0.9867 sibling [ 6503442 ] : 6673302 0.160171 (=526/(4*821)) 86.4364 best keyword for cluster 6673302 is PF00313 with Jaccard = 0.9633 [ 709 5 1099475 22 ] 0.9930 0.9699 SUGGESTING RELATEDNESS OF: A> PF06961 ( PF06961 Protein of unknown function (DUF1294) ) B> PF00313 ( PF00313 'Cold-shock' DNA-binding domain ) Only B has a clan ( CL0021.12 ). the two keywords coincide on Uniref90 proteins: |PF00313| = 731 , |PF06961| = 75 , |PF00313^PF06961| = 14 ( 1.9% and 18.7% ) only PF06961 has a PDB structure (may not be up to date) PF00313 b.40.4.5 SUPERFAM mapping significantly overlapping: 1 PF00313 SSF50249 0.971 (average over 2759 mutual instances, PF00313 2782 appearances, SSF50249 52669 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 299 ) 6594634_PF00478_PF01070 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00478 is 6560532 with Jaccard = 0.9866 |PF00478|=374 [ 369 0 1099837 5 ] parent [ 6560532 ] : 6594634 0.486234 (=80513/(415*399)) 59.2515 given [ 6560532 ] : 6560532 0.565823 (=894/(4*395)) 47.1569 best keyword for cluster 6560532 is PF00478 with Jaccard = 0.9866 [ 369 0 1099837 5 ] 1.0000 0.9866 sibling [ 6560532 ] : 6553162 0.646349 (=22767/(119*296)) 41.2847 best keyword for cluster 6553162 is PF01070 with Jaccard = 0.8529 [ 313 45 1099844 9 ] 0.8743 0.9720 SUGGESTING RELATEDNESS OF: A> PF00478 ( PF00478 IMP dehydrogenase / GMP reductase domain ) B> PF01070 ( PF01070 FMN-dependent dehydrogenase ) they come from the same clan: CL0036.17 : PF05690 PF01680 PF00834 PF01729 PF00697 PF03740 PF01884 PF00724 PF00215 PF03060 PF04095 PF04131 PF00478 PF00218 PF00977 PF01645 PF04309 PF01070 PF01207 PF04481 PF04476 PF01180 PF00701 PF01791 PF03932 PF03437 PF01081 PF00121 PF09370 PF02581 PF00290 the two keywords do not coincide on UniRef90 proteins both PF00478 and PF01070 have PDB structures PF00478 c.1.5.1 PF01070 c.1.4.1 SUPERFAM mapping significantly overlapping: 1 PF00478 SSF51621 0.696 (average over 2 mutual instances, PF00478 2 appearances, SSF51621 12495 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 300 ) 6752904_PF01311_PF01312 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01311 is 6624208 with Jaccard = 0.9866 |PF01311|=224 [ 221 0 1099987 3 ] parent [ 6624208 ] : 6752904 0.0119479 (=909/(317*240)) 98.8319 given [ 6624208 ] : 6624208 0.287815 (=137/(2*238)) 72.1473 best keyword for cluster 6624208 is PF01311 with Jaccard = 0.9866 [ 221 0 1099987 3 ] 1.0000 0.9866 sibling [ 6624208 ] : 6522545 0.813291 (=257/(1*316)) 21.4382 best keyword for cluster 6522545 is PF01312 with Jaccard = 0.9861 [ 284 0 1099923 4 ] 1.0000 0.9861 SUGGESTING RELATEDNESS OF: A> PF01311 ( PF01311 Bacterial export proteins, family 1 ) B> PF01312 ( PF01312 FlhB HrpN YscU SpaS Family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01311| = 224 , |PF01312| = 288 , |PF01311^PF01312| = 3 ( 1.3% and 1.0% ) Neither PF01311 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 301 ) 6724834_PF00076_PF05383 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05383 is 6549022 with Jaccard = 0.9864 |PF05383|=147 [ 145 0 1100064 2 ] parent [ 6549022 ] : 6724834 0.0475652 (=30729/(155*4168)) 96.088 given [ 6549022 ] : 6549022 0.626172 (=1469/(17*138)) 38.033 best keyword for cluster 6549022 is PF05383 with Jaccard = 0.9864 [ 145 0 1100064 2 ] 1.0000 0.9864 sibling [ 6549022 ] : 6716126 0.0578344 (=29681/(127*4041)) 94.9031 best keyword for cluster 6716126 is PF00076 with Jaccard = 0.8118 [ 3459 217 1095950 585 ] 0.9410 0.8553 SUGGESTING RELATEDNESS OF: A> PF05383 ( PF05383 La domain ) B> PF00076 ( PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) ) Only B has a clan ( CL0221.5 ). the two keywords coincide on Uniref90 proteins: |PF00076| = 4044 , |PF05383| = 147 , |PF00076^PF05383| = 35 ( 0.9% and 23.8% ) both PF05383 and PF00076 have PDB structures PF05383 a.4.5.46 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 302 ) 6644915_PF00112_PF03051 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03051 is 6633203 with Jaccard = 0.9861 |PF03051|=72 [ 71 0 1100139 1 ] parent [ 6633203 ] : 6644915 0.288649 (=27663/(76*1261)) 78.4904 given [ 6633203 ] : 6633203 0.306667 (=23/(1*75)) 75.3572 best keyword for cluster 6633203 is PF03051 with Jaccard = 0.9861 [ 71 0 1100139 1 ] 1.0000 0.9861 sibling [ 6633203 ] : 6635277 0.28254 (=356/(1*1260)) 75.8182 best keyword for cluster 6635277 is PF00112 with Jaccard = 0.9572 [ 1163 28 1098996 24 ] 0.9765 0.9798 SUGGESTING RELATEDNESS OF: A> PF03051 ( PF03051 Peptidase C1-like family ) B> PF00112 ( PF00112 Papain family cysteine protease ) they come from the same clan: CL0125.9 : PF08715 PF01707 PF03569 PF01830 PF00851 PF03543 PF03416 PF05543 PF05533 PF03412 PF05415 PF05412 PF05411 PF05410 PF05408 PF05407 PF05379 PF05381 PF00648 PF03051 PF01831 PF01088 PF01640 PF00112 PF00877 PF05257 PF05382 the two keywords do not coincide on UniRef90 proteins both PF03051 and PF00112 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 303 ) 6510097_PF00330_PF06434 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06434 is 6449454 with Jaccard = 0.9861 |PF06434|=71 [ 71 1 1100139 0 ] parent [ 6449454 ] : 6510097 0.852824 (=44601/(79*662)) 15.4905 given [ 6449454 ] : 6449454 0.987179 (=77/(1*78)) 1.28205 best keyword for cluster 6449454 is PF06434 with Jaccard = 0.9861 [ 71 1 1100139 0 ] 0.9861 1.0000 sibling [ 6449454 ] : 6497283 0.896804 (=2946/(5*657)) 10.5348 best keyword for cluster 6497283 is PF00330 with Jaccard = 0.8779 [ 604 4 1099523 80 ] 0.9934 0.8830 SUGGESTING RELATEDNESS OF: A> PF06434 ( PF06434 Aconitate hydratase 2 N-terminus ) B> PF00330 ( PF00330 Aconitase family (aconitate hydratase) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00330| = 684 , |PF06434| = 71 , |PF00330^PF06434| = 70 ( 10.2% and 98.6% ) both PF06434 and PF00330 have PDB structures PF00330 c.83.1.1 SUPERFAM mapping significantly overlapping: 1 PF00330 SSF53732 0.902 (average over 2340 mutual instances, PF00330 4017 appearances, SSF53732 3709 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 304 ) 6766417_PF02535_PF03773 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02535 is 6756659 with Jaccard = 0.9858 |PF02535|=493 [ 487 1 1099717 6 ] parent [ 6756659 ] : 6766417 0.00559701 (=888/(592*268)) 99.5743 given [ 6756659 ] : 6756659 0.0112069 (=78/(580*12)) 99.0707 best keyword for cluster 6756659 is PF02535 with Jaccard = 0.9858 [ 487 1 1099717 6 ] 0.9980 0.9878 sibling [ 6756659 ] : 6764145 0.00674916 (=24/(254*14)) 99.4723 best keyword for cluster 6764145 is PF03773 with Jaccard = 0.9854 [ 202 3 1100006 0 ] 0.9854 1.0000 SUGGESTING RELATEDNESS OF: A> PF02535 ( PF02535 ZIP Zinc transporter ) B> PF03773 ( PF03773 Predicted permease ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02535 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 305 ) 6733106_PF04237_PF04944 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04237 is 6600049 with Jaccard = 0.9855 |PF04237|=69 [ 68 0 1100142 1 ] parent [ 6600049 ] : 6733106 0.0350708 (=109/(84*37)) 97.0666 given [ 6600049 ] : 6600049 0.416964 (=585/(23*61)) 61.5661 best keyword for cluster 6600049 is PF04237 with Jaccard = 0.9855 [ 68 0 1100142 1 ] 1.0000 0.9855 sibling [ 6600049 ] : 6670205 0.175 (=28/(5*32)) 85.5747 best keyword for cluster 6670205 is PF04944 with Jaccard = 0.9524 [ 20 1 1100190 0 ] 0.9524 1.0000 SUGGESTING RELATEDNESS OF: A> PF04237 ( PF04237 Protein of unknown function (DUF419) ) B> PF04944 ( PF04944 Uncharacterised BCR (COG3801) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04237 has a PDB structure (may not be up to date) PF04237 d.198.3.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 306 ) 6768972_PF07396_PF07642 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07396 is 6749690 with Jaccard = 0.9855 |PF07396|=69 [ 68 0 1100142 1 ] parent [ 6749690 ] : 6768972 0.0044603 (=40/(38*236)) 99.6757 given [ 6749690 ] : 6749690 0.0190538 (=265/(114*122)) 98.599 best keyword for cluster 6749690 is PF07396 with Jaccard = 0.9855 [ 68 0 1100142 1 ] 1.0000 0.9855 sibling [ 6749690 ] : 6743610 0.027027 (=1/(1*37)) 98.1081 best keyword for cluster 6743610 is PF07642 with Jaccard = 1.0000 [ 10 0 1100201 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07396 ( PF07396 Phosphate-selective porin O and P ) B> PF07642 ( PF07642 Protein of unknown function (DUF1597) ) Only A has a clan ( CL0193.8 ). the two keywords do not coincide on UniRef90 proteins Neither PF07396 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 307 ) 6761897_PF02949_PF08395 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02949 is 6645471 with Jaccard = 0.9852 |PF02949|=202 [ 200 1 1100008 2 ] parent [ 6645471 ] : 6761897 0.00846702 (=399/(204*231)) 99.365 given [ 6645471 ] : 6645471 0.218905 (=132/(3*201)) 78.6399 best keyword for cluster 6645471 is PF02949 with Jaccard = 0.9852 [ 200 1 1100008 2 ] 0.9950 0.9901 sibling [ 6645471 ] : 6760503 0.00888743 (=27/(14*217)) 99.2911 best keyword for cluster 6760503 is PF08395 with Jaccard = 0.8587 [ 158 20 1100027 6 ] 0.8876 0.9634 SUGGESTING RELATEDNESS OF: A> PF02949 ( PF02949 7tm Odorant receptor ) B> PF08395 ( PF08395 7tm Chemosensory receptor ) they come from the same clan: CL0176.5 : PF02949 PF08395 PF03268 PF06151 the two keywords coincide on Uniref90 proteins: |PF02949| = 202 , |PF08395| = 164 , |PF02949^PF08395| = 1 ( 0.5% and 0.6% ) Neither PF02949 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 308 ) 6719873_PF01199_PF07891 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01199 is 6533282 with Jaccard = 0.9851 |PF01199|=67 [ 66 0 1100144 1 ] parent [ 6533282 ] : 6719873 0.0464345 (=56/(67*18)) 95.397 given [ 6533282 ] : 6533282 0.742424 (=49/(1*66)) 27.9689 best keyword for cluster 6533282 is PF01199 with Jaccard = 0.9851 [ 66 0 1100144 1 ] 1.0000 0.9851 sibling [ 6533282 ] : 6677252 0.125 (=10/(8*10)) 87.5001 best keyword for cluster 6677252 is PF07891 with Jaccard = 0.9167 [ 11 0 1100199 1 ] 1.0000 0.9167 SUGGESTING RELATEDNESS OF: A> PF01199 ( PF01199 Ribosomal protein L34e ) B> PF07891 ( PF07891 Protein of unknown function (DUF1666) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01199| = 67 , |PF07891| = 12 , |PF01199^PF07891| = 1 ( 1.5% and 8.3% ) Neither PF01199 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 309 ) 6763000_PF00210_PF00301 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00210 is 6725009 with Jaccard = 0.9850 |PF00210|=657 [ 655 8 1099546 2 ] parent [ 6725009 ] : 6763000 0.0082766 (=3783/(742*616)) 99.4169 given [ 6725009 ] : 6725009 0.0485667 (=6433/(443*299)) 96.1086 best keyword for cluster 6725009 is PF00210 with Jaccard = 0.9850 [ 655 8 1099546 2 ] 0.9879 0.9970 sibling [ 6725009 ] : 6762149 0.00847315 (=101/(20*596)) 99.3782 best keyword for cluster 6762149 is PF00301 with Jaccard = 0.6096 [ 278 144 1099755 34 ] 0.6588 0.8910 SUGGESTING RELATEDNESS OF: A> PF00210 ( PF00210 Ferritin-like domain ) B> PF00301 ( PF00301 Rubredoxin ) A and B come from a different clan ( CL0044.8 , CL0045.7 ). the two keywords do not coincide on UniRef90 proteins both PF00210 and PF00301 have PDB structures PF00210 a.25.1.1 SUPERFAM mapping significantly overlapping: 1 PF00210 SSF47240 0.870 (average over 2111 mutual instances, PF00210 2114 appearances, SSF47240 6970 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 310 ) 6735487_PF02613_PF06192 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02613 is 6514361 with Jaccard = 0.9844 |PF02613|=64 [ 63 0 1100147 1 ] parent [ 6514361 ] : 6735487 0.0338066 (=400/(68*174)) 97.3208 given [ 6514361 ] : 6514361 0.831746 (=262/(5*63)) 17.2396 best keyword for cluster 6514361 is PF02613 with Jaccard = 0.9844 [ 63 0 1100147 1 ] 1.0000 0.9844 sibling [ 6514361 ] : 6734154 0.037639 (=44/(167*7)) 97.1834 best keyword for cluster 6734154 is PF06192 with Jaccard = 1.0000 [ 112 0 1100099 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02613 ( PF02613 Nitrate reductase delta subunit ) B> PF06192 ( PF06192 Cytoplasmic chaperone TorD ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02613 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 311 ) 6746560_PF01248_PF04296 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04296 is 6662780 with Jaccard = 0.9843 |PF04296|=127 [ 125 0 1100084 2 ] parent [ 6662780 ] : 6746560 0.0190943 (=1197/(139*451)) 98.3545 given [ 6662780 ] : 6662780 0.203463 (=188/(7*132)) 83.8411 best keyword for cluster 6662780 is PF04296 with Jaccard = 0.9843 [ 125 0 1100084 2 ] 1.0000 0.9843 sibling [ 6662780 ] : 6745249 0.0222222 (=10/(1*450)) 98.2502 best keyword for cluster 6745249 is PF01248 with Jaccard = 0.9758 [ 403 1 1099798 9 ] 0.9975 0.9782 SUGGESTING RELATEDNESS OF: A> PF04296 ( PF04296 Protein of unknown function (DUF448) ) B> PF01248 ( PF01248 Ribosomal protein L7Ae/L30e/S12e/Gadd45 family ) Only B has a clan ( CL0101.7 ). the two keywords coincide on Uniref90 proteins: |PF01248| = 412 , |PF04296| = 127 , |PF01248^PF04296| = 6 ( 1.5% and 4.7% ) both PF04296 and PF01248 have PDB structures PF04296 d.192.1.1 SUPERFAM mapping significantly overlapping: 1 PF04296 SSF64376 0.891 (average over 378 mutual instances, PF04296 378 appearances, SSF64376 390 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 312 ) 6752660_PF01580_PF02534 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02534 is 6745172 with Jaccard = 0.9841 |PF02534|=189 [ 186 0 1100022 3 ] parent [ 6745172 ] : 6752660 0.0166277 (=6394/(290*1326)) 98.8145 given [ 6745172 ] : 6745172 0.0186667 (=77/(15*275)) 98.2469 best keyword for cluster 6745172 is PF02534 with Jaccard = 0.9841 [ 186 0 1100022 3 ] 1.0000 0.9841 sibling [ 6745172 ] : 6752078 0.0193745 (=381/(15*1311)) 98.7745 best keyword for cluster 6752078 is PF01580 with Jaccard = 0.6172 [ 424 260 1099524 3 ] 0.6199 0.9930 SUGGESTING RELATEDNESS OF: A> PF02534 ( PF02534 TraG/TraD family ) B> PF01580 ( PF01580 FtsK/SpoIIIE family ) Only A has a clan ( CL0023.26 ). the two keywords do not coincide on UniRef90 proteins only PF02534 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 313 ) 6734435_PF00379_PF07912 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00379 is 6708405 with Jaccard = 0.9840 |PF00379|=374 [ 368 0 1099837 6 ] parent [ 6708405 ] : 6734435 0.0299226 (=205/(17*403)) 97.2188 given [ 6708405 ] : 6708405 0.0690919 (=245/(394*9)) 93.7012 best keyword for cluster 6708405 is PF00379 with Jaccard = 0.9840 [ 368 0 1099837 6 ] 1.0000 0.9840 sibling [ 6708405 ] : 6699062 0.0857143 (=6/(10*7)) 92.0091 best keyword for cluster 6699062 is PF07912 with Jaccard = 0.7273 [ 8 3 1100200 0 ] 0.7273 1.0000 SUGGESTING RELATEDNESS OF: A> PF00379 ( PF00379 Insect cuticle protein ) B> PF07912 ( PF07912 ERp29, N-terminal domain ) Only B has a clan ( CL0172.11 ). the two keywords do not coincide on UniRef90 proteins only PF00379 has a PDB structure (may not be up to date) PF07912 c.47.1.7 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 314 ) 6750301_PF00246_PF04952 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04952 is 6716227 with Jaccard = 0.9840 |PF04952|=187 [ 185 1 1100023 2 ] parent [ 6716227 ] : 6750301 0.0184347 (=2452/(235*566)) 98.6405 given [ 6716227 ] : 6716227 0.0616883 (=57/(4*231)) 94.9217 best keyword for cluster 6716227 is PF04952 with Jaccard = 0.9840 [ 185 1 1100023 2 ] 0.9946 0.9893 sibling [ 6716227 ] : 6736069 0.0278559 (=109/(7*559)) 97.3805 best keyword for cluster 6736069 is PF00246 with Jaccard = 0.9802 [ 494 2 1099707 8 ] 0.9960 0.9841 SUGGESTING RELATEDNESS OF: A> PF04952 ( PF04952 Succinylglutamate desuccinylase / Aspartoacylase family ) B> PF00246 ( PF00246 Zinc carboxypeptidase ) they come from the same clan: CL0035.11 : PF05343 PF04389 PF01546 PF02127 PF00883 PF00246 PF05450 PF04952 the two keywords do not coincide on UniRef90 proteins both PF04952 and PF00246 have PDB structures PF04952 c.56.5.7 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 315 ) 6697008_PF00351_PF00800 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00351 is 6592298 with Jaccard = 0.9836 |PF00351|=122 [ 120 0 1100089 2 ] parent [ 6592298 ] : 6697008 0.0939436 (=5308/(129*438)) 91.7375 given [ 6592298 ] : 6592298 0.448819 (=114/(2*127)) 58.1228 best keyword for cluster 6592298 is PF00351 with Jaccard = 0.9836 [ 120 0 1100089 2 ] 1.0000 0.9836 sibling [ 6592298 ] : 6670139 0.154473 (=6370/(301*137)) 85.5298 best keyword for cluster 6670139 is PF00800 with Jaccard = 0.6867 [ 274 123 1099812 2 ] 0.6902 0.9928 SUGGESTING RELATEDNESS OF: A> PF00351 ( PF00351 Biopterin-dependent aromatic amino acid hydroxylase ) B> PF00800 ( PF00800 Prephenate dehydratase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF00351 and PF00800 have PDB structures PF00351 d.178.1.1 SUPERFAM mapping significantly overlapping: 1 PF00351 SSF56534 0.655 (average over 406 mutual instances, PF00351 407 appearances, SSF56534 477 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 316 ) 6746019_PF01111_PF03657 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01111 is 6557550 with Jaccard = 0.9833 |PF01111|=60 [ 59 0 1100151 1 ] parent [ 6557550 ] : 6746019 0.0169697 (=56/(66*50)) 98.3115 given [ 6557550 ] : 6557550 0.570312 (=73/(2*64)) 44.8558 best keyword for cluster 6557550 is PF01111 with Jaccard = 0.9833 [ 59 0 1100151 1 ] 1.0000 0.9833 sibling [ 6557550 ] : 6680536 0.126819 (=61/(37*13)) 88.3219 best keyword for cluster 6680536 is PF03657 with Jaccard = 0.8000 [ 20 5 1100186 0 ] 0.8000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01111 ( PF01111 Cyclin-dependent kinase regulatory subunit ) B> PF03657 ( PF03657 Uncharacterised protein family (UPF0113) ) Only B has a clan ( CL0178.11 ). the two keywords do not coincide on UniRef90 proteins only PF01111 has a PDB structure (may not be up to date) PF01111 d.97.1.1 SUPERFAM mapping significantly overlapping: 1 PF01111 SSF55637 0.893 (average over 132 mutual instances, PF01111 134 appearances, SSF55637 134 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 317 ) 6737467_PF00884_PF02995 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02995 is 6562652 with Jaccard = 0.9833 |PF02995|=60 [ 59 0 1100151 1 ] parent [ 6562652 ] : 6737467 0.0356349 (=2245/(60*1050)) 97.531 given [ 6562652 ] : 6562652 0.538462 (=224/(8*52)) 49.051 best keyword for cluster 6562652 is PF02995 with Jaccard = 0.9833 [ 59 0 1100151 1 ] 1.0000 0.9833 sibling [ 6562652 ] : 6725365 0.0522488 (=273/(5*1045)) 96.1516 best keyword for cluster 6725365 is PF00884 with Jaccard = 0.9673 [ 888 1 1099293 29 ] 0.9989 0.9684 SUGGESTING RELATEDNESS OF: A> PF02995 ( PF02995 Protein of unknown function (DUF229) ) B> PF00884 ( PF00884 Sulfatase ) they come from the same clan: CL0088.10 : PF00884 PF01663 PF08665 PF01676 PF02995 PF07394 PF00245 the two keywords do not coincide on UniRef90 proteins only PF02995 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 318 ) 6670998_PF00764_PF06508 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00764 is 6536206 with Jaccard = 0.9831 |PF00764|=236 [ 232 0 1099975 4 ] parent [ 6536206 ] : 6670998 0.190128 (=10562/(256*217)) 85.7853 given [ 6536206 ] : 6536206 0.713725 (=182/(1*255)) 29.72 best keyword for cluster 6536206 is PF00764 with Jaccard = 0.9831 [ 232 0 1099975 4 ] 1.0000 0.9831 sibling [ 6536206 ] : 6606478 0.407407 (=88/(1*216)) 64.9645 best keyword for cluster 6606478 is PF06508 with Jaccard = 0.7606 [ 197 1 1099952 61 ] 0.9949 0.7636 SUGGESTING RELATEDNESS OF: A> PF00764 ( PF00764 Arginosuccinate synthase ) B> PF06508 ( PF06508 ExsB ) they come from the same clan: CL0039.7 : PF00764 PF00733 PF01171 PF01902 PF06508 PF02540 PF01507 PF02568 PF03054 the two keywords do not coincide on UniRef90 proteins only PF00764 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 319 ) 6610063_PF01159_PF01777 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01159 is 6549264 with Jaccard = 0.9831 |PF01159|=58 [ 58 1 1100152 0 ] parent [ 6549264 ] : 6610063 0.431795 (=1703/(68*58)) 66.7831 given [ 6549264 ] : 6549264 0.630769 (=123/(3*65)) 38.2656 best keyword for cluster 6549264 is PF01159 with Jaccard = 0.9831 [ 58 1 1100152 0 ] 0.9831 1.0000 sibling [ 6549264 ] : 6601011 0.401786 (=45/(2*56)) 62.1429 best keyword for cluster 6601011 is PF01777 with Jaccard = 0.9818 [ 54 1 1100156 0 ] 0.9818 1.0000 SUGGESTING RELATEDNESS OF: A> PF01159 ( PF01159 Ribosomal protein L6e ) B> PF01777 ( PF01777 Ribosomal L27e protein family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01159 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 320 ) 6745432_PF01984_PF05024 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01984 is 6677614 with Jaccard = 0.9831 |PF01984|=59 [ 58 0 1100152 1 ] parent [ 6677614 ] : 6745432 0.0177404 (=57/(51*63)) 98.2659 given [ 6677614 ] : 6677614 0.129032 (=8/(1*62)) 87.6037 best keyword for cluster 6677614 is PF01984 with Jaccard = 0.9831 [ 58 0 1100152 1 ] 1.0000 0.9831 sibling [ 6677614 ] : 6661139 0.180851 (=34/(4*47)) 83.5267 best keyword for cluster 6661139 is PF05024 with Jaccard = 0.9556 [ 43 1 1100166 1 ] 0.9773 0.9773 SUGGESTING RELATEDNESS OF: A> PF01984 ( PF01984 Double-stranded DNA-binding domain ) B> PF05024 ( PF05024 N-acetylglucosaminyl transferase component (Gpi1) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01984| = 59 , |PF05024| = 44 , |PF01984^PF05024| = 1 ( 1.7% and 2.3% ) only PF01984 has a PDB structure (may not be up to date) PF01984 a.5.6.1 SUPERFAM mapping significantly overlapping: 1 PF01984 SSF46950 0.690 (average over 132 mutual instances, PF01984 133 appearances, SSF46950 134 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 321 ) 6736703_PF00731_PF01259 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01259 is 6727661 with Jaccard = 0.9830 |PF01259|=293 [ 289 1 1099917 4 ] parent [ 6727661 ] : 6736703 0.026216 (=3048/(337*345)) 97.4498 given [ 6727661 ] : 6727661 0.0356394 (=306/(27*318)) 96.4414 best keyword for cluster 6727661 is PF01259 with Jaccard = 0.9830 [ 289 1 1099917 4 ] 0.9966 0.9863 sibling [ 6727661 ] : 6542403 0.724826 (=13547/(70*267)) 33.7314 best keyword for cluster 6542403 is PF00731 with Jaccard = 0.8664 [ 253 1 1099919 38 ] 0.9961 0.8694 SUGGESTING RELATEDNESS OF: A> PF01259 ( PF01259 SAICAR synthetase ) B> PF00731 ( PF00731 AIR carboxylase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00731| = 291 , |PF01259| = 293 , |PF00731^PF01259| = 11 ( 3.8% and 3.8% ) both PF01259 and PF00731 have PDB structures PF00731 c.23.8.1 SUPERFAM mapping significantly overlapping: 1 PF00731 SSF52255 0.971 (average over 914 mutual instances, PF00731 1034 appearances, SSF52255 1010 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 322 ) 6689461_PF00860_PF00916 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00860 is 6658329 with Jaccard = 0.9828 |PF00860|=578 [ 570 2 1099631 8 ] parent [ 6658329 ] : 6689461 0.114728 (=51742/(637*708)) 90.1518 given [ 6658329 ] : 6658329 0.194574 (=19299/(366*271)) 82.808 best keyword for cluster 6658329 is PF00860 with Jaccard = 0.9828 [ 570 2 1099631 8 ] 0.9965 0.9862 sibling [ 6658329 ] : 6677780 0.12973 (=456/(5*703)) 87.6633 best keyword for cluster 6677780 is PF00916 with Jaccard = 0.9812 [ 626 6 1099573 6 ] 0.9905 0.9905 SUGGESTING RELATEDNESS OF: A> PF00860 ( PF00860 Permease family ) B> PF00916 ( PF00916 Sulfate transporter family ) they come from the same clan: CL0062.8 : PF00860 PF03222 PF02133 PF00916 PF00474 PF03845 PF01235 PF00955 PF07331 PF02361 PF05525 PF03594 PF01490 PF00324 the two keywords do not coincide on UniRef90 proteins Neither PF00860 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 323 ) 6733526_PF00324_PF03845 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03845 is 6703700 with Jaccard = 0.9826 |PF03845|=115 [ 113 0 1100096 2 ] parent [ 6703700 ] : 6733526 0.0355525 (=10998/(123*2515)) 97.1121 given [ 6703700 ] : 6703700 0.0779661 (=46/(5*118)) 92.8466 best keyword for cluster 6703700 is PF03845 with Jaccard = 0.9826 [ 113 0 1100096 2 ] 1.0000 0.9826 sibling [ 6703700 ] : 6727943 0.0502552 (=31744/(283*2232)) 96.478 best keyword for cluster 6727943 is PF00324 with Jaccard = 0.8716 [ 2016 261 1097898 36 ] 0.8854 0.9825 SUGGESTING RELATEDNESS OF: A> PF03845 ( PF03845 Spore germination protein ) B> PF00324 ( PF00324 Amino acid permease ) they come from the same clan: CL0062.8 : PF00860 PF03222 PF02133 PF00916 PF00474 PF03845 PF01235 PF00955 PF07331 PF02361 PF05525 PF03594 PF01490 PF00324 the two keywords do not coincide on UniRef90 proteins Neither PF03845 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 324 ) 6745291_PF01333_PF05896 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01333 is 6448190 with Jaccard = 0.9825 |PF01333|=57 [ 56 0 1100154 1 ] parent [ 6448190 ] : 6745291 0.0301637 (=105/(59*59)) 98.2552 given [ 6448190 ] : 6448190 1 (=58/(1*58)) 1.18178 best keyword for cluster 6448190 is PF01333 with Jaccard = 0.9825 [ 56 0 1100154 1 ] 1.0000 0.9825 sibling [ 6448190 ] : 6691814 0.131579 (=15/(57*2)) 90.6596 best keyword for cluster 6691814 is PF05896 with Jaccard = 1.0000 [ 48 0 1100163 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01333 ( PF01333 Apocytochrome F, C-terminal ) B> PF05896 ( PF05896 Na(+)-translocating NADH-quinone reductase subunit A (NQRA) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01333 has a PDB structure (may not be up to date) PF01333 b.84.2.2 i.4.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 325 ) 6758782_PF01774_PF08514 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01774 is 6594794 with Jaccard = 0.9825 |PF01774|=114 [ 112 0 1100097 2 ] parent [ 6594794 ] : 6758782 0.00815719 (=115/(133*106)) 99.1982 given [ 6594794 ] : 6594794 0.441463 (=543/(10*123)) 59.4299 best keyword for cluster 6594794 is PF01774 with Jaccard = 0.9825 [ 112 0 1100097 2 ] 1.0000 0.9825 sibling [ 6594794 ] : 6745117 0.0177536 (=49/(60*46)) 98.2436 best keyword for cluster 6745117 is PF08514 with Jaccard = 0.8913 [ 41 0 1100165 5 ] 1.0000 0.8913 SUGGESTING RELATEDNESS OF: A> PF01774 ( PF01774 UreD urease accessory protein ) B> PF08514 ( PF08514 STAG domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01774| = 114 , |PF08514| = 46 , |PF01774^PF08514| = 1 ( 0.9% and 2.2% ) Neither PF01774 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 326 ) 6612606_PF04632_PF05976 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05976 is 6577279 with Jaccard = 0.9825 |PF05976|=57 [ 56 0 1100154 1 ] parent [ 6577279 ] : 6612606 0.367611 (=6617/(120*150)) 67.5995 given [ 6577279 ] : 6577279 0.502356 (=1386/(31*89)) 52.8915 best keyword for cluster 6577279 is PF05976 with Jaccard = 0.9825 [ 56 0 1100154 1 ] 1.0000 0.9825 sibling [ 6577279 ] : 6603184 0.401932 (=1373/(28*122)) 63.1805 best keyword for cluster 6603184 is PF04632 with Jaccard = 0.9861 [ 71 0 1100139 1 ] 1.0000 0.9861 SUGGESTING RELATEDNESS OF: A> PF05976 ( PF05976 Bacterial membrane protein of unknown function (DUF893) ) B> PF04632 ( PF04632 Fusaric acid resistance protein conserved region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05976 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05976 SSF103473 0.754 (average over 1 mutual instances, PF05976 1 appearances, SSF103473 39293 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 327 ) 6779061_PF04833_PF06568 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06568 is 6756758 with Jaccard = 0.9825 |PF06568|=56 [ 56 1 1100154 0 ] parent [ 6756758 ] : 6779061 0.00093985 (=10/(112*95)) 99.9384 given [ 6756758 ] : 6756758 0.0151203 (=22/(97*15)) 99.0763 best keyword for cluster 6756758 is PF06568 with Jaccard = 0.9825 [ 56 1 1100154 0 ] 0.9825 1.0000 sibling [ 6756758 ] : 6773606 0.00261233 (=5/(66*29)) 99.821 best keyword for cluster 6773606 is PF04833 with Jaccard = 0.9677 [ 30 1 1100180 0 ] 0.9677 1.0000 SUGGESTING RELATEDNESS OF: A> PF06568 ( PF06568 Domain of unknown function (DUF1127) ) B> PF04833 ( PF04833 Phytochelatin synthetase-like conserved region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06568 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 328 ) 6769321_PF01746_PF02590 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02590 is 6717257 with Jaccard = 0.9824 |PF02590|=170 [ 167 0 1100041 3 ] parent [ 6717257 ] : 6769321 0.00484979 (=257/(192*276)) 99.6875 given [ 6717257 ] : 6717257 0.0511464 (=29/(3*189)) 95.0562 best keyword for cluster 6717257 is PF02590 with Jaccard = 0.9824 [ 167 0 1100041 3 ] 1.0000 0.9824 sibling [ 6717257 ] : 6768473 0.00363636 (=1/(1*275)) 99.6567 best keyword for cluster 6768473 is PF01746 with Jaccard = 0.7628 [ 238 0 1099899 74 ] 1.0000 0.7628 SUGGESTING RELATEDNESS OF: A> PF02590 ( PF02590 Uncharacterized ACR, COG1576 ) B> PF01746 ( PF01746 tRNA (Guanine-1)-methyltransferase ) they come from the same clan: CL0098.7 : PF02590 PF02598 PF04013 PF04452 PF00588 PF01746 the two keywords do not coincide on UniRef90 proteins both PF02590 and PF01746 have PDB structures PF02590 c.116.1.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 329 ) 6676104_PF04131_PF05690 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04131 is 6218438 with Jaccard = 0.9818 |PF04131|=55 [ 54 0 1100156 1 ] parent [ 6218438 ] : 6676104 0.166173 (=3031/(60*304)) 87.1964 given [ 6218438 ] : 6218438 1 (=611/(13*47)) 8.88923e-16 best keyword for cluster 6218438 is PF04131 with Jaccard = 0.9818 [ 54 0 1100156 1 ] 1.0000 0.9818 sibling [ 6218438 ] : 6663867 0.165017 (=50/(1*303)) 84.0322 best keyword for cluster 6663867 is PF05690 with Jaccard = 0.6281 [ 179 104 1099926 2 ] 0.6325 0.9890 SUGGESTING RELATEDNESS OF: A> PF04131 ( PF04131 Putative N-acetylmannosamine-6-phosphate epimerase ) B> PF05690 ( PF05690 Thiazole biosynthesis protein ThiG ) they come from the same clan: CL0036.17 : PF05690 PF01680 PF00834 PF01729 PF00697 PF03740 PF01884 PF00724 PF00215 PF03060 PF04095 PF04131 PF00478 PF00218 PF00977 PF01645 PF04309 PF01070 PF01207 PF04481 PF04476 PF01180 PF00701 PF01791 PF03932 PF03437 PF01081 PF00121 PF09370 PF02581 PF00290 the two keywords do not coincide on UniRef90 proteins both PF04131 and PF05690 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF05690 SSF110399 0.947 (average over 551 mutual instances, PF05690 572 appearances, SSF110399 583 appearances) 2 PF04131 SSF51366 0.852 (average over 132 mutual instances, PF04131 147 appearances, SSF51366 8168 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 330 ) 6547153_PF01680_PF05690 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01680 is 6507898 with Jaccard = 0.9817 |PF01680|=109 [ 107 0 1100102 2 ] parent [ 6507898 ] : 6547153 0.673611 (=14356/(192*111)) 36.8294 given [ 6507898 ] : 6507898 0.936364 (=103/(1*110)) 14.5589 best keyword for cluster 6507898 is PF01680 with Jaccard = 0.9817 [ 107 0 1100102 2 ] 1.0000 0.9817 sibling [ 6507898 ] : 6404184 0.999859 (=7099/(50*142)) 0.0148517 best keyword for cluster 6404184 is PF05690 with Jaccard = 0.9724 [ 176 0 1100030 5 ] 1.0000 0.9724 SUGGESTING RELATEDNESS OF: A> PF01680 ( PF01680 SOR/SNZ family ) B> PF05690 ( PF05690 Thiazole biosynthesis protein ThiG ) they come from the same clan: CL0036.17 : PF05690 PF01680 PF00834 PF01729 PF00697 PF03740 PF01884 PF00724 PF00215 PF03060 PF04095 PF04131 PF00478 PF00218 PF00977 PF01645 PF04309 PF01070 PF01207 PF04481 PF04476 PF01180 PF00701 PF01791 PF03932 PF03437 PF01081 PF00121 PF09370 PF02581 PF00290 the two keywords coincide on Uniref90 proteins: |PF01680| = 109 , |PF05690| = 181 , |PF01680^PF05690| = 4 ( 3.7% and 2.2% ) only PF01680 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05690 SSF110399 0.947 (average over 551 mutual instances, PF05690 572 appearances, SSF110399 583 appearances) 2 PF01680 SSF51366 0.778 (average over 331 mutual instances, PF01680 336 appearances, SSF51366 8168 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 331 ) 6777157_PF01801_PF04089 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04089 is 6762730 with Jaccard = 0.9811 |PF04089|=53 [ 52 0 1100158 1 ] parent [ 6762730 ] : 6777157 0.00161179 (=7/(101*43)) 99.9044 given [ 6762730 ] : 6762730 0.00690449 (=12/(22*79)) 99.4054 best keyword for cluster 6762730 is PF04089 with Jaccard = 0.9811 [ 52 0 1100158 1 ] 1.0000 0.9811 sibling [ 6762730 ] : 6769411 0.004329 (=2/(21*22)) 99.6905 best keyword for cluster 6769411 is PF01801 with Jaccard = 0.9000 [ 9 1 1100201 0 ] 0.9000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04089 ( PF04089 BRICHOS domain ) B> PF01801 ( PF01801 Cytomegalovirus glycoprotein L ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04089 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 332 ) 6754109_PF03932_PF06089 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06089 is 6386946 with Jaccard = 0.9811 |PF06089|=53 [ 52 0 1100158 1 ] parent [ 6386946 ] : 6754109 0.0108696 (=54/(54*92)) 98.913 given [ 6386946 ] : 6386946 1 (=53/(1*53)) 0.00126453 best keyword for cluster 6386946 is PF06089 with Jaccard = 0.9811 [ 52 0 1100158 1 ] 1.0000 0.9811 sibling [ 6386946 ] : 6675070 0.138577 (=37/(3*89)) 86.9357 best keyword for cluster 6675070 is PF03932 with Jaccard = 0.9880 [ 82 0 1100128 1 ] 1.0000 0.9880 SUGGESTING RELATEDNESS OF: A> PF06089 ( PF06089 L-asparaginase II ) B> PF03932 ( PF03932 CutC family ) Only B has a clan ( CL0036.17 ). the two keywords coincide on Uniref90 proteins: |PF03932| = 83 , |PF06089| = 53 , |PF03932^PF06089| = 1 ( 1.2% and 1.9% ) only PF06089 has a PDB structure (may not be up to date) PF03932 c.1.30.1 SUPERFAM mapping significantly overlapping: 1 PF03932 SSF110395 0.885 (average over 322 mutual instances, PF03932 322 appearances, SSF110395 322 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 333 ) 6743873_PF01987_PF02342 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01987 is 6734663 with Jaccard = 0.9809 |PF01987|=157 [ 154 0 1100054 3 ] parent [ 6734663 ] : 6743873 0.0188837 (=653/(182*190)) 98.1331 given [ 6734663 ] : 6734663 0.0377785 (=117/(19*163)) 97.2396 best keyword for cluster 6734663 is PF01987 with Jaccard = 0.9809 [ 154 0 1100054 3 ] 1.0000 0.9809 sibling [ 6734663 ] : 6737941 0.031746 (=6/(1*189)) 97.5831 best keyword for cluster 6737941 is PF02342 with Jaccard = 0.8917 [ 140 4 1100054 13 ] 0.9722 0.9150 SUGGESTING RELATEDNESS OF: A> PF01987 ( PF01987 Protein of unknown function DUF124 ) B> PF02342 ( PF02342 Bacterial stress protein ) Only B has a clan ( CL0128.6 ). the two keywords coincide on Uniref90 proteins: |PF01987| = 157 , |PF02342| = 153 , |PF01987^PF02342| = 5 ( 3.2% and 3.3% ) only PF01987 has a PDB structure (may not be up to date) PF01987 b.82.5.2 SUPERFAM mapping significantly overlapping: 1 PF01987 SSF51219 0.979 (average over 382 mutual instances, PF01987 383 appearances, SSF51219 425 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 334 ) 6720443_PF00214_PF02039 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00214 is 6607292 with Jaccard = 0.9808 |PF00214|=52 [ 51 0 1100159 1 ] parent [ 6607292 ] : 6720443 0.055668 (=55/(19*52)) 95.4812 given [ 6607292 ] : 6607292 0.34955 (=194/(15*37)) 65.4522 best keyword for cluster 6607292 is PF00214 with Jaccard = 0.9808 [ 51 0 1100159 1 ] 1.0000 0.9808 sibling [ 6607292 ] : 6526218 0.77381 (=65/(7*12)) 23.8695 best keyword for cluster 6526218 is PF02039 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00214 ( PF00214 Calcitonin / CGRP / IAPP family ) B> PF02039 ( PF02039 Adrenomedullin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00214 has a PDB structure (may not be up to date) PF00214 j.42.1.1 j.6.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 335 ) 6717053_PF01087_PF01230 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01230 is 6708055 with Jaccard = 0.9807 |PF01230|=622 [ 611 1 1099588 11 ] parent [ 6708055 ] : 6717053 0.0614693 (=6374/(139*746)) 95.0262 given [ 6708055 ] : 6708055 0.0708838 (=158/(3*743)) 93.6409 best keyword for cluster 6708055 is PF01230 with Jaccard = 0.9807 [ 611 1 1099588 11 ] 0.9984 0.9823 sibling [ 6708055 ] : 6651712 0.229323 (=183/(6*133)) 80.5232 best keyword for cluster 6651712 is PF01087 with Jaccard = 0.6316 [ 84 7 1100078 42 ] 0.9231 0.6667 SUGGESTING RELATEDNESS OF: A> PF01230 ( PF01230 HIT domain ) B> PF01087 ( PF01087 Galactose-1-phosphate uridyl transferase, N-terminal domain ) Only A has a clan ( CL0265.2 ). the two keywords coincide on Uniref90 proteins: |PF01087| = 126 , |PF01230| = 622 , |PF01087^PF01230| = 1 ( 0.8% and 0.2% ) both PF01230 and PF01087 have PDB structures PF01230 d.13.1.1 PF01087 d.13.1.2 SUPERFAM mapping significantly overlapping: 1 PF01230 SSF54197 0.754 (average over 1865 mutual instances, PF01230 1906 appearances, SSF54197 2604 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 336 ) 6734162_PF02661_PF05012 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05012 is 6689837 with Jaccard = 0.9804 |PF05012|=101 [ 100 1 1100109 1 ] parent [ 6689837 ] : 6734162 0.0426443 (=3915/(214*429)) 97.185 given [ 6689837 ] : 6689837 0.0985169 (=1116/(96*118)) 90.2336 best keyword for cluster 6689837 is PF05012 with Jaccard = 0.9804 [ 100 1 1100109 1 ] 0.9901 0.9901 sibling [ 6689837 ] : 6733450 0.0341909 (=101/(7*422)) 97.1024 best keyword for cluster 6733450 is PF02661 with Jaccard = 0.9878 [ 323 0 1099884 4 ] 1.0000 0.9878 SUGGESTING RELATEDNESS OF: A> PF05012 ( PF05012 Prophage maintenance system killer protein ) B> PF02661 ( PF02661 Fic protein family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05012 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 337 ) 6773633_PF00424_PF00539 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00424 is 6771379 with Jaccard = 0.9801 |PF00424|=704 [ 690 0 1099507 14 ] parent [ 6771379 ] : 6773633 0.00216402 (=2092/(763*1267)) 99.8217 given [ 6771379 ] : 6771379 0.00262467 (=2/(1*762)) 99.7559 best keyword for cluster 6771379 is PF00424 with Jaccard = 0.9801 [ 690 0 1099507 14 ] 1.0000 0.9801 sibling [ 6771379 ] : 6772814 0.00316957 (=20/(5*1262)) 99.7995 best keyword for cluster 6772814 is PF00539 with Jaccard = 0.8292 [ 743 152 1099315 1 ] 0.8302 0.9987 SUGGESTING RELATEDNESS OF: A> PF00424 ( PF00424 REV protein (anti-repression trans-activator protein) ) B> PF00539 ( PF00539 Transactivating regulatory protein (Tat) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00424| = 704 , |PF00539| = 744 , |PF00424^PF00539| = 3 ( 0.4% and 0.4% ) both PF00424 and PF00539 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 338 ) 6756753_PF01578_PF05140 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05140 is 6739787 with Jaccard = 0.9798 |PF05140|=99 [ 97 0 1100112 2 ] parent [ 6739787 ] : 6756753 0.0110831 (=1187/(153*700)) 99.0758 given [ 6739787 ] : 6739787 0.0241379 (=28/(145*8)) 97.7558 best keyword for cluster 6739787 is PF05140 with Jaccard = 0.9798 [ 97 0 1100112 2 ] 1.0000 0.9798 sibling [ 6739787 ] : 6748536 0.0195652 (=135/(10*690)) 98.5049 best keyword for cluster 6748536 is PF01578 with Jaccard = 0.9609 [ 540 21 1099649 1 ] 0.9626 0.9982 SUGGESTING RELATEDNESS OF: A> PF05140 ( PF05140 ResB-like family ) B> PF01578 ( PF01578 Cytochrome C assembly protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01578| = 541 , |PF05140| = 99 , |PF01578^PF05140| = 2 ( 0.4% and 2.0% ) Neither PF05140 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 339 ) 6688990_PF02065_PF05691 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05691 is 6658986 with Jaccard = 0.9796 |PF05691|=49 [ 48 0 1100162 1 ] parent [ 6658986 ] : 6688990 0.118551 (=1466/(229*54)) 90.0517 given [ 6658986 ] : 6658986 0.169935 (=26/(51*3)) 83.0067 best keyword for cluster 6658986 is PF05691 with Jaccard = 0.9796 [ 48 0 1100162 1 ] 1.0000 0.9796 sibling [ 6658986 ] : 6658269 0.201794 (=270/(6*223)) 82.7771 best keyword for cluster 6658269 is PF02065 with Jaccard = 0.9788 [ 185 4 1100022 0 ] 0.9788 1.0000 SUGGESTING RELATEDNESS OF: A> PF05691 ( PF05691 Raffinose synthase or seed imbibition protein Sip1 ) B> PF02065 ( PF02065 Melibiase ) Only B has a clan ( CL0058.10 ). the two keywords do not coincide on UniRef90 proteins only PF05691 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 340 ) 6741149_PF00226_PF06386 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06386 is 6598397 with Jaccard = 0.9792 |PF06386|=48 [ 47 0 1100163 1 ] parent [ 6598397 ] : 6741149 0.0238828 (=4425/(60*3088)) 97.89 given [ 6598397 ] : 6598397 0.426901 (=73/(3*57)) 60.9933 best keyword for cluster 6598397 is PF06386 with Jaccard = 0.9792 [ 47 0 1100163 1 ] 1.0000 0.9792 sibling [ 6598397 ] : 6737905 0.0323698 (=897/(9*3079)) 97.5773 best keyword for cluster 6737905 is PF00226 with Jaccard = 0.9188 [ 2468 109 1097525 109 ] 0.9577 0.9577 SUGGESTING RELATEDNESS OF: A> PF06386 ( PF06386 Gas vesicle synthesis protein GvpL/GvpF ) B> PF00226 ( PF00226 DnaJ domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00226| = 2577 , |PF06386| = 48 , |PF00226^PF06386| = 3 ( 0.1% and 6.2% ) only PF06386 has a PDB structure (may not be up to date) PF00226 a.2.3.1 SUPERFAM mapping significantly overlapping: 1 PF00226 SSF46565 0.626 (average over 6995 mutual instances, PF00226 11372 appearances, SSF46565 12650 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 341 ) 6727943_PF00324_PF01235 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00324 is 6725973 with Jaccard = 0.9791 |PF00324|=2052 [ 2016 7 1098152 36 ] parent [ 6725973 ] : 6727943 0.0502552 (=31744/(283*2232)) 96.478 given [ 6725973 ] : 6725973 0.0417415 (=372/(4*2228)) 96.2301 best keyword for cluster 6725973 is PF00324 with Jaccard = 0.9791 [ 2016 7 1098152 36 ] 0.9965 0.9825 sibling [ 6725973 ] : 6675002 0.137993 (=154/(4*279)) 86.9053 best keyword for cluster 6675002 is PF01235 with Jaccard = 0.9961 [ 254 0 1099956 1 ] 1.0000 0.9961 SUGGESTING RELATEDNESS OF: A> PF00324 ( PF00324 Amino acid permease ) B> PF01235 ( PF01235 Sodium:alanine symporter family ) they come from the same clan: CL0062.8 : PF00860 PF03222 PF02133 PF00916 PF00474 PF03845 PF01235 PF00955 PF07331 PF02361 PF05525 PF03594 PF01490 PF00324 the two keywords do not coincide on UniRef90 proteins Neither PF00324 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 342 ) 6749989_PF01694_PF04511 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01694 is 6743480 with Jaccard = 0.9791 |PF01694|=574 [ 563 1 1099636 11 ] parent [ 6743480 ] : 6749989 0.0176152 (=1321/(109*688)) 98.6183 given [ 6743480 ] : 6743480 0.0210166 (=86/(6*682)) 98.0967 best keyword for cluster 6743480 is PF01694 with Jaccard = 0.9791 [ 563 1 1099636 11 ] 0.9982 0.9808 sibling [ 6743480 ] : 6718815 0.0498282 (=58/(12*97)) 95.2505 best keyword for cluster 6718815 is PF04511 with Jaccard = 0.9239 [ 85 3 1100119 4 ] 0.9659 0.9551 SUGGESTING RELATEDNESS OF: A> PF01694 ( PF01694 Rhomboid family ) B> PF04511 ( PF04511 Der1-like family ) they come from the same clan: CL0207.4 : PF04511 PF08551 PF01694 the two keywords do not coincide on UniRef90 proteins Neither PF01694 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 343 ) 6731304_PF00893_PF02694 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00893 is 6678461 with Jaccard = 0.9787 |PF00893|=327 [ 321 1 1099883 6 ] parent [ 6678461 ] : 6731304 0.0412858 (=1337/(368*88)) 96.8609 given [ 6678461 ] : 6678461 0.122569 (=353/(8*360)) 87.8186 best keyword for cluster 6678461 is PF00893 with Jaccard = 0.9787 [ 321 1 1099883 6 ] 0.9969 0.9817 sibling [ 6678461 ] : 6723718 0.047619 (=16/(84*4)) 95.9557 best keyword for cluster 6723718 is PF02694 with Jaccard = 1.0000 [ 71 0 1100140 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00893 ( PF00893 Small Multidrug Resistance protein ) B> PF02694 ( PF02694 Uncharacterised BCR, YnfA/UPF0060 family ) they come from the same clan: CL0184.5 : PF07857 PF04342 PF00892 PF05653 PF06027 PF00893 PF04142 PF06379 PF06800 PF03151 PF08449 PF02694 the two keywords do not coincide on UniRef90 proteins only PF00893 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 344 ) 6673941_PF03150_PF06537 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06537 is 6484455 with Jaccard = 0.9783 |PF06537|=46 [ 45 0 1100165 1 ] parent [ 6484455 ] : 6673941 0.169163 (=2101/(54*230)) 86.6617 given [ 6484455 ] : 6484455 0.95 (=190/(4*50)) 6.77325 best keyword for cluster 6484455 is PF06537 with Jaccard = 0.9783 [ 45 0 1100165 1 ] 1.0000 0.9783 sibling [ 6484455 ] : 6631950 0.299808 (=468/(7*223)) 75.1271 best keyword for cluster 6631950 is PF03150 with Jaccard = 1.0000 [ 156 0 1100055 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06537 ( PF06537 Protein of unknown function (DUF1111) ) B> PF03150 ( PF03150 Di-haem cytochrome c peroxidase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06537 has a PDB structure (may not be up to date) PF03150 a.3.1.5 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 345 ) 6752337_PF01923_PF03928 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03928 is 6729997 with Jaccard = 0.9779 |PF03928|=181 [ 177 0 1100030 4 ] parent [ 6729997 ] : 6752337 0.0124398 (=413/(166*200)) 98.7917 given [ 6729997 ] : 6729997 0.0455623 (=269/(36*164)) 96.7207 best keyword for cluster 6729997 is PF03928 with Jaccard = 0.9779 [ 177 0 1100030 4 ] 1.0000 0.9779 sibling [ 6729997 ] : 6676336 0.145455 (=24/(1*165)) 87.2621 best keyword for cluster 6676336 is PF01923 with Jaccard = 0.9329 [ 153 0 1100047 11 ] 1.0000 0.9329 SUGGESTING RELATEDNESS OF: A> PF03928 ( PF03928 Domain of unknown function (DUF336) ) B> PF01923 ( PF01923 Cobalamin adenosyltransferase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01923| = 164 , |PF03928| = 181 , |PF01923^PF03928| = 2 ( 1.2% and 1.1% ) both PF03928 and PF01923 have PDB structures PF03928 d.110.9.1 PF01923 a.25.2.2 SUPERFAM mapping significantly overlapping: 1 PF01923 SSF89028 0.931 (average over 509 mutual instances, PF01923 510 appearances, SSF89028 542 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 346 ) 6760615_PF05489_PF06893 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05489 is 6668813 with Jaccard = 0.9778 |PF05489|=45 [ 44 0 1100166 1 ] parent [ 6668813 ] : 6760615 0.00887949 (=21/(55*43)) 99.2972 given [ 6668813 ] : 6668813 0.177177 (=118/(37*18)) 85.1917 best keyword for cluster 6668813 is PF05489 with Jaccard = 0.9778 [ 44 0 1100166 1 ] 1.0000 0.9778 sibling [ 6668813 ] : 6755880 0.0238095 (=1/(1*42)) 99.0238 best keyword for cluster 6755880 is PF06893 with Jaccard = 0.9286 [ 26 2 1100183 0 ] 0.9286 1.0000 SUGGESTING RELATEDNESS OF: A> PF05489 ( PF05489 Phage Tail Protein X ) B> PF06893 ( PF06893 Bacteriophage Mu P protein ) Only A has a clan ( CL0187.6 ). the two keywords do not coincide on UniRef90 proteins Neither PF05489 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 347 ) 6764035_PF01157_PF03524 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03524 is 6733553 with Jaccard = 0.9776 |PF03524|=134 [ 131 0 1100077 3 ] parent [ 6733553 ] : 6764035 0.00624117 (=106/(193*88)) 99.4673 given [ 6733553 ] : 6733553 0.0349138 (=243/(48*145)) 97.1173 best keyword for cluster 6733553 is PF03524 with Jaccard = 0.9776 [ 131 0 1100077 3 ] 1.0000 0.9776 sibling [ 6733553 ] : 6762942 0.0114943 (=1/(1*87)) 99.4138 best keyword for cluster 6762942 is PF01157 with Jaccard = 1.0000 [ 81 0 1100130 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03524 ( PF03524 Conjugal transfer protein ) B> PF01157 ( PF01157 Ribosomal protein L21e ) Only B has a clan ( CL0107.7 ). the two keywords do not coincide on UniRef90 proteins only PF03524 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01157 SSF50104 0.997 (average over 235 mutual instances, PF01157 236 appearances, SSF50104 9220 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 348 ) 6670696_PF01041_PF01276 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01041 is 6650488 with Jaccard = 0.9774 |PF01041|=654 [ 649 10 1099547 5 ] parent [ 6650488 ] : 6670696 0.163765 (=18852/(159*724)) 85.6823 given [ 6650488 ] : 6650488 0.235131 (=170/(1*723)) 80.2206 best keyword for cluster 6650488 is PF01041 with Jaccard = 0.9774 [ 649 10 1099547 5 ] 0.9848 0.9924 sibling [ 6650488 ] : 6532901 0.760684 (=356/(3*156)) 27.5861 best keyword for cluster 6532901 is PF01276 with Jaccard = 0.9605 [ 146 5 1100059 1 ] 0.9669 0.9932 SUGGESTING RELATEDNESS OF: A> PF01041 ( PF01041 DegT/DnrJ/EryC1/StrS aminotransferase family ) B> PF01276 ( PF01276 Orn/Lys/Arg decarboxylase, major domain ) they come from the same clan: CL0061.8 : PF05889 PF00464 PF03841 PF00282 PF01276 PF02347 PF01041 PF01053 PF01212 PF00266 PF00202 PF00155 PF06838 PF04864 the two keywords do not coincide on UniRef90 proteins both PF01041 and PF01276 have PDB structures PF01041 c.67.1.4 SUPERFAM mapping significantly overlapping: 1 PF01041 SSF53383 0.932 (average over 1805 mutual instances, PF01041 1817 appearances, SSF53383 34644 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 349 ) 6599234_PF04101_PF06925 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06925 is 6478966 with Jaccard = 0.9773 |PF06925|=43 [ 43 1 1100167 0 ] parent [ 6478966 ] : 6599234 0.41884 (=5927/(267*53)) 61.167 given [ 6478966 ] : 6478966 0.945833 (=227/(5*48)) 5.4698 best keyword for cluster 6478966 is PF06925 with Jaccard = 0.9773 [ 43 1 1100167 0 ] 0.9773 1.0000 sibling [ 6478966 ] : 6375579 1 (=16892/(103*164)) 0.000250955 best keyword for cluster 6375579 is PF04101 with Jaccard = 0.6096 [ 242 0 1099814 155 ] 1.0000 0.6096 SUGGESTING RELATEDNESS OF: A> PF06925 ( PF06925 Monogalactosyldiacylglycerol (MGDG) synthase ) B> PF04101 ( PF04101 Glycosyltransferase family 28 C-terminal domain ) they come from the same clan: CL0113.8 : PF06925 PF02684 PF04464 PF04101 PF01075 PF03033 PF00982 PF00534 PF05693 PF02350 PF04007 PF06722 PF05159 PF08660 PF00343 PF00201 the two keywords coincide on Uniref90 proteins: |PF04101| = 397 , |PF06925| = 43 , |PF04101^PF06925| = 19 ( 4.8% and 44.2% ) only PF06925 has a PDB structure (may not be up to date) PF04101 c.87.1.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 350 ) 6736629_PF00823_PF08237 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08237 is 6681375 with Jaccard = 0.9773 |PF08237|=44 [ 43 0 1100167 1 ] parent [ 6681375 ] : 6736629 0.0264975 (=403/(67*227)) 97.439 given [ 6681375 ] : 6681375 0.121118 (=117/(21*46)) 88.525 best keyword for cluster 6681375 is PF08237 with Jaccard = 0.9773 [ 43 0 1100167 1 ] 1.0000 0.9773 sibling [ 6681375 ] : 6716602 0.0560538 (=50/(4*223)) 94.993 best keyword for cluster 6716602 is PF00823 with Jaccard = 0.7875 [ 126 27 1100051 7 ] 0.8235 0.9474 SUGGESTING RELATEDNESS OF: A> PF08237 ( PF08237 PE-PPE domain ) B> PF00823 ( PF00823 PPE family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00823| = 133 , |PF08237| = 44 , |PF00823^PF08237| = 3 ( 2.3% and 6.8% ) Neither PF08237 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 351 ) 6750901_PF01722_PF02657 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02657 is 6551215 with Jaccard = 0.9769 |PF02657|=130 [ 127 0 1100081 3 ] parent [ 6551215 ] : 6750901 0.0133596 (=645/(142*340)) 98.6871 given [ 6551215 ] : 6551215 0.601918 (=251/(3*139)) 39.9631 best keyword for cluster 6551215 is PF02657 with Jaccard = 0.9769 [ 127 0 1100081 3 ] 1.0000 0.9769 sibling [ 6551215 ] : 6717119 0.0530973 (=18/(1*339)) 95.0367 best keyword for cluster 6717119 is PF01722 with Jaccard = 0.9810 [ 310 0 1099895 6 ] 1.0000 0.9810 SUGGESTING RELATEDNESS OF: A> PF02657 ( PF02657 Fe-S metabolism associated domain ) B> PF01722 ( PF01722 BolA-like protein ) Only A has a clan ( CL0233.3 ). the two keywords coincide on Uniref90 proteins: |PF01722| = 316 , |PF02657| = 130 , |PF01722^PF02657| = 2 ( 0.6% and 1.5% ) both PF02657 and PF01722 have PDB structures PF01722 d.52.6.1 SUPERFAM mapping significantly overlapping: 1 PF01722 SSF82657 0.885 (average over 968 mutual instances, PF01722 976 appearances, SSF82657 980 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 352 ) 6752913_PF00817_PF02961 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00817 is 6743482 with Jaccard = 0.9768 |PF00817|=516 [ 505 1 1099694 11 ] parent [ 6743482 ] : 6752913 0.0182865 (=1162/(94*676)) 98.8327 given [ 6743482 ] : 6743482 0.0211403 (=99/(7*669)) 98.0969 best keyword for cluster 6743482 is PF00817 with Jaccard = 0.9768 [ 505 1 1099694 11 ] 0.9980 0.9787 sibling [ 6743482 ] : 6742618 0.0300065 (=46/(73*21)) 98.0191 best keyword for cluster 6742618 is PF02961 with Jaccard = 0.7857 [ 11 3 1100197 0 ] 0.7857 1.0000 SUGGESTING RELATEDNESS OF: A> PF00817 ( PF00817 impB/mucB/samB family ) B> PF02961 ( PF02961 Barrier to autointegration factor ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF00817 and PF02961 have PDB structures PF00817 d.240.1.1 e.8.1.7 PF02961 a.60.5.1 SUPERFAM mapping significantly overlapping: 1 PF02961 SSF47798 0.975 (average over 29 mutual instances, PF02961 29 appearances, SSF47798 29 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 353 ) 6750980_PF00639_PF04319 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04319 is 6524652 with Jaccard = 0.9767 |PF04319|=43 [ 42 0 1100168 1 ] parent [ 6524652 ] : 6750980 0.0135312 (=506/(45*831)) 98.6941 given [ 6524652 ] : 6524652 0.792683 (=130/(4*41)) 22.8423 best keyword for cluster 6524652 is PF04319 with Jaccard = 0.9767 [ 42 0 1100168 1 ] 1.0000 0.9767 sibling [ 6524652 ] : 6747622 0.019593 (=129/(8*823)) 98.4372 best keyword for cluster 6747622 is PF00639 with Jaccard = 0.9720 [ 520 5 1099676 10 ] 0.9905 0.9811 SUGGESTING RELATEDNESS OF: A> PF04319 ( PF04319 NifZ domain ) B> PF00639 ( PF00639 PPIC-type PPIASE domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00639| = 530 , |PF04319| = 43 , |PF00639^PF04319| = 1 ( 0.2% and 2.3% ) only PF04319 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 354 ) 6751748_PF03692_PF05779 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03692 is 6748798 with Jaccard = 0.9758 |PF03692|=289 [ 282 0 1099922 7 ] parent [ 6748798 ] : 6751748 0.0190383 (=700/(96*383)) 98.7501 given [ 6748798 ] : 6748798 0.0207989 (=151/(20*363)) 98.5273 best keyword for cluster 6748798 is PF03692 with Jaccard = 0.9758 [ 282 0 1099922 7 ] 1.0000 0.9758 sibling [ 6748798 ] : 6743017 0.035313 (=22/(89*7)) 98.0547 best keyword for cluster 6743017 is PF05779 with Jaccard = 0.9868 [ 75 1 1100135 0 ] 0.9868 1.0000 SUGGESTING RELATEDNESS OF: A> PF03692 ( PF03692 Uncharacterised protein family (UPF0153) ) B> PF05779 ( ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03692| = 289 , |PF05779| = 75 , |PF03692^PF05779| = 1 ( 0.3% and 1.3% ) Neither PF03692 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 355 ) 6746429_PF01763_PF03271 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01763 is 6488277 with Jaccard = 0.9756 |PF01763|=41 [ 40 0 1100170 1 ] parent [ 6488277 ] : 6746429 0.0265766 (=118/(40*111)) 98.3434 given [ 6488277 ] : 6488277 0.923077 (=36/(1*39)) 7.76623 best keyword for cluster 6488277 is PF01763 with Jaccard = 0.9756 [ 40 0 1100170 1 ] 1.0000 0.9756 sibling [ 6488277 ] : 6740396 0.0263736 (=48/(91*20)) 97.8172 best keyword for cluster 6740396 is PF03271 with Jaccard = 0.7922 [ 61 15 1100134 1 ] 0.8026 0.9839 SUGGESTING RELATEDNESS OF: A> PF01763 ( PF01763 Herpesvirus UL6 like ) B> PF03271 ( PF03271 EB1-like C-terminal motif ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01763 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 356 ) 6762508_PF03968_PF06835 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06835 is 6748807 with Jaccard = 0.9756 |PF06835|=81 [ 80 1 1100129 1 ] parent [ 6748807 ] : 6762508 0.00917447 (=549/(160*374)) 99.3945 given [ 6748807 ] : 6748807 0.0194892 (=87/(124*36)) 98.5286 best keyword for cluster 6748807 is PF06835 with Jaccard = 0.9756 [ 80 1 1100129 1 ] 0.9877 0.9877 sibling [ 6748807 ] : 6752400 0.0142857 (=52/(364*10)) 98.7961 best keyword for cluster 6752400 is PF03968 with Jaccard = 0.7921 [ 221 46 1099932 12 ] 0.8277 0.9485 SUGGESTING RELATEDNESS OF: A> PF06835 ( PF06835 Protein of unknown function (DUF1239) ) B> PF03968 ( PF03968 OstA-like protein ) they come from the same clan: CL0259.2 : PF06835 PF03968 the two keywords do not coincide on UniRef90 proteins Neither PF06835 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 357 ) 6754277_PF01206_PF02635 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02635 is 6742716 with Jaccard = 0.9755 |PF02635|=161 [ 159 2 1100048 2 ] parent [ 6742716 ] : 6754277 0.0119638 (=1584/(313*423)) 98.9237 given [ 6742716 ] : 6742716 0.0233392 (=397/(243*70)) 98.026 best keyword for cluster 6742716 is PF02635 with Jaccard = 0.9755 [ 159 2 1100048 2 ] 0.9876 0.9876 sibling [ 6742716 ] : 6748727 0.0220923 (=612/(81*342)) 98.5214 best keyword for cluster 6748727 is PF01206 with Jaccard = 0.9051 [ 248 7 1099937 19 ] 0.9725 0.9288 SUGGESTING RELATEDNESS OF: A> PF02635 ( PF02635 DsrE/DsrF-like family ) B> PF01206 ( PF01206 SirA-like protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01206| = 267 , |PF02635| = 161 , |PF01206^PF02635| = 4 ( 1.5% and 2.5% ) both PF02635 and PF01206 have PDB structures PF02635 c.114.1.1 SUPERFAM mapping significantly overlapping: 1 PF01206 SSF64307 0.950 (average over 849 mutual instances, PF01206 954 appearances, SSF64307 1042 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 358 ) 6542083_PF00393_PF03446 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00393 is 6532366 with Jaccard = 0.9753 |PF00393|=277 [ 276 6 1099928 1 ] parent [ 6532366 ] : 6542083 0.685969 (=116837/(308*553)) 33.5237 given [ 6532366 ] : 6532366 0.73768 (=2410/(11*297)) 27.0546 best keyword for cluster 6532366 is PF00393 with Jaccard = 0.9753 [ 276 6 1099928 1 ] 0.9787 0.9964 sibling [ 6532366 ] : 6532887 0.746717 (=7960/(20*533)) 27.5737 best keyword for cluster 6532887 is PF03446 with Jaccard = 0.6203 [ 490 3 1099421 297 ] 0.9939 0.6226 SUGGESTING RELATEDNESS OF: A> PF00393 ( PF00393 6-phosphogluconate dehydrogenase, C-terminal domain ) B> PF03446 ( PF03446 NAD binding domain of 6-phosphogluconate dehydrogenase ) A and B come from a different clan ( CL0106.7 , CL0063.17 ). the two keywords coincide on Uniref90 proteins: |PF00393| = 277 , |PF03446| = 787 , |PF00393^PF03446| = 255 ( 92.1% and 32.4% ) both PF00393 and PF03446 have PDB structures PF00393 a.100.1.1 SUPERFAM mapping significantly overlapping: 1 PF00393 SSF48179 0.846 (average over 1097 mutual instances, PF00393 2092 appearances, SSF48179 20570 appearances) 2 PF03446 SSF51735 0.944 (average over 2816 mutual instances, PF03446 5512 appearances, SSF51735 164772 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 359 ) 6634810_PF03088_PF08450 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03088 is 6629232 with Jaccard = 0.9750 |PF03088|=80 [ 78 0 1100131 2 ] parent [ 6629232 ] : 6634810 0.273158 (=7201/(269*98)) 75.6875 given [ 6629232 ] : 6629232 0.313978 (=146/(5*93)) 74.2173 best keyword for cluster 6629232 is PF03088 with Jaccard = 0.9750 [ 78 0 1100131 2 ] 1.0000 0.9750 sibling [ 6629232 ] : 6626669 0.277154 (=148/(2*267)) 73.2134 best keyword for cluster 6626669 is PF08450 with Jaccard = 0.8979 [ 211 16 1099976 8 ] 0.9295 0.9635 SUGGESTING RELATEDNESS OF: A> PF03088 ( PF03088 Strictosidine synthase ) B> PF08450 ( PF08450 SMP-30/Gluconolaconase/LRE-like region ) they come from the same clan: CL0186.8 : PF03088 PF08450 PF06739 PF07494 PF01011 PF02897 PF07676 PF08801 PF01436 PF06433 PF00058 PF01839 PF00930 PF02239 PF01731 PF00400 the two keywords do not coincide on UniRef90 proteins only PF03088 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 360 ) 6700753_PF05105_PF05895 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05105 is 6680068 with Jaccard = 0.9750 |PF05105|=40 [ 39 0 1100171 1 ] parent [ 6680068 ] : 6700753 0.077381 (=39/(12*42)) 92.3104 given [ 6680068 ] : 6680068 0.138158 (=21/(4*38)) 88.1723 best keyword for cluster 6680068 is PF05105 with Jaccard = 0.9750 [ 39 0 1100171 1 ] 1.0000 0.9750 sibling [ 6680068 ] : 5822328 1 (=32/(8*4)) 9.37506e-52 best keyword for cluster 5822328 is PF05895 with Jaccard = 0.9231 [ 12 0 1100198 1 ] 1.0000 0.9231 SUGGESTING RELATEDNESS OF: A> PF05105 ( PF05105 Holin family ) B> PF05895 ( PF05895 Siphovirus protein of unknown function (DUF859) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF05105| = 40 , |PF05895| = 13 , |PF05105^PF05895| = 1 ( 2.5% and 7.7% ) Neither PF05105 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 361 ) 6732749_PF00581_PF00899 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00899 is 6728865 with Jaccard = 0.9749 |PF00899|=901 [ 893 15 1099295 8 ] parent [ 6728865 ] : 6732749 0.0326903 (=54432/(1529*1089)) 97.0243 given [ 6728865 ] : 6728865 0.043517 (=976/(21*1068)) 96.594 best keyword for cluster 6728865 is PF00899 with Jaccard = 0.9749 [ 893 15 1099295 8 ] 0.9835 0.9911 sibling [ 6728865 ] : 6721893 0.0539344 (=329/(4*1525)) 95.6893 best keyword for cluster 6721893 is PF00581 with Jaccard = 0.8021 [ 1050 8 1098902 251 ] 0.9924 0.8071 SUGGESTING RELATEDNESS OF: A> PF00899 ( PF00899 ThiF family ) B> PF00581 ( PF00581 Rhodanese-like domain ) Only A has a clan ( CL0063.17 ). the two keywords coincide on Uniref90 proteins: |PF00581| = 1301 , |PF00899| = 901 , |PF00581^PF00899| = 72 ( 5.5% and 8.0% ) both PF00899 and PF00581 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00581 SSF52821 0.763 (average over 3964 mutual instances, PF00581 4463 appearances, SSF52821 6143 appearances) 2 PF00899 SSF69572 0.518 (average over 2370 mutual instances, PF00899 2642 appearances, SSF69572 3931 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 362 ) 6737404_PF00485_PF01121 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01121 is 6731962 with Jaccard = 0.9746 |PF01121|=313 [ 307 2 1099896 6 ] parent [ 6731962 ] : 6737404 0.0391929 (=8508/(536*405)) 97.5261 given [ 6731962 ] : 6731962 0.0310602 (=317/(27*378)) 96.9348 best keyword for cluster 6731962 is PF01121 with Jaccard = 0.9746 [ 307 2 1099896 6 ] 0.9935 0.9808 sibling [ 6731962 ] : 6720335 0.0650943 (=207/(6*530)) 95.464 best keyword for cluster 6720335 is PF00485 with Jaccard = 0.8920 [ 347 4 1099822 38 ] 0.9886 0.9013 SUGGESTING RELATEDNESS OF: A> PF01121 ( PF01121 Dephospho-CoA kinase ) B> PF00485 ( PF00485 Phosphoribulokinase / Uridine kinase family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01121 and PF00485 have PDB structures PF01121 c.37.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 363 ) 6740876_PF03410_PF05193 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05193 is 6728858 with Jaccard = 0.9741 |PF05193|=945 [ 939 19 1099247 6 ] parent [ 6728858 ] : 6740876 0.0290778 (=548/(1047*18)) 97.8647 given [ 6728858 ] : 6728858 0.0353167 (=184/(1042*5)) 96.5927 best keyword for cluster 6728858 is PF05193 with Jaccard = 0.9741 [ 939 19 1099247 6 ] 0.9802 0.9937 sibling [ 6728858 ] : 6733557 0.0588235 (=1/(1*17)) 97.1176 best keyword for cluster 6733557 is PF03410 with Jaccard = 0.9231 [ 12 1 1100198 0 ] 0.9231 1.0000 SUGGESTING RELATEDNESS OF: A> PF05193 ( PF05193 Peptidase M16 inactive domain ) B> PF03410 ( PF03410 Protein G1 ) they come from the same clan: CL0094.7 : PF02664 PF00675 PF05193 PF03410 the two keywords do not coincide on UniRef90 proteins only PF05193 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 364 ) 6710584_PF01417_PF07651 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01417 is 6577412 with Jaccard = 0.9735 |PF01417|=113 [ 110 0 1100098 3 ] parent [ 6577412 ] : 6710584 0.0803684 (=890/(113*98)) 94.0104 given [ 6577412 ] : 6577412 0.477477 (=106/(2*111)) 52.9269 best keyword for cluster 6577412 is PF01417 with Jaccard = 0.9735 [ 110 0 1100098 3 ] 1.0000 0.9735 sibling [ 6577412 ] : 6694982 0.0930851 (=35/(4*94)) 91.3228 best keyword for cluster 6694982 is PF07651 with Jaccard = 0.6960 [ 87 1 1100086 37 ] 0.9886 0.7016 SUGGESTING RELATEDNESS OF: A> PF01417 ( PF01417 ENTH domain ) B> PF07651 ( PF07651 ANTH domain ) they come from the same clan: CL0009.14 : PF01417 PF07651 PF00790 the two keywords do not coincide on UniRef90 proteins both PF01417 and PF07651 have PDB structures PF01417 a.118.9.1 SUPERFAM mapping significantly overlapping: 1 PF01417 SSF48464 0.847 (average over 240 mutual instances, PF01417 240 appearances, SSF48464 1729 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 365 ) 6758123_PF03672_PF04729 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03672 is 6439359 with Jaccard = 0.9730 |PF03672|=37 [ 36 0 1100174 1 ] parent [ 6439359 ] : 6758123 0.0160595 (=41/(37*69)) 99.1604 given [ 6439359 ] : 6439359 0.995238 (=209/(7*30)) 0.62387 best keyword for cluster 6439359 is PF03672 with Jaccard = 0.9730 [ 36 0 1100174 1 ] 1.0000 0.9730 sibling [ 6439359 ] : 6742166 0.0274725 (=20/(56*13)) 97.9857 best keyword for cluster 6742166 is PF04729 with Jaccard = 1.0000 [ 47 0 1100164 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03672 ( PF03672 Uncharacterised protein family (UPF0154) ) B> PF04729 ( PF04729 Anti-silencing protein, ASF1-like ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03672 has a PDB structure (may not be up to date) PF04729 b.1.22.1 SUPERFAM mapping significantly overlapping: 1 PF04729 SSF101546 0.968 (average over 114 mutual instances, PF04729 114 appearances, SSF101546 115 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 366 ) 6711543_PF00004_PF06309 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06309 is 6689666 with Jaccard = 0.9730 |PF06309|=36 [ 36 1 1100174 0 ] parent [ 6689666 ] : 6711543 0.0669631 (=24623/(51*7210)) 94.1712 given [ 6689666 ] : 6689666 0.111111 (=30/(45*6)) 90.2085 best keyword for cluster 6689666 is PF06309 with Jaccard = 0.9730 [ 36 1 1100174 0 ] 0.9730 1.0000 sibling [ 6689666 ] : 6708571 0.0663275 (=10964/(23*7187)) 93.739 best keyword for cluster 6708571 is PF00004 with Jaccard = 0.6403 [ 3979 2070 1093997 165 ] 0.6578 0.9602 SUGGESTING RELATEDNESS OF: A> PF06309 ( PF06309 Torsin ) B> PF00004 ( PF00004 ATPase family associated with various cellular activities (AAA) ) Only B has a clan ( CL0023.26 ). the two keywords do not coincide on UniRef90 proteins only PF06309 has a PDB structure (may not be up to date) PF00004 c.37.1.1 c.37.1.20 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 367 ) 6740652_PF01758_PF03547 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03547 is 6693831 with Jaccard = 0.9729 |PF03547|=405 [ 395 1 1099805 10 ] parent [ 6693831 ] : 6740652 0.0282873 (=6970/(448*550)) 97.8419 given [ 6693831 ] : 6693831 0.112273 (=2866/(67*381)) 91.0553 best keyword for cluster 6693831 is PF03547 with Jaccard = 0.9729 [ 395 1 1099805 10 ] 0.9975 0.9753 sibling [ 6693831 ] : 6740319 0.0255009 (=14/(1*549)) 97.8069 best keyword for cluster 6740319 is PF01758 with Jaccard = 0.8565 [ 376 62 1099772 1 ] 0.8584 0.9973 SUGGESTING RELATEDNESS OF: A> PF03547 ( PF03547 Membrane transport protein ) B> PF01758 ( PF01758 Sodium Bile acid symporter family ) they come from the same clan: CL0064.7 : PF06826 PF03547 PF03601 PF05684 PF05982 PF03616 PF06965 PF00999 PF03977 PF01758 the two keywords do not coincide on UniRef90 proteins Neither PF03547 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 368 ) 6751086_PF01274_PF03328 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01274 is 6537339 with Jaccard = 0.9713 |PF01274|=174 [ 169 0 1100037 5 ] parent [ 6537339 ] : 6751086 0.0176249 (=1499/(189*450)) 98.7029 given [ 6537339 ] : 6537339 0.713358 (=6259/(82*107)) 30.4034 best keyword for cluster 6537339 is PF01274 with Jaccard = 0.9713 [ 169 0 1100037 5 ] 1.0000 0.9713 sibling [ 6537339 ] : 6712190 0.0746986 (=3668/(264*186)) 94.2795 best keyword for cluster 6712190 is PF03328 with Jaccard = 0.9634 [ 368 12 1099829 2 ] 0.9684 0.9946 SUGGESTING RELATEDNESS OF: A> PF01274 ( PF01274 Malate synthase ) B> PF03328 ( PF03328 HpcH/HpaI aldolase/citrate lyase family ) they come from the same clan: CL0151.7 : PF03328 PF01274 PF02896 PF00224 the two keywords do not coincide on UniRef90 proteins both PF01274 and PF03328 have PDB structures PF03328 c.1.12.5 SUPERFAM mapping significantly overlapping: 1 PF03328 SSF51621 0.868 (average over 1134 mutual instances, PF03328 1215 appearances, SSF51621 12495 appearances) 2 PF01274 SSF51645 0.908 (average over 679 mutual instances, PF01274 686 appearances, SSF51645 767 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 369 ) 6779200_PF05910_PF07893 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07893 is 6771772 with Jaccard = 0.9710 |PF07893|=69 [ 67 0 1100142 2 ] parent [ 6771772 ] : 6779200 0.000873536 (=22/(115*219)) 99.9408 given [ 6771772 ] : 6771772 0.00328407 (=8/(87*28)) 99.7681 best keyword for cluster 6771772 is PF07893 with Jaccard = 0.9710 [ 67 0 1100142 2 ] 1.0000 0.9710 sibling [ 6771772 ] : 6774540 0.00195713 (=21/(145*74)) 99.8444 best keyword for cluster 6774540 is PF05910 with Jaccard = 0.6571 [ 23 12 1100176 0 ] 0.6571 1.0000 SUGGESTING RELATEDNESS OF: A> PF07893 ( PF07893 Protein of unknown function (DUF1668) ) B> PF05910 ( PF05910 Plant protein of unknown function (DUF868) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07893 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 370 ) 6767275_PF01208_PF04217 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01208 is 6735556 with Jaccard = 0.9706 |PF01208|=272 [ 264 0 1099939 8 ] parent [ 6735556 ] : 6767275 0.00695667 (=162/(319*73)) 99.6081 given [ 6735556 ] : 6735556 0.0305466 (=76/(311*8)) 97.3307 best keyword for cluster 6735556 is PF01208 with Jaccard = 0.9706 [ 264 0 1099939 8 ] 1.0000 0.9706 sibling [ 6735556 ] : 6764765 0.0138889 (=1/(1*72)) 99.5 best keyword for cluster 6764765 is PF04217 with Jaccard = 1.0000 [ 22 0 1100189 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01208 ( PF01208 Uroporphyrinogen decarboxylase (URO-D) ) B> PF04217 ( PF04217 Protein of unknown function, DUF412 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01208 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 371 ) 6650231_PF01148_PF01864 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01864 is 6497798 with Jaccard = 0.9706 |PF01864|=34 [ 33 0 1100177 1 ] parent [ 6497798 ] : 6650231 0.246841 (=3067/(355*35)) 80.061 given [ 6497798 ] : 6497798 0.907407 (=196/(27*8)) 10.8695 best keyword for cluster 6497798 is PF01864 with Jaccard = 0.9706 [ 33 0 1100177 1 ] 1.0000 0.9706 sibling [ 6497798 ] : 6448283 0.991523 (=13568/(44*311)) 1.19472 best keyword for cluster 6448283 is PF01148 with Jaccard = 0.7222 [ 325 0 1099761 125 ] 1.0000 0.7222 SUGGESTING RELATEDNESS OF: A> PF01864 ( PF01864 Putative integral membrane protein DUF46 ) B> PF01148 ( PF01148 Cytidylyltransferase family ) they come from the same clan: CL0234.3 : PF01148 PF01864 the two keywords do not coincide on UniRef90 proteins Neither PF01864 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 372 ) 6740778_PF02001_PF04198 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02001 is 6731336 with Jaccard = 0.9706 |PF02001|=66 [ 66 2 1100143 0 ] parent [ 6731336 ] : 6740778 0.03243 (=432/(173*77)) 97.8562 given [ 6731336 ] : 6731336 0.0361111 (=13/(72*5)) 96.8664 best keyword for cluster 6731336 is PF02001 with Jaccard = 0.9706 [ 66 2 1100143 0 ] 0.9706 1.0000 sibling [ 6731336 ] : 6667839 0.2 (=264/(8*165)) 84.9514 best keyword for cluster 6667839 is PF04198 with Jaccard = 0.9272 [ 140 9 1100060 2 ] 0.9396 0.9859 SUGGESTING RELATEDNESS OF: A> PF02001 ( PF02001 Protein of unknown function DUF134 ) B> PF04198 ( PF04198 Putative sugar-binding domain ) A and B come from a different clan ( CL0123.12 , CL0246.3 ). the two keywords do not coincide on UniRef90 proteins Neither PF02001 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF02001 SSF88659 0.548 (average over 16 mutual instances, PF02001 23 appearances, SSF88659 22430 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 373 ) 6616760_PF00014_PF02177 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02177 is 6605242 with Jaccard = 0.9706 |PF02177|=33 [ 33 1 1100177 0 ] parent [ 6605242 ] : 6616760 0.31396 (=4235/(329*41)) 69.1394 given [ 6605242 ] : 6605242 0.358974 (=28/(2*39)) 64.1046 best keyword for cluster 6605242 is PF02177 with Jaccard = 0.9706 [ 33 1 1100177 0 ] 0.9706 1.0000 sibling [ 6605242 ] : 6600405 0.464832 (=304/(2*327)) 61.9802 best keyword for cluster 6600405 is PF00014 with Jaccard = 0.7529 [ 320 0 1099786 105 ] 1.0000 0.7529 SUGGESTING RELATEDNESS OF: A> PF02177 ( PF02177 Amyloid A4 extracellular domain ) B> PF00014 ( PF00014 Kunitz/Bovine pancreatic trypsin inhibitor domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00014| = 426 , |PF02177| = 33 , |PF00014^PF02177| = 13 ( 3.1% and 39.4% ) both PF02177 and PF00014 have PDB structures PF02177 d.170.2.1 d.230.3.1 PF00014 g.8.1.1 g.8.1.2 k.35.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 374 ) 6745343_PF06295_PF06511 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06295 is 6539149 with Jaccard = 0.9706 |PF06295|=34 [ 33 0 1100177 1 ] parent [ 6539149 ] : 6745343 0.0223684 (=17/(38*20)) 98.2595 given [ 6539149 ] : 6539149 0.695238 (=73/(3*35)) 31.858 best keyword for cluster 6539149 is PF06295 with Jaccard = 0.9706 [ 33 0 1100177 1 ] 1.0000 0.9706 sibling [ 6539149 ] : 6727382 0.04 (=3/(5*15)) 96.4 best keyword for cluster 6727382 is PF06511 with Jaccard = 0.8889 [ 8 1 1100202 0 ] 0.8889 1.0000 SUGGESTING RELATEDNESS OF: A> PF06295 ( PF06295 Protein of unknown function (DUF1043) ) B> PF06511 ( PF06511 Invasion plasmid antigen IpaD ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06295 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 375 ) 6758309_PF03663_PF07470 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07470 is 6725770 with Jaccard = 0.9706 |PF07470|=101 [ 99 1 1100109 2 ] parent [ 6725770 ] : 6758309 0.0124713 (=348/(128*218)) 99.172 given [ 6725770 ] : 6725770 0.0515516 (=206/(54*74)) 96.2043 best keyword for cluster 6725770 is PF07470 with Jaccard = 0.9706 [ 99 1 1100109 2 ] 0.9900 0.9802 sibling [ 6725770 ] : 6748843 0.0219372 (=248/(85*133)) 98.5323 best keyword for cluster 6748843 is PF03663 with Jaccard = 0.6067 [ 108 70 1100033 0 ] 0.6067 1.0000 SUGGESTING RELATEDNESS OF: A> PF07470 ( PF07470 Glycosyl Hydrolase Family 88 ) B> PF03663 ( PF03663 Glycosyl hydrolase family 76 ) Only A has a clan ( CL0059.10 ). the two keywords do not coincide on UniRef90 proteins only PF07470 has a PDB structure (may not be up to date) PF07470 a.102.1.6 a.102.1.7 SUPERFAM mapping significantly overlapping: 1 PF03663 SSF48208 0.809 (average over 242 mutual instances, PF03663 251 appearances, SSF48208 6032 appearances) 2 PF07470 SSF48208 0.854 (average over 295 mutual instances, PF07470 297 appearances, SSF48208 6032 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 376 ) 6646486_PF00928_PF01217 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01217 is 6600961 with Jaccard = 0.9702 |PF01217|=168 [ 163 0 1100043 5 ] parent [ 6600961 ] : 6646486 0.238624 (=8474/(184*193)) 78.9734 given [ 6600961 ] : 6600961 0.412251 (=2894/(130*54)) 62.0741 best keyword for cluster 6600961 is PF01217 with Jaccard = 0.9702 [ 163 0 1100043 5 ] 1.0000 0.9702 sibling [ 6600961 ] : 6624085 0.311862 (=2140/(47*146)) 72.0361 best keyword for cluster 6624085 is PF00928 with Jaccard = 0.9176 [ 167 0 1100029 15 ] 1.0000 0.9176 SUGGESTING RELATEDNESS OF: A> PF01217 ( PF01217 Clathrin adaptor complex small chain ) B> PF00928 ( PF00928 Adaptor complexes medium subunit family ) Only A has a clan ( CL0212.4 ). the two keywords do not coincide on UniRef90 proteins both PF01217 and PF00928 have PDB structures PF01217 d.110.4.2 i.23.1.1 PF00928 b.2.7.1 i.23.1.1 SUPERFAM mapping significantly overlapping: 1 PF00928 SSF49447 0.984 (average over 510 mutual instances, PF00928 940 appearances, SSF49447 619 appearances) 2 PF01217 SSF64356 0.981 (average over 601 mutual instances, PF01217 700 appearances, SSF64356 1711 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 377 ) 6577955_PF00025_PF00503 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00025 is 6559928 with Jaccard = 0.9700 |PF00025|=429 [ 420 4 1099778 9 ] parent [ 6559928 ] : 6577955 0.522522 (=80110/(466*329)) 53.0535 given [ 6559928 ] : 6559928 0.551422 (=2268/(9*457)) 46.776 best keyword for cluster 6559928 is PF00025 with Jaccard = 0.9700 [ 420 4 1099778 9 ] 0.9906 0.9790 sibling [ 6559928 ] : 6561247 0.585366 (=192/(1*328)) 47.9601 best keyword for cluster 6561247 is PF00503 with Jaccard = 0.9517 [ 315 0 1099880 16 ] 1.0000 0.9517 SUGGESTING RELATEDNESS OF: A> PF00025 ( PF00025 ADP-ribosylation factor family ) B> PF00503 ( PF00503 G-protein alpha subunit ) they come from the same clan: CL0017.14 : PF00735 PF00071 PF06858 PF01926 PF08477 PF05049 PF00009 PF00503 PF00350 PF09439 PF03193 PF03029 PF00025 PF04548 the two keywords do not coincide on UniRef90 proteins both PF00025 and PF00503 have PDB structures PF00025 c.37.1.8 PF00503 j.56.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 378 ) 6722352_PF03083_PF04193 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04193 is 6695625 with Jaccard = 0.9698 |PF04193|=199 [ 193 0 1100012 6 ] parent [ 6695625 ] : 6722352 0.0591151 (=2314/(233*168)) 95.7616 given [ 6695625 ] : 6695625 0.119113 (=1209/(58*175)) 91.4789 best keyword for cluster 6695625 is PF04193 with Jaccard = 0.9698 [ 193 0 1100012 6 ] 1.0000 0.9698 sibling [ 6695625 ] : 6713456 0.0640625 (=82/(160*8)) 94.4822 best keyword for cluster 6713456 is PF03083 with Jaccard = 0.9717 [ 103 3 1100105 0 ] 0.9717 1.0000 SUGGESTING RELATEDNESS OF: A> PF04193 ( PF04193 PQ loop repeat ) B> PF03083 ( PF03083 MtN3/saliva family ) they come from the same clan: CL0141.7 : PF04193 PF03083 PF07578 PF03650 the two keywords coincide on Uniref90 proteins: |PF03083| = 103 , |PF04193| = 199 , |PF03083^PF04193| = 1 ( 1.0% and 0.5% ) Neither PF04193 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 379 ) 6701536_PF00977_PF04309 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04309 is 6313306 with Jaccard = 0.9697 |PF04309|=32 [ 32 1 1100178 0 ] parent [ 6313306 ] : 6701536 0.107232 (=2239/(36*580)) 92.446 given [ 6313306 ] : 6313306 1 (=68/(2*34)) 1.25177e-08 best keyword for cluster 6313306 is PF04309 with Jaccard = 0.9697 [ 32 1 1100178 0 ] 0.9697 1.0000 sibling [ 6313306 ] : 6685620 0.134715 (=78/(1*579)) 89.3861 best keyword for cluster 6685620 is PF00977 with Jaccard = 0.9030 [ 484 50 1099675 2 ] 0.9064 0.9959 SUGGESTING RELATEDNESS OF: A> PF04309 ( PF04309 Glycerol-3-phosphate responsive antiterminator ) B> PF00977 ( PF00977 Histidine biosynthesis protein ) they come from the same clan: CL0036.17 : PF05690 PF01680 PF00834 PF01729 PF00697 PF03740 PF01884 PF00724 PF00215 PF03060 PF04095 PF04131 PF00478 PF00218 PF00977 PF01645 PF04309 PF01070 PF01207 PF04481 PF04476 PF01180 PF00701 PF01791 PF03932 PF03437 PF01081 PF00121 PF09370 PF02581 PF00290 the two keywords do not coincide on UniRef90 proteins both PF04309 and PF00977 have PDB structures PF04309 c.1.29.1 PF00977 c.1.2.1 SUPERFAM mapping significantly overlapping: 1 PF04309 SSF110391 0.979 (average over 128 mutual instances, PF04309 128 appearances, SSF110391 128 appearances) 2 PF00977 SSF51366 0.938 (average over 1629 mutual instances, PF00977 1632 appearances, SSF51366 8168 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 380 ) 6737615_PF00462_PF03479 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03479 is 6690625 with Jaccard = 0.9695 |PF03479|=130 [ 127 1 1100080 3 ] parent [ 6690625 ] : 6737615 0.0264726 (=4185/(168*941)) 97.5474 given [ 6690625 ] : 6690625 0.107533 (=748/(94*74)) 90.3884 best keyword for cluster 6690625 is PF03479 with Jaccard = 0.9695 [ 127 1 1100080 3 ] 0.9922 0.9769 sibling [ 6690625 ] : 6731891 0.0352304 (=2148/(70*871)) 96.9284 best keyword for cluster 6731891 is PF00462 with Jaccard = 0.7431 [ 729 79 1099230 173 ] 0.9022 0.8082 SUGGESTING RELATEDNESS OF: A> PF03479 ( PF03479 Domain of unknown function (DUF296) ) B> PF00462 ( PF00462 Glutaredoxin ) Only B has a clan ( CL0172.11 ). the two keywords coincide on Uniref90 proteins: |PF00462| = 902 , |PF03479| = 130 , |PF00462^PF03479| = 8 ( 0.9% and 6.2% ) only PF03479 has a PDB structure (may not be up to date) PF00462 c.47.1.1 SUPERFAM mapping significantly overlapping: 1 PF00462 SSF52833 0.710 (average over 2554 mutual instances, PF00462 2661 appearances, SSF52833 34965 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 381 ) 6752262_PF00463_PF02548 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00463 is 6709607 with Jaccard = 0.9693 |PF00463|=255 [ 253 6 1099950 2 ] parent [ 6709607 ] : 6752262 0.0167139 (=1768/(258*410)) 98.7864 given [ 6709607 ] : 6709607 0.0622222 (=126/(405*5)) 93.8751 best keyword for cluster 6709607 is PF00463 with Jaccard = 0.9693 [ 253 6 1099950 2 ] 0.9768 0.9922 sibling [ 6709607 ] : 6630499 0.303502 (=78/(1*257)) 74.8747 best keyword for cluster 6630499 is PF02548 with Jaccard = 1.0000 [ 231 0 1099980 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00463 ( PF00463 Isocitrate lyase family ) B> PF02548 ( PF02548 Ketopantoate hydroxymethyltransferase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF00463 and PF02548 have PDB structures PF00463 c.1.12.7 SUPERFAM mapping significantly overlapping: 1 PF00463 SSF51621 0.657 (average over 906 mutual instances, PF00463 913 appearances, SSF51621 12495 appearances) 2 PF02548 SSF51621 0.983 (average over 760 mutual instances, PF02548 769 appearances, SSF51621 12495 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 382 ) 6712077_PF05995_PF07847 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05995 is 6597067 with Jaccard = 0.9692 |PF05995|=64 [ 63 1 1100146 1 ] parent [ 6597067 ] : 6712077 0.081474 (=241/(34*87)) 94.2655 given [ 6597067 ] : 6597067 0.448443 (=835/(38*49)) 60.285 best keyword for cluster 6597067 is PF05995 with Jaccard = 0.9692 [ 63 1 1100146 1 ] 0.9844 0.9844 sibling [ 6597067 ] : 6515298 0.875 (=56/(2*32)) 17.918 best keyword for cluster 6515298 is PF07847 with Jaccard = 0.9412 [ 32 0 1100177 2 ] 1.0000 0.9412 SUGGESTING RELATEDNESS OF: A> PF05995 ( PF05995 Cysteine dioxygenase type I ) B> PF07847 ( PF07847 Protein of unknown function (DUF1637) ) Only A has a clan ( CL0029.13 ). the two keywords do not coincide on UniRef90 proteins only PF05995 has a PDB structure (may not be up to date) PF05995 b.82.1.19 SUPERFAM mapping significantly overlapping: 1 PF05995 SSF51182 0.731 (average over 135 mutual instances, PF05995 135 appearances, SSF51182 14255 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 383 ) 6744929_PF00799_PF01492 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01492 is 6683146 with Jaccard = 0.9691 |PF01492|=193 [ 188 1 1100017 5 ] parent [ 6683146 ] : 6744929 0.0190774 (=1024/(189*284)) 98.2288 given [ 6683146 ] : 6683146 0.143646 (=208/(8*181)) 88.977 best keyword for cluster 6683146 is PF01492 with Jaccard = 0.9691 [ 188 1 1100017 5 ] 0.9947 0.9741 sibling [ 6683146 ] : 6737996 0.0397112 (=77/(7*277)) 97.5867 best keyword for cluster 6737996 is PF00799 with Jaccard = 0.9538 [ 227 10 1099973 1 ] 0.9578 0.9956 SUGGESTING RELATEDNESS OF: A> PF01492 ( PF01492 Geminivirus C4 protein ) B> PF00799 ( PF00799 Geminivirus Rep catalytic domain ) Only B has a clan ( CL0169.6 ). the two keywords coincide on Uniref90 proteins: |PF00799| = 228 , |PF01492| = 193 , |PF00799^PF01492| = 3 ( 1.3% and 1.6% ) only PF01492 has a PDB structure (may not be up to date) PF00799 d.89.1.4 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 384 ) 6750118_PF00505_PF04769 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04769 is 6624153 with Jaccard = 0.9683 |PF04769|=63 [ 61 0 1100148 2 ] parent [ 6624153 ] : 6750118 0.0170587 (=1378/(70*1154)) 98.626 given [ 6624153 ] : 6624153 0.289855 (=20/(1*69)) 72.0942 best keyword for cluster 6624153 is PF04769 with Jaccard = 0.9683 [ 61 0 1100148 2 ] 1.0000 0.9683 sibling [ 6624153 ] : 6746762 0.0225881 (=284/(11*1143)) 98.3707 best keyword for cluster 6746762 is PF00505 with Jaccard = 0.8042 [ 805 137 1099210 59 ] 0.8546 0.9317 SUGGESTING RELATEDNESS OF: A> PF04769 ( PF04769 Mating-type protein MAT alpha 1 ) B> PF00505 ( PF00505 HMG (high mobility group) box ) Only B has a clan ( CL0114.6 ). the two keywords coincide on Uniref90 proteins: |PF00505| = 864 , |PF04769| = 63 , |PF00505^PF04769| = 3 ( 0.3% and 4.8% ) only PF04769 has a PDB structure (may not be up to date) PF00505 a.21.1.1 SUPERFAM mapping significantly overlapping: 1 PF00505 SSF47095 0.800 (average over 2604 mutual instances, PF00505 2716 appearances, SSF47095 3113 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 385 ) 6747159_PF01566_PF05525 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05525 is 6700316 with Jaccard = 0.9683 |PF05525|=126 [ 122 0 1100085 4 ] parent [ 6700316 ] : 6747159 0.0203124 (=1455/(189*379)) 98.4008 given [ 6700316 ] : 6700316 0.0931824 (=708/(131*58)) 92.2349 best keyword for cluster 6700316 is PF05525 with Jaccard = 0.9683 [ 122 0 1100085 4 ] 1.0000 0.9683 sibling [ 6700316 ] : 6729842 0.0380184 (=99/(372*7)) 96.708 best keyword for cluster 6729842 is PF01566 with Jaccard = 0.9893 [ 278 3 1099930 0 ] 0.9893 1.0000 SUGGESTING RELATEDNESS OF: A> PF05525 ( PF05525 Branched-chain amino acid transport protein ) B> PF01566 ( PF01566 Natural resistance-associated macrophage protein ) Only A has a clan ( CL0062.8 ). the two keywords do not coincide on UniRef90 proteins Neither PF05525 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 386 ) 6754973_PF03081_PF03106 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03106 is 6715920 with Jaccard = 0.9677 |PF03106|=278 [ 270 1 1099932 8 ] parent [ 6715920 ] : 6754973 0.0115311 (=532/(316*146)) 98.9681 given [ 6715920 ] : 6715920 0.0513911 (=290/(19*297)) 94.8673 best keyword for cluster 6715920 is PF03106 with Jaccard = 0.9677 [ 270 1 1099932 8 ] 0.9963 0.9712 sibling [ 6715920 ] : 6746456 0.0215827 (=21/(139*7)) 98.3464 best keyword for cluster 6746456 is PF03081 with Jaccard = 0.9802 [ 99 1 1100110 1 ] 0.9900 0.9900 SUGGESTING RELATEDNESS OF: A> PF03106 ( PF03106 WRKY DNA -binding domain ) B> PF03081 ( PF03081 Exo70 exocyst complex subunit ) Only A has a clan ( CL0274.2 ). the two keywords do not coincide on UniRef90 proteins both PF03106 and PF03081 have PDB structures PF03081 a.118.17.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 387 ) 6632019_PF02991_PF04110 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04110 is 6608773 with Jaccard = 0.9677 |PF04110|=31 [ 30 0 1100180 1 ] parent [ 6608773 ] : 6632019 0.287175 (=730/(31*82)) 75.1426 given [ 6608773 ] : 6608773 0.4 (=12/(1*30)) 66.4 best keyword for cluster 6608773 is PF04110 with Jaccard = 0.9677 [ 30 0 1100180 1 ] 1.0000 0.9677 sibling [ 6608773 ] : 6611291 0.35 (=56/(2*80)) 67.1418 best keyword for cluster 6611291 is PF02991 with Jaccard = 0.9740 [ 75 0 1100134 2 ] 1.0000 0.9740 SUGGESTING RELATEDNESS OF: A> PF04110 ( PF04110 Ubiquitin-like autophagy protein Apg12 ) B> PF02991 ( PF02991 Microtubule associated protein 1A/1B, light chain 3 ) they come from the same clan: CL0072.14 : PF09138 PF03671 PF03658 PF00789 PF00240 PF02597 PF02824 PF02196 PF00788 PF00794 PF00564 PF02991 PF09379 PF08783 PF06071 PF07023 PF02017 PF04110 PF08817 the two keywords do not coincide on UniRef90 proteins both PF04110 and PF02991 have PDB structures PF04110 d.15.1.7 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 388 ) 6675846_PF04740_PF06860 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04740 is 6640075 with Jaccard = 0.9677 |PF04740|=31 [ 30 0 1100180 1 ] parent [ 6640075 ] : 6675846 0.147147 (=196/(36*37)) 87.1267 given [ 6640075 ] : 6640075 0.231183 (=43/(6*31)) 77.002 best keyword for cluster 6640075 is PF04740 with Jaccard = 0.9677 [ 30 0 1100180 1 ] 1.0000 0.9677 sibling [ 6640075 ] : 6592383 0.428125 (=137/(20*16)) 58.2245 best keyword for cluster 6592383 is PF06860 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04740 ( PF04740 Bacillus transposase protein ) B> PF06860 ( PF06860 Protein of unknown function (DUF1252) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04740 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 389 ) 6761119_PF06610_PF07895 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07895 is 6703274 with Jaccard = 0.9677 |PF07895|=31 [ 30 0 1100180 1 ] parent [ 6703274 ] : 6761119 0.00735294 (=5/(34*20)) 99.3243 given [ 6703274 ] : 6703274 0.107143 (=18/(28*6)) 92.7548 best keyword for cluster 6703274 is PF07895 with Jaccard = 0.9677 [ 30 0 1100180 1 ] 1.0000 0.9677 sibling [ 6703274 ] : 6743335 0.0238095 (=2/(6*14)) 98.0833 best keyword for cluster 6743335 is PF06610 with Jaccard = 1.0000 [ 12 0 1100199 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07895 ( PF07895 Protein of unknown function (DUF1673) ) B> PF06610 ( PF06610 Protein of unknown function (DUF1144) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07895 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 390 ) 6758191_PF02269_PF04719 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02269 is 6732009 with Jaccard = 0.9672 |PF02269|=60 [ 59 1 1100150 1 ] parent [ 6732009 ] : 6758191 0.0151796 (=41/(37*73)) 99.1644 given [ 6732009 ] : 6732009 0.0348259 (=14/(6*67)) 96.9421 best keyword for cluster 6732009 is PF02269 with Jaccard = 0.9672 [ 59 1 1100150 1 ] 0.9833 0.9833 sibling [ 6732009 ] : 6725481 0.0555556 (=2/(1*36)) 96.1667 best keyword for cluster 6725481 is PF04719 with Jaccard = 0.9706 [ 33 0 1100177 1 ] 1.0000 0.9706 SUGGESTING RELATEDNESS OF: A> PF02269 ( PF02269 Transcription initiation factor IID, 18kD subunit ) B> PF04719 ( PF04719 hTAFII28-like protein conserved region ) they come from the same clan: CL0012.11 : PF02969 PF00125 PF00808 PF07524 PF04719 PF02269 PF02291 PF03847 the two keywords do not coincide on UniRef90 proteins both PF02269 and PF04719 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF04719 SSF47113 0.834 (average over 77 mutual instances, PF04719 77 appearances, SSF47113 7440 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 391 ) 6707884_PF03962_PF07106 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07106 is 6597200 with Jaccard = 0.9667 |PF07106|=30 [ 29 0 1100181 1 ] parent [ 6597200 ] : 6707884 0.0764007 (=90/(31*38)) 93.6075 given [ 6597200 ] : 6597200 0.4 (=52/(26*5)) 60.4103 best keyword for cluster 6597200 is PF07106 with Jaccard = 0.9667 [ 29 0 1100181 1 ] 1.0000 0.9667 sibling [ 6597200 ] : 6641149 0.228571 (=24/(3*35)) 77.347 best keyword for cluster 6641149 is PF03962 with Jaccard = 1.0000 [ 28 0 1100183 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07106 ( PF07106 Tat binding protein 1(TBP-1)-interacting protein (TBPIP) ) B> PF03962 ( PF03962 Mnd1 family ) Only A has a clan ( CL0123.12 ). the two keywords do not coincide on UniRef90 proteins Neither PF07106 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 392 ) 6706286_PF07587_PF07627 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07627 is 6382483 with Jaccard = 0.9667 |PF07627|=30 [ 29 0 1100181 1 ] parent [ 6382483 ] : 6706286 0.0829538 (=255/(29*106)) 93.3371 given [ 6382483 ] : 6382483 1 (=28/(1*28)) 0.000698447 best keyword for cluster 6382483 is PF07627 with Jaccard = 0.9667 [ 29 0 1100181 1 ] 1.0000 0.9667 sibling [ 6382483 ] : 6690681 0.127273 (=133/(11*95)) 90.4088 best keyword for cluster 6690681 is PF07587 with Jaccard = 0.7108 [ 59 24 1100128 0 ] 0.7108 1.0000 SUGGESTING RELATEDNESS OF: A> PF07627 ( PF07627 Protein of unknown function (DUF1588) ) B> PF07587 ( PF07587 Protein of unknown function (DUF1553) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07627 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 393 ) 6747048_PF06725_PF06737 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06725 is 6735474 with Jaccard = 0.9663 |PF06725|=175 [ 172 3 1100033 3 ] parent [ 6735474 ] : 6747048 0.0165497 (=566/(225*152)) 98.3939 given [ 6735474 ] : 6735474 0.0279811 (=130/(202*23)) 97.3184 best keyword for cluster 6735474 is PF06725 with Jaccard = 0.9663 [ 172 3 1100033 3 ] 0.9829 0.9829 sibling [ 6735474 ] : 6741857 0.0295921 (=111/(31*121)) 97.9582 best keyword for cluster 6741857 is PF06737 with Jaccard = 0.6190 [ 78 33 1100085 15 ] 0.7027 0.8387 SUGGESTING RELATEDNESS OF: A> PF06725 ( PF06725 3D domain ) B> PF06737 ( PF06737 Transglycosylase-like domain ) A and B come from a different clan ( CL0199.7 , CL0037.9 ). the two keywords do not coincide on UniRef90 proteins both PF06725 and PF06737 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 394 ) 6745596_PF05163_PF07609 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05163 is 6722677 with Jaccard = 0.9659 |PF05163|=88 [ 85 0 1100123 3 ] parent [ 6722677 ] : 6745596 0.0254689 (=239/(136*69)) 98.2807 given [ 6722677 ] : 6722677 0.0620783 (=138/(117*19)) 95.8099 best keyword for cluster 6722677 is PF05163 with Jaccard = 0.9659 [ 85 0 1100123 3 ] 1.0000 0.9659 sibling [ 6722677 ] : 6735499 0.0336842 (=32/(19*50)) 97.3218 best keyword for cluster 6735499 is PF07609 with Jaccard = 0.8571 [ 12 2 1100197 0 ] 0.8571 1.0000 SUGGESTING RELATEDNESS OF: A> PF05163 ( PF05163 DinB family ) B> PF07609 ( PF07609 Protein of unknown function (DUF1572) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05163 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 395 ) 6769854_PF01193_PF03971 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01193 is 6757657 with Jaccard = 0.9655 |PF01193|=477 [ 476 16 1099718 1 ] parent [ 6757657 ] : 6769854 0.00438292 (=455/(633*164)) 99.7061 given [ 6757657 ] : 6757657 0.0136474 (=975/(486*147)) 99.1322 best keyword for cluster 6757657 is PF01193 with Jaccard = 0.9655 [ 476 16 1099718 1 ] 0.9675 0.9979 sibling [ 6757657 ] : 6767454 0.00513652 (=19/(27*137)) 99.6161 best keyword for cluster 6767454 is PF03971 with Jaccard = 0.9737 [ 74 2 1100135 0 ] 0.9737 1.0000 SUGGESTING RELATEDNESS OF: A> PF01193 ( PF01193 RNA polymerase Rpb3/Rpb11 dimerisation domain ) B> PF03971 ( PF03971 Monomeric isocitrate dehydrogenase ) Only B has a clan ( CL0270.2 ). the two keywords do not coincide on UniRef90 proteins both PF01193 and PF03971 have PDB structures PF03971 c.77.1.1 c.77.1.2 SUPERFAM mapping significantly overlapping: 1 PF01193 SSF55257 0.889 (average over 2125 mutual instances, PF01193 5241 appearances, SSF55257 5319 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 396 ) 6764061_PF00723_PF02446 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02446 is 6555483 with Jaccard = 0.9655 |PF02446|=174 [ 168 0 1100037 6 ] parent [ 6555483 ] : 6764061 0.00593233 (=317/(183*292)) 99.4687 given [ 6555483 ] : 6555483 0.588398 (=213/(2*181)) 43.0567 best keyword for cluster 6555483 is PF02446 with Jaccard = 0.9655 [ 168 0 1100037 6 ] 1.0000 0.9655 sibling [ 6555483 ] : 6754817 0.0111959 (=88/(262*30)) 98.9578 best keyword for cluster 6754817 is PF00723 with Jaccard = 0.8739 [ 201 27 1099981 2 ] 0.8816 0.9901 SUGGESTING RELATEDNESS OF: A> PF02446 ( PF02446 4-alpha-glucanotransferase ) B> PF00723 ( PF00723 Glycosyl hydrolases family 15 ) A and B come from a different clan ( CL0058.10 , CL0059.10 ). the two keywords do not coincide on UniRef90 proteins both PF02446 and PF00723 have PDB structures PF02446 c.1.8.1 PF00723 a.102.1.1 a.102.1.5 SUPERFAM mapping significantly overlapping: 1 PF00723 SSF48208 0.911 (average over 498 mutual instances, PF00723 605 appearances, SSF48208 6032 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 397 ) 6763259_PF00589_PF07512 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00589 is 6760844 with Jaccard = 0.9653 |PF00589|=2815 [ 2757 41 1097355 58 ] parent [ 6760844 ] : 6763259 0.00656587 (=1381/(3895*54)) 99.4296 given [ 6760844 ] : 6760844 0.0096304 (=1961/(3842*53)) 99.3098 best keyword for cluster 6760844 is PF00589 with Jaccard = 0.9653 [ 2757 41 1097355 58 ] 0.9853 0.9794 sibling [ 6760844 ] : 6761400 0.0188679 (=1/(1*53)) 99.3396 best keyword for cluster 6761400 is PF07512 with Jaccard = 0.9231 [ 36 1 1100172 2 ] 0.9730 0.9474 SUGGESTING RELATEDNESS OF: A> PF00589 ( PF00589 Phage integrase family ) B> PF07512 ( PF07512 Protein of unknown function (DUF1526) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00589 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00589 SSF56349 0.828 (average over 8330 mutual instances, PF00589 11867 appearances, SSF56349 10914 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 398 ) 6713233_PF06850_PF07167 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06850 is 6481929 with Jaccard = 0.9649 |PF06850|=57 [ 55 0 1100154 2 ] parent [ 6481929 ] : 6713233 0.0623937 (=1174/(64*294)) 94.439 given [ 6481929 ] : 6481929 0.952381 (=60/(1*63)) 6.12756 best keyword for cluster 6481929 is PF06850 with Jaccard = 0.9649 [ 55 0 1100154 2 ] 1.0000 0.9649 sibling [ 6481929 ] : 6699063 0.0950722 (=191/(7*287)) 92.0099 best keyword for cluster 6699063 is PF07167 with Jaccard = 0.8445 [ 201 37 1099973 0 ] 0.8445 1.0000 SUGGESTING RELATEDNESS OF: A> PF06850 ( PF06850 PHB de-polymerase C-terminus ) B> PF07167 ( PF07167 Poly-beta-hydroxybutyrate polymerase (PhaC) N-terminus ) Only A has a clan ( CL0028.14 ). the two keywords do not coincide on UniRef90 proteins Neither PF06850 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 399 ) 6646401_PF04808_PF05515 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05515 is 6600116 with Jaccard = 0.9643 |PF05515|=28 [ 27 0 1100183 1 ] parent [ 6600116 ] : 6646401 0.276498 (=60/(31*7)) 78.9058 given [ 6600116 ] : 6600116 0.453704 (=49/(4*27)) 61.6409 best keyword for cluster 6600116 is PF05515 with Jaccard = 0.9643 [ 27 0 1100183 1 ] 1.0000 0.9643 sibling [ 6600116 ] : 6603207 0.5 (=5/(5*2)) 63.2 best keyword for cluster 6603207 is PF04808 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05515 ( PF05515 Viral nucleic acid binding ) B> PF04808 ( PF04808 Citrus tristeza virus (CTV) P23 protein ) they come from the same clan: CL0140.6 : PF01623 PF04808 PF05515 the two keywords do not coincide on UniRef90 proteins Neither PF05515 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 400 ) 6675740_PF05171_PF06228 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06228 is 6246500 with Jaccard = 0.9643 |PF06228|=28 [ 27 0 1100183 1 ] parent [ 6246500 ] : 6675740 0.162162 (=162/(27*37)) 87.0664 given [ 6246500 ] : 6246500 1 (=180/(15*12)) 1.3411e-13 best keyword for cluster 6246500 is PF06228 with Jaccard = 0.9643 [ 27 0 1100183 1 ] 1.0000 0.9643 sibling [ 6246500 ] : 6650456 0.222222 (=8/(1*36)) 80.2005 best keyword for cluster 6650456 is PF05171 with Jaccard = 1.0000 [ 29 0 1100182 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06228 ( PF06228 Protein of unknown function (DUF1008) ) B> PF05171 ( PF05171 Haemin-degrading family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06228 has a PDB structure (may not be up to date) PF05171 e.62.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 401 ) 6725827_PF04357_PF05170 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04357 is 6700171 with Jaccard = 0.9640 |PF04357|=137 [ 134 2 1100072 3 ] parent [ 6700171 ] : 6725827 0.0495423 (=4005/(215*376)) 96.2123 given [ 6700171 ] : 6700171 0.0930864 (=789/(52*163)) 92.202 best keyword for cluster 6700171 is PF04357 with Jaccard = 0.9640 [ 134 2 1100072 3 ] 0.9853 0.9781 sibling [ 6700171 ] : 6714825 0.0662837 (=2054/(122*254)) 94.6996 best keyword for cluster 6714825 is PF05170 with Jaccard = 0.6635 [ 140 49 1100000 22 ] 0.7407 0.8642 SUGGESTING RELATEDNESS OF: A> PF04357 ( PF04357 Family of unknown function (DUF490) ) B> PF05170 ( PF05170 AsmA family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04357| = 137 , |PF05170| = 162 , |PF04357^PF05170| = 5 ( 3.6% and 3.1% ) Neither PF04357 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 402 ) 6764413_PF00994_PF01507 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01507 is 6703055 with Jaccard = 0.9637 |PF01507|=429 [ 425 12 1099770 4 ] parent [ 6703055 ] : 6764413 0.00563274 (=3609/(494*1297)) 99.4849 given [ 6703055 ] : 6703055 0.0901515 (=1785/(44*450)) 92.7302 best keyword for cluster 6703055 is PF01507 with Jaccard = 0.9637 [ 425 12 1099770 4 ] 0.9725 0.9907 sibling [ 6703055 ] : 6737864 0.0259578 (=6239/(224*1073)) 97.5717 best keyword for cluster 6737864 is PF00994 with Jaccard = 0.6889 [ 815 356 1099028 12 ] 0.6960 0.9855 SUGGESTING RELATEDNESS OF: A> PF01507 ( PF01507 Phosphoadenosine phosphosulfate reductase family ) B> PF00994 ( PF00994 Probable molybdopterin binding domain ) Only A has a clan ( CL0039.7 ). the two keywords coincide on Uniref90 proteins: |PF00994| = 827 , |PF01507| = 429 , |PF00994^PF01507| = 9 ( 1.1% and 2.1% ) both PF01507 and PF00994 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00994 SSF53218 0.869 (average over 2574 mutual instances, PF00994 4737 appearances, SSF53218 5120 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 403 ) 6712763_PF02581_PF08543 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02581 is 6681958 with Jaccard = 0.9637 |PF02581|=386 [ 372 0 1099825 14 ] parent [ 6681958 ] : 6712763 0.0572249 (=10345/(442*409)) 94.3662 given [ 6681958 ] : 6681958 0.130247 (=211/(4*405)) 88.6761 best keyword for cluster 6681958 is PF02581 with Jaccard = 0.9637 [ 372 0 1099825 14 ] 1.0000 0.9637 sibling [ 6681958 ] : 6619897 0.3 (=264/(2*440)) 70.4714 best keyword for cluster 6619897 is PF08543 with Jaccard = 0.7875 [ 315 67 1099811 18 ] 0.8246 0.9459 SUGGESTING RELATEDNESS OF: A> PF02581 ( PF02581 Thiamine monophosphate synthase/TENI ) B> PF08543 ( PF08543 Phosphomethylpyrimidine kinase ) A and B come from a different clan ( CL0036.17 , CL0118.7 ). the two keywords coincide on Uniref90 proteins: |PF02581| = 386 , |PF08543| = 333 , |PF02581^PF08543| = 22 ( 5.7% and 6.6% ) both PF02581 and PF08543 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02581 SSF51391 0.858 (average over 1103 mutual instances, PF02581 1176 appearances, SSF51391 1335 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 404 ) 6692708_PF01180_PF01207 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01180 is 6682050 with Jaccard = 0.9631 |PF01180|=404 [ 392 3 1099804 12 ] parent [ 6682050 ] : 6692708 0.10585 (=27875/(604*436)) 90.8217 given [ 6682050 ] : 6682050 0.116319 (=201/(4*432)) 88.7083 best keyword for cluster 6682050 is PF01180 with Jaccard = 0.9631 [ 392 3 1099804 12 ] 0.9924 0.9703 sibling [ 6682050 ] : 6651219 0.210372 (=2219/(18*586)) 80.4374 best keyword for cluster 6651219 is PF01207 with Jaccard = 0.9873 [ 545 3 1099659 4 ] 0.9945 0.9927 SUGGESTING RELATEDNESS OF: A> PF01180 ( PF01180 Dihydroorotate dehydrogenase ) B> PF01207 ( PF01207 Dihydrouridine synthase (Dus) ) they come from the same clan: CL0036.17 : PF05690 PF01680 PF00834 PF01729 PF00697 PF03740 PF01884 PF00724 PF00215 PF03060 PF04095 PF04131 PF00478 PF00218 PF00977 PF01645 PF04309 PF01070 PF01207 PF04481 PF04476 PF01180 PF00701 PF01791 PF03932 PF03437 PF01081 PF00121 PF09370 PF02581 PF00290 the two keywords do not coincide on UniRef90 proteins both PF01180 and PF01207 have PDB structures PF01180 c.1.4.1 PF01207 c.1.4.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 405 ) 6769456_PF01263_PF06799 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01263 is 6762597 with Jaccard = 0.9631 |PF01263|=406 [ 392 1 1099804 14 ] parent [ 6762597 ] : 6769456 0.00375536 (=77/(44*466)) 99.692 given [ 6762597 ] : 6762597 0.0062004 (=50/(18*448)) 99.3995 best keyword for cluster 6762597 is PF01263 with Jaccard = 0.9631 [ 392 1 1099804 14 ] 0.9975 0.9655 sibling [ 6762597 ] : 6756687 0.0130719 (=6/(17*27)) 99.0719 best keyword for cluster 6756687 is PF06799 with Jaccard = 0.9583 [ 23 1 1100187 0 ] 0.9583 1.0000 SUGGESTING RELATEDNESS OF: A> PF01263 ( PF01263 Aldose 1-epimerase ) B> PF06799 ( PF06799 Protein of unknown function (DUF1230) ) Only A has a clan ( CL0103.7 ). the two keywords do not coincide on UniRef90 proteins only PF01263 has a PDB structure (may not be up to date) PF01263 b.30.5.4 b.30.5.7 SUPERFAM mapping significantly overlapping: 1 PF01263 SSF74650 0.881 (average over 1335 mutual instances, PF01263 1358 appearances, SSF74650 5571 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 406 ) 6661132_PF01863_PF08325 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01863 is 6637941 with Jaccard = 0.9630 |PF01863|=240 [ 234 3 1099968 6 ] parent [ 6637941 ] : 6661132 0.226735 (=2417/(41*260)) 83.5231 given [ 6637941 ] : 6637941 0.28418 (=291/(4*256)) 76.474 best keyword for cluster 6637941 is PF01863 with Jaccard = 0.9630 [ 234 3 1099968 6 ] 0.9873 0.9750 sibling [ 6637941 ] : 6559434 0.551282 (=43/(2*39)) 46.2715 best keyword for cluster 6559434 is PF08325 with Jaccard = 0.8605 [ 37 4 1100168 2 ] 0.9024 0.9487 SUGGESTING RELATEDNESS OF: A> PF01863 ( PF01863 Protein of unknown function DUF45 ) B> PF08325 ( PF08325 WLM domain ) they come from the same clan: CL0126.12 : PF08325 PF01421 PF01752 PF01457 PF02031 PF09471 PF05299 PF05547 PF05572 PF01434 PF01447 PF02128 PF02102 PF02074 PF01432 PF01742 PF01401 PF01431 PF05548 PF00413 PF01433 PF01863 PF07998 PF01400 the two keywords do not coincide on UniRef90 proteins Neither PF01863 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 407 ) 6759296_PF00164_PF01176 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01176 is 6705413 with Jaccard = 0.9628 |PF01176|=260 [ 259 9 1099942 1 ] parent [ 6705413 ] : 6759296 0.0125169 (=1186/(288*329)) 99.227 given [ 6705413 ] : 6705413 0.0899031 (=1698/(187*101)) 93.1773 best keyword for cluster 6705413 is PF01176 with Jaccard = 0.9628 [ 259 9 1099942 1 ] 0.9664 0.9962 sibling [ 6705413 ] : 6754276 0.0165533 (=73/(14*315)) 98.9237 best keyword for cluster 6754276 is PF00164 with Jaccard = 0.9929 [ 278 1 1099931 1 ] 0.9964 0.9964 SUGGESTING RELATEDNESS OF: A> PF01176 ( PF01176 Translation initiation factor 1A / IF-1 ) B> PF00164 ( PF00164 Ribosomal protein S12 ) they come from the same clan: CL0021.12 : PF08402 PF03459 PF02765 PF00436 PF00575 PF01330 PF03870 PF00366 PF00164 PF00181 PF07497 PF04057 PF02303 PF08206 PF03919 PF01287 PF01176 PF01132 PF04076 PF03120 PF00313 PF01336 PF01588 the two keywords do not coincide on UniRef90 proteins both PF01176 and PF00164 have PDB structures PF01176 b.40.4.5 SUPERFAM mapping significantly overlapping: 1 PF01176 SSF50249 0.835 (average over 1235 mutual instances, PF01176 1253 appearances, SSF50249 52669 appearances) 2 PF00164 SSF50249 0.965 (average over 1574 mutual instances, PF00164 1579 appearances, SSF50249 52669 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 408 ) 6715860_PF00696_PF00742 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00696 is 6715372 with Jaccard = 0.9625 |PF00696|=1324 [ 1282 8 1098879 42 ] parent [ 6715372 ] : 6715860 0.0540314 (=24147/(311*1437)) 94.8587 given [ 6715372 ] : 6715372 0.0668703 (=1708/(18*1419)) 94.7786 best keyword for cluster 6715372 is PF00696 with Jaccard = 0.9625 [ 1282 8 1098879 42 ] 0.9938 0.9683 sibling [ 6715372 ] : 6653251 0.251613 (=78/(1*310)) 81.0781 best keyword for cluster 6653251 is PF00742 with Jaccard = 0.7600 [ 247 26 1099886 52 ] 0.9048 0.8261 SUGGESTING RELATEDNESS OF: A> PF00696 ( PF00696 Amino acid kinase family ) B> PF00742 ( PF00742 Homoserine dehydrogenase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00696| = 1324 , |PF00742| = 299 , |PF00696^PF00742| = 66 ( 5.0% and 22.1% ) both PF00696 and PF00742 have PDB structures PF00696 c.73.1.1 c.73.1.2 c.73.1.3 PF00742 d.81.1.2 SUPERFAM mapping significantly overlapping: 1 PF00696 SSF53633 0.922 (average over 4687 mutual instances, PF00696 5933 appearances, SSF53633 7277 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 409 ) 6694770_PF01490_PF03222 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01490 is 6687942 with Jaccard = 0.9623 |PF01490|=664 [ 639 0 1099547 25 ] parent [ 6687942 ] : 6694770 0.108887 (=13129/(175*689)) 91.2519 given [ 6687942 ] : 6687942 0.112044 (=307/(4*685)) 89.8465 best keyword for cluster 6687942 is PF01490 with Jaccard = 0.9623 [ 639 0 1099547 25 ] 1.0000 0.9623 sibling [ 6687942 ] : 6677733 0.135659 (=70/(3*172)) 87.6466 best keyword for cluster 6677733 is PF03222 with Jaccard = 0.8455 [ 104 19 1100088 0 ] 0.8455 1.0000 SUGGESTING RELATEDNESS OF: A> PF01490 ( PF01490 Transmembrane amino acid transporter protein ) B> PF03222 ( PF03222 Tryptophan/tyrosine permease family ) they come from the same clan: CL0062.8 : PF00860 PF03222 PF02133 PF00916 PF00474 PF03845 PF01235 PF00955 PF07331 PF02361 PF05525 PF03594 PF01490 PF00324 the two keywords do not coincide on UniRef90 proteins Neither PF01490 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 410 ) 6757909_PF04107_PF04169 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04107 is 6733210 with Jaccard = 0.9615 |PF04107|=155 [ 150 1 1100055 5 ] parent [ 6733210 ] : 6757909 0.00892443 (=404/(203*223)) 99.1493 given [ 6733210 ] : 6733210 0.0402697 (=209/(30*173)) 97.0758 best keyword for cluster 6733210 is PF04107 with Jaccard = 0.9615 [ 150 1 1100055 5 ] 0.9934 0.9677 sibling [ 6733210 ] : 6754704 0.0135135 (=3/(1*222)) 98.9504 best keyword for cluster 6754704 is PF04169 with Jaccard = 0.6082 [ 104 67 1100040 0 ] 0.6082 1.0000 SUGGESTING RELATEDNESS OF: A> PF04107 ( PF04107 Glutamate-cysteine ligase family 2(GCS2) ) B> PF04169 ( PF04169 Domain of unknown function (DUF404) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04107| = 155 , |PF04169| = 104 , |PF04107^PF04169| = 3 ( 1.9% and 2.9% ) only PF04107 has a PDB structure (may not be up to date) PF04107 d.128.1.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 411 ) 6758455_PF01032_PF02653 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02653 is 6748941 with Jaccard = 0.9612 |PF02653|=2086 [ 2006 1 1098124 80 ] parent [ 6748941 ] : 6758455 0.0121088 (=36405/(1338*2247)) 99.1798 given [ 6748941 ] : 6748941 0.0165879 (=885/(24*2223)) 98.5394 best keyword for cluster 6748941 is PF02653 with Jaccard = 0.9612 [ 2006 1 1098124 80 ] 0.9995 0.9616 sibling [ 6748941 ] : 6705925 0.0807105 (=32731/(874*464)) 93.2869 best keyword for cluster 6705925 is PF01032 with Jaccard = 0.6647 [ 807 405 1098997 2 ] 0.6658 0.9975 SUGGESTING RELATEDNESS OF: A> PF02653 ( PF02653 Branched-chain amino acid transport system / permease component ) B> PF01032 ( PF01032 FecCD transport family ) they come from the same clan: CL0142.6 : PF00950 PF05145 PF02653 PF01032 PF01098 the two keywords do not coincide on UniRef90 proteins only PF02653 has a PDB structure (may not be up to date) PF01032 f.22.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 412 ) 6761875_PF00127_PF02298 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02298 is 6724386 with Jaccard = 0.9610 |PF02298|=153 [ 148 1 1100057 5 ] parent [ 6724386 ] : 6761875 0.00982134 (=785/(194*412)) 99.3639 given [ 6724386 ] : 6724386 0.0561275 (=229/(24*170)) 96.0357 best keyword for cluster 6724386 is PF02298 with Jaccard = 0.9610 [ 148 1 1100057 5 ] 0.9933 0.9673 sibling [ 6724386 ] : 6759281 0.00963948 (=50/(13*399)) 99.226 best keyword for cluster 6759281 is PF00127 with Jaccard = 0.8987 [ 213 11 1099974 13 ] 0.9509 0.9425 SUGGESTING RELATEDNESS OF: A> PF02298 ( PF02298 Plastocyanin-like domain ) B> PF00127 ( PF00127 Copper binding proteins, plastocyanin/azurin family ) they come from the same clan: CL0026.14 : PF00394 PF00116 PF00127 PF07731 PF07732 PF02298 PF00812 the two keywords do not coincide on UniRef90 proteins both PF02298 and PF00127 have PDB structures PF02298 b.6.1.1 PF00127 b.6.1.1 i.4.1.1 SUPERFAM mapping significantly overlapping: 1 PF00127 SSF49503 0.766 (average over 579 mutual instances, PF00127 604 appearances, SSF49503 36729 appearances) 2 PF02298 SSF49503 0.778 (average over 379 mutual instances, PF02298 385 appearances, SSF49503 36729 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 413 ) 6738168_PF02645_PF02734 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02645 is 6615390 with Jaccard = 0.9608 |PF02645|=204 [ 196 0 1100007 8 ] parent [ 6615390 ] : 6738168 0.0244044 (=1721/(215*328)) 97.6051 given [ 6615390 ] : 6615390 0.341195 (=217/(212*3)) 68.597 best keyword for cluster 6615390 is PF02645 with Jaccard = 0.9608 [ 196 0 1100007 8 ] 1.0000 0.9608 sibling [ 6615390 ] : 6690060 0.111692 (=2307/(243*85)) 90.2543 best keyword for cluster 6690060 is PF02734 with Jaccard = 0.7448 [ 216 73 1099921 1 ] 0.7474 0.9954 SUGGESTING RELATEDNESS OF: A> PF02645 ( PF02645 Uncharacterised protein, DegV family COG1307 ) B> PF02734 ( PF02734 DAK2 domain ) Only A has a clan ( CL0245.3 ). the two keywords coincide on Uniref90 proteins: |PF02645| = 204 , |PF02734| = 217 , |PF02645^PF02734| = 8 ( 3.9% and 3.7% ) both PF02645 and PF02734 have PDB structures PF02645 c.119.1.1 PF02734 a.208.1.1 SUPERFAM mapping significantly overlapping: 1 PF02734 SSF101473 0.827 (average over 691 mutual instances, PF02734 692 appearances, SSF101473 907 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 414 ) 6649345_PF00085_PF06201 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06201 is 6573663 with Jaccard = 0.9605 |PF06201|=74 [ 73 2 1100135 1 ] parent [ 6573663 ] : 6649345 0.212868 (=39086/(76*2416)) 79.8577 given [ 6573663 ] : 6573663 0.493243 (=73/(2*74)) 51.9025 best keyword for cluster 6573663 is PF06201 with Jaccard = 0.9605 [ 73 2 1100135 1 ] 0.9733 0.9865 sibling [ 6573663 ] : 6647387 0.233106 (=30270/(55*2361)) 79.1709 best keyword for cluster 6647387 is PF00085 with Jaccard = 0.6089 [ 1470 603 1097797 341 ] 0.7091 0.8117 SUGGESTING RELATEDNESS OF: A> PF06201 ( PF06201 Domain of Unknown Function (DUF1000) ) B> PF00085 ( PF00085 Thioredoxin ) Only B has a clan ( CL0172.11 ). the two keywords coincide on Uniref90 proteins: |PF00085| = 1811 , |PF06201| = 74 , |PF00085^PF06201| = 22 ( 1.2% and 29.7% ) both PF06201 and PF00085 have PDB structures PF06201 b.18.1.26 SUPERFAM mapping significantly overlapping: 1 PF00085 SSF52833 0.811 (average over 4892 mutual instances, PF00085 5078 appearances, SSF52833 34965 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 415 ) 6732968_PF03561_PF04115 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04115 is 6711570 with Jaccard = 0.9600 |PF04115|=50 [ 48 0 1100161 2 ] parent [ 6711570 ] : 6732968 0.0297719 (=124/(49*85)) 97.0479 given [ 6711570 ] : 6711570 0.0583501 (=58/(14*71)) 94.1754 best keyword for cluster 6711570 is PF04115 with Jaccard = 0.9600 [ 48 0 1100161 2 ] 1.0000 0.9600 sibling [ 6711570 ] : 6448104 1 (=94/(2*47)) 1.16911 best keyword for cluster 6448104 is PF03561 with Jaccard = 0.9200 [ 46 0 1100161 4 ] 1.0000 0.9200 SUGGESTING RELATEDNESS OF: A> PF04115 ( PF04115 Ureidoglycolate hydrolase ) B> PF03561 ( PF03561 Allantoicase repeat ) Only B has a clan ( CL0202.5 ). the two keywords coincide on Uniref90 proteins: |PF03561| = 50 , |PF04115| = 50 , |PF03561^PF04115| = 2 ( 4.0% and 4.0% ) both PF04115 and PF03561 have PDB structures PF03561 b.18.1.22 SUPERFAM mapping significantly overlapping: 1 PF03561 SSF49785 0.887 (average over 165 mutual instances, PF03561 166 appearances, SSF49785 13919 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 416 ) 6630082_PF04991_PF06828 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04991 is 6608805 with Jaccard = 0.9592 |PF04991|=48 [ 47 1 1100162 1 ] parent [ 6608805 ] : 6630082 0.29826 (=360/(17*71)) 74.6295 given [ 6608805 ] : 6608805 0.346154 (=135/(6*65)) 66.4323 best keyword for cluster 6608805 is PF04991 with Jaccard = 0.9592 [ 47 1 1100162 1 ] 0.9792 0.9792 sibling [ 6608805 ] : 6589624 0.428571 (=30/(7*10)) 57.1429 best keyword for cluster 6589624 is PF06828 with Jaccard = 0.9231 [ 12 0 1100198 1 ] 1.0000 0.9231 SUGGESTING RELATEDNESS OF: A> PF04991 ( PF04991 LICD Protein Family ) B> PF06828 ( PF06828 Fukutin-related ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04991 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 417 ) 6769494_PF03611_PF05437 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05437 is 6761469 with Jaccard = 0.9592 |PF05437|=147 [ 141 0 1100064 6 ] parent [ 6761469 ] : 6769494 0.00499182 (=61/(188*65)) 99.6936 given [ 6761469 ] : 6761469 0.00803571 (=27/(20*168)) 99.3431 best keyword for cluster 6761469 is PF05437 with Jaccard = 0.9592 [ 141 0 1100064 6 ] 1.0000 0.9592 sibling [ 6761469 ] : 6768874 0.015625 (=1/(1*64)) 99.6719 best keyword for cluster 6768874 is PF03611 with Jaccard = 1.0000 [ 34 0 1100177 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05437 ( PF05437 Branched-chain amino acid transport protein (AzlD) ) B> PF03611 ( PF03611 PTS system Galactitol-specific IIC component ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05437 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 418 ) 6384589_PF06071_PF08438 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08438 is 6203727 with Jaccard = 0.9592 |PF08438|=47 [ 47 2 1100162 0 ] parent [ 6203727 ] : 6384589 1 (=13328/(272*49)) 0.000979999 given [ 6203727 ] : 6203727 1 (=558/(31*18)) 5.78851e-17 best keyword for cluster 6203727 is PF08438 with Jaccard = 0.9592 [ 47 2 1100162 0 ] 0.9592 1.0000 sibling [ 6203727 ] : 6072285 1 (=2620/(10*262)) 7.63809e-28 best keyword for cluster 6072285 is PF06071 with Jaccard = 0.9288 [ 248 3 1099944 16 ] 0.9880 0.9394 SUGGESTING RELATEDNESS OF: A> PF08438 ( PF08438 GTPase of unknown function C-terminal ) B> PF06071 ( PF06071 Protein of unknown function (DUF933) ) Only B has a clan ( CL0072.14 ). the two keywords do not coincide on UniRef90 proteins only PF08438 has a PDB structure (may not be up to date) PF06071 d.15.10.2 SUPERFAM mapping significantly overlapping: 1 PF06071 SSF81271 0.987 (average over 918 mutual instances, PF06071 920 appearances, SSF81271 8501 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 419 ) 6608034_PF00456_PF02780 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00456 is 6598008 with Jaccard = 0.9585 |PF00456|=515 [ 508 15 1099681 7 ] parent [ 6598008 ] : 6608034 0.355364 (=179106/(809*623)) 65.9252 given [ 6598008 ] : 6598008 0.435691 (=271/(1*622)) 60.5762 best keyword for cluster 6598008 is PF00456 with Jaccard = 0.9585 [ 508 15 1099681 7 ] 0.9713 0.9864 sibling [ 6598008 ] : 6527603 0.762687 (=3066/(5*804)) 24.8444 best keyword for cluster 6527603 is PF02780 with Jaccard = 0.6511 [ 724 20 1099099 368 ] 0.9731 0.6630 SUGGESTING RELATEDNESS OF: A> PF00456 ( PF00456 Transketolase, thiamine diphosphate binding domain ) B> PF02780 ( PF02780 Transketolase, C-terminal domain ) Only A has a clan ( CL0254.3 ). the two keywords coincide on Uniref90 proteins: |PF00456| = 515 , |PF02780| = 1092 , |PF00456^PF02780| = 312 ( 60.6% and 28.6% ) both PF00456 and PF02780 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02780 SSF52922 0.877 (average over 3553 mutual instances, PF02780 3663 appearances, SSF52922 11092 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 420 ) 6524758_PF01561_PF07948 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01561 is 6466261 with Jaccard = 0.9583 |PF01561|=24 [ 23 0 1100187 1 ] parent [ 6466261 ] : 6524758 0.796875 (=153/(8*24)) 22.9738 given [ 6466261 ] : 6466261 0.968254 (=61/(21*3)) 3.17461 best keyword for cluster 6466261 is PF01561 with Jaccard = 0.9583 [ 23 0 1100187 1 ] 1.0000 0.9583 sibling [ 6466261 ] : 6417709 1 (=7/(1*7)) 0.08 best keyword for cluster 6417709 is PF07948 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01561 ( PF01561 Hantavirus glycoprotein G2 ) B> PF07948 ( PF07948 Nairovirus M polyprotein-like ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01561 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 421 ) 6766089_PF01042_PF04013 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04013 is 6705175 with Jaccard = 0.9583 |PF04013|=23 [ 23 1 1100187 0 ] parent [ 6705175 ] : 6766089 0.00520186 (=159/(31*986)) 99.559 given [ 6705175 ] : 6705175 0.0692308 (=9/(5*26)) 93.1417 best keyword for cluster 6705175 is PF04013 with Jaccard = 0.9583 [ 23 1 1100187 0 ] 0.9583 1.0000 sibling [ 6705175 ] : 6744650 0.0224042 (=197/(9*977)) 98.2046 best keyword for cluster 6744650 is PF01042 with Jaccard = 0.8986 [ 780 77 1099343 11 ] 0.9102 0.9861 SUGGESTING RELATEDNESS OF: A> PF04013 ( PF04013 Protein of unknown function (DUF358) ) B> PF01042 ( PF01042 Endoribonuclease L-PSP ) Only A has a clan ( CL0098.7 ). the two keywords do not coincide on UniRef90 proteins only PF04013 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01042 SSF55298 0.919 (average over 2576 mutual instances, PF01042 2591 appearances, SSF55298 2787 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 422 ) 6573169_PF06800_PF07857 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07857 is 6364109 with Jaccard = 0.9583 |PF07857|=24 [ 23 0 1100187 1 ] parent [ 6364109 ] : 6573169 0.551087 (=507/(23*40)) 51.7544 given [ 6364109 ] : 6364109 1 (=120/(8*15)) 4.417e-05 best keyword for cluster 6364109 is PF07857 with Jaccard = 0.9583 [ 23 0 1100187 1 ] 1.0000 0.9583 sibling [ 6364109 ] : 6370256 1 (=39/(1*39)) 0.000102566 best keyword for cluster 6370256 is PF06800 with Jaccard = 0.9730 [ 36 0 1100174 1 ] 1.0000 0.9730 SUGGESTING RELATEDNESS OF: A> PF07857 ( PF07857 CEO family (DUF1632) ) B> PF06800 ( PF06800 Sugar transport protein ) they come from the same clan: CL0184.5 : PF07857 PF04342 PF00892 PF05653 PF06027 PF00893 PF04142 PF06379 PF06800 PF03151 PF08449 PF02694 the two keywords do not coincide on UniRef90 proteins Neither PF07857 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF06800 SSF103473 0.646 (average over 1 mutual instances, PF06800 1 appearances, SSF103473 39293 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 423 ) 6767726_PF04521_PF04909 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04909 is 6736195 with Jaccard = 0.9578 |PF04909|=497 [ 477 1 1099713 20 ] parent [ 6736195 ] : 6767726 0.00426991 (=66/(533*29)) 99.6273 given [ 6736195 ] : 6736195 0.0359814 (=2280/(354*179)) 97.3961 best keyword for cluster 6736195 is PF04909 with Jaccard = 0.9578 [ 477 1 1099713 20 ] 0.9979 0.9598 sibling [ 6736195 ] : 6744628 0.0315789 (=6/(10*19)) 98.2037 best keyword for cluster 6744628 is PF04521 with Jaccard = 0.9167 [ 11 1 1100199 0 ] 0.9167 1.0000 SUGGESTING RELATEDNESS OF: A> PF04909 ( PF04909 Amidohydrolase ) B> PF04521 ( PF04521 ssRNA positive strand viral 18kD cysteine rich protein ) Only A has a clan ( CL0034.9 ). the two keywords do not coincide on UniRef90 proteins only PF04909 has a PDB structure (may not be up to date) PF04909 c.1.9.15 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 424 ) 6716601_PF03681_PF05534 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03681 is 6714979 with Jaccard = 0.9576 |PF03681|=235 [ 226 1 1099975 9 ] parent [ 6714979 ] : 6716601 0.0586912 (=1070/(59*309)) 94.9928 given [ 6714979 ] : 6714979 0.0649065 (=118/(6*303)) 94.7177 best keyword for cluster 6714979 is PF03681 with Jaccard = 0.9576 [ 226 1 1099975 9 ] 0.9956 0.9617 sibling [ 6714979 ] : 6556240 0.588663 (=405/(43*16)) 43.8636 best keyword for cluster 6556240 is PF05534 with Jaccard = 0.8163 [ 40 0 1100162 9 ] 1.0000 0.8163 SUGGESTING RELATEDNESS OF: A> PF03681 ( PF03681 Uncharacterised protein family (UPF0150) ) B> PF05534 ( PF05534 HicB family ) Only B has a clan ( CL0057.9 ). the two keywords coincide on Uniref90 proteins: |PF03681| = 235 , |PF05534| = 49 , |PF03681^PF05534| = 9 ( 3.8% and 18.4% ) Neither PF03681 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05534 SSF47598 0.702 (average over 13 mutual instances, PF05534 13 appearances, SSF47598 883 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 425 ) 6606929_PF00400_PF08145 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08145 is 6590869 with Jaccard = 0.9574 |PF08145|=46 [ 45 1 1100164 1 ] parent [ 6590869 ] : 6606929 0.402307 (=79858/(50*3970)) 65.0467 given [ 6590869 ] : 6590869 0.427083 (=41/(2*48)) 57.546 best keyword for cluster 6590869 is PF08145 with Jaccard = 0.9574 [ 45 1 1100164 1 ] 0.9783 0.9783 sibling [ 6590869 ] : 6605419 0.38941 (=41457/(27*3943)) 64.2818 best keyword for cluster 6605419 is PF00400 with Jaccard = 0.6629 [ 3780 20 1094509 1902 ] 0.9947 0.6653 SUGGESTING RELATEDNESS OF: A> PF08145 ( PF08145 BOP1NT (NUC169) domain ) B> PF00400 ( PF00400 WD domain, G-beta repeat ) Only B has a clan ( CL0186.8 ). the two keywords coincide on Uniref90 proteins: |PF00400| = 5682 , |PF08145| = 46 , |PF00400^PF08145| = 37 ( 0.7% and 80.4% ) only PF08145 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 426 ) 6722011_PF01037_PF06018 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01037 is 6699533 with Jaccard = 0.9569 |PF01037|=830 [ 821 28 1099353 9 ] parent [ 6699533 ] : 6722011 0.0687566 (=4119/(1051*57)) 95.7055 given [ 6699533 ] : 6699533 0.0942857 (=99/(1*1050)) 92.1017 best keyword for cluster 6699533 is PF01037 with Jaccard = 0.9569 [ 821 28 1099353 9 ] 0.9670 0.9892 sibling [ 6699533 ] : 6704992 0.0714286 (=28/(8*49)) 93.0987 best keyword for cluster 6704992 is PF06018 with Jaccard = 0.9070 [ 39 4 1100168 0 ] 0.9070 1.0000 SUGGESTING RELATEDNESS OF: A> PF01037 ( PF01037 AsnC family ) B> PF06018 ( PF06018 CodY GAF-like domain ) A and B come from a different clan ( CL0032.9 , CL0161.7 ). the two keywords do not coincide on UniRef90 proteins only PF01037 has a PDB structure (may not be up to date) PF01037 d.58.4.2 SUPERFAM mapping significantly overlapping: 1 PF01037 SSF54909 0.871 (average over 3219 mutual instances, PF01037 3221 appearances, SSF54909 7040 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 427 ) 6751998_PF02610_PF02952 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02952 is 6741924 with Jaccard = 0.9565 |PF02952|=22 [ 22 1 1100188 0 ] parent [ 6741924 ] : 6751998 0.0145349 (=25/(40*43)) 98.7688 given [ 6741924 ] : 6741924 0.020362 (=9/(17*26)) 97.9638 best keyword for cluster 6741924 is PF02952 with Jaccard = 0.9565 [ 22 1 1100188 0 ] 0.9565 1.0000 sibling [ 6741924 ] : 6721528 0.0769231 (=3/(1*39)) 95.641 best keyword for cluster 6721528 is PF02610 with Jaccard = 1.0000 [ 34 0 1100177 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02952 ( PF02952 L-fucose isomerase, C-terminal domain ) B> PF02610 ( PF02610 L-arabinose isomerase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF02952 and PF02610 have PDB structures PF02952 b.43.2.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 428 ) 6700492_PF03344_PF05361 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05361 is 6593454 with Jaccard = 0.9565 |PF05361|=23 [ 22 0 1100188 1 ] parent [ 6593454 ] : 6700492 0.0933333 (=42/(25*18)) 92.2648 given [ 6593454 ] : 6593454 0.434783 (=20/(2*23)) 58.6454 best keyword for cluster 6593454 is PF05361 with Jaccard = 0.9565 [ 22 0 1100188 1 ] 1.0000 0.9565 sibling [ 6593454 ] : 6678883 0.133333 (=6/(3*15)) 87.9044 best keyword for cluster 6678883 is PF03344 with Jaccard = 0.8571 [ 12 1 1100197 1 ] 0.9231 0.9231 SUGGESTING RELATEDNESS OF: A> PF05361 ( PF05361 PKC-activated protein phosphatase-1 inhibitor ) B> PF03344 ( PF03344 Daxx Family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05361 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05361 SSF81790 0.754 (average over 44 mutual instances, PF05361 46 appearances, SSF81790 47 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 429 ) 6739204_PF01488_PF02423 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02423 is 6690153 with Jaccard = 0.9563 |PF02423|=201 [ 197 5 1100005 4 ] parent [ 6690153 ] : 6739204 0.0290937 (=4466/(656*234)) 97.703 given [ 6690153 ] : 6690153 0.103261 (=95/(4*230)) 90.2904 best keyword for cluster 6690153 is PF02423 with Jaccard = 0.9563 [ 197 5 1100005 4 ] 0.9752 0.9801 sibling [ 6690153 ] : 6648777 0.240202 (=23535/(426*230)) 79.6281 best keyword for cluster 6648777 is PF01488 with Jaccard = 0.9092 [ 571 16 1099583 41 ] 0.9727 0.9330 SUGGESTING RELATEDNESS OF: A> PF02423 ( PF02423 Ornithine cyclodeaminase/mu-crystallin family ) B> PF01488 ( PF01488 Shikimate / quinate 5-dehydrogenase ) they come from the same clan: CL0063.17 : PF03721 PF04820 PF02254 PF00899 PF01946 PF02882 PF01488 PF01118 PF08491 PF03435 PF04321 PF07992 PF00070 PF02719 PF02153 PF02423 PF05368 PF01210 PF07994 PF07993 PF03447 PF03446 PF01225 PF06039 PF01232 PF03949 PF05834 PF00056 PF08659 PF07991 PF03486 PF00044 PF00732 PF01134 PF01408 PF00996 PF00479 PF00743 PF01494 PF00890 PF03807 PF01370 PF00208 PF02670 PF01113 PF01266 PF02629 PF02558 PF01593 PF01262 PF00670 PF00107 PF00106 PF02737 PF01073 PF02826 the two keywords do not coincide on UniRef90 proteins both PF02423 and PF01488 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02423 SSF51735 0.965 (average over 621 mutual instances, PF02423 624 appearances, SSF51735 164772 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 430 ) 6748772_PF04740_PF06013 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06013 is 6742882 with Jaccard = 0.9560 |PF06013|=91 [ 87 0 1100120 4 ] parent [ 6742882 ] : 6748772 0.0160004 (=341/(144*148)) 98.5252 given [ 6742882 ] : 6742882 0.0239808 (=30/(139*9)) 98.0418 best keyword for cluster 6742882 is PF06013 with Jaccard = 0.9560 [ 87 0 1100120 4 ] 1.0000 0.9560 sibling [ 6742882 ] : 6737739 0.0250665 (=113/(46*98)) 97.5598 best keyword for cluster 6737739 is PF04740 with Jaccard = 0.6383 [ 30 16 1100164 1 ] 0.6522 0.9677 SUGGESTING RELATEDNESS OF: A> PF06013 ( PF06013 Proteins of 100 residues with WXG ) B> PF04740 ( PF04740 Bacillus transposase protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06013 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 431 ) 6760411_PF05347_PF05882 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05347 is 6756063 with Jaccard = 0.9549 |PF05347|=132 [ 127 1 1100078 5 ] parent [ 6756063 ] : 6760411 0.0103571 (=87/(35*240)) 99.2856 given [ 6756063 ] : 6756063 0.0149173 (=83/(26*214)) 99.0359 best keyword for cluster 6756063 is PF05347 with Jaccard = 0.9549 [ 127 1 1100078 5 ] 0.9922 0.9621 sibling [ 6756063 ] : 6746912 0.0294118 (=1/(1*34)) 98.3824 best keyword for cluster 6746912 is PF05882 with Jaccard = 0.9310 [ 27 1 1100182 1 ] 0.9643 0.9643 SUGGESTING RELATEDNESS OF: A> PF05347 ( PF05347 Complex 1 protein (LYR family) ) B> PF05882 ( PF05882 ACN9 family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05347 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05882 SSF46458 0.685 (average over 1 mutual instances, PF05882 1 appearances, SSF46458 5480 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 432 ) 6697014_PF01042_PF01902 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01042 is 6678666 with Jaccard = 0.9545 |PF01042|=791 [ 755 0 1099420 36 ] parent [ 6678666 ] : 6697014 0.0879872 (=8106/(107*861)) 91.7391 given [ 6678666 ] : 6678666 0.13696 (=2528/(22*839)) 87.8361 best keyword for cluster 6678666 is PF01042 with Jaccard = 0.9545 [ 755 0 1099420 36 ] 1.0000 0.9545 sibling [ 6678666 ] : 6637247 0.253205 (=79/(104*3)) 76.3247 best keyword for cluster 6637247 is PF01902 with Jaccard = 0.9083 [ 99 0 1100102 10 ] 1.0000 0.9083 SUGGESTING RELATEDNESS OF: A> PF01042 ( PF01042 Endoribonuclease L-PSP ) B> PF01902 ( PF01902 ATP-binding region ) Only B has a clan ( CL0039.7 ). the two keywords coincide on Uniref90 proteins: |PF01042| = 791 , |PF01902| = 109 , |PF01042^PF01902| = 23 ( 2.9% and 21.1% ) both PF01042 and PF01902 have PDB structures PF01902 c.26.2.1 SUPERFAM mapping significantly overlapping: 1 PF01042 SSF55298 0.919 (average over 2576 mutual instances, PF01042 2591 appearances, SSF55298 2787 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 433 ) 6776393_PF03414_PF04487 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03414 is 6762969 with Jaccard = 0.9545 |PF03414|=44 [ 42 0 1100167 2 ] parent [ 6762969 ] : 6776393 0.00128425 (=6/(73*64)) 99.8882 given [ 6762969 ] : 6762969 0.00584795 (=6/(54*19)) 99.4152 best keyword for cluster 6762969 is PF03414 with Jaccard = 0.9545 [ 42 0 1100167 2 ] 1.0000 0.9545 sibling [ 6762969 ] : 6764583 0.00651042 (=5/(16*48)) 99.4927 best keyword for cluster 6764583 is PF04487 with Jaccard = 0.9474 [ 18 1 1100192 0 ] 0.9474 1.0000 SUGGESTING RELATEDNESS OF: A> PF03414 ( PF03414 Glycosyltransferase family 6 ) B> PF04487 ( PF04487 CITED ) Only A has a clan ( CL0110.6 ). the two keywords do not coincide on UniRef90 proteins both PF03414 and PF04487 have PDB structures PF03414 c.68.1.9 PF04487 j.96.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 434 ) 6731040_PF06542_PF06879 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06879 is 6616647 with Jaccard = 0.9524 |PF06879|=21 [ 20 0 1100190 1 ] parent [ 6616647 ] : 6731040 0.0396975 (=21/(23*23)) 96.8383 given [ 6616647 ] : 6616647 0.333333 (=30/(18*5)) 69.067 best keyword for cluster 6616647 is PF06879 with Jaccard = 0.9524 [ 20 0 1100190 1 ] 1.0000 0.9524 sibling [ 6616647 ] : 6692792 0.0916667 (=11/(8*15)) 90.8337 best keyword for cluster 6692792 is PF06542 with Jaccard = 0.8571 [ 12 0 1100197 2 ] 1.0000 0.8571 SUGGESTING RELATEDNESS OF: A> PF06879 ( PF06879 Protein of unknown function (DUF1261) ) B> PF06542 ( PF06542 Protein of unknown function (DUF1114) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06879 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 435 ) 6718526_PF07297_PF08510 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07297 is 6615336 with Jaccard = 0.9524 |PF07297|=21 [ 20 0 1100190 1 ] parent [ 6615336 ] : 6718526 0.06375 (=51/(32*25)) 95.2182 given [ 6615336 ] : 6615336 0.326087 (=15/(23*2)) 68.5608 best keyword for cluster 6615336 is PF07297 with Jaccard = 0.9524 [ 20 0 1100190 1 ] 1.0000 0.9524 sibling [ 6615336 ] : 6605222 0.366667 (=22/(2*30)) 64.0784 best keyword for cluster 6605222 is PF08510 with Jaccard = 0.8000 [ 28 0 1100176 7 ] 1.0000 0.8000 SUGGESTING RELATEDNESS OF: A> PF07297 ( PF07297 Dolichol phosphate-mannose biosynthesis regulatory protein (DPM2) ) B> PF08510 ( PF08510 PIG-P ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07297 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 436 ) 6762703_PF01862_PF07357 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07357 is 6043261 with Jaccard = 0.9524 |PF07357|=21 [ 20 0 1100190 1 ] parent [ 6043261 ] : 6762703 0.00765306 (=9/(21*56)) 99.404 given [ 6043261 ] : 6043261 1 (=104/(8*13)) 2.30973e-30 best keyword for cluster 6043261 is PF07357 with Jaccard = 0.9524 [ 20 0 1100190 1 ] 1.0000 0.9524 sibling [ 6043261 ] : 6756740 0.0132576 (=7/(44*12)) 99.0752 best keyword for cluster 6756740 is PF01862 with Jaccard = 0.9737 [ 37 1 1100173 0 ] 0.9737 1.0000 SUGGESTING RELATEDNESS OF: A> PF07357 ( PF07357 Dinitrogenase reductase ADP-ribosyltransferase (DRAT) ) B> PF01862 ( PF01862 Pyruvoyl-dependent arginine decarboxylase (PvlArgDC) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07357 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01862 SSF56271 0.943 (average over 81 mutual instances, PF01862 81 appearances, SSF56271 104 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 437 ) 6722479_PF02453_PF07234 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02453 is 6622261 with Jaccard = 0.9521 |PF02453|=166 [ 159 1 1100044 7 ] parent [ 6622261 ] : 6722479 0.0706494 (=136/(175*11)) 95.7839 given [ 6622261 ] : 6622261 0.310909 (=513/(10*165)) 71.4791 best keyword for cluster 6622261 is PF02453 with Jaccard = 0.9521 [ 159 1 1100044 7 ] 0.9938 0.9578 sibling [ 6622261 ] : 6669678 0.25 (=6/(8*3)) 85.4167 best keyword for cluster 6669678 is PF07234 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02453 ( PF02453 Reticulon ) B> PF07234 ( PF07234 Protein of unknown function (DUF1426) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02453 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 438 ) 6772698_PF01974_PF06315 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01974 is 6748618 with Jaccard = 0.9515 |PF01974|=103 [ 98 0 1100108 5 ] parent [ 6748618 ] : 6772698 0.00242234 (=17/(121*58)) 99.7962 given [ 6748618 ] : 6748618 0.0200501 (=16/(114*7)) 98.5119 best keyword for cluster 6748618 is PF01974 with Jaccard = 0.9515 [ 98 0 1100108 5 ] 1.0000 0.9515 sibling [ 6748618 ] : 6768061 0.00555556 (=4/(40*18)) 99.6414 best keyword for cluster 6768061 is PF06315 with Jaccard = 0.9667 [ 29 1 1100181 0 ] 0.9667 1.0000 SUGGESTING RELATEDNESS OF: A> PF01974 ( PF01974 tRNA intron endonuclease, catalytic C-terminal domain ) B> PF06315 ( PF06315 Isocitrate dehydrogenase kinase/phosphatase (AceK) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01974 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01974 SSF53032 0.905 (average over 199 mutual instances, PF01974 259 appearances, SSF53032 237 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 439 ) 6746649_PF01920_PF02996 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02996 is 6695281 with Jaccard = 0.9515 |PF02996|=164 [ 157 1 1100046 7 ] parent [ 6695281 ] : 6746649 0.0216861 (=729/(176*191)) 98.3616 given [ 6695281 ] : 6695281 0.0997721 (=613/(48*128)) 91.3885 best keyword for cluster 6695281 is PF02996 with Jaccard = 0.9515 [ 157 1 1100046 7 ] 0.9937 0.9573 sibling [ 6695281 ] : 6714974 0.065107 (=487/(136*55)) 94.716 best keyword for cluster 6714974 is PF01920 with Jaccard = 0.9382 [ 167 0 1100033 11 ] 1.0000 0.9382 SUGGESTING RELATEDNESS OF: A> PF02996 ( PF02996 Prefoldin subunit ) B> PF01920 ( PF01920 Prefoldin subunit ) they come from the same clan: CL0200.5 : PF01920 PF02996 the two keywords do not coincide on UniRef90 proteins both PF02996 and PF01920 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02996 SSF46579 0.892 (average over 327 mutual instances, PF02996 330 appearances, SSF46579 930 appearances) 2 PF01920 SSF46579 0.944 (average over 357 mutual instances, PF01920 366 appearances, SSF46579 930 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 440 ) 6749744_PF03332_PF07851 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03332 is 6529894 with Jaccard = 0.9508 |PF03332|=58 [ 58 3 1100150 0 ] parent [ 6529894 ] : 6749744 0.021585 (=70/(69*47)) 98.6006 given [ 6529894 ] : 6529894 0.753731 (=101/(2*67)) 25.8978 best keyword for cluster 6529894 is PF03332 with Jaccard = 0.9508 [ 58 3 1100150 0 ] 0.9508 1.0000 sibling [ 6529894 ] : 6744182 0.0189189 (=7/(10*37)) 98.1625 best keyword for cluster 6744182 is PF07851 with Jaccard = 1.0000 [ 20 0 1100191 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03332 ( PF03332 Eukaryotic phosphomannomutase ) B> PF07851 ( PF07851 TMPIT-like protein ) Only A has a clan ( CL0137.9 ). the two keywords do not coincide on UniRef90 proteins only PF03332 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 441 ) 6546863_PF02958_PF07914 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02958 is 6532919 with Jaccard = 0.9504 |PF02958|=121 [ 115 0 1100090 6 ] parent [ 6532919 ] : 6546863 0.671008 (=3194/(40*119)) 36.5246 given [ 6532919 ] : 6532919 0.771186 (=91/(1*118)) 27.6027 best keyword for cluster 6532919 is PF02958 with Jaccard = 0.9504 [ 115 0 1100090 6 ] 1.0000 0.9504 sibling [ 6532919 ] : 6509948 0.897436 (=35/(1*39)) 15.3955 best keyword for cluster 6509948 is PF07914 with Jaccard = 0.9310 [ 27 1 1100182 1 ] 0.9643 0.9643 SUGGESTING RELATEDNESS OF: A> PF02958 ( PF02958 Domain of unknown function (DUF227) ) B> PF07914 ( PF07914 Protein of unknown function (DUF1679) ) they come from the same clan: CL0016.14 : PF07714 PF00069 PF06293 PF03881 PF02958 PF07914 PF01633 PF04655 PF01636 PF03109 PF05445 PF01163 PF06176 the two keywords do not coincide on UniRef90 proteins Neither PF02958 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF07914 SSF56112 0.516 (average over 46 mutual instances, PF07914 47 appearances, SSF56112 66637 appearances) 2 PF02958 SSF56112 0.668 (average over 283 mutual instances, PF02958 288 appearances, SSF56112 66637 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 442 ) 6448201_PF01018_PF06071 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01018 is 6439767 with Jaccard = 0.9502 |PF01018|=299 [ 286 2 1099910 13 ] parent [ 6439767 ] : 6448201 0.989609 (=100382/(321*316)) 1.18297 given [ 6439767 ] : 6439767 0.993631 (=624/(2*314)) 0.64166 best keyword for cluster 6439767 is PF01018 with Jaccard = 0.9502 [ 286 2 1099910 13 ] 0.9931 0.9565 sibling [ 6439767 ] : 6384589 1 (=13328/(272*49)) 0.000979999 best keyword for cluster 6384589 is PF06071 with Jaccard = 0.7848 [ 248 52 1099895 16 ] 0.8267 0.9394 SUGGESTING RELATEDNESS OF: A> PF01018 ( PF01018 GTP1/OBG ) B> PF06071 ( PF06071 Protein of unknown function (DUF933) ) Only B has a clan ( CL0072.14 ). the two keywords do not coincide on UniRef90 proteins both PF01018 and PF06071 have PDB structures PF01018 b.117.1.1 PF06071 d.15.10.2 SUPERFAM mapping significantly overlapping: 1 PF06071 SSF81271 0.987 (average over 918 mutual instances, PF06071 920 appearances, SSF81271 8501 appearances) 2 PF01018 SSF82051 0.983 (average over 948 mutual instances, PF01018 1220 appearances, SSF82051 2144 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 443 ) 6603163_PF00894_PF01690 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01690 is 6170440 with Jaccard = 0.9500 |PF01690|=20 [ 19 0 1100191 1 ] parent [ 6170440 ] : 6603163 0.368421 (=91/(19*13)) 63.1579 given [ 6170440 ] : 6170440 1 (=48/(3*16)) 1.03942e-19 best keyword for cluster 6170440 is PF01690 with Jaccard = 0.9500 [ 19 0 1100191 1 ] 1.0000 0.9500 sibling [ 6170440 ] : 6209349 1 (=30/(3*10)) 1.6738e-16 best keyword for cluster 6209349 is PF00894 with Jaccard = 0.6500 [ 13 0 1100191 7 ] 1.0000 0.6500 SUGGESTING RELATEDNESS OF: A> PF01690 ( PF01690 Potato leaf roll virus readthrough protein ) B> PF00894 ( PF00894 Luteovirus coat protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00894| = 20 , |PF01690| = 20 , |PF00894^PF01690| = 7 ( 35.0% and 35.0% ) Neither PF01690 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 444 ) 5389938_PF05379_PF05413 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05379 is 5165011 with Jaccard = 0.9500 |PF05379|=20 [ 19 0 1100191 1 ] parent [ 5165011 ] : 5389938 1 (=140/(20*7)) 8.01509e-118 given [ 5165011 ] : 5165011 1 (=19/(1*19)) 0 best keyword for cluster 5165011 is PF05379 with Jaccard = 0.9500 [ 19 0 1100191 1 ] 1.0000 0.9500 sibling [ 5165011 ] : 5311096 1 (=6/(1*6)) 1.66672e-137 best keyword for cluster 5311096 is PF05413 with Jaccard = 0.8571 [ 6 1 1100204 0 ] 0.8571 1.0000 SUGGESTING RELATEDNESS OF: A> PF05379 ( PF05379 Carlavirus endopeptidase ) B> PF05413 ( PF05413 Putative closterovirus papain-like endopeptidase ) Only A has a clan ( CL0125.9 ). the two keywords do not coincide on UniRef90 proteins Neither PF05379 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 445 ) 6715876_PF00430_PF01991 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01991 is 6654698 with Jaccard = 0.9492 |PF01991|=118 [ 112 0 1100093 6 ] parent [ 6654698 ] : 6715876 0.0651359 (=6397/(122*805)) 94.8622 given [ 6654698 ] : 6654698 0.208547 (=122/(5*117)) 81.624 best keyword for cluster 6654698 is PF01991 with Jaccard = 0.9492 [ 112 0 1100093 6 ] 1.0000 0.9492 sibling [ 6654698 ] : 6713393 0.0744906 (=3583/(65*740)) 94.4672 best keyword for cluster 6713393 is PF00430 with Jaccard = 0.6472 [ 387 204 1099613 7 ] 0.6548 0.9822 SUGGESTING RELATEDNESS OF: A> PF01991 ( PF01991 ATP synthase (E/31 kDa) subunit ) B> PF00430 ( PF00430 ATP synthase B/B' CF(0) ) Only B has a clan ( CL0255.4 ). the two keywords do not coincide on UniRef90 proteins only PF01991 has a PDB structure (may not be up to date) PF00430 f.23.21.1 j.35.1.1 SUPERFAM mapping significantly overlapping: 1 PF00430 SSF82607 0.684 (average over 1 mutual instances, PF00430 10 appearances, SSF82607 761 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 446 ) 6689485_PF04032_PF08296 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04032 is 6550824 with Jaccard = 0.9492 |PF04032|=59 [ 56 0 1100152 3 ] parent [ 6550824 ] : 6689485 0.132254 (=98/(57*13)) 90.1608 given [ 6550824 ] : 6550824 0.648883 (=523/(26*31)) 39.5374 best keyword for cluster 6550824 is PF04032 with Jaccard = 0.9492 [ 56 0 1100152 3 ] 1.0000 0.9492 sibling [ 6550824 ] : 6658361 0.190476 (=8/(7*6)) 82.8293 best keyword for cluster 6658361 is PF08296 with Jaccard = 0.6364 [ 7 0 1100200 4 ] 1.0000 0.6364 SUGGESTING RELATEDNESS OF: A> PF04032 ( PF04032 RNAse P Rpr2/Rpp21/SNM1 subunit domain ) B> PF08296 ( ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04032 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 447 ) 6733306_PF01888_PF02571 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02571 is 6497012 with Jaccard = 0.9490 |PF02571|=98 [ 93 0 1100113 5 ] parent [ 6497012 ] : 6733306 0.029596 (=293/(100*99)) 97.0857 given [ 6497012 ] : 6497012 0.923913 (=680/(8*92)) 10.3702 best keyword for cluster 6497012 is PF02571 with Jaccard = 0.9490 [ 93 0 1100113 5 ] 1.0000 0.9490 sibling [ 6497012 ] : 6731849 0.0408163 (=4/(1*98)) 96.9235 best keyword for cluster 6731849 is PF01888 with Jaccard = 0.9894 [ 93 0 1100117 1 ] 1.0000 0.9894 SUGGESTING RELATEDNESS OF: A> PF02571 ( PF02571 Precorrin-6x reductase CbiJ/CobK ) B> PF01888 ( PF01888 CbiD ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01888| = 94 , |PF02571| = 98 , |PF01888^PF02571| = 3 ( 3.2% and 3.1% ) only PF02571 has a PDB structure (may not be up to date) PF01888 e.54.1.1 SUPERFAM mapping significantly overlapping: 1 PF01888 SSF111342 0.885 (average over 277 mutual instances, PF01888 277 appearances, SSF111342 288 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 448 ) 6736552_PF01052_PF04509 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01052 is 6700713 with Jaccard = 0.9489 |PF01052|=354 [ 353 18 1099839 1 ] parent [ 6700713 ] : 6736552 0.0275438 (=2237/(432*188)) 97.4309 given [ 6700713 ] : 6700713 0.0790297 (=202/(6*426)) 92.3035 best keyword for cluster 6700713 is PF01052 with Jaccard = 0.9489 [ 353 18 1099839 1 ] 0.9515 0.9972 sibling [ 6700713 ] : 6711572 0.0717665 (=622/(81*107)) 94.1761 best keyword for cluster 6711572 is PF04509 with Jaccard = 0.7667 [ 115 0 1100061 35 ] 1.0000 0.7667 SUGGESTING RELATEDNESS OF: A> PF01052 ( PF01052 Surface presentation of antigens (SPOA) protein ) B> PF04509 ( PF04509 CheC-like family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01052| = 354 , |PF04509| = 150 , |PF01052^PF04509| = 27 ( 7.6% and 18.0% ) both PF01052 and PF04509 have PDB structures PF04509 d.252.1.1 SUPERFAM mapping significantly overlapping: 1 PF01052 SSF101801 0.925 (average over 1196 mutual instances, PF01052 1196 appearances, SSF101801 1677 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 449 ) 6553137_PF00154_PF08423 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00154 is 6525584 with Jaccard = 0.9488 |PF00154|=332 [ 315 0 1099879 17 ] parent [ 6525584 ] : 6553137 0.618222 (=49616/(352*228)) 41.2553 given [ 6525584 ] : 6525584 0.776627 (=3675/(14*338)) 23.2444 best keyword for cluster 6525584 is PF00154 with Jaccard = 0.9488 [ 315 0 1099879 17 ] 1.0000 0.9488 sibling [ 6525584 ] : 6539891 0.692222 (=2483/(17*211)) 32.3724 best keyword for cluster 6539891 is PF08423 with Jaccard = 0.9529 [ 182 8 1100020 1 ] 0.9579 0.9945 SUGGESTING RELATEDNESS OF: A> PF00154 ( PF00154 recA bacterial DNA recombination protein ) B> PF08423 ( PF08423 Rad51 ) they come from the same clan: CL0216.4 : PF08423 PF00154 the two keywords coincide on Uniref90 proteins: |PF00154| = 332 , |PF08423| = 183 , |PF00154^PF08423| = 9 ( 2.7% and 4.9% ) both PF00154 and PF08423 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 450 ) 6781262_PF02452_PF02495 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02495 is 6778505 with Jaccard = 0.9487 |PF02495|=77 [ 74 1 1100133 3 ] parent [ 6778505 ] : 6781262 0.000358262 (=28/(203*385)) 99.9707 given [ 6778505 ] : 6778505 0.00136293 (=14/(96*107)) 99.9292 best keyword for cluster 6778505 is PF02495 with Jaccard = 0.9487 [ 74 1 1100133 3 ] 0.9867 0.9610 sibling [ 6778505 ] : 6780292 0.00260417 (=1/(1*384)) 99.9583 best keyword for cluster 6780292 is PF02452 with Jaccard = 0.9803 [ 149 3 1100059 0 ] 0.9803 1.0000 SUGGESTING RELATEDNESS OF: A> PF02495 ( PF02495 7kD viral coat protein ) B> PF02452 ( PF02452 PemK-like protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02495 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF02452 SSF50118 0.943 (average over 466 mutual instances, PF02452 466 appearances, SSF50118 511 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 451 ) 6769850_PF03498_PF06680 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03498 is 6746534 with Jaccard = 0.9474 |PF03498|=19 [ 18 0 1100192 1 ] parent [ 6746534 ] : 6769850 0.00357143 (=3/(40*21)) 99.706 given [ 6746534 ] : 6746534 0.0204604 (=8/(17*23)) 98.3524 best keyword for cluster 6746534 is PF03498 with Jaccard = 0.9474 [ 18 0 1100192 1 ] 1.0000 0.9474 sibling [ 6746534 ] : 6758061 0.00909091 (=1/(10*11)) 99.1573 best keyword for cluster 6758061 is PF06680 with Jaccard = 1.0000 [ 2 0 1100209 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03498 ( PF03498 Cytolethal distending toxin A/C family ) B> PF06680 ( PF06680 Protein of unknown function (DUF1181) ) Only A has a clan ( CL0066.9 ). the two keywords do not coincide on UniRef90 proteins only PF03498 has a PDB structure (may not be up to date) PF03498 b.42.2.1 SUPERFAM mapping significantly overlapping: 1 PF03498 SSF50370 0.866 (average over 96 mutual instances, PF03498 97 appearances, SSF50370 1691 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 452 ) 6685473_PF01947_PF04482 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04482 is 6392012 with Jaccard = 0.9474 |PF04482|=18 [ 18 1 1100192 0 ] parent [ 6392012 ] : 6685473 0.12963 (=49/(21*18)) 89.3686 given [ 6392012 ] : 6392012 1 (=38/(2*19)) 0.00283734 best keyword for cluster 6392012 is PF04482 with Jaccard = 0.9474 [ 18 1 1100192 0 ] 0.9474 1.0000 sibling [ 6392012 ] : 6643837 0.222222 (=16/(12*6)) 78.1124 best keyword for cluster 6643837 is PF01947 with Jaccard = 0.7500 [ 12 0 1100195 4 ] 1.0000 0.7500 SUGGESTING RELATEDNESS OF: A> PF04482 ( PF04482 Protein of unknown function (DUF564) ) B> PF01947 ( PF01947 Protein of unknown function DUF98 ) Only B has a clan ( CL0122.6 ). the two keywords coincide on Uniref90 proteins: |PF01947| = 16 , |PF04482| = 18 , |PF01947^PF04482| = 1 ( 6.2% and 5.6% ) Neither PF04482 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 453 ) 6732698_PF06937_PF07763 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07763 is 6710207 with Jaccard = 0.9474 |PF07763|=19 [ 18 0 1100192 1 ] parent [ 6710207 ] : 6732698 0.0350877 (=16/(12*38)) 97.0156 given [ 6710207 ] : 6710207 0.0727273 (=12/(33*5)) 93.9706 best keyword for cluster 6710207 is PF07763 with Jaccard = 0.9474 [ 18 0 1100192 1 ] 1.0000 0.9474 sibling [ 6710207 ] : 6706399 0.0857143 (=3/(7*5)) 93.3657 best keyword for cluster 6706399 is PF06937 with Jaccard = 1.0000 [ 7 0 1100204 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07763 ( PF07763 FEZ-like protein ) B> PF06937 ( PF06937 EURL protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07763 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 454 ) 6709457_PF02525_PF03358 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02525 is 6697772 with Jaccard = 0.9470 |PF02525|=395 [ 375 1 1099815 20 ] parent [ 6697772 ] : 6709457 0.0851348 (=14733/(417*415)) 93.8555 given [ 6697772 ] : 6697772 0.0871671 (=72/(2*413)) 91.8466 best keyword for cluster 6697772 is PF02525 with Jaccard = 0.9470 [ 375 1 1099815 20 ] 0.9973 0.9494 sibling [ 6697772 ] : 6621507 0.33311 (=8455/(74*343)) 71.0028 best keyword for cluster 6621507 is PF03358 with Jaccard = 0.6543 [ 371 0 1099644 196 ] 1.0000 0.6543 SUGGESTING RELATEDNESS OF: A> PF02525 ( PF02525 Flavodoxin-like fold ) B> PF03358 ( PF03358 NADPH-dependent FMN reductase ) they come from the same clan: CL0042.7 : PF00258 PF02525 PF07972 PF03358 the two keywords do not coincide on UniRef90 proteins both PF02525 and PF03358 have PDB structures PF02525 c.23.5.3 PF03358 c.23.5.4 c.23.5.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 455 ) 6757287_PF00293_PF06381 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00293 is 6755581 with Jaccard = 0.9465 |PF00293|=2744 [ 2602 5 1097462 142 ] parent [ 6755581 ] : 6757287 0.00979154 (=1628/(54*3079)) 99.1089 given [ 6755581 ] : 6755581 0.0150833 (=1743/(38*3041)) 99.0025 best keyword for cluster 6755581 is PF00293 with Jaccard = 0.9465 [ 2602 5 1097462 142 ] 0.9981 0.9483 sibling [ 6755581 ] : 6751497 0.0204082 (=5/(5*49)) 98.733 best keyword for cluster 6751497 is PF06381 with Jaccard = 0.7500 [ 6 2 1100203 0 ] 0.7500 1.0000 SUGGESTING RELATEDNESS OF: A> PF00293 ( PF00293 NUDIX domain ) B> PF06381 ( PF06381 Protein of unknown function (DUF1073) ) Only A has a clan ( CL0261.2 ). the two keywords do not coincide on UniRef90 proteins only PF00293 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00293 SSF55811 0.818 (average over 8148 mutual instances, PF00293 8350 appearances, SSF55811 10363 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 456 ) 6715444_PF00005_PF06792 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06792 is 6532908 with Jaccard = 0.9459 |PF06792|=37 [ 35 0 1100174 2 ] parent [ 6532908 ] : 6715444 0.0524778 (=39020/(37*20096)) 94.7958 given [ 6532908 ] : 6532908 0.777778 (=28/(1*36)) 27.5943 best keyword for cluster 6532908 is PF06792 with Jaccard = 0.9459 [ 35 0 1100174 2 ] 1.0000 0.9459 sibling [ 6532908 ] : 6713034 0.0696297 (=5596/(4*20092)) 94.4116 best keyword for cluster 6713034 is PF00005 with Jaccard = 0.9910 [ 18190 107 1081855 59 ] 0.9942 0.9968 SUGGESTING RELATEDNESS OF: A> PF06792 ( PF06792 Uncharacterised protein family (UPF0261) ) B> PF00005 ( PF00005 ABC transporter ) Only B has a clan ( CL0023.26 ). the two keywords coincide on Uniref90 proteins: |PF00005| = 18249 , |PF06792| = 37 , |PF00005^PF06792| = 2 ( 0.0% and 5.4% ) only PF06792 has a PDB structure (may not be up to date) PF00005 c.37.1.12 j.35.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 457 ) 6683032_PF00326_PF07676 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00326 is 6658985 with Jaccard = 0.9455 |PF00326|=696 [ 676 19 1099496 20 ] parent [ 6658985 ] : 6683032 0.127578 (=40103/(780*403)) 88.9462 given [ 6658985 ] : 6658985 0.179587 (=834/(774*6)) 83.0057 best keyword for cluster 6658985 is PF00326 with Jaccard = 0.9455 [ 676 19 1099496 20 ] 0.9727 0.9713 sibling [ 6658985 ] : 6679084 0.148423 (=640/(11*392)) 87.9687 best keyword for cluster 6679084 is PF07676 with Jaccard = 0.6368 [ 249 20 1099820 122 ] 0.9257 0.6712 SUGGESTING RELATEDNESS OF: A> PF00326 ( PF00326 Prolyl oligopeptidase family ) B> PF07676 ( PF07676 WD40-like Beta Propeller Repeat ) A and B come from a different clan ( CL0028.14 , CL0186.8 ). the two keywords coincide on Uniref90 proteins: |PF00326| = 696 , |PF07676| = 371 , |PF00326^PF07676| = 60 ( 8.6% and 16.2% ) both PF00326 and PF07676 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 458 ) 6751762_PF01564_PF02675 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01564 is 6745131 with Jaccard = 0.9449 |PF01564|=263 [ 257 9 1099939 6 ] parent [ 6745131 ] : 6751762 0.0126661 (=734/(122*475)) 98.7514 given [ 6745131 ] : 6745131 0.0194805 (=117/(13*462)) 98.2445 best keyword for cluster 6745131 is PF01564 with Jaccard = 0.9449 [ 257 9 1099939 6 ] 0.9662 0.9772 sibling [ 6745131 ] : 6644380 0.279167 (=67/(2*120)) 78.2652 best keyword for cluster 6644380 is PF02675 with Jaccard = 0.9652 [ 111 0 1100096 4 ] 1.0000 0.9652 SUGGESTING RELATEDNESS OF: A> PF01564 ( PF01564 Spermine/spermidine synthase ) B> PF02675 ( PF02675 S-adenosylmethionine decarboxylase ) Only A has a clan ( CL0102.14 ). the two keywords coincide on Uniref90 proteins: |PF01564| = 263 , |PF02675| = 115 , |PF01564^PF02675| = 4 ( 1.5% and 3.5% ) both PF01564 and PF02675 have PDB structures PF02675 d.156.1.2 SUPERFAM mapping significantly overlapping: 1 PF02675 SSF56276 0.924 (average over 324 mutual instances, PF02675 324 appearances, SSF56276 602 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 459 ) 6538599_PF02471_PF06780 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02471 is 5969585 with Jaccard = 0.9444 |PF02471|=18 [ 17 0 1100193 1 ] parent [ 5969585 ] : 6538599 0.71267 (=315/(17*26)) 31.2588 given [ 5969585 ] : 5969585 1 (=70/(10*7)) 5.15297e-37 best keyword for cluster 5969585 is PF02471 with Jaccard = 0.9444 [ 17 0 1100193 1 ] 1.0000 0.9444 sibling [ 5969585 ] : 6523733 0.791667 (=133/(14*12)) 22.0105 best keyword for cluster 6523733 is PF06780 with Jaccard = 1.0000 [ 14 0 1100197 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02471 ( PF02471 Borrelia outer surface protein E ) B> PF06780 ( PF06780 Erp protein C-terminus ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02471 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 460 ) 6767234_PF05339_PF07865 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07865 is 6627778 with Jaccard = 0.9444 |PF07865|=17 [ 17 1 1100193 0 ] parent [ 6627778 ] : 6767234 0.00396825 (=2/(21*24)) 99.6064 given [ 6627778 ] : 6627778 0.288889 (=26/(15*6)) 73.651 best keyword for cluster 6627778 is PF07865 with Jaccard = 0.9444 [ 17 1 1100193 0 ] 0.9444 1.0000 sibling [ 6627778 ] : 6749976 0.0234375 (=3/(16*8)) 98.6172 best keyword for cluster 6749976 is PF05339 with Jaccard = 0.9000 [ 9 1 1100201 0 ] 0.9000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07865 ( PF07865 Protein of unknown function (DUF1652) ) B> PF05339 ( PF05339 Protein of unknown function (DUF739) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07865 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 461 ) 6725933_PF02876_PF03642 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02876 is 6686928 with Jaccard = 0.9429 |PF02876|=69 [ 66 1 1100141 3 ] parent [ 6686928 ] : 6725933 0.0483721 (=104/(86*25)) 96.225 given [ 6686928 ] : 6686928 0.118902 (=39/(82*4)) 89.6319 best keyword for cluster 6686928 is PF02876 with Jaccard = 0.9429 [ 66 1 1100141 3 ] 0.9851 0.9565 sibling [ 6686928 ] : 6713745 0.0701754 (=8/(6*19)) 94.514 best keyword for cluster 6713745 is PF03642 with Jaccard = 0.8571 [ 6 1 1100204 0 ] 0.8571 1.0000 SUGGESTING RELATEDNESS OF: A> PF02876 ( PF02876 Staphylococcal/Streptococcal toxin, beta-grasp domain ) B> PF03642 ( PF03642 MAP domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF02876 and PF03642 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 462 ) 6718967_PF00957_PF08366 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00957 is 6714576 with Jaccard = 0.9419 |PF00957|=257 [ 243 1 1099953 14 ] parent [ 6714576 ] : 6718967 0.0633997 (=1308/(69*299)) 95.2742 given [ 6714576 ] : 6714576 0.0604027 (=18/(1*298)) 94.6645 best keyword for cluster 6714576 is PF00957 with Jaccard = 0.9419 [ 243 1 1099953 14 ] 0.9959 0.9455 sibling [ 6714576 ] : 6666944 0.166667 (=90/(60*9)) 84.7003 best keyword for cluster 6666944 is PF08366 with Jaccard = 0.6047 [ 26 17 1100168 0 ] 0.6047 1.0000 SUGGESTING RELATEDNESS OF: A> PF00957 ( PF00957 Synaptobrevin ) B> PF08366 ( PF08366 LLGL2 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00957| = 257 , |PF08366| = 26 , |PF00957^PF08366| = 4 ( 1.6% and 15.4% ) only PF00957 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 463 ) 6771350_PF01231_PF01648 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01648 is 6742879 with Jaccard = 0.9413 |PF01648|=493 [ 465 1 1099717 28 ] parent [ 6742879 ] : 6771350 0.00261378 (=136/(542*96)) 99.7551 given [ 6742879 ] : 6742879 0.0265441 (=823/(65*477)) 98.0414 best keyword for cluster 6742879 is PF01648 with Jaccard = 0.9413 [ 465 1 1099717 28 ] 0.9979 0.9432 sibling [ 6742879 ] : 6768552 0.00460526 (=7/(20*76)) 99.6595 best keyword for cluster 6768552 is PF01231 with Jaccard = 1.0000 [ 40 0 1100171 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01648 ( PF01648 4'-phosphopantetheinyl transferase superfamily ) B> PF01231 ( PF01231 Indoleamine 2,3-dioxygenase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01231| = 40 , |PF01648| = 493 , |PF01231^PF01648| = 1 ( 2.5% and 0.2% ) both PF01648 and PF01231 have PDB structures PF01648 d.150.1.2 PF01231 a.266.1.2 SUPERFAM mapping significantly overlapping: 1 PF01648 SSF56214 0.575 (average over 1410 mutual instances, PF01648 1579 appearances, SSF56214 1581 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 464 ) 6687377_PF04410_PF05492 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05492 is 6515157 with Jaccard = 0.9412 |PF05492|=34 [ 32 0 1100177 2 ] parent [ 6515157 ] : 6687377 0.144828 (=294/(35*58)) 89.7339 given [ 6515157 ] : 6515157 0.833333 (=55/(2*33)) 17.8184 best keyword for cluster 6515157 is PF05492 with Jaccard = 0.9412 [ 32 0 1100177 2 ] 1.0000 0.9412 sibling [ 6515157 ] : 6678470 0.169697 (=28/(3*55)) 87.825 best keyword for cluster 6678470 is PF04410 with Jaccard = 0.9487 [ 37 0 1100172 2 ] 1.0000 0.9487 SUGGESTING RELATEDNESS OF: A> PF05492 ( PF05492 NAF1 domain ) B> PF04410 ( PF04410 Gar1 protein RNA binding region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05492 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 465 ) 6713707_PF02274_PF04455 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02274 is 6686623 with Jaccard = 0.9409 |PF02274|=186 [ 175 0 1100025 11 ] parent [ 6686623 ] : 6713707 0.0582915 (=406/(35*199)) 94.5032 given [ 6686623 ] : 6686623 0.129416 (=674/(31*168)) 89.577 best keyword for cluster 6686623 is PF02274 with Jaccard = 0.9409 [ 175 0 1100025 11 ] 1.0000 0.9409 sibling [ 6686623 ] : 6647182 0.260684 (=61/(9*26)) 79.0054 best keyword for cluster 6647182 is PF04455 with Jaccard = 0.8000 [ 20 0 1100186 5 ] 1.0000 0.8000 SUGGESTING RELATEDNESS OF: A> PF02274 ( PF02274 Amidinotransferase ) B> PF04455 ( PF04455 LOR/SDH bifunctional enzyme conserved region ) Only A has a clan ( CL0197.5 ). the two keywords coincide on Uniref90 proteins: |PF02274| = 186 , |PF04455| = 25 , |PF02274^PF04455| = 6 ( 3.2% and 24.0% ) only PF02274 has a PDB structure (may not be up to date) PF02274 d.126.1.2 d.126.1.3 d.126.1.4 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 466 ) 6756471_PF02606_PF03966 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03966 is 6670234 with Jaccard = 0.9409 |PF03966|=186 [ 175 0 1100025 11 ] parent [ 6670234 ] : 6756471 0.00945526 (=310/(194*169)) 99.0591 given [ 6670234 ] : 6670234 0.174542 (=1363/(137*57)) 85.5941 best keyword for cluster 6670234 is PF03966 with Jaccard = 0.9409 [ 175 0 1100025 11 ] 1.0000 0.9409 sibling [ 6670234 ] : 6714182 0.0633947 (=62/(6*163)) 94.5945 best keyword for cluster 6714182 is PF02606 with Jaccard = 0.9790 [ 140 0 1100068 3 ] 1.0000 0.9790 SUGGESTING RELATEDNESS OF: A> PF03966 ( PF03966 Trm112p-like protein ) B> PF02606 ( PF02606 Tetraacyldisaccharide-1-P 4'-kinase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02606| = 143 , |PF03966| = 186 , |PF02606^PF03966| = 2 ( 1.4% and 1.1% ) Neither PF03966 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 467 ) 6722022_PF01975_PF03133 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01975 is 6651096 with Jaccard = 0.9406 |PF01975|=219 [ 206 0 1099992 13 ] parent [ 6651096 ] : 6722022 0.0432618 (=2772/(233*275)) 95.7095 given [ 6651096 ] : 6651096 0.227536 (=157/(230*3)) 80.3596 best keyword for cluster 6651096 is PF01975 with Jaccard = 0.9406 [ 206 0 1099992 13 ] 1.0000 0.9406 sibling [ 6651096 ] : 6716160 0.0748489 (=644/(36*239)) 94.9095 best keyword for cluster 6716160 is PF03133 with Jaccard = 0.9865 [ 219 2 1099989 1 ] 0.9910 0.9955 SUGGESTING RELATEDNESS OF: A> PF01975 ( PF01975 Survival protein SurE ) B> PF03133 ( PF03133 Tubulin-tyrosine ligase family ) Only B has a clan ( CL0179.8 ). the two keywords coincide on Uniref90 proteins: |PF01975| = 219 , |PF03133| = 220 , |PF01975^PF03133| = 13 ( 5.9% and 5.9% ) only PF01975 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01975 SSF64167 0.747 (average over 681 mutual instances, PF01975 682 appearances, SSF64167 709 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 468 ) 6749295_PF02334_PF03551 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03551 is 6736329 with Jaccard = 0.9404 |PF03551|=519 [ 489 1 1099691 30 ] parent [ 6736329 ] : 6749295 0.0179924 (=114/(11*576)) 98.568 given [ 6736329 ] : 6736329 0.0315789 (=108/(6*570)) 97.41 best keyword for cluster 6736329 is PF03551 with Jaccard = 0.9404 [ 489 1 1099691 30 ] 0.9980 0.9422 sibling [ 6736329 ] : 6730232 0.0333333 (=1/(5*6)) 96.75 best keyword for cluster 6730232 is PF02334 with Jaccard = 0.7500 [ 3 1 1100207 0 ] 0.7500 1.0000 SUGGESTING RELATEDNESS OF: A> PF03551 ( PF03551 Transcriptional regulator PadR-like family ) B> PF02334 ( PF02334 Replication terminator protein ) Only A has a clan ( CL0123.12 ). the two keywords do not coincide on UniRef90 proteins both PF03551 and PF02334 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 469 ) 6747098_PF02469_PF02676 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02469 is 6740345 with Jaccard = 0.9403 |PF02469|=266 [ 252 2 1099943 14 ] parent [ 6740345 ] : 6747098 0.0166176 (=226/(40*340)) 98.3971 given [ 6740345 ] : 6740345 0.0308834 (=179/(18*322)) 97.81 best keyword for cluster 6740345 is PF02469 with Jaccard = 0.9403 [ 252 2 1099943 14 ] 0.9921 0.9474 sibling [ 6740345 ] : 6516567 0.854396 (=311/(14*26)) 18.4009 best keyword for cluster 6516567 is PF02676 with Jaccard = 0.8605 [ 37 0 1100168 6 ] 1.0000 0.8605 SUGGESTING RELATEDNESS OF: A> PF02469 ( PF02469 Fasciclin domain ) B> PF02676 ( PF02676 TYW3 like ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02469| = 266 , |PF02676| = 43 , |PF02469^PF02676| = 1 ( 0.4% and 2.3% ) both PF02469 and PF02676 have PDB structures PF02469 b.118.1.1 SUPERFAM mapping significantly overlapping: 1 PF02676 SSF111278 0.843 (average over 90 mutual instances, PF02676 99 appearances, SSF111278 112 appearances) 2 PF02469 SSF82153 0.813 (average over 790 mutual instances, PF02469 818 appearances, SSF82153 907 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 470 ) 6744540_PF00866_PF02982 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00866 is 6661533 with Jaccard = 0.9387 |PF00866|=154 [ 153 9 1100048 1 ] parent [ 6661533 ] : 6744540 0.0254237 (=819/(182*177)) 98.1969 given [ 6661533 ] : 6661533 0.201317 (=1345/(131*51)) 83.6086 best keyword for cluster 6661533 is PF00866 with Jaccard = 0.9387 [ 153 9 1100048 1 ] 0.9444 0.9935 sibling [ 6661533 ] : 6731451 0.0372093 (=32/(5*172)) 96.8798 best keyword for cluster 6731451 is PF02982 with Jaccard = 0.6250 [ 10 5 1100195 1 ] 0.6667 0.9091 SUGGESTING RELATEDNESS OF: A> PF00866 ( PF00866 Ring hydroxylating beta subunit ) B> PF02982 ( PF02982 Scytalone dehydratase ) they come from the same clan: CL0051.9 : PF02982 PF00866 PF02136 PF05223 PF07858 PF07080 PF08332 PF07366 the two keywords do not coincide on UniRef90 proteins both PF00866 and PF02982 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 471 ) 6747076_PF01782_PF04139 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01782 is 6656469 with Jaccard = 0.9381 |PF01782|=213 [ 212 13 1099985 1 ] parent [ 6656469 ] : 6747076 0.0168115 (=209/(259*48)) 98.3957 given [ 6656469 ] : 6656469 0.197709 (=397/(251*8)) 82.1879 best keyword for cluster 6656469 is PF01782 with Jaccard = 0.9381 [ 212 13 1099985 1 ] 0.9422 0.9953 sibling [ 6656469 ] : 6675885 0.149826 (=43/(7*41)) 87.1489 best keyword for cluster 6675885 is PF04139 with Jaccard = 0.9667 [ 29 1 1100181 0 ] 0.9667 1.0000 SUGGESTING RELATEDNESS OF: A> PF01782 ( PF01782 RimM N-terminal domain ) B> PF04139 ( PF04139 Rad9 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01782 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 472 ) 6628465_PF01647_PF06641 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01647 is 6436642 with Jaccard = 0.9375 |PF01647|=16 [ 15 0 1100195 1 ] parent [ 6436642 ] : 6628465 0.274306 (=79/(16*18)) 73.9709 given [ 6436642 ] : 6436642 1 (=55/(11*5)) 0.504113 best keyword for cluster 6436642 is PF01647 with Jaccard = 0.9375 [ 15 0 1100195 1 ] 1.0000 0.9375 sibling [ 6436642 ] : 6595671 0.402597 (=31/(11*7)) 59.8519 best keyword for cluster 6595671 is PF06641 with Jaccard = 0.7143 [ 10 4 1100197 0 ] 0.7143 1.0000 SUGGESTING RELATEDNESS OF: A> PF01647 ( PF01647 Morbillivirus RNA polymerase alpha subunit ) B> PF06641 ( PF06641 Paramyxovirus structural protein V ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01647| = 16 , |PF06641| = 10 , |PF01647^PF06641| = 1 ( 6.2% and 10.0% ) Neither PF01647 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 473 ) 6639142_PF05040_PF06765 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05040 is 6473928 with Jaccard = 0.9375 |PF05040|=16 [ 15 0 1100195 1 ] parent [ 6473928 ] : 6639142 0.293333 (=198/(15*45)) 76.7733 given [ 6473928 ] : 6473928 0.96 (=48/(10*5)) 4.42584 best keyword for cluster 6473928 is PF05040 with Jaccard = 0.9375 [ 15 0 1100195 1 ] 1.0000 0.9375 sibling [ 6473928 ] : 6570610 0.536 (=268/(20*25)) 51.1411 best keyword for cluster 6570610 is PF06765 with Jaccard = 0.9091 [ 20 2 1100189 0 ] 0.9091 1.0000 SUGGESTING RELATEDNESS OF: A> PF05040 ( PF05040 Heparan sulfate 2-O-sulfotransferase (HS2ST) ) B> PF06765 ( PF06765 Heparan sulfate 6-sulfotransferase (HS6ST) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05040 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 474 ) 6735547_PF04041_PF04616 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04616 is 6705404 with Jaccard = 0.9360 |PF04616|=292 [ 278 5 1099914 14 ] parent [ 6705404 ] : 6735547 0.0348639 (=1230/(336*105)) 97.33 given [ 6705404 ] : 6705404 0.0782828 (=155/(6*330)) 93.1751 best keyword for cluster 6705404 is PF04616 with Jaccard = 0.9360 [ 278 5 1099914 14 ] 0.9823 0.9521 sibling [ 6705404 ] : 6725529 0.0387205 (=23/(6*99)) 96.1743 best keyword for cluster 6725529 is PF04041 with Jaccard = 0.9104 [ 61 6 1100144 0 ] 0.9104 1.0000 SUGGESTING RELATEDNESS OF: A> PF04616 ( PF04616 Glycosyl hydrolases family 43 ) B> PF04041 ( PF04041 Domain of unknown function (DUF377) ) they come from the same clan: CL0143.8 : PF03664 PF04616 PF00251 PF04041 PF02435 the two keywords do not coincide on UniRef90 proteins both PF04616 and PF04041 have PDB structures PF04616 b.67.2.1 PF04041 b.67.2.4 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 475 ) 6740120_PF00892_PF05653 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00892 is 6738900 with Jaccard = 0.9359 |PF00892|=2414 [ 2308 52 1097745 106 ] parent [ 6738900 ] : 6740120 0.030438 (=15310/(179*2810)) 97.7876 given [ 6738900 ] : 6738900 0.0287992 (=726/(9*2801)) 97.6725 best keyword for cluster 6738900 is PF00892 with Jaccard = 0.9359 [ 2308 52 1097745 106 ] 0.9780 0.9561 sibling [ 6738900 ] : 6711424 0.0645472 (=67/(6*173)) 94.1592 best keyword for cluster 6711424 is PF05653 with Jaccard = 0.8814 [ 104 14 1100093 0 ] 0.8814 1.0000 SUGGESTING RELATEDNESS OF: A> PF00892 ( PF00892 Integral membrane protein DUF6 ) B> PF05653 ( PF05653 Protein of unknown function (DUF803) ) they come from the same clan: CL0184.5 : PF07857 PF04342 PF00892 PF05653 PF06027 PF00893 PF04142 PF06379 PF06800 PF03151 PF08449 PF02694 the two keywords do not coincide on UniRef90 proteins Neither PF00892 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05653 SSF103473 0.757 (average over 2 mutual instances, PF05653 3 appearances, SSF103473 39293 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 476 ) 6750215_PF00534_PF05693 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00534 is 6749863 with Jaccard = 0.9342 |PF00534|=3857 [ 3692 95 1096259 165 ] parent [ 6749863 ] : 6750215 0.0195931 (=5703/(64*4548)) 98.6336 given [ 6749863 ] : 6749863 0.0171615 (=4317/(56*4492)) 98.6096 best keyword for cluster 6749863 is PF00534 with Jaccard = 0.9342 [ 3692 95 1096259 165 ] 0.9749 0.9572 sibling [ 6749863 ] : 6739900 0.0258621 (=9/(6*58)) 97.7667 best keyword for cluster 6739900 is PF05693 with Jaccard = 0.9123 [ 52 3 1100154 2 ] 0.9455 0.9630 SUGGESTING RELATEDNESS OF: A> PF00534 ( PF00534 Glycosyl transferases group 1 ) B> PF05693 ( PF05693 Glycogen synthase ) they come from the same clan: CL0113.8 : PF06925 PF02684 PF04464 PF04101 PF01075 PF03033 PF00982 PF00534 PF05693 PF02350 PF04007 PF06722 PF05159 PF08660 PF00343 PF00201 the two keywords coincide on Uniref90 proteins: |PF00534| = 3857 , |PF05693| = 54 , |PF00534^PF05693| = 1 ( 0.0% and 1.9% ) only PF00534 has a PDB structure (may not be up to date) PF00534 c.87.1.8 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 477 ) 6705032_PF03025_PF05776 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03025 is 6624819 with Jaccard = 0.9333 |PF03025|=15 [ 14 0 1100196 1 ] parent [ 6624819 ] : 6705032 0.0928571 (=13/(14*10)) 93.1071 given [ 6624819 ] : 6624819 0.307692 (=4/(1*13)) 72.4616 best keyword for cluster 6624819 is PF03025 with Jaccard = 0.9333 [ 14 0 1100196 1 ] 1.0000 0.9333 sibling [ 6624819 ] : 6695546 0.111111 (=1/(1*9)) 91.4445 best keyword for cluster 6695546 is PF05776 with Jaccard = 1.0000 [ 6 0 1100205 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03025 ( PF03025 Papillomavirus E5 ) B> PF05776 ( PF05776 Papillomavirus E5A protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03025 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 478 ) 6620928_PF06807_PF08160 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08160 is 6549379 with Jaccard = 0.9333 |PF08160|=14 [ 14 1 1100196 0 ] parent [ 6549379 ] : 6620928 0.317692 (=826/(52*50)) 70.9227 given [ 6549379 ] : 6549379 0.655856 (=364/(15*37)) 38.3883 best keyword for cluster 6549379 is PF08160 with Jaccard = 0.9333 [ 14 1 1100196 0 ] 0.9333 1.0000 sibling [ 6549379 ] : 6603400 0.397163 (=56/(3*47)) 63.3679 best keyword for cluster 6603400 is PF06807 with Jaccard = 0.8857 [ 31 0 1100176 4 ] 1.0000 0.8857 SUGGESTING RELATEDNESS OF: A> PF08160 ( PF08160 NUC156 domain ) B> PF06807 ( PF06807 Pre-mRNA cleavage complex II protein Clp1 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF06807| = 35 , |PF08160| = 14 , |PF06807^PF08160| = 1 ( 2.9% and 7.1% ) Neither PF08160 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 479 ) 6470551_PF00693_PF08465 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08465 is 6072180 with Jaccard = 0.9333 |PF08465|=14 [ 14 1 1100196 0 ] parent [ 6072180 ] : 6470551 0.966667 (=406/(15*28)) 3.86678 given [ 6072180 ] : 6072180 1 (=14/(1*14)) 7.43001e-28 best keyword for cluster 6072180 is PF08465 with Jaccard = 0.9333 [ 14 1 1100196 0 ] 0.9333 1.0000 sibling [ 6072180 ] : 6308051 1 (=115/(23*5)) 5.21742e-09 best keyword for cluster 6308051 is PF00693 with Jaccard = 0.6512 [ 28 0 1100168 15 ] 1.0000 0.6512 SUGGESTING RELATEDNESS OF: A> PF08465 ( PF08465 Thymidine kinase from Herpesvirus C-terminal ) B> PF00693 ( PF00693 Thymidine kinase from herpesvirus ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00693| = 43 , |PF08465| = 14 , |PF00693^PF08465| = 7 ( 16.3% and 50.0% ) only PF08465 has a PDB structure (may not be up to date) PF00693 c.37.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 480 ) 6509578_PF02491_PF06723 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02491 is 6470596 with Jaccard = 0.9330 |PF02491|=194 [ 181 0 1100017 13 ] parent [ 6470596 ] : 6509578 0.869734 (=41081/(209*226)) 15.108 given [ 6470596 ] : 6470596 0.963933 (=3314/(18*191)) 3.88218 best keyword for cluster 6470596 is PF02491 with Jaccard = 0.9330 [ 181 0 1100017 13 ] 1.0000 0.9330 sibling [ 6470596 ] : 6502948 0.899064 (=4035/(22*204)) 12.5566 best keyword for cluster 6502948 is PF06723 with Jaccard = 0.9786 [ 183 4 1100024 0 ] 0.9786 1.0000 SUGGESTING RELATEDNESS OF: A> PF02491 ( PF02491 Cell division protein FtsA ) B> PF06723 ( PF06723 MreB/Mbl protein ) they come from the same clan: CL0108.10 : PF06406 PF00480 PF02541 PF00814 PF06723 PF05378 PF01968 PF00012 PF03727 PF00349 PF02685 PF01150 PF02491 PF00370 PF02782 PF02543 PF01869 PF00022 PF00871 PF03702 the two keywords do not coincide on UniRef90 proteins both PF02491 and PF06723 have PDB structures PF02491 c.55.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 481 ) 6710627_PF01892_PF04608 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01892 is 6524055 with Jaccard = 0.9310 |PF01892|=29 [ 27 0 1100182 2 ] parent [ 6524055 ] : 6710627 0.0820747 (=413/(37*136)) 94.0248 given [ 6524055 ] : 6524055 0.793706 (=227/(26*11)) 22.2785 best keyword for cluster 6524055 is PF01892 with Jaccard = 0.9310 [ 27 0 1100182 2 ] 1.0000 0.9310 sibling [ 6524055 ] : 6594591 0.482422 (=494/(8*128)) 59.2157 best keyword for cluster 6594591 is PF04608 with Jaccard = 1.0000 [ 113 0 1100098 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01892 ( ) B> PF04608 ( PF04608 Phosphatidylglycerophosphatase A ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01892 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04608 SSF101307 0.945 (average over 471 mutual instances, PF04608 471 appearances, SSF101307 476 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 482 ) 6690055_PF01039_PF06833 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06833 is 6349412 with Jaccard = 0.9310 |PF06833|=29 [ 27 0 1100182 2 ] parent [ 6349412 ] : 6690055 0.116854 (=2808/(27*890)) 90.2525 given [ 6349412 ] : 6349412 1 (=182/(13*14)) 4.51987e-06 best keyword for cluster 6349412 is PF06833 with Jaccard = 0.9310 [ 27 0 1100182 2 ] 1.0000 0.9310 sibling [ 6349412 ] : 6665646 0.164977 (=293/(2*888)) 84.4146 best keyword for cluster 6665646 is PF01039 with Jaccard = 0.7153 [ 603 173 1099368 67 ] 0.7771 0.9000 SUGGESTING RELATEDNESS OF: A> PF06833 ( PF06833 Malonate decarboxylase gamma subunit (MdcE) ) B> PF01039 ( PF01039 Carboxyl transferase domain ) they come from the same clan: CL0127.6 : PF03255 PF01039 PF00574 PF01972 PF00378 PF06833 PF03572 PF01343 the two keywords do not coincide on UniRef90 proteins only PF06833 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 483 ) 6741784_PF05721_PF07350 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05721 is 6738761 with Jaccard = 0.9307 |PF05721|=272 [ 255 2 1099937 17 ] parent [ 6738761 ] : 6741784 0.026807 (=708/(77*343)) 97.9501 given [ 6738761 ] : 6738761 0.0302721 (=267/(28*315)) 97.6616 best keyword for cluster 6738761 is PF05721 with Jaccard = 0.9307 [ 255 2 1099937 17 ] 0.9922 0.9375 sibling [ 6738761 ] : 6713416 0.0729167 (=105/(32*45)) 94.4737 best keyword for cluster 6713416 is PF07350 with Jaccard = 0.6744 [ 29 14 1100168 0 ] 0.6744 1.0000 SUGGESTING RELATEDNESS OF: A> PF05721 ( PF05721 Phytanoyl-CoA dioxygenase (PhyH) ) B> PF07350 ( PF07350 Protein of unknown function (DUF1479) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05721 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 484 ) 6742084_PF03288_PF05272 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03288 is 6734488 with Jaccard = 0.9301 |PF03288|=141 [ 133 2 1100068 8 ] parent [ 6734488 ] : 6742084 0.0221663 (=740/(156*214)) 97.9776 given [ 6734488 ] : 6734488 0.0303658 (=44/(7*207)) 97.2229 best keyword for cluster 6734488 is PF03288 with Jaccard = 0.9301 [ 133 2 1100068 8 ] 0.9852 0.9433 sibling [ 6734488 ] : 6725041 0.0424691 (=258/(81*75)) 96.1114 best keyword for cluster 6725041 is PF05272 with Jaccard = 0.7826 [ 54 15 1100142 0 ] 0.7826 1.0000 SUGGESTING RELATEDNESS OF: A> PF03288 ( PF03288 Poxvirus D5 protein-like ) B> PF05272 ( PF05272 Virulence-associated protein E ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03288 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 485 ) 6704291_PF03734_PF06104 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03734 is 6652683 with Jaccard = 0.9301 |PF03734|=499 [ 466 2 1099710 33 ] parent [ 6652683 ] : 6704291 0.0912986 (=6115/(122*549)) 92.985 given [ 6652683 ] : 6652683 0.231779 (=16530/(338*211)) 80.9584 best keyword for cluster 6652683 is PF03734 with Jaccard = 0.9301 [ 466 2 1099710 33 ] 0.9957 0.9339 sibling [ 6652683 ] : 6623048 0.294956 (=269/(114*8)) 71.6842 best keyword for cluster 6623048 is PF06104 with Jaccard = 0.6452 [ 40 22 1100149 0 ] 0.6452 1.0000 SUGGESTING RELATEDNESS OF: A> PF03734 ( PF03734 ErfK/YbiS/YcfS/YnhG ) B> PF06104 ( PF06104 Bacterial protein of unknown function (DUF949) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03734| = 499 , |PF06104| = 40 , |PF03734^PF06104| = 4 ( 0.8% and 10.0% ) only PF03734 has a PDB structure (may not be up to date) PF03734 b.160.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 486 ) 6762972_PF02402_PF06809 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02402 is 6457547 with Jaccard = 0.9286 |PF02402|=14 [ 13 0 1100197 1 ] parent [ 6457547 ] : 6762972 0.00769231 (=3/(13*30)) 99.4154 given [ 6457547 ] : 6457547 1 (=30/(3*10)) 2.02548 best keyword for cluster 6457547 is PF02402 with Jaccard = 0.9286 [ 13 0 1100197 1 ] 1.0000 0.9286 sibling [ 6457547 ] : 6746646 0.0170455 (=3/(8*22)) 98.3614 best keyword for cluster 6746646 is PF06809 with Jaccard = 0.8333 [ 10 0 1100199 2 ] 1.0000 0.8333 SUGGESTING RELATEDNESS OF: A> PF02402 ( PF02402 Lysis protein ) B> PF06809 ( PF06809 Neural proliferation differentiation control-1 protein (NPDC1) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02402 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 487 ) 6604310_PF04352_PF05286 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04352 is 6558791 with Jaccard = 0.9286 |PF04352|=42 [ 39 0 1100169 3 ] parent [ 6558791 ] : 6604310 0.382979 (=126/(7*47)) 63.7324 given [ 6558791 ] : 6558791 0.581395 (=100/(4*43)) 45.8572 best keyword for cluster 6558791 is PF04352 with Jaccard = 0.9286 [ 39 0 1100169 3 ] 1.0000 0.9286 sibling [ 6558791 ] : 6490635 0.916667 (=11/(4*3)) 8.42992 best keyword for cluster 6490635 is PF05286 with Jaccard = 0.6667 [ 4 2 1100205 0 ] 0.6667 1.0000 SUGGESTING RELATEDNESS OF: A> PF04352 ( PF04352 ProQ activator of osmoprotectant transporter ProP ) B> PF05286 ( PF05286 Fertility inhibition protein (FINO) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04352 has a PDB structure (may not be up to date) PF05286 a.136.1.1 SUPERFAM mapping significantly overlapping: 1 PF04352 SSF48657 0.708 (average over 183 mutual instances, PF04352 183 appearances, SSF48657 240 appearances) 2 PF05286 SSF48657 0.922 (average over 46 mutual instances, PF05286 46 appearances, SSF48657 240 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 488 ) 6753644_PF07610_PF07955 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07955 is 6717867 with Jaccard = 0.9286 |PF07955|=14 [ 13 0 1100197 1 ] parent [ 6717867 ] : 6753644 0.0123494 (=41/(20*166)) 98.88 given [ 6717867 ] : 6717867 0.0505051 (=5/(9*11)) 95.1319 best keyword for cluster 6717867 is PF07955 with Jaccard = 0.9286 [ 13 0 1100197 1 ] 1.0000 0.9286 sibling [ 6717867 ] : 6745563 0.023913 (=132/(46*120)) 98.2777 best keyword for cluster 6745563 is PF07610 with Jaccard = 0.7931 [ 23 6 1100182 0 ] 0.7931 1.0000 SUGGESTING RELATEDNESS OF: A> PF07955 ( PF07955 Protein of unknown function (DUF1687) ) B> PF07610 ( PF07610 Protein of unknown function (DUF1573) ) Only A has a clan ( CL0172.11 ). the two keywords do not coincide on UniRef90 proteins only PF07955 has a PDB structure (may not be up to date) PF07955 c.47.1.18 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 489 ) 6588362_PF01412_PF08518 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08518 is 6495466 with Jaccard = 0.9286 |PF08518|=26 [ 26 2 1100183 0 ] parent [ 6495466 ] : 6588362 0.440382 (=4986/(34*333)) 56.5453 given [ 6495466 ] : 6495466 0.920415 (=266/(17*17)) 9.91523 best keyword for cluster 6495466 is PF08518 with Jaccard = 0.9286 [ 26 2 1100183 0 ] 0.9286 1.0000 sibling [ 6495466 ] : 6563040 0.510574 (=338/(2*331)) 49.5698 best keyword for cluster 6563040 is PF01412 with Jaccard = 0.8867 [ 313 2 1099858 38 ] 0.9937 0.8917 SUGGESTING RELATEDNESS OF: A> PF08518 ( PF08518 Spa2 homology domain (SHD) of GIT ) B> PF01412 ( PF01412 Putative GTPase activating protein for Arf ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01412| = 351 , |PF08518| = 26 , |PF01412^PF08518| = 13 ( 3.7% and 50.0% ) only PF08518 has a PDB structure (may not be up to date) PF01412 g.45.1.1 SUPERFAM mapping significantly overlapping: 1 PF01412 SSF57863 0.943 (average over 744 mutual instances, PF01412 1021 appearances, SSF57863 1344 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 490 ) 6721128_PF05067_PF05974 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05974 is 6665821 with Jaccard = 0.9275 |PF05974|=69 [ 64 0 1100142 5 ] parent [ 6665821 ] : 6721128 0.0445054 (=292/(81*81)) 95.5809 given [ 6665821 ] : 6665821 0.183761 (=43/(78*3)) 84.4548 best keyword for cluster 6665821 is PF05974 with Jaccard = 0.9275 [ 64 0 1100142 5 ] 1.0000 0.9275 sibling [ 6665821 ] : 6677704 0.125 (=10/(1*80)) 87.6362 best keyword for cluster 6677704 is PF05067 with Jaccard = 0.9870 [ 76 0 1100134 1 ] 1.0000 0.9870 SUGGESTING RELATEDNESS OF: A> PF05974 ( PF05974 Protein of unknown function (DUF892) ) B> PF05067 ( PF05067 Manganese containing catalase ) Only B has a clan ( CL0044.8 ). the two keywords coincide on Uniref90 proteins: |PF05067| = 77 , |PF05974| = 69 , |PF05067^PF05974| = 4 ( 5.2% and 5.8% ) only PF05974 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05974 SSF47240 0.882 (average over 92 mutual instances, PF05974 92 appearances, SSF47240 6970 appearances) 2 PF05067 SSF47240 0.922 (average over 207 mutual instances, PF05067 207 appearances, SSF47240 6970 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 491 ) 6759074_PF00711_PF08131 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00711 is 6740379 with Jaccard = 0.9272 |PF00711|=146 [ 140 5 1100060 6 ] parent [ 6740379 ] : 6759074 0.0118758 (=44/(15*247)) 99.2137 given [ 6740379 ] : 6740379 0.0303713 (=445/(148*99)) 97.8153 best keyword for cluster 6740379 is PF00711 with Jaccard = 0.9272 [ 140 5 1100060 6 ] 0.9655 0.9589 sibling [ 6740379 ] : 6731564 0.0535714 (=3/(8*7)) 96.8929 best keyword for cluster 6731564 is PF08131 with Jaccard = 0.8333 [ 5 0 1100205 1 ] 1.0000 0.8333 SUGGESTING RELATEDNESS OF: A> PF00711 ( PF00711 Beta defensin ) B> PF08131 ( PF08131 Defensin-like peptide family ) they come from the same clan: CL0075.8 : PF00711 PF08131 PF00323 PF07936 PF00706 the two keywords coincide on Uniref90 proteins: |PF00711| = 146 , |PF08131| = 6 , |PF00711^PF08131| = 1 ( 0.7% and 16.7% ) both PF00711 and PF08131 have PDB structures PF00711 g.9.1.1 PF08131 g.9.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 492 ) 6627242_PF01676_PF08342 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08342 is 6497516 with Jaccard = 0.9271 |PF08342|=89 [ 89 7 1100115 0 ] parent [ 6497516 ] : 6627242 0.302146 (=9039/(108*277)) 73.4601 given [ 6497516 ] : 6497516 0.897196 (=96/(1*107)) 10.6754 best keyword for cluster 6497516 is PF08342 with Jaccard = 0.9271 [ 89 7 1100115 0 ] 0.9271 1.0000 sibling [ 6497516 ] : 6599411 0.443325 (=6602/(73*204)) 61.3576 best keyword for cluster 6599411 is PF01676 with Jaccard = 0.6966 [ 248 6 1099855 102 ] 0.9764 0.7086 SUGGESTING RELATEDNESS OF: A> PF08342 ( PF08342 Phosphopentomutase N-terminal ) B> PF01676 ( PF01676 Metalloenzyme superfamily ) Only B has a clan ( CL0088.10 ). the two keywords coincide on Uniref90 proteins: |PF01676| = 350 , |PF08342| = 89 , |PF01676^PF08342| = 86 ( 24.6% and 96.6% ) only PF08342 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 493 ) 6766583_PF03364_PF08327 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08327 is 6745057 with Jaccard = 0.9269 |PF08327|=259 [ 241 1 1099951 18 ] parent [ 6745057 ] : 6766583 0.00644022 (=1897/(365*807)) 99.5812 given [ 6745057 ] : 6745057 0.022882 (=531/(82*283)) 98.239 best keyword for cluster 6745057 is PF08327 with Jaccard = 0.9269 [ 241 1 1099951 18 ] 0.9959 0.9305 sibling [ 6745057 ] : 6761345 0.00995196 (=667/(94*713)) 99.3363 best keyword for cluster 6761345 is PF03364 with Jaccard = 0.7143 [ 290 90 1099805 26 ] 0.7632 0.9177 SUGGESTING RELATEDNESS OF: A> PF08327 ( PF08327 Activator of Hsp90 ATPase homolog 1-like protein ) B> PF03364 ( PF03364 Polyketide cyclase / dehydrase and lipid transport ) they come from the same clan: CL0209.4 : PF08327 PF00407 PF06240 PF02121 PF03364 PF00848 PF01852 the two keywords do not coincide on UniRef90 proteins both PF08327 and PF03364 have PDB structures PF08327 d.129.3.5 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 494 ) 6727985_PF04586_PF05065 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05065 is 6704757 with Jaccard = 0.9268 |PF05065|=116 [ 114 7 1100088 2 ] parent [ 6704757 ] : 6727985 0.0363515 (=709/(106*184)) 96.4814 given [ 6704757 ] : 6704757 0.0793951 (=294/(23*161)) 93.0433 best keyword for cluster 6704757 is PF05065 with Jaccard = 0.9268 [ 114 7 1100088 2 ] 0.9421 0.9828 sibling [ 6704757 ] : 6697872 0.0926385 (=112/(93*13)) 91.8641 best keyword for cluster 6697872 is PF04586 with Jaccard = 0.8142 [ 92 0 1100098 21 ] 1.0000 0.8142 SUGGESTING RELATEDNESS OF: A> PF05065 ( PF05065 Phage capsid family ) B> PF04586 ( PF04586 Caudovirus prohead protease ) Only B has a clan ( CL0201.5 ). the two keywords coincide on Uniref90 proteins: |PF04586| = 113 , |PF05065| = 116 , |PF04586^PF05065| = 5 ( 4.4% and 4.3% ) Neither PF05065 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04586 SSF50789 0.760 (average over 36 mutual instances, PF04586 36 appearances, SSF50789 125 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 495 ) 6732074_PF01190_PF03251 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01190 is 6638470 with Jaccard = 0.9265 |PF01190|=67 [ 63 1 1100143 4 ] parent [ 6638470 ] : 6732074 0.0354651 (=61/(86*20)) 96.9513 given [ 6638470 ] : 6638470 0.258889 (=466/(36*50)) 76.6064 best keyword for cluster 6638470 is PF01190 with Jaccard = 0.9265 [ 63 1 1100143 4 ] 0.9844 0.9403 sibling [ 6638470 ] : 6704236 0.0879121 (=8/(13*7)) 92.9642 best keyword for cluster 6704236 is PF03251 with Jaccard = 1.0000 [ 13 0 1100198 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01190 ( PF01190 Pollen proteins Ole e I family ) B> PF03251 ( PF03251 Tymovirus 45/70Kd protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01190 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 496 ) 6617800_PF02626_PF02682 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02626 is 6427134 with Jaccard = 0.9239 |PF02626|=184 [ 170 0 1100027 14 ] parent [ 6427134 ] : 6617800 0.312951 (=8813/(189*149)) 69.5383 given [ 6427134 ] : 6427134 0.998913 (=919/(184*5)) 0.215921 best keyword for cluster 6427134 is PF02626 with Jaccard = 0.9239 [ 170 0 1100027 14 ] 1.0000 0.9239 sibling [ 6427134 ] : 6463866 0.974101 (=1354/(10*139)) 2.80598 best keyword for cluster 6463866 is PF02682 with Jaccard = 0.7500 [ 132 0 1100035 44 ] 1.0000 0.7500 SUGGESTING RELATEDNESS OF: A> PF02626 ( PF02626 Allophanate hydrolase subunit 2 ) B> PF02682 ( PF02682 Allophanate hydrolase subunit 1 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02626| = 184 , |PF02682| = 176 , |PF02626^PF02682| = 57 ( 31.0% and 32.4% ) Neither PF02626 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 497 ) 6744013_PF00650_PF04707 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00650 is 6709401 with Jaccard = 0.9231 |PF00650|=403 [ 384 13 1099795 19 ] parent [ 6709401 ] : 6744013 0.0190589 (=678/(77*462)) 98.147 given [ 6709401 ] : 6709401 0.0777032 (=521/(15*447)) 93.8452 best keyword for cluster 6709401 is PF00650 with Jaccard = 0.9231 [ 384 13 1099795 19 ] 0.9673 0.9529 sibling [ 6709401 ] : 6687641 0.116667 (=42/(5*72)) 89.7697 best keyword for cluster 6687641 is PF04707 with Jaccard = 0.8118 [ 69 0 1100126 16 ] 1.0000 0.8118 SUGGESTING RELATEDNESS OF: A> PF00650 ( PF00650 CRAL/TRIO domain ) B> PF04707 ( PF04707 PRELI-like family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00650| = 403 , |PF04707| = 85 , |PF00650^PF04707| = 11 ( 2.7% and 12.9% ) only PF00650 has a PDB structure (may not be up to date) PF00650 c.13.1.1 SUPERFAM mapping significantly overlapping: 1 PF00650 SSF52087 0.769 (average over 887 mutual instances, PF00650 1638 appearances, SSF52087 1745 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 498 ) 6723643_PF01210_PF02558 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02558 is 6656580 with Jaccard = 0.9231 |PF02558|=321 [ 300 4 1099886 21 ] parent [ 6656580 ] : 6723643 0.0517331 (=7358/(330*431)) 95.9455 given [ 6656580 ] : 6656580 0.198171 (=130/(2*328)) 82.2279 best keyword for cluster 6656580 is PF02558 with Jaccard = 0.9231 [ 300 4 1099886 21 ] 0.9868 0.9346 sibling [ 6656580 ] : 6691574 0.114798 (=2591/(370*61)) 90.5913 best keyword for cluster 6691574 is PF01210 with Jaccard = 0.8753 [ 358 41 1099802 10 ] 0.8972 0.9728 SUGGESTING RELATEDNESS OF: A> PF02558 ( PF02558 Ketopantoate reductase PanE/ApbA ) B> PF01210 ( PF01210 NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus ) they come from the same clan: CL0063.17 : PF03721 PF04820 PF02254 PF00899 PF01946 PF02882 PF01488 PF01118 PF08491 PF03435 PF04321 PF07992 PF00070 PF02719 PF02153 PF02423 PF05368 PF01210 PF07994 PF07993 PF03447 PF03446 PF01225 PF06039 PF01232 PF03949 PF05834 PF00056 PF08659 PF07991 PF03486 PF00044 PF00732 PF01134 PF01408 PF00996 PF00479 PF00743 PF01494 PF00890 PF03807 PF01370 PF00208 PF02670 PF01113 PF01266 PF02629 PF02558 PF01593 PF01262 PF00670 PF00107 PF00106 PF02737 PF01073 PF02826 the two keywords do not coincide on UniRef90 proteins both PF02558 and PF01210 have PDB structures PF02558 c.2.1.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 499 ) 6747247_PF02566_PF02624 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02566 is 6733389 with Jaccard = 0.9231 |PF02566|=494 [ 456 0 1099717 38 ] parent [ 6733389 ] : 6747247 0.0177635 (=1152/(523*124)) 98.4085 given [ 6733389 ] : 6733389 0.0351606 (=127/(516*7)) 97.0983 best keyword for cluster 6733389 is PF02566 with Jaccard = 0.9231 [ 456 0 1099717 38 ] 1.0000 0.9231 sibling [ 6733389 ] : 6734835 0.0286499 (=73/(26*98)) 97.2563 best keyword for cluster 6734835 is PF02624 with Jaccard = 0.7632 [ 87 22 1100097 5 ] 0.7982 0.9457 SUGGESTING RELATEDNESS OF: A> PF02566 ( PF02566 OsmC-like protein ) B> PF02624 ( PF02624 YcaO-like family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02566| = 494 , |PF02624| = 92 , |PF02566^PF02624| = 10 ( 2.0% and 10.9% ) only PF02566 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF02566 SSF82784 0.853 (average over 1637 mutual instances, PF02566 1638 appearances, SSF82784 1704 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 500 ) 6578610_PF02087_PF03973 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03973 is 6516787 with Jaccard = 0.9231 |PF03973|=26 [ 24 0 1100185 2 ] parent [ 6516787 ] : 6578610 0.515152 (=136/(11*24)) 53.3059 given [ 6516787 ] : 6516787 0.828125 (=106/(16*8)) 18.5923 best keyword for cluster 6516787 is PF03973 with Jaccard = 0.9231 [ 24 0 1100185 2 ] 1.0000 0.9231 sibling [ 6516787 ] : 6497715 0.9 (=27/(5*6)) 10.8127 best keyword for cluster 6497715 is PF02087 with Jaccard = 0.9000 [ 9 1 1100201 0 ] 0.9000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03973 ( PF03973 Triabin ) B> PF02087 ( PF02087 Nitrophorin ) they come from the same clan: CL0116.7 : PF03973 PF02087 PF08212 PF00061 PF02098 PF07137 the two keywords do not coincide on UniRef90 proteins both PF03973 and PF02087 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02087 SSF50814 0.793 (average over 15 mutual instances, PF02087 15 appearances, SSF50814 7354 appearances) 2 PF03973 SSF50814 0.807 (average over 117 mutual instances, PF03973 117 appearances, SSF50814 7354 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 501 ) 6720727_PF03879_PF08524 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08524 is 6666603 with Jaccard = 0.9231 |PF08524|=13 [ 12 0 1100198 1 ] parent [ 6666603 ] : 6720727 0.0612083 (=154/(37*68)) 95.5088 given [ 6666603 ] : 6666603 0.190476 (=40/(7*30)) 84.6291 best keyword for cluster 6666603 is PF08524 with Jaccard = 0.9231 [ 12 0 1100198 1 ] 1.0000 0.9231 sibling [ 6666603 ] : 6715736 0.0727273 (=52/(55*13)) 94.8341 best keyword for cluster 6715736 is PF03879 with Jaccard = 0.9444 [ 17 1 1100193 0 ] 0.9444 1.0000 SUGGESTING RELATEDNESS OF: A> PF08524 ( PF08524 rRNA processing ) B> PF03879 ( PF03879 Cgr1 family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF08524 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 502 ) 6675685_PF00676_PF02780 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00676 is 6669739 with Jaccard = 0.9228 |PF00676|=653 [ 610 8 1099550 43 ] parent [ 6669739 ] : 6675685 0.146716 (=143089/(674*1447)) 87.0398 given [ 6669739 ] : 6669739 0.153005 (=308/(3*671)) 85.4565 best keyword for cluster 6669739 is PF00676 with Jaccard = 0.9228 [ 610 8 1099550 43 ] 0.9871 0.9342 sibling [ 6669739 ] : 6672641 0.152835 (=221/(1*1446)) 86.2285 best keyword for cluster 6672641 is PF02780 with Jaccard = 0.7813 [ 1036 234 1098885 56 ] 0.8157 0.9487 SUGGESTING RELATEDNESS OF: A> PF00676 ( PF00676 Dehydrogenase E1 component ) B> PF02780 ( PF02780 Transketolase, C-terminal domain ) Only A has a clan ( CL0254.3 ). the two keywords coincide on Uniref90 proteins: |PF00676| = 653 , |PF02780| = 1092 , |PF00676^PF02780| = 45 ( 6.9% and 4.1% ) both PF00676 and PF02780 have PDB structures PF00676 c.36.1.11 SUPERFAM mapping significantly overlapping: 1 PF02780 SSF52922 0.877 (average over 3553 mutual instances, PF02780 3663 appearances, SSF52922 11092 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 503 ) 6666407_PF00109_PF02803 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00108 is 6654688 with Jaccard = 0.9224 |PF00108|=920 [ 915 72 1099219 5 ] parent [ 6654688 ] : 6666407 0.19718 (=674108/(2735*1250)) 84.5803 given [ 6654688 ] : 6654688 0.217774 (=272/(1*1249)) 81.6154 best keyword for cluster 6654688 is PF02803 with Jaccard = 0.9323 [ 923 64 1099221 3 ] 0.9352 0.9968 sibling [ 6654688 ] : 6646279 0.21558 (=2355/(4*2731)) 78.8238 best keyword for cluster 6646279 is PF00109 with Jaccard = 0.7916 [ 2009 469 1097673 60 ] 0.8107 0.9710 SUGGESTING RELATEDNESS OF: A> PF02803 ( PF02803 Thiolase, C-terminal domain ) B> PF00109 ( PF00109 Beta-ketoacyl synthase, N-terminal domain ) they come from the same clan: CL0046.10 : PF02803 PF02801 PF00109 PF01154 PF08392 PF00195 PF02797 PF08545 PF08541 PF00108 the two keywords coincide on Uniref90 proteins: |PF00109| = 2069 , |PF02803| = 926 , |PF00109^PF02803| = 8 ( 0.4% and 0.9% ) both PF02803 and PF00109 have PDB structures PF00109 c.95.1.1 SUPERFAM mapping significantly overlapping: 1 PF02803 SSF53901 0.956 (average over 3202 mutual instances, PF02803 3237 appearances, SSF53901 32336 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 504 ) 6720290_PF00834_PF02749 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01729 is 6526941 with Jaccard = 0.9223 |PF01729|=294 [ 273 2 1099915 21 ] parent [ 6526941 ] : 6720290 0.0666635 (=8551/(299*429)) 95.4559 given [ 6526941 ] : 6526941 0.768456 (=229/(1*298)) 24.094 best keyword for cluster 6526941 is PF02749 with Jaccard = 0.9375 [ 270 5 1099923 13 ] 0.9818 0.9541 sibling [ 6526941 ] : 6647292 0.251808 (=6615/(355*74)) 79.1065 best keyword for cluster 6647292 is PF00834 with Jaccard = 0.8157 [ 323 71 1099815 2 ] 0.8198 0.9938 SUGGESTING RELATEDNESS OF: A> PF02749 ( PF02749 Quinolinate phosphoribosyl transferase, N-terminal domain ) B> PF00834 ( PF00834 Ribulose-phosphate 3 epimerase family ) Only B has a clan ( CL0036.17 ). the two keywords coincide on Uniref90 proteins: |PF00834| = 325 , |PF02749| = 283 , |PF00834^PF02749| = 1 ( 0.3% and 0.4% ) both PF02749 and PF00834 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00834 SSF51366 0.922 (average over 1075 mutual instances, PF00834 1075 appearances, SSF51366 8168 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 505 ) 6707441_PF05985_PF06751 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05985 is 6430145 with Jaccard = 0.9216 |PF05985|=51 [ 47 0 1100160 4 ] parent [ 6430145 ] : 6707441 0.0646401 (=185/(53*54)) 93.5371 given [ 6430145 ] : 6430145 1 (=52/(1*52)) 0.288961 best keyword for cluster 6430145 is PF05985 with Jaccard = 0.9216 [ 47 0 1100160 4 ] 1.0000 0.9216 sibling [ 6430145 ] : 6622681 0.384615 (=40/(2*52)) 71.6019 best keyword for cluster 6622681 is PF06751 with Jaccard = 1.0000 [ 48 0 1100163 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05985 ( PF05985 Ethanolamine ammonia-lyase light chain (EutC) ) B> PF06751 ( PF06751 Ethanolamine ammonia lyase large subunit (EutB) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF05985| = 51 , |PF06751| = 48 , |PF05985^PF06751| = 4 ( 7.8% and 8.3% ) Neither PF05985 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 506 ) 6772684_PF02687_PF05341 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02687 is 6754914 with Jaccard = 0.9207 |PF02687|=1752 [ 1613 0 1098459 139 ] parent [ 6754914 ] : 6772684 0.00222387 (=309/(1957*71)) 99.7956 given [ 6754914 ] : 6754914 0.0148301 (=688/(24*1933)) 98.9635 best keyword for cluster 6754914 is PF02687 with Jaccard = 0.9207 [ 1613 0 1098459 139 ] 1.0000 0.9207 sibling [ 6754914 ] : 6770973 0.0142857 (=1/(1*70)) 99.7429 best keyword for cluster 6770973 is PF05341 with Jaccard = 0.9583 [ 23 1 1100187 0 ] 0.9583 1.0000 SUGGESTING RELATEDNESS OF: A> PF02687 ( PF02687 Predicted permease ) B> PF05341 ( PF05341 Protein of unknown function (DUF708) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02687 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 507 ) 6690984_PF00228_PF04592 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00228 is 6608811 with Jaccard = 0.9206 |PF00228|=63 [ 58 0 1100148 5 ] parent [ 6608811 ] : 6690984 0.154971 (=106/(9*76)) 90.4818 given [ 6608811 ] : 6608811 0.397516 (=192/(7*69)) 66.4337 best keyword for cluster 6608811 is PF00228 with Jaccard = 0.9206 [ 58 0 1100148 5 ] 1.0000 0.9206 sibling [ 6608811 ] : 6423749 1 (=8/(1*8)) 0.154538 best keyword for cluster 6423749 is PF04592 with Jaccard = 1.0000 [ 8 0 1100203 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00228 ( PF00228 Bowman-Birk serine protease inhibitor family ) B> PF04592 ( PF04592 Selenoprotein P, N terminal region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00228 has a PDB structure (may not be up to date) PF00228 g.3.13.1 j.38.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 508 ) 6740710_PF05277_PF05990 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05990 is 6579232 with Jaccard = 0.9206 |PF05990|=63 [ 58 0 1100148 5 ] parent [ 6579232 ] : 6740710 0.0262857 (=138/(70*75)) 97.848 given [ 6579232 ] : 6579232 0.508772 (=493/(19*51)) 53.4905 best keyword for cluster 6579232 is PF05990 with Jaccard = 0.9206 [ 58 0 1100148 5 ] 1.0000 0.9206 sibling [ 6579232 ] : 6729790 0.0540541 (=4/(1*74)) 96.7027 best keyword for cluster 6729790 is PF05277 with Jaccard = 1.0000 [ 52 0 1100159 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05990 ( PF05990 Alpha/beta hydrolase of unknown function (DUF900) ) B> PF05277 ( PF05277 Protein of unknown function (DUF726) ) Only A has a clan ( CL0028.14 ). the two keywords do not coincide on UniRef90 proteins Neither PF05990 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 509 ) 6745274_PF02457_PF07949 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07949 is 6726164 with Jaccard = 0.9206 |PF07949|=63 [ 58 0 1100148 5 ] parent [ 6726164 ] : 6745274 0.0184584 (=307/(84*198)) 98.2536 given [ 6726164 ] : 6726164 0.042735 (=20/(6*78)) 96.2522 best keyword for cluster 6726164 is PF07949 with Jaccard = 0.9206 [ 58 0 1100148 5 ] 1.0000 0.9206 sibling [ 6726164 ] : 6730413 0.0435835 (=162/(21*177)) 96.7648 best keyword for cluster 6730413 is PF02457 with Jaccard = 1.0000 [ 154 0 1100057 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07949 ( PF07949 YbbR-like protein ) B> PF02457 ( PF02457 Domain of unknown function DUF147 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02457| = 154 , |PF07949| = 63 , |PF02457^PF07949| = 4 ( 2.6% and 6.3% ) only PF07949 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 510 ) 6776861_PF05160_PF07235 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05160 is 6768345 with Jaccard = 0.9200 |PF05160|=25 [ 23 0 1100186 2 ] parent [ 6768345 ] : 6776861 0.00170036 (=9/(67*79)) 99.8986 given [ 6768345 ] : 6768345 0.0035014 (=5/(51*28)) 99.6521 best keyword for cluster 6768345 is PF05160 with Jaccard = 0.9200 [ 23 0 1100186 2 ] 1.0000 0.9200 sibling [ 6768345 ] : 6751579 0.0192982 (=11/(57*10)) 98.7395 best keyword for cluster 6751579 is PF07235 with Jaccard = 1.0000 [ 27 0 1100184 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF05160 ( PF05160 DSS1/SEM1 family ) B> PF07235 ( PF07235 Protein of unknown function (DUF1427) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05160 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 511 ) 6756286_PF02957_PF08197 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08197 is 6495001 with Jaccard = 0.9200 |PF08197|=25 [ 23 0 1100186 2 ] parent [ 6495001 ] : 6756286 0.00952381 (=46/(23*210)) 99.049 given [ 6495001 ] : 6495001 0.954545 (=21/(1*22)) 9.78709 best keyword for cluster 6495001 is PF08197 with Jaccard = 0.9200 [ 23 0 1100186 2 ] 1.0000 0.9200 sibling [ 6495001 ] : 6753250 0.0138889 (=48/(192*18)) 98.8547 best keyword for cluster 6753250 is PF02957 with Jaccard = 0.9353 [ 159 9 1100041 2 ] 0.9464 0.9876 SUGGESTING RELATEDNESS OF: A> PF08197 ( PF08197 pORF2a truncated protein ) B> PF02957 ( PF02957 TT viral ORF2 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02957| = 161 , |PF08197| = 25 , |PF02957^PF08197| = 2 ( 1.2% and 8.0% ) Neither PF08197 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 512 ) 6713103_PF00535_PF04464 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04464 is 6699175 with Jaccard = 0.9189 |PF04464|=148 [ 136 0 1100063 12 ] parent [ 6699175 ] : 6713103 0.0587994 (=41373/(170*4139)) 94.424 given [ 6699175 ] : 6699175 0.099375 (=159/(10*160)) 92.0341 best keyword for cluster 6699175 is PF04464 with Jaccard = 0.9189 [ 136 0 1100063 12 ] 1.0000 0.9189 sibling [ 6699175 ] : 6711399 0.0686322 (=5091/(18*4121)) 94.1537 best keyword for cluster 6711399 is PF00535 with Jaccard = 0.8802 [ 3496 139 1096239 337 ] 0.9618 0.9121 SUGGESTING RELATEDNESS OF: A> PF04464 ( PF04464 CDP-Glycerol:Poly(glycerophosphate) glycerophosphotransferase ) B> PF00535 ( PF00535 Glycosyl transferase family 2 ) A and B come from a different clan ( CL0113.8 , CL0110.6 ). the two keywords coincide on Uniref90 proteins: |PF00535| = 3833 , |PF04464| = 148 , |PF00535^PF04464| = 31 ( 0.8% and 20.9% ) only PF04464 has a PDB structure (may not be up to date) PF00535 c.68.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 513 ) 6496462_PF00195_PF08392 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08392 is 6450070 with Jaccard = 0.9189 |PF08392|=71 [ 68 3 1100137 3 ] parent [ 6450070 ] : 6496462 0.915768 (=15286/(78*214)) 10.0544 given [ 6450070 ] : 6450070 0.987013 (=76/(1*77)) 1.30931 best keyword for cluster 6450070 is PF08392 with Jaccard = 0.9189 [ 68 3 1100137 3 ] 0.9577 0.9577 sibling [ 6450070 ] : 6483175 0.941141 (=1551/(8*206)) 6.43543 best keyword for cluster 6483175 is PF00195 with Jaccard = 0.8858 [ 194 5 1099992 20 ] 0.9749 0.9065 SUGGESTING RELATEDNESS OF: A> PF08392 ( PF08392 FAE1/Type III polyketide synthase-like protein ) B> PF00195 ( PF00195 Chalcone and stilbene synthases, N-terminal domain ) they come from the same clan: CL0046.10 : PF02803 PF02801 PF00109 PF01154 PF08392 PF00195 PF02797 PF08545 PF08541 PF00108 the two keywords do not coincide on UniRef90 proteins only PF08392 has a PDB structure (may not be up to date) PF00195 c.95.1.2 SUPERFAM mapping significantly overlapping: 1 PF00195 SSF53901 0.548 (average over 1376 mutual instances, PF00195 1385 appearances, SSF53901 32336 appearances) 2 PF08392 SSF53901 0.710 (average over 216 mutual instances, PF08392 218 appearances, SSF53901 32336 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 514 ) 6713039_PF04235_PF07786 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04235 is 6616824 with Jaccard = 0.9176 |PF04235|=78 [ 78 7 1100126 0 ] parent [ 6616824 ] : 6713039 0.0715755 (=905/(109*116)) 94.4147 given [ 6616824 ] : 6616824 0.345804 (=684/(23*86)) 69.1724 best keyword for cluster 6616824 is PF04235 with Jaccard = 0.9176 [ 78 7 1100126 0 ] 0.9176 1.0000 sibling [ 6616824 ] : 6701367 0.0982786 (=314/(45*71)) 92.4182 best keyword for cluster 6701367 is PF07786 with Jaccard = 0.9167 [ 33 2 1100175 1 ] 0.9429 0.9706 SUGGESTING RELATEDNESS OF: A> PF04235 ( PF04235 Protein of unknown function (DUF418) ) B> PF07786 ( PF07786 Protein of unknown function (DUF1624) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04235 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 515 ) 6774979_PF04325_PF05268 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04325 is 6678665 with Jaccard = 0.9175 |PF04325|=97 [ 89 0 1100114 8 ] parent [ 6678665 ] : 6774979 0.00163934 (=13/(130*61)) 99.8553 given [ 6678665 ] : 6678665 0.160862 (=679/(67*63)) 87.8355 best keyword for cluster 6678665 is PF04325 with Jaccard = 0.9175 [ 89 0 1100114 8 ] 1.0000 0.9175 sibling [ 6678665 ] : 6772826 0.0166667 (=1/(1*60)) 99.8 best keyword for cluster 6772826 is PF05268 with Jaccard = 0.6667 [ 12 6 1100193 0 ] 0.6667 1.0000 SUGGESTING RELATEDNESS OF: A> PF04325 ( PF04325 Protein of unknown function (DUF465) ) B> PF05268 ( PF05268 Phage tail fibre adhesin Gp38 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04325 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 516 ) 6717220_PF08209_PF08313 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08313 is 6683156 with Jaccard = 0.9167 |PF08313|=36 [ 33 0 1100175 3 ] parent [ 6683156 ] : 6717220 0.0571429 (=88/(44*35)) 95.0528 given [ 6683156 ] : 6683156 0.130081 (=16/(3*41)) 88.9781 best keyword for cluster 6683156 is PF08313 with Jaccard = 0.9167 [ 33 0 1100175 3 ] 1.0000 0.9167 sibling [ 6683156 ] : 6711396 0.0588235 (=2/(1*34)) 94.1529 best keyword for cluster 6711396 is PF08209 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF08313 ( PF08313 SCA7 ) B> PF08209 ( PF08209 Sgf11 (transcriptional regulation protein) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF08209| = 19 , |PF08313| = 36 , |PF08209^PF08313| = 1 ( 5.3% and 2.8% ) Neither PF08313 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 517 ) 6765258_PF04326_PF04703 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04326 is 6760897 with Jaccard = 0.9155 |PF04326|=207 [ 195 6 1099998 12 ] parent [ 6760897 ] : 6765258 0.00657058 (=62/(28*337)) 99.5231 given [ 6760897 ] : 6760897 0.00890269 (=43/(15*322)) 99.3127 best keyword for cluster 6760897 is PF04326 with Jaccard = 0.9155 [ 195 6 1099998 12 ] 0.9701 0.9420 sibling [ 6760897 ] : 6754104 0.0125 (=2/(8*20)) 98.9125 best keyword for cluster 6754104 is PF04703 with Jaccard = 0.8889 [ 8 0 1100202 1 ] 1.0000 0.8889 SUGGESTING RELATEDNESS OF: A> PF04326 ( PF04326 Divergent AAA domain ) B> PF04703 ( PF04703 FaeA-like protein ) Only B has a clan ( CL0123.12 ). the two keywords do not coincide on UniRef90 proteins Neither PF04326 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 518 ) 6714871_PF01170_PF01555 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01555 is 6669111 with Jaccard = 0.9146 |PF01555|=532 [ 514 30 1099649 18 ] parent [ 6669111 ] : 6714871 0.0729302 (=11094/(626*243)) 94.7007 given [ 6669111 ] : 6669111 0.180676 (=561/(5*621)) 85.3135 best keyword for cluster 6669111 is PF01555 with Jaccard = 0.9146 [ 514 30 1099649 18 ] 0.9449 0.9662 sibling [ 6669111 ] : 6705075 0.0742678 (=71/(4*239)) 93.1176 best keyword for cluster 6705075 is PF01170 with Jaccard = 0.6875 [ 198 5 1099923 85 ] 0.9754 0.6996 SUGGESTING RELATEDNESS OF: A> PF01555 ( PF01555 DNA methylase ) B> PF01170 ( PF01170 Putative RNA methylase family UPF0020 ) they come from the same clan: CL0102.14 : PF06962 PF00398 PF06325 PF03291 PF01135 PF01358 PF06460 PF01189 PF05401 PF01234 PF01555 PF02384 PF07942 PF05175 PF05063 PF07109 PF02475 PF07021 PF08003 PF05148 PF01795 PF02390 PF01596 PF00891 PF09445 PF08242 PF08241 PF05971 PF02086 PF02527 PF08704 PF01728 PF01269 PF07669 PF06080 PF05891 PF05430 PF04816 PF04672 PF04445 PF04378 PF01861 PF03269 PF03141 PF07757 PF07279 PF05219 PF08123 PF00145 PF03602 PF02353 PF01739 PF06859 PF09243 PF01564 PF03848 PF05724 PF02005 PF05958 PF01209 PF01170 the two keywords coincide on Uniref90 proteins: |PF01170| = 283 , |PF01555| = 532 , |PF01170^PF01555| = 1 ( 0.4% and 0.2% ) only PF01555 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 519 ) 6751290_PF02207_PF02617 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02617 is 6559223 with Jaccard = 0.9137 |PF02617|=197 [ 180 0 1100014 17 ] parent [ 6559223 ] : 6751290 0.0161439 (=559/(199*174)) 98.718 given [ 6559223 ] : 6559223 0.593056 (=4697/(144*55)) 46.0208 best keyword for cluster 6559223 is PF02617 with Jaccard = 0.9137 [ 180 0 1100014 17 ] 1.0000 0.9137 sibling [ 6559223 ] : 6723009 0.0523725 (=383/(71*103)) 95.8572 best keyword for cluster 6723009 is PF02207 with Jaccard = 0.8106 [ 107 9 1100079 16 ] 0.9224 0.8699 SUGGESTING RELATEDNESS OF: A> PF02617 ( PF02617 ATP-dependent Clp protease adaptor protein ClpS ) B> PF02207 ( PF02207 Putative zinc finger in N-recognin (UBR box) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02207| = 123 , |PF02617| = 197 , |PF02207^PF02617| = 15 ( 12.2% and 7.6% ) only PF02617 has a PDB structure (may not be up to date) PF02617 d.45.1.2 SUPERFAM mapping significantly overlapping: 1 PF02617 SSF54736 0.941 (average over 609 mutual instances, PF02617 612 appearances, SSF54736 1680 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 520 ) 6694609_PF00809_PF01288 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01288 is 6568011 with Jaccard = 0.9118 |PF01288|=306 [ 279 0 1099905 27 ] parent [ 6568011 ] : 6694609 0.0884057 (=10681/(313*386)) 91.2336 given [ 6568011 ] : 6568011 0.531169 (=818/(5*308)) 50.5696 best keyword for cluster 6568011 is PF01288 with Jaccard = 0.9118 [ 279 0 1099905 27 ] 1.0000 0.9118 sibling [ 6568011 ] : 6680245 0.137511 (=158/(3*383)) 88.2278 best keyword for cluster 6680245 is PF00809 with Jaccard = 0.6455 [ 335 1 1099692 183 ] 0.9970 0.6467 SUGGESTING RELATEDNESS OF: A> PF01288 ( PF01288 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK) ) B> PF00809 ( PF00809 Pterin binding enzyme ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00809| = 518 , |PF01288| = 306 , |PF00809^PF01288| = 34 ( 6.6% and 11.1% ) both PF01288 and PF00809 have PDB structures PF01288 d.58.30.1 SUPERFAM mapping significantly overlapping: 1 PF00809 SSF51717 0.783 (average over 1837 mutual instances, PF00809 3831 appearances, SSF51717 4784 appearances) 2 PF01288 SSF55083 0.821 (average over 930 mutual instances, PF01288 1027 appearances, SSF55083 1126 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 521 ) 6732270_PF01061_PF03379 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03379 is 6606312 with Jaccard = 0.9115 |PF03379|=110 [ 103 3 1100098 7 ] parent [ 6606312 ] : 6732270 0.0383941 (=7443/(122*1589)) 96.9763 given [ 6606312 ] : 6606312 0.377679 (=423/(10*112)) 64.7614 best keyword for cluster 6606312 is PF03379 with Jaccard = 0.9115 [ 103 3 1100098 7 ] 0.9717 0.9364 sibling [ 6606312 ] : 6703361 0.0853535 (=39843/(1200*389)) 92.7892 best keyword for cluster 6703361 is PF01061 with Jaccard = 0.6168 [ 1109 8 1098413 681 ] 0.9928 0.6196 SUGGESTING RELATEDNESS OF: A> PF03379 ( PF03379 CcmB protein ) B> PF01061 ( PF01061 ABC-2 type transporter ) they come from the same clan: CL0181.5 : PF01061 PF03379 PF06182 the two keywords do not coincide on UniRef90 proteins Neither PF03379 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 522 ) 6762033_PF00432_PF03936 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03936 is 6756742 with Jaccard = 0.9103 |PF03936|=280 [ 274 21 1099910 6 ] parent [ 6756742 ] : 6762033 0.00836484 (=1593/(360*529)) 99.3721 given [ 6756742 ] : 6756742 0.0121238 (=224/(62*298)) 99.0753 best keyword for cluster 6756742 is PF03936 with Jaccard = 0.9103 [ 274 21 1099910 6 ] 0.9288 0.9786 sibling [ 6756742 ] : 6757010 0.0141383 (=137/(19*510)) 99.0916 best keyword for cluster 6757010 is PF00432 with Jaccard = 0.9106 [ 336 14 1099842 19 ] 0.9600 0.9465 SUGGESTING RELATEDNESS OF: A> PF03936 ( PF03936 Terpene synthase family, metal binding domain ) B> PF00432 ( PF00432 Prenyltransferase and squalene oxidase repeat ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00432| = 355 , |PF03936| = 280 , |PF00432^PF03936| = 2 ( 0.6% and 0.7% ) both PF03936 and PF00432 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF03936 SSF48576 0.798 (average over 833 mutual instances, PF03936 1514 appearances, SSF48576 4885 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 523 ) 6754976_PF00160_PF00254 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00254 is 6723775 with Jaccard = 0.9093 |PF00254|=1037 [ 972 32 1099142 65 ] parent [ 6723775 ] : 6754976 0.010703 (=13835/(1125*1149)) 98.9684 given [ 6723775 ] : 6723775 0.0520375 (=710/(12*1137)) 95.9682 best keyword for cluster 6723775 is PF00254 with Jaccard = 0.9093 [ 972 32 1099142 65 ] 0.9681 0.9373 sibling [ 6723775 ] : 6722487 0.0440123 (=2139/(45*1080)) 95.7861 best keyword for cluster 6722487 is PF00160 with Jaccard = 0.9289 [ 980 34 1099156 41 ] 0.9665 0.9598 SUGGESTING RELATEDNESS OF: A> PF00254 ( PF00254 FKBP-type peptidyl-prolyl cis-trans isomerase ) B> PF00160 ( PF00160 Cyclophilin type peptidyl-prolyl cis-trans isomerase/CLD ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00160| = 1021 , |PF00254| = 1037 , |PF00160^PF00254| = 13 ( 1.3% and 1.3% ) both PF00254 and PF00160 have PDB structures PF00254 d.26.1.1 SUPERFAM mapping significantly overlapping: 1 PF00160 SSF50891 0.950 (average over 2953 mutual instances, PF00160 3038 appearances, SSF50891 3533 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 524 ) 6677258_PF02048_PF02058 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02048 is 6431160 with Jaccard = 0.9091 |PF02048|=11 [ 10 0 1100200 1 ] parent [ 6431160 ] : 6677258 0.235294 (=40/(10*17)) 87.5005 given [ 6431160 ] : 6431160 1 (=24/(4*6)) 0.315756 best keyword for cluster 6431160 is PF02048 with Jaccard = 0.9091 [ 10 0 1100200 1 ] 1.0000 0.9091 sibling [ 6431160 ] : 6562710 0.5625 (=9/(1*16)) 49.1244 best keyword for cluster 6562710 is PF02058 with Jaccard = 1.0000 [ 16 0 1100195 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02048 ( PF02048 Heat-stable enterotoxin ) B> PF02058 ( PF02058 Guanylin precursor ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF02048 and PF02058 have PDB structures PF02058 d.234.1.1 SUPERFAM mapping significantly overlapping: 1 PF02058 SSF89890 0.800 (average over 26 mutual instances, PF02058 26 appearances, SSF89890 27 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 525 ) 6715034_PF03382_PF05215 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03382 is 6659162 with Jaccard = 0.9091 |PF03382|=97 [ 90 2 1100112 7 ] parent [ 6659162 ] : 6715034 0.0605546 (=428/(114*62)) 94.7298 given [ 6659162 ] : 6659162 0.174383 (=113/(6*108)) 83.1401 best keyword for cluster 6659162 is PF03382 with Jaccard = 0.9091 [ 90 2 1100112 7 ] 0.9783 0.9278 sibling [ 6659162 ] : 6708330 0.0859539 (=41/(9*53)) 93.6908 best keyword for cluster 6708330 is PF05215 with Jaccard = 0.8333 [ 5 1 1100205 0 ] 0.8333 1.0000 SUGGESTING RELATEDNESS OF: A> PF03382 ( PF03382 Mycoplasma protein of unknown function, DUF285 ) B> PF05215 ( PF05215 Spiralin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF03382 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 526 ) 6728401_PF04208_PF04210 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04210 is 6427583 with Jaccard = 0.9091 |PF04210|=11 [ 10 0 1100200 1 ] parent [ 6427583 ] : 6728401 0.0354839 (=11/(10*31)) 96.5323 given [ 6427583 ] : 6427583 1 (=9/(1*9)) 0.225379 best keyword for cluster 6427583 is PF04210 with Jaccard = 0.9091 [ 10 0 1100200 1 ] 1.0000 0.9091 sibling [ 6427583 ] : 6689018 0.10101 (=20/(22*9)) 90.0609 best keyword for cluster 6689018 is PF04208 with Jaccard = 1.0000 [ 21 0 1100190 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04210 ( PF04210 Tetrahydromethanopterin S-methyltransferase, subunit G ) B> PF04208 ( PF04208 Tetrahydromethanopterin S-methyltransferase, subunit A ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04208| = 21 , |PF04210| = 11 , |PF04208^PF04210| = 1 ( 4.8% and 9.1% ) Neither PF04210 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 527 ) 6516428_PF05279_PF07169 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07169 is 6442429 with Jaccard = 0.9091 |PF07169|=11 [ 10 0 1100200 1 ] parent [ 6442429 ] : 6516428 0.836842 (=159/(10*19)) 18.2885 given [ 6442429 ] : 6442429 1 (=9/(1*9)) 0.79231 best keyword for cluster 6442429 is PF07169 with Jaccard = 0.9091 [ 10 0 1100200 1 ] 1.0000 0.9091 sibling [ 6442429 ] : 6499147 0.892857 (=75/(7*12)) 11.0598 best keyword for cluster 6499147 is PF05279 with Jaccard = 0.8261 [ 19 0 1100188 4 ] 1.0000 0.8261 SUGGESTING RELATEDNESS OF: A> PF07169 ( PF07169 Triadin ) B> PF05279 ( PF05279 Aspartyl beta-hydroxylase N-terminal region ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07169 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 528 ) 6656573_PF01625_PF01641 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01625 is 6609366 with Jaccard = 0.9089 |PF01625|=428 [ 389 0 1099783 39 ] parent [ 6609366 ] : 6656573 0.178414 (=26683/(431*347)) 82.2245 given [ 6609366 ] : 6609366 0.416279 (=179/(1*430)) 66.6318 best keyword for cluster 6609366 is PF01625 with Jaccard = 0.9089 [ 389 0 1099783 39 ] 1.0000 0.9089 sibling [ 6609366 ] : 6654707 0.229717 (=470/(6*341)) 81.6259 best keyword for cluster 6654707 is PF01641 with Jaccard = 0.8918 [ 305 0 1099869 37 ] 1.0000 0.8918 SUGGESTING RELATEDNESS OF: A> PF01625 ( PF01625 Peptide methionine sulfoxide reductase ) B> PF01641 ( PF01641 SelR domain ) Only B has a clan ( CL0080.7 ). the two keywords coincide on Uniref90 proteins: |PF01625| = 428 , |PF01641| = 342 , |PF01625^PF01641| = 69 ( 16.1% and 20.2% ) both PF01625 and PF01641 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF01641 SSF51316 0.909 (average over 1096 mutual instances, PF01641 1348 appearances, SSF51316 1572 appearances) 2 PF01625 SSF55068 0.877 (average over 1339 mutual instances, PF01625 1591 appearances, SSF55068 1587 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 529 ) 6686617_PF00476_PF00752 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00752 is 6682097 with Jaccard = 0.9076 |PF00752|=218 [ 216 20 1099973 2 ] parent [ 6682097 ] : 6686617 0.12149 (=19501/(615*261)) 89.5745 given [ 6682097 ] : 6682097 0.121622 (=63/(2*259)) 88.7149 best keyword for cluster 6682097 is PF00752 with Jaccard = 0.9076 [ 216 20 1099973 2 ] 0.9153 0.9908 sibling [ 6682097 ] : 6665138 0.159041 (=292/(3*612)) 84.293 best keyword for cluster 6665138 is PF00476 with Jaccard = 0.7633 [ 416 114 1099666 15 ] 0.7849 0.9652 SUGGESTING RELATEDNESS OF: A> PF00752 ( PF00752 XPG N-terminal domain ) B> PF00476 ( PF00476 DNA polymerase family A ) Only A has a clan ( CL0280.2 ). the two keywords do not coincide on UniRef90 proteins both PF00752 and PF00476 have PDB structures PF00476 e.8.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 530 ) 6674654_PF01600_PF01601 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01600 is 6597009 with Jaccard = 0.9071 |PF01600|=183 [ 166 0 1100028 17 ] parent [ 6597009 ] : 6674654 0.141921 (=1061/(178*42)) 86.803 given [ 6597009 ] : 6597009 0.397661 (=476/(7*171)) 60.2341 best keyword for cluster 6597009 is PF01600 with Jaccard = 0.9071 [ 166 0 1100028 17 ] 1.0000 0.9071 sibling [ 6597009 ] : 6651052 0.203125 (=65/(32*10)) 80.3359 best keyword for cluster 6651052 is PF01601 with Jaccard = 0.6667 [ 30 4 1100166 11 ] 0.8824 0.7317 SUGGESTING RELATEDNESS OF: A> PF01600 ( PF01600 Coronavirus S1 glycoprotein ) B> PF01601 ( PF01601 Coronavirus S2 glycoprotein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01600| = 183 , |PF01601| = 41 , |PF01600^PF01601| = 20 ( 10.9% and 48.8% ) only PF01600 has a PDB structure (may not be up to date) PF01601 h.3.3.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 531 ) 6635535_PF03741_PF04332 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04332 is 6246423 with Jaccard = 0.9070 |PF04332|=43 [ 39 0 1100168 4 ] parent [ 6246423 ] : 6635535 0.29117 (=4620/(41*387)) 75.885 given [ 6246423 ] : 6246423 1 (=78/(2*39)) 1.32111e-13 best keyword for cluster 6246423 is PF04332 with Jaccard = 0.9070 [ 39 0 1100168 4 ] 1.0000 0.9070 sibling [ 6246423 ] : 6619220 0.316062 (=122/(1*386)) 70.0756 best keyword for cluster 6619220 is PF03741 with Jaccard = 0.9971 [ 339 0 1099871 1 ] 1.0000 0.9971 SUGGESTING RELATEDNESS OF: A> PF04332 ( PF04332 Protein of unknown function (DUF475) ) B> PF03741 ( PF03741 Integral membrane protein TerC family ) Only A has a clan ( CL0015.13 ). the two keywords do not coincide on UniRef90 proteins Neither PF04332 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF04332 SSF103473 0.864 (average over 16 mutual instances, PF04332 17 appearances, SSF103473 39293 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 532 ) 6750299_PF01796_PF07431 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01796 is 6715575 with Jaccard = 0.9048 |PF01796|=189 [ 171 0 1100022 18 ] parent [ 6715575 ] : 6750299 0.0163317 (=52/(199*16)) 98.6405 given [ 6715575 ] : 6715575 0.0608466 (=115/(189*10)) 94.8109 best keyword for cluster 6715575 is PF01796 with Jaccard = 0.9048 [ 171 0 1100022 18 ] 1.0000 0.9048 sibling [ 6715575 ] : 6731553 0.047619 (=3/(9*7)) 96.8921 best keyword for cluster 6731553 is PF07431 with Jaccard = 0.8750 [ 7 0 1100203 1 ] 1.0000 0.8750 SUGGESTING RELATEDNESS OF: A> PF01796 ( PF01796 Domain of unknown function DUF35 ) B> PF07431 ( PF07431 Protein of unknown function (DUF1512) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01796 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 533 ) 6684095_PF00741_PF05121 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05121 is 6505331 with Jaccard = 0.9048 |PF05121|=21 [ 19 0 1100190 2 ] parent [ 6505331 ] : 6684095 0.113997 (=158/(21*66)) 89.1211 given [ 6505331 ] : 6505331 0.9 (=18/(1*20)) 13.4524 best keyword for cluster 6505331 is PF05121 with Jaccard = 0.9048 [ 19 0 1100190 2 ] 1.0000 0.9048 sibling [ 6505331 ] : 6632731 0.25 (=32/(2*64)) 75.3185 best keyword for cluster 6632731 is PF00741 with Jaccard = 0.9206 [ 58 0 1100148 5 ] 1.0000 0.9206 SUGGESTING RELATEDNESS OF: A> PF05121 ( PF05121 Gas vesicle protein K ) B> PF00741 ( PF00741 Gas vesicle protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00741| = 63 , |PF05121| = 21 , |PF00741^PF05121| = 5 ( 7.9% and 23.8% ) Neither PF05121 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 534 ) 6751190_PF01074_PF03065 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07748 is 6728822 with Jaccard = 0.9045 |PF07748|=163 [ 161 15 1100033 2 ] parent [ 6728822 ] : 6751190 0.0169542 (=517/(158*193)) 98.7098 given [ 6728822 ] : 6728822 0.0351064 (=33/(188*5)) 96.5871 best keyword for cluster 6728822 is PF01074 with Jaccard = 0.9435 [ 167 9 1100034 1 ] 0.9489 0.9940 sibling [ 6728822 ] : 6720735 0.0449561 (=41/(6*152)) 95.5102 best keyword for cluster 6720735 is PF03065 with Jaccard = 0.9500 [ 133 0 1100071 7 ] 1.0000 0.9500 SUGGESTING RELATEDNESS OF: A> PF01074 ( PF01074 Glycosyl hydrolases family 38 N-terminal domain ) B> PF03065 ( PF03065 Glycosyl hydrolase family 57 ) they come from the same clan: CL0158.6 : PF01074 PF03065 PF01522 the two keywords do not coincide on UniRef90 proteins both PF01074 and PF03065 have PDB structures PF01074 c.6.2.1 PF03065 c.6.2.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 535 ) 6750289_PF01427_PF02557 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02557 is 6723467 with Jaccard = 0.9036 |PF02557|=160 [ 150 6 1100045 10 ] parent [ 6723467 ] : 6750289 0.0186265 (=377/(88*230)) 98.6391 given [ 6723467 ] : 6723467 0.0500424 (=649/(131*99)) 95.9206 best keyword for cluster 6723467 is PF02557 with Jaccard = 0.9036 [ 150 6 1100045 10 ] 0.9615 0.9375 sibling [ 6723467 ] : 6602156 0.402299 (=35/(1*87)) 62.7467 best keyword for cluster 6602156 is PF01427 with Jaccard = 0.9512 [ 78 2 1100129 2 ] 0.9750 0.9750 SUGGESTING RELATEDNESS OF: A> PF02557 ( PF02557 D-alanyl-D-alanine carboxypeptidase ) B> PF01427 ( PF01427 D-ala-D-ala dipeptidase ) they come from the same clan: CL0170.6 : PF01085 PF01427 PF05951 PF08291 PF03411 PF02557 the two keywords do not coincide on UniRef90 proteins both PF02557 and PF01427 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF02557 SSF55166 0.819 (average over 234 mutual instances, PF02557 248 appearances, SSF55166 1247 appearances) 2 PF01427 SSF55166 0.969 (average over 259 mutual instances, PF01427 262 appearances, SSF55166 1247 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 536 ) 6696267_PF06225_PF06227 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06225 is 6649280 with Jaccard = 0.9032 |PF06225|=28 [ 28 3 1100180 0 ] parent [ 6649280 ] : 6696267 0.101307 (=31/(34*9)) 91.6104 given [ 6649280 ] : 6649280 0.223443 (=61/(13*21)) 79.8006 best keyword for cluster 6649280 is PF06225 with Jaccard = 0.9032 [ 28 3 1100180 0 ] 0.9032 1.0000 sibling [ 6649280 ] : 6663638 0.25 (=2/(1*8)) 84 best keyword for cluster 6663638 is PF06227 with Jaccard = 1.0000 [ 3 0 1100208 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06225 ( PF06225 Poxvirus A4/B15 family ) B> PF06227 ( PF06227 Orthopoxvirus N1 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06225 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 537 ) 6729962_PF00457_PF01522 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01522 is 6718873 with Jaccard = 0.9029 |PF01522|=911 [ 828 6 1099294 83 ] parent [ 6718873 ] : 6729962 0.0338904 (=4490/(142*933)) 96.7181 given [ 6718873 ] : 6718873 0.0529158 (=343/(7*926)) 95.2614 best keyword for cluster 6718873 is PF01522 with Jaccard = 0.9029 [ 828 6 1099294 83 ] 0.9928 0.9089 sibling [ 6718873 ] : 6708529 0.0835979 (=79/(135*7)) 93.7245 best keyword for cluster 6708529 is PF00457 with Jaccard = 0.9853 [ 134 1 1100075 1 ] 0.9926 0.9926 SUGGESTING RELATEDNESS OF: A> PF01522 ( PF01522 Polysaccharide deacetylase ) B> PF00457 ( PF00457 Glycosyl hydrolases family 11 ) A and B come from a different clan ( CL0158.6 , CL0004.14 ). the two keywords coincide on Uniref90 proteins: |PF00457| = 135 , |PF01522| = 911 , |PF00457^PF01522| = 7 ( 5.2% and 0.8% ) both PF01522 and PF00457 have PDB structures PF00457 b.29.1.11 SUPERFAM mapping significantly overlapping: 1 PF00457 SSF49899 0.930 (average over 314 mutual instances, PF00457 406 appearances, SSF49899 14070 appearances) 2 PF01522 SSF88713 0.586 (average over 2652 mutual instances, PF01522 2828 appearances, SSF88713 4598 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 538 ) 6605561_PF00441_PF01756 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01756 is 6602354 with Jaccard = 0.9028 |PF01756|=130 [ 130 14 1100067 0 ] parent [ 6602354 ] : 6605561 0.382416 (=125697/(2107*156)) 64.4188 given [ 6602354 ] : 6602354 0.387778 (=349/(150*6)) 62.9528 best keyword for cluster 6602354 is PF01756 with Jaccard = 0.9028 [ 130 14 1100067 0 ] 0.9028 1.0000 sibling [ 6602354 ] : 6581114 0.516144 (=1087/(1*2106)) 54.0987 best keyword for cluster 6581114 is PF00441 with Jaccard = 0.9416 [ 1870 30 1098225 86 ] 0.9842 0.9560 SUGGESTING RELATEDNESS OF: A> PF01756 ( PF01756 Acyl-CoA oxidase ) B> PF00441 ( PF00441 Acyl-CoA dehydrogenase, C-terminal domain ) they come from the same clan: CL0087.7 : PF01756 PF00441 PF08028 the two keywords coincide on Uniref90 proteins: |PF00441| = 1956 , |PF01756| = 130 , |PF00441^PF01756| = 53 ( 2.7% and 40.8% ) both PF01756 and PF00441 have PDB structures PF01756 a.29.3.2 PF00441 a.29.3.1 SUPERFAM mapping significantly overlapping: 1 PF00441 SSF47203 0.910 (average over 6570 mutual instances, PF00441 13147 appearances, SSF47203 17996 appearances) 2 PF01756 SSF47203 0.935 (average over 253 mutual instances, PF01756 485 appearances, SSF47203 17996 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 539 ) 6608783_PF00135_PF07859 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00135 is 6582935 with Jaccard = 0.9021 |PF00135|=797 [ 719 0 1099414 78 ] parent [ 6582935 ] : 6608783 0.373768 (=260498/(747*933)) 66.4102 given [ 6582935 ] : 6582935 0.520805 (=776/(2*745)) 54.6842 best keyword for cluster 6582935 is PF00135 with Jaccard = 0.9021 [ 719 0 1099414 78 ] 1.0000 0.9021 sibling [ 6582935 ] : 6597184 0.440143 (=1228/(3*930)) 60.3925 best keyword for cluster 6597184 is PF07859 with Jaccard = 0.9227 [ 716 40 1099435 20 ] 0.9471 0.9728 SUGGESTING RELATEDNESS OF: A> PF00135 ( PF00135 Carboxylesterase ) B> PF07859 ( PF07859 alpha/beta hydrolase fold ) they come from the same clan: CL0028.14 : PF05728 PF00975 PF07519 PF06850 PF07819 PF00326 PF05576 PF05577 PF02129 PF00450 PF02089 PF03403 PF03096 PF01764 PF01674 PF00151 PF03583 PF02450 PF03959 PF00756 PF06028 PF05990 PF05677 PF05057 PF04301 PF08538 PF07176 PF06821 PF06500 PF06342 PF06259 PF01738 PF01083 PF00135 PF07224 PF08840 PF05448 PF02273 PF08386 PF07859 PF02230 PF00561 PF06057 the two keywords do not coincide on UniRef90 proteins both PF00135 and PF07859 have PDB structures PF00135 c.69.1.1 c.69.1.17 PF07859 c.69.1.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 540 ) 6759091_PF01637_PF06846 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01637 is 6740504 with Jaccard = 0.9016 |PF01637|=168 [ 165 15 1100028 3 ] parent [ 6740504 ] : 6759091 0.0113533 (=188/(29*571)) 99.2146 given [ 6740504 ] : 6740504 0.0292702 (=2371/(263*308)) 97.8279 best keyword for cluster 6740504 is PF01637 with Jaccard = 0.9016 [ 165 15 1100028 3 ] 0.9167 0.9821 sibling [ 6740504 ] : 6672132 0.171569 (=35/(12*17)) 86.0559 best keyword for cluster 6672132 is PF06846 with Jaccard = 1.0000 [ 11 0 1100200 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01637 ( PF01637 Archaeal ATPase ) B> PF06846 ( PF06846 Protein of unknown function (DUF1245) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01637 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 541 ) 6696840_PF03400_PF03811 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03400 is 6690597 with Jaccard = 0.9014 |PF03400|=69 [ 64 2 1100140 5 ] parent [ 6690597 ] : 6696840 0.089158 (=773/(102*85)) 91.7053 given [ 6690597 ] : 6690597 0.130952 (=11/(1*84)) 90.3792 best keyword for cluster 6690597 is PF03400 with Jaccard = 0.9014 [ 64 2 1100140 5 ] 0.9697 0.9275 sibling [ 6690597 ] : 6648687 0.237403 (=457/(77*25)) 79.5642 best keyword for cluster 6648687 is PF03811 with Jaccard = 0.8197 [ 50 5 1100150 6 ] 0.9091 0.8929 SUGGESTING RELATEDNESS OF: A> PF03400 ( PF03400 IS1 transposase ) B> PF03811 ( PF03811 Insertion element protein ) Only B has a clan ( CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF03400| = 69 , |PF03811| = 56 , |PF03400^PF03811| = 2 ( 2.9% and 3.6% ) Neither PF03400 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 542 ) 6723432_PF02275_PF03417 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03417 is 6667352 with Jaccard = 0.9000 |PF03417|=50 [ 45 0 1100161 5 ] parent [ 6667352 ] : 6723432 0.0491453 (=253/(99*52)) 95.9147 given [ 6667352 ] : 6667352 0.171852 (=116/(27*25)) 84.8142 best keyword for cluster 6667352 is PF03417 with Jaccard = 0.9000 [ 45 0 1100161 5 ] 1.0000 0.9000 sibling [ 6667352 ] : 6559634 0.551546 (=107/(2*97)) 46.4772 best keyword for cluster 6559634 is PF02275 with Jaccard = 0.7699 [ 87 0 1100098 26 ] 1.0000 0.7699 SUGGESTING RELATEDNESS OF: A> PF03417 ( PF03417 Acyl-coenzyme A:6-aminopenicillanic acid acyl-transferase ) B> PF02275 ( PF02275 Linear amide C-N hydrolases, choloylglycine hydrolase family ) they come from the same clan: CL0052.11 : PF00227 PF03577 PF01804 PF01019 PF00310 PF02275 PF01112 PF03417 the two keywords do not coincide on UniRef90 proteins only PF03417 has a PDB structure (may not be up to date) PF02275 d.153.1.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 543 ) 6693364_PF00023_PF01412 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01412 is 6678069 with Jaccard = 0.8989 |PF01412|=351 [ 329 15 1099845 22 ] parent [ 6678069 ] : 6693364 0.101111 (=146245/(375*3857)) 90.9881 given [ 6678069 ] : 6678069 0.136792 (=203/(4*371)) 87.6937 best keyword for cluster 6678069 is PF01412 with Jaccard = 0.8989 [ 329 15 1099845 22 ] 0.9564 0.9373 sibling [ 6678069 ] : 6691620 0.0987318 (=8330/(22*3835)) 90.6005 best keyword for cluster 6691620 is PF00023 with Jaccard = 0.7263 [ 3124 223 1095910 954 ] 0.9334 0.7661 SUGGESTING RELATEDNESS OF: A> PF01412 ( PF01412 Putative GTPase activating protein for Arf ) B> PF00023 ( PF00023 Ankyrin repeat ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00023| = 4078 , |PF01412| = 351 , |PF00023^PF01412| = 87 ( 2.1% and 24.8% ) both PF01412 and PF00023 have PDB structures PF01412 g.45.1.1 PF00023 d.211.1.1 i.11.1.1 SUPERFAM mapping significantly overlapping: 1 PF01412 SSF57863 0.943 (average over 744 mutual instances, PF01412 1021 appearances, SSF57863 1344 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 544 ) 6708789_PF02774_PF02800 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02774 is 6592245 with Jaccard = 0.8988 |PF02774|=521 [ 506 42 1099648 15 ] parent [ 6592245 ] : 6708789 0.0773083 (=44486/(607*948)) 93.7601 given [ 6592245 ] : 6592245 0.45068 (=40810/(264*343)) 58.0641 best keyword for cluster 6592245 is PF02774 with Jaccard = 0.8988 [ 506 42 1099648 15 ] 0.9234 0.9712 sibling [ 6592245 ] : 6702179 0.161563 (=153/(1*947)) 92.5745 best keyword for cluster 6702179 is PF02800 with Jaccard = 0.9434 [ 833 48 1099328 2 ] 0.9455 0.9976 SUGGESTING RELATEDNESS OF: A> PF02774 ( PF02774 Semialdehyde dehydrogenase, dimerisation domain ) B> PF02800 ( PF02800 Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain ) they come from the same clan: CL0139.6 : PF02800 PF02774 the two keywords do not coincide on UniRef90 proteins both PF02774 and PF02800 have PDB structures PF02800 d.81.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 545 ) 6697766_PF03151_PF08449 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08449 is 6695887 with Jaccard = 0.8957 |PF08449|=114 [ 103 1 1100096 11 ] parent [ 6695887 ] : 6697766 0.100189 (=3400/(303*112)) 91.8457 given [ 6695887 ] : 6695887 0.0917431 (=30/(3*109)) 91.5001 best keyword for cluster 6695887 is PF08449 with Jaccard = 0.8957 [ 103 1 1100096 11 ] 0.9904 0.9035 sibling [ 6695887 ] : 6615930 0.334001 (=5167/(65*238)) 68.8752 best keyword for cluster 6615930 is PF03151 with Jaccard = 0.8301 [ 254 6 1099905 46 ] 0.9769 0.8467 SUGGESTING RELATEDNESS OF: A> PF08449 ( PF08449 UAA transporter family ) B> PF03151 ( PF03151 Triose-phosphate Transporter family ) they come from the same clan: CL0184.5 : PF07857 PF04342 PF00892 PF05653 PF06027 PF00893 PF04142 PF06379 PF06800 PF03151 PF08449 PF02694 the two keywords do not coincide on UniRef90 proteins Neither PF08449 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 546 ) 6732346_PF02498_PF08346 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02498 is 6725496 with Jaccard = 0.8941 |PF02498|=233 [ 211 3 1099975 22 ] parent [ 6725496 ] : 6732346 0.0319589 (=1809/(178*318)) 96.9893 given [ 6725496 ] : 6725496 0.0459426 (=261/(19*299)) 96.1671 best keyword for cluster 6725496 is PF02498 with Jaccard = 0.8941 [ 211 3 1099975 22 ] 0.9860 0.9056 sibling [ 6725496 ] : 6707402 0.0669749 (=478/(117*61)) 93.5264 best keyword for cluster 6707402 is PF08346 with Jaccard = 0.7231 [ 47 16 1100146 2 ] 0.7460 0.9592 SUGGESTING RELATEDNESS OF: A> PF02498 ( PF02498 BRO family, N-terminal domain ) B> PF08346 ( PF08346 AntA/AntB antirepressor ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02498 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 547 ) 6781628_PF04159_PF04277 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04277 is 6761709 with Jaccard = 0.8929 |PF04277|=55 [ 50 1 1100155 5 ] parent [ 6761709 ] : 6781628 0.00047619 (=4/(84*100)) 99.9752 given [ 6761709 ] : 6761709 0.0108108 (=8/(10*74)) 99.356 best keyword for cluster 6761709 is PF04277 with Jaccard = 0.8929 [ 50 1 1100155 5 ] 0.9804 0.9091 sibling [ 6761709 ] : 6779296 0.00080289 (=2/(53*47)) 99.9423 best keyword for cluster 6779296 is PF04159 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04277 ( PF04277 Oxaloacetate decarboxylase, gamma chain ) B> PF04159 ( PF04159 NB glycoprotein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04277 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 548 ) 6768152_PF01105_PF04776 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04776 is 6740322 with Jaccard = 0.8929 |PF04776|=27 [ 25 1 1100183 2 ] parent [ 6740322 ] : 6768152 0.00428762 (=61/(41*347)) 99.645 given [ 6740322 ] : 6740322 0.0252101 (=6/(34*7)) 97.8073 best keyword for cluster 6740322 is PF04776 with Jaccard = 0.8929 [ 25 1 1100183 2 ] 0.9615 0.9259 sibling [ 6740322 ] : 6766010 0.00581395 (=6/(3*344)) 99.5556 best keyword for cluster 6766010 is PF01105 with Jaccard = 0.6914 [ 177 0 1099955 79 ] 1.0000 0.6914 SUGGESTING RELATEDNESS OF: A> PF04776 ( PF04776 Protein of unknown function (DUF626) ) B> PF01105 ( PF01105 emp24/gp25L/p24 family/GOLD ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF04776 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 549 ) 6576467_PF02434_PF04646 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02434 is 6535808 with Jaccard = 0.8913 |PF02434|=45 [ 41 1 1100165 4 ] parent [ 6535808 ] : 6576467 0.518541 (=867/(38*44)) 52.6529 given [ 6535808 ] : 6535808 0.730159 (=230/(9*35)) 29.372 best keyword for cluster 6535808 is PF02434 with Jaccard = 0.8913 [ 41 1 1100165 4 ] 0.9762 0.9111 sibling [ 6535808 ] : 6514429 0.835227 (=294/(22*16)) 17.294 best keyword for cluster 6514429 is PF04646 with Jaccard = 0.8333 [ 20 0 1100187 4 ] 1.0000 0.8333 SUGGESTING RELATEDNESS OF: A> PF02434 ( PF02434 Fringe-like ) B> PF04646 ( PF04646 Protein of unknown function, DUF604 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02434| = 45 , |PF04646| = 24 , |PF02434^PF04646| = 1 ( 2.2% and 4.2% ) Neither PF02434 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 550 ) 6664257_PF00430_PF05103 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00430 is 6615487 with Jaccard = 0.8909 |PF00430|=394 [ 351 0 1099817 43 ] parent [ 6615487 ] : 6664257 0.193881 (=17160/(406*218)) 84.1173 given [ 6615487 ] : 6615487 0.335271 (=7615/(67*339)) 68.6452 best keyword for cluster 6615487 is PF00430 with Jaccard = 0.8909 [ 351 0 1099817 43 ] 1.0000 0.8909 sibling [ 6615487 ] : 6642868 0.269444 (=1843/(180*38)) 77.9038 best keyword for cluster 6642868 is PF05103 with Jaccard = 0.8919 [ 99 7 1100100 5 ] 0.9340 0.9519 SUGGESTING RELATEDNESS OF: A> PF00430 ( PF00430 ATP synthase B/B' CF(0) ) B> PF05103 ( PF05103 DivIVA protein ) Only A has a clan ( CL0255.4 ). the two keywords do not coincide on UniRef90 proteins only PF00430 has a PDB structure (may not be up to date) PF00430 f.23.21.1 j.35.1.1 SUPERFAM mapping significantly overlapping: 1 PF00430 SSF82607 0.684 (average over 1 mutual instances, PF00430 10 appearances, SSF82607 761 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 551 ) 6653706_PF00505_PF03531 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00505 is 6648722 with Jaccard = 0.8889 |PF00505|=864 [ 768 0 1099347 96 ] parent [ 6648722 ] : 6653706 0.195702 (=10838/(65*852)) 81.2733 given [ 6648722 ] : 6648722 0.240575 (=1423/(7*845)) 79.5867 best keyword for cluster 6648722 is PF00505 with Jaccard = 0.8889 [ 768 0 1099347 96 ] 1.0000 0.8889 sibling [ 6648722 ] : 6624418 0.301333 (=226/(15*50)) 72.328 best keyword for cluster 6624418 is PF03531 with Jaccard = 0.7667 [ 46 14 1100151 0 ] 0.7667 1.0000 SUGGESTING RELATEDNESS OF: A> PF00505 ( PF00505 HMG (high mobility group) box ) B> PF03531 ( PF03531 Structure-specific recognition protein (SSRP1) ) A and B come from a different clan ( CL0114.6 , CL0215.5 ). the two keywords coincide on Uniref90 proteins: |PF00505| = 864 , |PF03531| = 46 , |PF00505^PF03531| = 17 ( 2.0% and 37.0% ) only PF00505 has a PDB structure (may not be up to date) PF00505 a.21.1.1 SUPERFAM mapping significantly overlapping: 1 PF00505 SSF47095 0.800 (average over 2604 mutual instances, PF00505 2716 appearances, SSF47095 3113 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 552 ) 6647472_PF03168_PF07427 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07427 is 6625919 with Jaccard = 0.8889 |PF07427|=9 [ 8 0 1100202 1 ] parent [ 6625919 ] : 6647472 0.254261 (=179/(64*11)) 79.2263 given [ 6625919 ] : 6625919 0.277778 (=5/(2*9)) 72.9905 best keyword for cluster 6625919 is PF07427 with Jaccard = 0.8889 [ 8 0 1100202 1 ] 1.0000 0.8889 sibling [ 6625919 ] : 6607992 0.394917 (=404/(31*33)) 65.8917 best keyword for cluster 6607992 is PF03168 with Jaccard = 1.0000 [ 26 0 1100185 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07427 ( PF07427 Protein of unknown function (DUF1511) ) B> PF03168 ( PF03168 Late embryogenesis abundant protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF07427 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 553 ) 6681333_PF03032_PF08018 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08018 is 6651888 with Jaccard = 0.8889 |PF08018|=27 [ 24 0 1100184 3 ] parent [ 6651888 ] : 6681333 0.128254 (=202/(25*63)) 88.5089 given [ 6651888 ] : 6651888 0.282609 (=13/(2*23)) 80.637 best keyword for cluster 6651888 is PF08018 with Jaccard = 0.8889 [ 24 0 1100184 3 ] 1.0000 0.8889 sibling [ 6651888 ] : 6678437 0.127119 (=30/(4*59)) 87.805 best keyword for cluster 6678437 is PF03032 with Jaccard = 0.6769 [ 44 0 1100146 21 ] 1.0000 0.6769 SUGGESTING RELATEDNESS OF: A> PF08018 ( PF08018 Frog antimicrobial peptide ) B> PF03032 ( PF03032 Brevenin/esculentin/gaegurin/rugosin family ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03032| = 65 , |PF08018| = 28 , |PF03032^PF08018| = 8 ( 12.3% and 28.6% ) Neither PF08018 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 554 ) 6625262_PF00308_PF01695 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01695 is 6607912 with Jaccard = 0.8885 |PF01695|=292 [ 287 31 1099888 5 ] parent [ 6607912 ] : 6625262 0.314167 (=52665/(417*402)) 72.5964 given [ 6607912 ] : 6607912 0.365012 (=603/(4*413)) 65.8168 best keyword for cluster 6607912 is PF01695 with Jaccard = 0.8885 [ 287 31 1099888 5 ] 0.9025 0.9829 sibling [ 6607912 ] : 6622739 0.28625 (=229/(2*400)) 71.6495 best keyword for cluster 6622739 is PF00308 with Jaccard = 0.9633 [ 315 12 1099884 0 ] 0.9633 1.0000 SUGGESTING RELATEDNESS OF: A> PF01695 ( PF01695 IstB-like ATP binding protein ) B> PF00308 ( PF00308 Bacterial dnaA protein ) they come from the same clan: CL0023.26 : PF02367 PF02534 PF02463 PF01202 PF00158 PF08542 PF03215 PF05729 PF00488 PF01078 PF00493 PF08433 PF01695 PF00437 PF05872 PF06144 PF00308 PF01583 PF00005 PF08298 PF07728 PF07726 PF07724 PF00004 PF05707 the two keywords do not coincide on UniRef90 proteins only PF01695 has a PDB structure (may not be up to date) PF00308 c.37.1.20 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 555 ) 6651089_PF01583_PF01747 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01747 is 6541006 with Jaccard = 0.8881 |PF01747|=134 [ 119 0 1100077 15 ] parent [ 6541006 ] : 6651089 0.197373 (=5409/(203*135)) 80.3545 given [ 6541006 ] : 6541006 0.680451 (=181/(2*133)) 33.1475 best keyword for cluster 6541006 is PF01747 with Jaccard = 0.8881 [ 119 0 1100077 15 ] 1.0000 0.8881 sibling [ 6541006 ] : 6536186 0.738333 (=443/(3*200)) 29.6961 best keyword for cluster 6536186 is PF01583 with Jaccard = 0.7027 [ 182 0 1099952 77 ] 1.0000 0.7027 SUGGESTING RELATEDNESS OF: A> PF01747 ( PF01747 ATP-sulfurylase ) B> PF01583 ( PF01583 Adenylylsulphate kinase ) Only B has a clan ( CL0023.26 ). the two keywords coincide on Uniref90 proteins: |PF01583| = 259 , |PF01747| = 134 , |PF01583^PF01747| = 33 ( 12.7% and 24.6% ) both PF01747 and PF01583 have PDB structures PF01583 c.37.1.15 c.37.1.4 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 556 ) 6705094_PF00891_PF05891 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00891 is 6648966 with Jaccard = 0.8862 |PF00891|=380 [ 366 33 1099798 14 ] parent [ 6648966 ] : 6705094 0.0886865 (=1636/(43*429)) 93.1238 given [ 6648966 ] : 6648966 0.220012 (=741/(8*421)) 79.6904 best keyword for cluster 6648966 is PF00891 with Jaccard = 0.8862 [ 366 33 1099798 14 ] 0.9173 0.9632 sibling [ 6648966 ] : 6610234 0.371795 (=58/(39*4)) 66.8698 best keyword for cluster 6610234 is PF05891 with Jaccard = 0.9286 [ 39 3 1100169 0 ] 0.9286 1.0000 SUGGESTING RELATEDNESS OF: A> PF00891 ( PF00891 O-methyltransferase ) B> PF05891 ( PF05891 Eukaryotic protein of unknown function (DUF858) ) they come from the same clan: CL0102.14 : PF06962 PF00398 PF06325 PF03291 PF01135 PF01358 PF06460 PF01189 PF05401 PF01234 PF01555 PF02384 PF07942 PF05175 PF05063 PF07109 PF02475 PF07021 PF08003 PF05148 PF01795 PF02390 PF01596 PF00891 PF09445 PF08242 PF08241 PF05971 PF02086 PF02527 PF08704 PF01728 PF01269 PF07669 PF06080 PF05891 PF05430 PF04816 PF04672 PF04445 PF04378 PF01861 PF03269 PF03141 PF07757 PF07279 PF05219 PF08123 PF00145 PF03602 PF02353 PF01739 PF06859 PF09243 PF01564 PF03848 PF05724 PF02005 PF05958 PF01209 PF01170 the two keywords do not coincide on UniRef90 proteins both PF00891 and PF05891 have PDB structures PF05891 c.66.1.42 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 557 ) 6667538_PF02493_PF07661 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02493 is 6662460 with Jaccard = 0.8847 |PF02493|=449 [ 399 2 1099760 50 ] parent [ 6662460 ] : 6667538 0.17733 (=12406/(159*440)) 84.8563 given [ 6662460 ] : 6662460 0.180619 (=315/(4*436)) 83.7714 best keyword for cluster 6662460 is PF02493 with Jaccard = 0.8847 [ 399 2 1099760 50 ] 0.9950 0.8886 sibling [ 6662460 ] : 6648173 0.228632 (=107/(3*156)) 79.4327 best keyword for cluster 6648173 is PF07661 with Jaccard = 0.9138 [ 106 1 1100095 9 ] 0.9907 0.9217 SUGGESTING RELATEDNESS OF: A> PF02493 ( PF02493 MORN repeat ) B> PF07661 ( PF07661 MORN repeat variant ) they come from the same clan: CL0251.3 : PF07661 PF02493 the two keywords coincide on Uniref90 proteins: |PF02493| = 449 , |PF07661| = 115 , |PF02493^PF07661| = 2 ( 0.4% and 1.7% ) only PF02493 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 558 ) 6716047_PF00590_PF02602 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00590 is 6685379 with Jaccard = 0.8814 |PF00590|=1270 [ 1122 3 1098938 148 ] parent [ 6685379 ] : 6716047 0.0535455 (=21959/(300*1367)) 94.8904 given [ 6685379 ] : 6685379 0.133467 (=11625/(67*1300)) 89.3396 best keyword for cluster 6685379 is PF00590 with Jaccard = 0.8814 [ 1122 3 1098938 148 ] 0.9973 0.8835 sibling [ 6685379 ] : 6669701 0.165375 (=1792/(42*258)) 85.4332 best keyword for cluster 6669701 is PF02602 with Jaccard = 0.8272 [ 249 0 1099910 52 ] 1.0000 0.8272 SUGGESTING RELATEDNESS OF: A> PF00590 ( PF00590 Tetrapyrrole (Corrin/Porphyrin) Methylases ) B> PF02602 ( PF02602 Uroporphyrinogen-III synthase HemD ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00590| = 1270 , |PF02602| = 301 , |PF00590^PF02602| = 58 ( 4.6% and 19.3% ) both PF00590 and PF02602 have PDB structures PF02602 c.113.1.1 SUPERFAM mapping significantly overlapping: 1 PF02602 SSF69618 0.927 (average over 844 mutual instances, PF02602 1033 appearances, SSF69618 1108 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 559 ) 6715698_PF02945_PF03175 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03175 is 6692611 with Jaccard = 0.8812 |PF03175|=101 [ 89 0 1100110 12 ] parent [ 6692611 ] : 6715698 0.0615757 (=483/(74*106)) 94.8307 given [ 6692611 ] : 6692611 0.107843 (=44/(4*102)) 90.7988 best keyword for cluster 6692611 is PF03175 with Jaccard = 0.8812 [ 89 0 1100110 12 ] 1.0000 0.8812 sibling [ 6692611 ] : 6664680 0.184942 (=253/(36*38)) 84.2093 best keyword for cluster 6664680 is PF02945 with Jaccard = 0.8462 [ 33 6 1100172 0 ] 0.8462 1.0000 SUGGESTING RELATEDNESS OF: A> PF03175 ( PF03175 DNA polymerase type B, organellar and viral ) B> PF02945 ( PF02945 Recombination endonuclease VII ) A and B come from a different clan ( CL0194.5 , CL0263.2 ). the two keywords do not coincide on UniRef90 proteins both PF03175 and PF02945 have PDB structures PF02945 d.4.1.5 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 560 ) 6698382_PF00046_PF00412 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00046 is 6695559 with Jaccard = 0.8808 |PF00046|=3370 [ 3171 230 1096611 199 ] parent [ 6695559 ] : 6698382 0.086379 (=207982/(645*3733)) 91.9378 given [ 6695559 ] : 6695559 0.114319 (=21467/(51*3682)) 91.4494 best keyword for cluster 6695559 is PF00046 with Jaccard = 0.8808 [ 3171 230 1096611 199 ] 0.9324 0.9409 sibling [ 6695559 ] : 6686524 0.128315 (=329/(4*641)) 89.5542 best keyword for cluster 6686524 is PF00412 with Jaccard = 0.7463 [ 562 32 1099458 159 ] 0.9461 0.7795 SUGGESTING RELATEDNESS OF: A> PF00046 ( PF00046 Homeobox domain ) B> PF00412 ( PF00412 LIM domain ) Only A has a clan ( CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF00046| = 3370 , |PF00412| = 721 , |PF00046^PF00412| = 105 ( 3.1% and 14.6% ) both PF00046 and PF00412 have PDB structures PF00046 a.4.1.1 j.92.1.1 SUPERFAM mapping significantly overlapping: 1 PF00046 SSF46689 0.773 (average over 9143 mutual instances, PF00046 9568 appearances, SSF46689 68153 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 561 ) 6743478_PF00551_PF02769 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02769 is 6686008 with Jaccard = 0.8803 |PF02769|=1069 [ 1059 134 1099008 10 ] parent [ 6686008 ] : 6743478 0.0193508 (=24548/(1367*928)) 98.0966 given [ 6686008 ] : 6686008 0.120242 (=52036/(869*498)) 89.476 best keyword for cluster 6686008 is PF02769 with Jaccard = 0.8803 [ 1059 134 1099008 10 ] 0.8877 0.9906 sibling [ 6686008 ] : 6733475 0.0442287 (=41/(1*927)) 97.1063 best keyword for cluster 6733475 is PF00551 with Jaccard = 0.9203 [ 808 7 1099333 63 ] 0.9914 0.9277 SUGGESTING RELATEDNESS OF: A> PF02769 ( PF02769 AIR synthase related protein, C-terminal domain ) B> PF00551 ( PF00551 Formyl transferase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00551| = 871 , |PF02769| = 1069 , |PF00551^PF02769| = 30 ( 3.4% and 2.8% ) both PF02769 and PF00551 have PDB structures PF02769 d.139.1.1 PF00551 c.65.1.1 SUPERFAM mapping significantly overlapping: 1 PF02769 SSF56042 0.865 (average over 3183 mutual instances, PF02769 6538 appearances, SSF56042 6282 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 562 ) 6628159_PF01579_PF03236 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01579 is 6621820 with Jaccard = 0.8800 |PF01579|=49 [ 44 1 1100161 5 ] parent [ 6621820 ] : 6628159 0.295733 (=506/(29*59)) 73.7389 given [ 6621820 ] : 6621820 0.307018 (=35/(2*57)) 71.2946 best keyword for cluster 6621820 is PF01579 with Jaccard = 0.8800 [ 44 1 1100161 5 ] 0.9778 0.8980 sibling [ 6621820 ] : 6555485 0.607143 (=102/(8*21)) 43.0595 best keyword for cluster 6555485 is PF03236 with Jaccard = 0.9000 [ 18 2 1100191 0 ] 0.9000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01579 ( PF01579 Domain of unknown function DUF19 ) B> PF03236 ( PF03236 Domain of unknown function DUF263 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01579 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 563 ) 6706799_PF03544_PF05569 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03544 is 6705391 with Jaccard = 0.8798 |PF03544|=234 [ 227 24 1099953 7 ] parent [ 6705391 ] : 6706799 0.0705605 (=3298/(82*570)) 93.4258 given [ 6705391 ] : 6705391 0.0848109 (=287/(6*564)) 93.1689 best keyword for cluster 6705391 is PF03544 with Jaccard = 0.8798 [ 227 24 1099953 7 ] 0.9044 0.9701 sibling [ 6705391 ] : 6649350 0.225 (=36/(2*80)) 79.8615 best keyword for cluster 6649350 is PF05569 with Jaccard = 0.6700 [ 67 1 1100111 32 ] 0.9853 0.6768 SUGGESTING RELATEDNESS OF: A> PF03544 ( PF03544 Gram-negative bacterial tonB protein ) B> PF05569 ( PF05569 BlaR1 peptidase M56 ) Only B has a clan ( CL0150.6 ). the two keywords coincide on Uniref90 proteins: |PF03544| = 234 , |PF05569| = 99 , |PF03544^PF05569| = 13 ( 5.6% and 13.1% ) only PF03544 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 564 ) 6729734_PF00320_PF04855 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00320 is 6728520 with Jaccard = 0.8789 |PF00320|=385 [ 341 3 1099823 44 ] parent [ 6728520 ] : 6729734 0.037004 (=746/(48*420)) 96.6962 given [ 6728520 ] : 6728520 0.0373171 (=153/(10*410)) 96.548 best keyword for cluster 6728520 is PF00320 with Jaccard = 0.8789 [ 341 3 1099823 44 ] 0.9913 0.8857 sibling [ 6728520 ] : 6671414 0.141304 (=13/(2*46)) 85.8777 best keyword for cluster 6671414 is PF04855 with Jaccard = 1.0000 [ 43 0 1100168 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00320 ( PF00320 GATA zinc finger ) B> PF04855 ( PF04855 SNF5 / SMARCB1 / INI1 ) Only A has a clan ( CL0167.10 ). the two keywords coincide on Uniref90 proteins: |PF00320| = 385 , |PF04855| = 43 , |PF00320^PF04855| = 2 ( 0.5% and 4.7% ) only PF00320 has a PDB structure (may not be up to date) PF00320 g.39.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 565 ) 6680903_PF04717_PF06890 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06890 is 6463517 with Jaccard = 0.8788 |PF06890|=33 [ 29 0 1100178 4 ] parent [ 6463517 ] : 6680903 0.134473 (=346/(31*83)) 88.41 given [ 6463517 ] : 6463517 0.976923 (=127/(26*5)) 2.79599 best keyword for cluster 6463517 is PF06890 with Jaccard = 0.8788 [ 29 0 1100178 4 ] 1.0000 0.8788 sibling [ 6463517 ] : 6669671 0.166915 (=224/(22*61)) 85.4093 best keyword for cluster 6669671 is PF04717 with Jaccard = 0.9273 [ 51 2 1100156 2 ] 0.9623 0.9623 SUGGESTING RELATEDNESS OF: A> PF06890 ( PF06890 Bacteriophage Mu Gp45 protein ) B> PF04717 ( PF04717 Phage-related baseplate assembly protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04717| = 53 , |PF06890| = 33 , |PF04717^PF06890| = 3 ( 5.7% and 9.1% ) Neither PF06890 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 566 ) 6598068_PF02797_PF08541 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08541 is 6563150 with Jaccard = 0.8767 |PF08541|=448 [ 398 6 1099757 50 ] parent [ 6563150 ] : 6598068 0.427155 (=60902/(469*304)) 60.6578 given [ 6563150 ] : 6563150 0.534261 (=499/(2*467)) 49.6942 best keyword for cluster 6563150 is PF08541 with Jaccard = 0.8767 [ 398 6 1099757 50 ] 0.9851 0.8884 sibling [ 6563150 ] : 6597133 0.402318 (=243/(2*302)) 60.3392 best keyword for cluster 6597133 is PF02797 with Jaccard = 0.8990 [ 258 24 1099924 5 ] 0.9149 0.9810 SUGGESTING RELATEDNESS OF: A> PF08541 ( PF08541 3-Oxoacyl-[acyl-carrier-protein (ACP)] synthase III C terminal ) B> PF02797 ( PF02797 Chalcone and stilbene synthases, C-terminal domain ) they come from the same clan: CL0046.10 : PF02803 PF02801 PF00109 PF01154 PF08392 PF00195 PF02797 PF08545 PF08541 PF00108 the two keywords do not coincide on UniRef90 proteins both PF08541 and PF02797 have PDB structures PF02797 c.95.1.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 567 ) 6738378_PF00753_PF07522 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00753 is 6735710 with Jaccard = 0.8752 |PF00753|=3135 [ 2897 175 1096901 238 ] parent [ 6735710 ] : 6738378 0.0311918 (=10091/(81*3994)) 97.6263 given [ 6735710 ] : 6735710 0.038499 (=65869/(488*3506)) 97.3396 best keyword for cluster 6735710 is PF00753 with Jaccard = 0.8752 [ 2897 175 1096901 238 ] 0.9430 0.9241 sibling [ 6735710 ] : 6717203 0.0579151 (=30/(74*7)) 95.049 best keyword for cluster 6717203 is PF07522 with Jaccard = 0.9153 [ 54 3 1100152 2 ] 0.9474 0.9643 SUGGESTING RELATEDNESS OF: A> PF00753 ( PF00753 Metallo-beta-lactamase superfamily ) B> PF07522 ( PF07522 DNA repair metallo-beta-lactamase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00753 has a PDB structure (may not be up to date) PF00753 d.157.1.1 d.157.1.2 d.157.1.3 d.157.1.7 d.157.1.9 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 568 ) 6581327_PF03516_PF05474 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05474 is 6534316 with Jaccard = 0.8750 |PF05474|=15 [ 14 1 1100195 1 ] parent [ 6534316 ] : 6581327 0.478632 (=112/(18*13)) 54.1525 given [ 6534316 ] : 6534316 0.732143 (=41/(4*14)) 28.4559 best keyword for cluster 6534316 is PF05474 with Jaccard = 0.8750 [ 14 1 1100195 1 ] 0.9333 0.9333 sibling [ 6534316 ] : 6560600 0.545455 (=12/(2*11)) 47.2386 best keyword for cluster 6560600 is PF03516 with Jaccard = 0.7143 [ 5 2 1100204 0 ] 0.7143 1.0000 SUGGESTING RELATEDNESS OF: A> PF05474 ( PF05474 Semenogelin ) B> PF03516 ( PF03516 Filaggrin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05474 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 569 ) 6712639_PF00004_PF05695 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05695 is 6701117 with Jaccard = 0.8750 |PF05695|=35 [ 35 5 1100171 0 ] parent [ 6701117 ] : 6712639 0.0646679 (=30521/(65*7261)) 94.3402 given [ 6701117 ] : 6701117 0.0778689 (=19/(4*61)) 92.3812 best keyword for cluster 6701117 is PF05695 with Jaccard = 0.8750 [ 35 5 1100171 0 ] 0.8750 1.0000 sibling [ 6701117 ] : 6711543 0.0669631 (=24623/(51*7210)) 94.1712 best keyword for cluster 6711543 is PF00004 with Jaccard = 0.6365 [ 3979 2107 1093960 165 ] 0.6538 0.9602 SUGGESTING RELATEDNESS OF: A> PF05695 ( PF05695 Plant protein of unknown function (DUF825) ) B> PF00004 ( PF00004 ATPase family associated with various cellular activities (AAA) ) Only B has a clan ( CL0023.26 ). the two keywords coincide on Uniref90 proteins: |PF00004| = 4144 , |PF05695| = 35 , |PF00004^PF05695| = 17 ( 0.4% and 48.6% ) only PF05695 has a PDB structure (may not be up to date) PF00004 c.37.1.1 c.37.1.20 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 570 ) 6737235_PF00033_PF03161 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03161 is 6632032 with Jaccard = 0.8710 |PF03161|=93 [ 81 0 1100118 12 ] parent [ 6632032 ] : 6737235 0.0250365 (=7058/(86*3278)) 97.5055 given [ 6632032 ] : 6632032 0.267857 (=45/(2*84)) 75.1462 best keyword for cluster 6632032 is PF03161 with Jaccard = 0.8710 [ 81 0 1100118 12 ] 1.0000 0.8710 sibling [ 6632032 ] : 6735269 0.042722 (=140/(1*3277)) 97.2996 best keyword for cluster 6735269 is PF00033 with Jaccard = 0.9322 [ 2927 199 1097071 14 ] 0.9363 0.9952 SUGGESTING RELATEDNESS OF: A> PF03161 ( PF03161 LAGLIDADG DNA endonuclease family ) B> PF00033 ( PF00033 Cytochrome b(N-terminal)/b6/petB ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00033| = 2941 , |PF03161| = 93 , |PF00033^PF03161| = 3 ( 0.1% and 3.2% ) both PF03161 and PF00033 have PDB structures PF03161 d.95.2.1 PF00033 f.21.1.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 571 ) 6724780_PF03724_PF04170 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03724 is 6685476 with Jaccard = 0.8692 |PF03724|=107 [ 93 0 1100104 14 ] parent [ 6685476 ] : 6724780 0.0420338 (=291/(43*161)) 96.0822 given [ 6685476 ] : 6685476 0.112609 (=618/(49*112)) 89.369 best keyword for cluster 6685476 is PF03724 with Jaccard = 0.8692 [ 93 0 1100104 14 ] 1.0000 0.8692 sibling [ 6685476 ] : 6666141 0.158537 (=13/(2*41)) 84.5004 best keyword for cluster 6666141 is PF04170 with Jaccard = 0.9524 [ 20 1 1100190 0 ] 0.9524 1.0000 SUGGESTING RELATEDNESS OF: A> PF03724 ( PF03724 Domain of unknown function (306) ) B> PF04170 ( PF04170 Uncharacterized lipoprotein NlpE involved in copper resistance ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03724| = 107 , |PF04170| = 20 , |PF03724^PF04170| = 1 ( 0.9% and 5.0% ) Neither PF03724 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 572 ) 6774576_PF02674_PF06900 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02674 is 6772591 with Jaccard = 0.8684 |PF02674|=188 [ 165 2 1100021 23 ] parent [ 6772591 ] : 6774576 0.00203037 (=23/(48*236)) 99.8453 given [ 6772591 ] : 6772591 0.00274725 (=15/(26*210)) 99.7927 best keyword for cluster 6772591 is PF02674 with Jaccard = 0.8684 [ 165 2 1100021 23 ] 0.9880 0.8777 sibling [ 6772591 ] : 6770289 0.0037037 (=2/(18*30)) 99.7204 best keyword for cluster 6770289 is PF06900 with Jaccard = 1.0000 [ 5 0 1100206 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02674 ( PF02674 Colicin V production protein ) B> PF06900 ( PF06900 Protein of unknown function (DUF1270) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF02674 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 573 ) 6721711_PF04271_PF04492 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04271 is 6711946 with Jaccard = 0.8673 |PF04271|=92 [ 85 6 1100113 7 ] parent [ 6711946 ] : 6721711 0.0456731 (=475/(52*200)) 95.6656 given [ 6711946 ] : 6711946 0.0652778 (=235/(20*180)) 94.2464 best keyword for cluster 6711946 is PF04271 with Jaccard = 0.8673 [ 85 6 1100113 7 ] 0.9341 0.9239 sibling [ 6711946 ] : 6673410 0.139881 (=94/(24*28)) 86.4985 best keyword for cluster 6673410 is PF04492 with Jaccard = 0.6667 [ 18 5 1100184 4 ] 0.7826 0.8182 SUGGESTING RELATEDNESS OF: A> PF04271 ( PF04271 DnaD-like domain ) B> PF04492 ( PF04492 Bacteriophage replication protein O ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04271 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 574 ) 6622276_PF02309_PF02362 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02309 is 6606108 with Jaccard = 0.8662 |PF02309|=142 [ 123 0 1100069 19 ] parent [ 6606108 ] : 6622276 0.293164 (=6459/(153*144)) 71.4924 given [ 6606108 ] : 6606108 0.369718 (=105/(2*142)) 64.5477 best keyword for cluster 6606108 is PF02309 with Jaccard = 0.8662 [ 123 0 1100069 19 ] 1.0000 0.8662 sibling [ 6606108 ] : 6599331 0.405721 (=1773/(38*115)) 61.2884 best keyword for cluster 6599331 is PF02362 with Jaccard = 0.6256 [ 137 1 1099992 81 ] 0.9928 0.6284 SUGGESTING RELATEDNESS OF: A> PF02309 ( PF02309 AUX/IAA family ) B> PF02362 ( PF02362 B3 DNA binding domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02309| = 142 , |PF02362| = 218 , |PF02309^PF02362| = 14 ( 9.9% and 6.4% ) only PF02309 has a PDB structure (may not be up to date) PF02362 b.142.1.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 575 ) 6475621_PF00749_PF03950 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03950 is 6323607 with Jaccard = 0.8655 |PF03950|=216 [ 193 7 1099988 23 ] parent [ 6323607 ] : 6475621 0.959779 (=101845/(217*489)) 4.78999 given [ 6323607 ] : 6323607 1 (=5610/(30*187)) 7.13173e-08 best keyword for cluster 6323607 is PF03950 with Jaccard = 0.8655 [ 193 7 1099988 23 ] 0.9650 0.8935 sibling [ 6323607 ] : 6445136 0.991964 (=5678/(12*477)) 0.950106 best keyword for cluster 6445136 is PF00749 with Jaccard = 0.6706 [ 450 0 1099540 221 ] 1.0000 0.6706 SUGGESTING RELATEDNESS OF: A> PF03950 ( PF03950 tRNA synthetases class I (E and Q), anti-codon binding domain ) B> PF00749 ( PF00749 tRNA synthetases class I (E and Q), catalytic domain ) Only B has a clan ( CL0038.9 ). the two keywords coincide on Uniref90 proteins: |PF00749| = 671 , |PF03950| = 216 , |PF00749^PF03950| = 210 ( 31.3% and 97.2% ) both PF03950 and PF00749 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF03950 SSF50715 0.905 (average over 619 mutual instances, PF03950 710 appearances, SSF50715 2049 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 576 ) 6753665_PF00337_PF04099 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00337 is 6752320 with Jaccard = 0.8651 |PF00337|=212 [ 186 3 1099996 26 ] parent [ 6752320 ] : 6753665 0.0114014 (=343/(138*218)) 98.8814 given [ 6752320 ] : 6752320 0.0143541 (=27/(9*209)) 98.7911 best keyword for cluster 6752320 is PF00337 with Jaccard = 0.8651 [ 186 3 1099996 26 ] 0.9841 0.8774 sibling [ 6752320 ] : 6681927 0.134615 (=630/(78*60)) 88.6667 best keyword for cluster 6681927 is PF04099 with Jaccard = 0.7653 [ 75 23 1100113 0 ] 0.7653 1.0000 SUGGESTING RELATEDNESS OF: A> PF00337 ( PF00337 Galactoside-binding lectin ) B> PF04099 ( PF04099 Sybindin-like family ) A and B come from a different clan ( CL0004.14 , CL0212.4 ). the two keywords coincide on Uniref90 proteins: |PF00337| = 212 , |PF04099| = 75 , |PF00337^PF04099| = 2 ( 0.9% and 2.7% ) only PF00337 has a PDB structure (may not be up to date) PF00337 b.29.1.3 SUPERFAM mapping significantly overlapping: 1 PF00337 SSF49899 0.926 (average over 447 mutual instances, PF00337 453 appearances, SSF49899 14070 appearances) 2 PF04099 SSF64356 0.954 (average over 167 mutual instances, PF04099 170 appearances, SSF64356 1711 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 577 ) 6595370_PF00386_PF01391 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00386 is 6579905 with Jaccard = 0.8649 |PF00386|=174 [ 160 11 1100026 14 ] parent [ 6579905 ] : 6595370 0.42048 (=68339/(182*893)) 59.5189 given [ 6579905 ] : 6579905 0.481875 (=2313/(32*150)) 53.6781 best keyword for cluster 6579905 is PF00386 with Jaccard = 0.8649 [ 160 11 1100026 14 ] 0.9357 0.9195 sibling [ 6579905 ] : 6593540 0.434637 (=12335/(33*860)) 58.72 best keyword for cluster 6593540 is PF01391 with Jaccard = 0.6285 [ 751 36 1099016 408 ] 0.9543 0.6480 SUGGESTING RELATEDNESS OF: A> PF00386 ( PF00386 C1q domain ) B> PF01391 ( PF01391 Collagen triple helix repeat (20 copies) ) Only A has a clan ( CL0100.7 ). the two keywords coincide on Uniref90 proteins: |PF00386| = 174 , |PF01391| = 1159 , |PF00386^PF01391| = 90 ( 51.7% and 7.8% ) both PF00386 and PF01391 have PDB structures PF00386 b.22.1.1 PF01391 d.169.1.5 h.1.1.1 SUPERFAM mapping significantly overlapping: 1 PF00386 SSF49842 0.910 (average over 373 mutual instances, PF00386 377 appearances, SSF49842 1081 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 578 ) 6682017_PF01895_PF02690 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01895 is 6601900 with Jaccard = 0.8649 |PF01895|=277 [ 256 19 1099915 21 ] parent [ 6601900 ] : 6682017 0.125154 (=6827/(319*171)) 88.6983 given [ 6601900 ] : 6601900 0.413011 (=5555/(50*269)) 62.519 best keyword for cluster 6601900 is PF01895 with Jaccard = 0.8649 [ 256 19 1099915 21 ] 0.9309 0.9242 sibling [ 6601900 ] : 6630073 0.257396 (=87/(2*169)) 74.6247 best keyword for cluster 6630073 is PF02690 with Jaccard = 0.9936 [ 155 1 1100055 0 ] 0.9936 1.0000 SUGGESTING RELATEDNESS OF: A> PF01895 ( PF01895 PhoU family ) B> PF02690 ( PF02690 Na+/Pi-cotransporter ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01895| = 277 , |PF02690| = 155 , |PF01895^PF02690| = 11 ( 4.0% and 7.1% ) only PF01895 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 579 ) 6741682_PF02992_PF03004 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03004 is 6735857 with Jaccard = 0.8630 |PF03004|=71 [ 63 2 1100138 8 ] parent [ 6735857 ] : 6741682 0.0225235 (=2994/(134*992)) 97.9417 given [ 6735857 ] : 6735857 0.0282258 (=35/(10*124)) 97.3586 best keyword for cluster 6735857 is PF03004 with Jaccard = 0.8630 [ 63 2 1100138 8 ] 0.9692 0.8873 sibling [ 6735857 ] : 6737777 0.0292634 (=29/(1*991)) 97.5643 best keyword for cluster 6737777 is PF02992 with Jaccard = 0.7841 [ 385 98 1099720 8 ] 0.7971 0.9796 SUGGESTING RELATEDNESS OF: A> PF03004 ( PF03004 Plant transposase (Ptta/En/Spm family) ) B> PF02992 ( PF02992 Transposase family tnp2 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF02992| = 393 , |PF03004| = 71 , |PF02992^PF03004| = 5 ( 1.3% and 7.0% ) Neither PF03004 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 580 ) 6693891_PF00132_PF00483 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00132 is 6676082 with Jaccard = 0.8619 |PF00132|=2358 [ 2053 24 1097829 305 ] parent [ 6676082 ] : 6693891 0.101199 (=372306/(2438*1509)) 91.0764 given [ 6676082 ] : 6676082 0.162283 (=9014/(23*2415)) 87.1826 best keyword for cluster 6676082 is PF00132 with Jaccard = 0.8619 [ 2053 24 1097829 305 ] 0.9884 0.8707 sibling [ 6676082 ] : 6677300 0.133046 (=401/(2*1507)) 87.5139 best keyword for cluster 6677300 is PF00483 with Jaccard = 0.7386 [ 1297 58 1098455 401 ] 0.9572 0.7638 SUGGESTING RELATEDNESS OF: A> PF00132 ( PF00132 Bacterial transferase hexapeptide (three repeats) ) B> PF00483 ( PF00483 Nucleotidyl transferase ) Only B has a clan ( CL0110.6 ). the two keywords coincide on Uniref90 proteins: |PF00132| = 2358 , |PF00483| = 1698 , |PF00132^PF00483| = 331 ( 14.0% and 19.5% ) both PF00132 and PF00483 have PDB structures PF00132 b.81.1.1 b.81.1.2 b.81.1.3 b.81.1.5 b.81.1.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 581 ) 6608488_PF00378_PF00725 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00378 is 6602365 with Jaccard = 0.8614 |PF00378|=2033 [ 1752 1 1098177 281 ] parent [ 6602365 ] : 6608488 0.346156 (=428907/(644*1924)) 66.0146 given [ 6602365 ] : 6602365 0.413111 (=1588/(2*1922)) 62.9686 best keyword for cluster 6602365 is PF00378 with Jaccard = 0.8614 [ 1752 1 1098177 281 ] 0.9994 0.8618 sibling [ 6602365 ] : 6593490 0.479459 (=922/(3*641)) 58.6697 best keyword for cluster 6593490 is PF00725 with Jaccard = 0.9474 [ 576 12 1099603 20 ] 0.9796 0.9664 SUGGESTING RELATEDNESS OF: A> PF00378 ( PF00378 Enoyl-CoA hydratase/isomerase family ) B> PF00725 ( PF00725 3-hydroxyacyl-CoA dehydrogenase, C-terminal domain ) A and B come from a different clan ( CL0127.6 , CL0106.7 ). the two keywords coincide on Uniref90 proteins: |PF00378| = 2033 , |PF00725| = 596 , |PF00378^PF00725| = 241 ( 11.9% and 40.4% ) both PF00378 and PF00725 have PDB structures PF00378 c.14.1.3 PF00725 a.100.1.3 SUPERFAM mapping significantly overlapping: 1 PF00725 SSF48179 0.569 (average over 1877 mutual instances, PF00725 3749 appearances, SSF48179 20570 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 582 ) 6692694_PF00795_PF02540 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00795 is 6676809 with Jaccard = 0.8609 |PF00795|=1201 [ 1034 0 1099010 167 ] parent [ 6676809 ] : 6692694 0.10092 (=53595/(459*1157)) 90.8178 given [ 6676809 ] : 6676809 0.1492 (=31811/(230*927)) 87.4004 best keyword for cluster 6676809 is PF00795 with Jaccard = 0.8609 [ 1034 0 1099010 167 ] 1.0000 0.8609 sibling [ 6676809 ] : 6651913 0.228995 (=8280/(358*101)) 80.6531 best keyword for cluster 6651913 is PF02540 with Jaccard = 0.7392 [ 326 83 1099770 32 ] 0.7971 0.9106 SUGGESTING RELATEDNESS OF: A> PF00795 ( PF00795 Carbon-nitrogen hydrolase ) B> PF02540 ( PF02540 NAD synthase ) Only B has a clan ( CL0039.7 ). the two keywords coincide on Uniref90 proteins: |PF00795| = 1201 , |PF02540| = 358 , |PF00795^PF02540| = 150 ( 12.5% and 41.9% ) both PF00795 and PF02540 have PDB structures PF00795 d.160.1.1 d.160.1.2 SUPERFAM mapping significantly overlapping: 1 PF00795 SSF56317 0.652 (average over 3308 mutual instances, PF00795 3415 appearances, SSF56317 3928 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 583 ) 6636556_PF00346_PF00374 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00374 is 6600211 with Jaccard = 0.8606 |PF00374|=287 [ 247 0 1099924 40 ] parent [ 6600211 ] : 6636556 0.271283 (=26452/(281*347)) 76.0845 given [ 6600211 ] : 6600211 0.456989 (=255/(2*279)) 61.7438 best keyword for cluster 6600211 is PF00374 with Jaccard = 0.8606 [ 247 0 1099924 40 ] 1.0000 0.8606 sibling [ 6600211 ] : 6612724 0.347826 (=240/(2*345)) 67.6607 best keyword for cluster 6612724 is PF00346 with Jaccard = 0.9904 [ 309 3 1099899 0 ] 0.9904 1.0000 SUGGESTING RELATEDNESS OF: A> PF00374 ( PF00374 Nickel-dependent hydrogenase ) B> PF00346 ( PF00346 Respiratory-chain NADH dehydrogenase, 49 Kd subunit ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00346| = 309 , |PF00374| = 287 , |PF00346^PF00374| = 37 ( 12.0% and 12.9% ) only PF00374 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 584 ) 6687705_PF06151_PF08395 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08395 is 6658598 with Jaccard = 0.8598 |PF08395|=164 [ 141 0 1100047 23 ] parent [ 6658598 ] : 6687705 0.131944 (=475/(25*144)) 89.7932 given [ 6658598 ] : 6658598 0.198354 (=241/(9*135)) 82.987 best keyword for cluster 6658598 is PF08395 with Jaccard = 0.8598 [ 141 0 1100047 23 ] 1.0000 0.8598 sibling [ 6658598 ] : 6613064 0.378788 (=25/(22*3)) 67.8268 best keyword for cluster 6613064 is PF06151 with Jaccard = 0.8333 [ 20 4 1100187 0 ] 0.8333 1.0000 SUGGESTING RELATEDNESS OF: A> PF08395 ( PF08395 7tm Chemosensory receptor ) B> PF06151 ( PF06151 Trehalose receptor ) they come from the same clan: CL0176.5 : PF02949 PF08395 PF03268 PF06151 the two keywords do not coincide on UniRef90 proteins Neither PF08395 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 585 ) 6737788_PF05514_PF07681 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07681 is 6729047 with Jaccard = 0.8595 |PF07681|=443 [ 422 48 1099720 21 ] parent [ 6729047 ] : 6737788 0.0283 (=280/(17*582)) 97.5656 given [ 6729047 ] : 6729047 0.0440906 (=1340/(58*524)) 96.6143 best keyword for cluster 6729047 is PF07681 with Jaccard = 0.8595 [ 422 48 1099720 21 ] 0.8979 0.9526 sibling [ 6729047 ] : 6701508 0.1 (=6/(12*5)) 92.4433 best keyword for cluster 6701508 is PF05514 with Jaccard = 0.9231 [ 12 1 1100198 0 ] 0.9231 1.0000 SUGGESTING RELATEDNESS OF: A> PF07681 ( PF07681 DoxX ) B> PF05514 ( PF05514 HR-like lesion-inducing ) Only A has a clan ( CL0131.6 ). the two keywords do not coincide on UniRef90 proteins Neither PF07681 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 586 ) 6646199_PF00131_PF01439 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01439 is 6630315 with Jaccard = 0.8581 |PF01439|=144 [ 133 11 1100056 11 ] parent [ 6630315 ] : 6646199 0.292884 (=9261/(186*170)) 78.7596 given [ 6630315 ] : 6630315 0.375 (=369/(6*164)) 74.6983 best keyword for cluster 6630315 is PF01439 with Jaccard = 0.8581 [ 133 11 1100056 11 ] 0.9236 0.9236 sibling [ 6630315 ] : 6628254 0.315164 (=1353/(27*159)) 73.8056 best keyword for cluster 6628254 is PF00131 with Jaccard = 0.7431 [ 81 23 1100102 5 ] 0.7788 0.9419 SUGGESTING RELATEDNESS OF: A> PF01439 ( PF01439 Metallothionein ) B> PF00131 ( PF00131 Metallothionein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01439 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 587 ) 6556301_PF04877_PF07132 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04877 is 5618755 with Jaccard = 0.8571 |PF04877|=7 [ 6 0 1100204 1 ] parent [ 5618755 ] : 6556301 0.625 (=30/(6*8)) 43.9395 given [ 5618755 ] : 5618755 1 (=9/(3*3)) 2.47778e-77 best keyword for cluster 5618755 is PF04877 with Jaccard = 0.8571 [ 6 0 1100204 1 ] 1.0000 0.8571 sibling [ 5618755 ] : 6363179 1 (=7/(1*7)) 3.91429e-05 best keyword for cluster 6363179 is PF07132 with Jaccard = 0.8750 [ 7 1 1100203 0 ] 0.8750 1.0000 SUGGESTING RELATEDNESS OF: A> PF04877 ( PF04877 HrpZ ) B> PF07132 ( PF07132 Harpin protein (HrpN) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF04877 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 588 ) 6734062_PF01663_PF01676 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01663 is 6689075 with Jaccard = 0.8527 |PF01663|=306 [ 301 47 1099858 5 ] parent [ 6689075 ] : 6734062 0.0374571 (=6857/(439*417)) 97.1713 given [ 6689075 ] : 6689075 0.124902 (=640/(12*427)) 90.0879 best keyword for cluster 6689075 is PF01663 with Jaccard = 0.8527 [ 301 47 1099858 5 ] 0.8649 0.9837 sibling [ 6689075 ] : 6713780 0.0596591 (=735/(385*32)) 94.5262 best keyword for cluster 6713780 is PF01676 with Jaccard = 0.9446 [ 341 11 1099850 9 ] 0.9688 0.9743 SUGGESTING RELATEDNESS OF: A> PF01663 ( PF01663 Type I phosphodiesterase / nucleotide pyrophosphatase ) B> PF01676 ( PF01676 Metalloenzyme superfamily ) they come from the same clan: CL0088.10 : PF00884 PF01663 PF08665 PF01676 PF02995 PF07394 PF00245 the two keywords do not coincide on UniRef90 proteins both PF01663 and PF01676 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 589 ) 6674985_PF05646_PF07019 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05646 is 6664519 with Jaccard = 0.8519 |PF05646|=27 [ 23 0 1100184 4 ] parent [ 6664519 ] : 6674985 0.16129 (=110/(22*31)) 86.8959 given [ 6664519 ] : 6664519 0.166667 (=18/(27*4)) 84.1916 best keyword for cluster 6664519 is PF05646 with Jaccard = 0.8519 [ 23 0 1100184 4 ] 1.0000 0.8519 sibling [ 6664519 ] : 6602333 0.4 (=16/(2*20)) 62.9261 best keyword for cluster 6602333 is PF07019 with Jaccard = 0.9444 [ 17 1 1100193 0 ] 0.9444 1.0000 SUGGESTING RELATEDNESS OF: A> PF05646 ( PF05646 Protein of unknown function (DUF786) ) B> PF07019 ( PF07019 Rab5-interacting protein (Rab5ip) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05646 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 590 ) 6651854_PF00069_PF08311 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08311 is 6580339 with Jaccard = 0.8491 |PF08311|=46 [ 45 7 1100158 1 ] parent [ 6580339 ] : 6651854 0.225494 (=165733/(58*12672)) 80.6142 given [ 6580339 ] : 6580339 0.5 (=56/(2*56)) 53.832 best keyword for cluster 6580339 is PF08311 with Jaccard = 0.8491 [ 45 7 1100158 1 ] 0.8654 0.9783 sibling [ 6580339 ] : 6650440 0.237647 (=57132/(19*12653)) 80.1955 best keyword for cluster 6650440 is PF00069 with Jaccard = 0.7752 [ 10205 1790 1087046 1170 ] 0.8508 0.8971 SUGGESTING RELATEDNESS OF: A> PF08311 ( PF08311 Mad3/BUB1 homology region 1 ) B> PF00069 ( PF00069 Protein kinase domain ) Only B has a clan ( CL0016.14 ). the two keywords coincide on Uniref90 proteins: |PF00069| = 11375 , |PF08311| = 46 , |PF00069^PF08311| = 24 ( 0.2% and 52.2% ) only PF08311 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00069 SSF56112 0.797 (average over 32363 mutual instances, PF00069 36405 appearances, SSF56112 66637 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 591 ) 6468685_PF02134_PF05237 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02134 is 6428878 with Jaccard = 0.8485 |PF02134|=157 [ 140 8 1100046 17 ] parent [ 6428878 ] : 6468685 0.970412 (=79959/(149*553)) 3.51969 given [ 6428878 ] : 6428878 0.997629 (=4208/(38*111)) 0.254901 best keyword for cluster 6428878 is PF02134 with Jaccard = 0.8485 [ 140 8 1100046 17 ] 0.9459 0.8917 sibling [ 6428878 ] : 6461015 0.980108 (=53608/(129*424)) 2.44493 best keyword for cluster 6461015 is PF05237 with Jaccard = 0.6440 [ 331 178 1099697 5 ] 0.6503 0.9851 SUGGESTING RELATEDNESS OF: A> PF02134 ( PF02134 Repeat in ubiquitin-activating (UBA) protein ) B> PF05237 ( PF05237 MoeZ/MoeB domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF02134 and PF05237 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 592 ) 6778122_PF04406_PF06882 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06882 is 6770500 with Jaccard = 0.8485 |PF06882|=64 [ 56 2 1100145 8 ] parent [ 6770500 ] : 6778122 0.000804934 (=65/(412*196)) 99.9224 given [ 6770500 ] : 6770500 0.00282636 (=112/(153*259)) 99.7276 best keyword for cluster 6770500 is PF06882 with Jaccard = 0.8485 [ 56 2 1100145 8 ] 0.9655 0.8750 sibling [ 6770500 ] : 6774757 0.0019971 (=11/(34*162)) 99.8496 best keyword for cluster 6774757 is PF04406 with Jaccard = 0.9500 [ 57 3 1100151 0 ] 0.9500 1.0000 SUGGESTING RELATEDNESS OF: A> PF06882 ( PF06882 Protein of unknown function (DUF1263) ) B> PF04406 ( PF04406 Type IIB DNA topoisomerase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06882 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 593 ) 6754793_PF04977_PF04999 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04977 is 6713845 with Jaccard = 0.8462 |PF04977|=208 [ 176 0 1100003 32 ] parent [ 6713845 ] : 6754793 0.0135887 (=481/(207*171)) 98.9564 given [ 6713845 ] : 6713845 0.0691942 (=407/(34*173)) 94.5381 best keyword for cluster 6713845 is PF04977 with Jaccard = 0.8462 [ 176 0 1100003 32 ] 1.0000 0.8462 sibling [ 6713845 ] : 6745174 0.0247302 (=110/(32*139)) 98.247 best keyword for cluster 6745174 is PF04999 with Jaccard = 0.7558 [ 65 21 1100125 0 ] 0.7558 1.0000 SUGGESTING RELATEDNESS OF: A> PF04977 ( PF04977 Septum formation initiator ) B> PF04999 ( PF04999 Cell division protein FtsL ) they come from the same clan: CL0225.3 : PF04977 PF04999 the two keywords coincide on Uniref90 proteins: |PF04977| = 208 , |PF04999| = 65 , |PF04977^PF04999| = 1 ( 0.5% and 1.5% ) Neither PF04977 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 594 ) 6643783_PF00060_PF00497 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00497 is 6639432 with Jaccard = 0.8442 |PF00497|=995 [ 840 0 1099216 155 ] parent [ 6639432 ] : 6643783 0.252819 (=79745/(303*1041)) 78.0743 given [ 6639432 ] : 6639432 0.245878 (=2535/(10*1031)) 76.851 best keyword for cluster 6639432 is PF00497 with Jaccard = 0.8442 [ 840 0 1099216 155 ] 1.0000 0.8442 sibling [ 6639432 ] : 6629027 0.273288 (=487/(6*297)) 74.0437 best keyword for cluster 6629027 is PF00060 with Jaccard = 0.8930 [ 242 29 1099940 0 ] 0.8930 1.0000 SUGGESTING RELATEDNESS OF: A> PF00497 ( PF00497 Bacterial extracellular solute-binding proteins, family 3 ) B> PF00060 ( PF00060 Ligand-gated ion channel ) Only A has a clan ( CL0177.7 ). the two keywords coincide on Uniref90 proteins: |PF00060| = 242 , |PF00497| = 995 , |PF00060^PF00497| = 5 ( 2.1% and 0.5% ) both PF00497 and PF00060 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 595 ) 6646403_PF02117_PF02175 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02117 is 6561562 with Jaccard = 0.8438 |PF02117|=64 [ 54 0 1100147 10 ] parent [ 6561562 ] : 6646403 0.229237 (=817/(54*66)) 78.9067 given [ 6561562 ] : 6561562 0.529412 (=81/(3*51)) 48.0511 best keyword for cluster 6561562 is PF02117 with Jaccard = 0.8438 [ 54 0 1100147 10 ] 1.0000 0.8438 sibling [ 6561562 ] : 6626574 0.302419 (=75/(4*62)) 73.1491 best keyword for cluster 6626574 is PF02175 with Jaccard = 0.7021 [ 33 12 1100164 2 ] 0.7333 0.9429 SUGGESTING RELATEDNESS OF: A> PF02117 ( PF02117 C.elegans Sra family integral membrane protein ) B> PF02175 ( PF02175 C.elegans integral membrane protein Srb ) they come from the same clan: CL0138.6 : PF02117 PF02175 PF03125 the two keywords do not coincide on UniRef90 proteins Neither PF02117 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 596 ) 6753694_PF05154_PF07754 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05154 is 6725797 with Jaccard = 0.8417 |PF05154|=139 [ 117 0 1100072 22 ] parent [ 6725797 ] : 6753694 0.0157154 (=1347/(176*487)) 98.8841 given [ 6725797 ] : 6725797 0.0416667 (=82/(164*12)) 96.208 best keyword for cluster 6725797 is PF05154 with Jaccard = 0.8417 [ 117 0 1100072 22 ] 1.0000 0.8417 sibling [ 6725797 ] : 6750033 0.0229825 (=131/(12*475)) 98.6209 best keyword for cluster 6750033 is PF07754 with Jaccard = 0.6667 [ 18 8 1100184 1 ] 0.6923 0.9474 SUGGESTING RELATEDNESS OF: A> PF05154 ( PF05154 TM2 domain ) B> PF07754 ( PF07754 Domain of unknown function (DUF1610) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05154 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 597 ) 6613166_PF00441_PF08028 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02770 is 6605561 with Jaccard = 0.8409 |PF02770|=1925 [ 1813 231 1098055 112 ] parent [ 6605561 ] : 6613166 0.341475 (=217145/(281*2263)) 67.8683 given [ 6605561 ] : 6605561 0.382416 (=125697/(2107*156)) 64.4188 best keyword for cluster 6605561 is PF00441 with Jaccard = 0.9296 [ 1927 117 1098138 29 ] 0.9428 0.9852 sibling [ 6605561 ] : 6555437 0.599206 (=4379/(252*29)) 43.0147 best keyword for cluster 6555437 is PF08028 with Jaccard = 0.8674 [ 229 23 1099947 12 ] 0.9087 0.9502 SUGGESTING RELATEDNESS OF: A> PF00441 ( PF00441 Acyl-CoA dehydrogenase, C-terminal domain ) B> PF08028 ( PF08028 Acyl-CoA dehydrogenase, C-terminal domain ) they come from the same clan: CL0087.7 : PF01756 PF00441 PF08028 the two keywords do not coincide on UniRef90 proteins only PF00441 has a PDB structure (may not be up to date) PF00441 a.29.3.1 SUPERFAM mapping significantly overlapping: 1 PF08028 SSF47203 0.844 (average over 653 mutual instances, PF08028 1323 appearances, SSF47203 17996 appearances) 2 PF00441 SSF47203 0.910 (average over 6570 mutual instances, PF00441 13147 appearances, SSF47203 17996 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 598 ) 6516641_PF00759_PF02927 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02927 is 6455799 with Jaccard = 0.8409 |PF02927|=39 [ 37 5 1100167 2 ] parent [ 6455799 ] : 6516641 0.84375 (=7425/(44*200)) 18.4615 given [ 6455799 ] : 6455799 0.983333 (=472/(20*24)) 1.8588 best keyword for cluster 6455799 is PF02927 with Jaccard = 0.8409 [ 37 5 1100167 2 ] 0.8810 0.9487 sibling [ 6455799 ] : 6514071 0.831633 (=652/(4*196)) 17.0268 best keyword for cluster 6514071 is PF00759 with Jaccard = 0.7471 [ 195 0 1099950 66 ] 1.0000 0.7471 SUGGESTING RELATEDNESS OF: A> PF02927 ( PF02927 N-terminal ig-like domain of cellulase ) B> PF00759 ( PF00759 Glycosyl hydrolase family 9 ) Only B has a clan ( CL0059.10 ). the two keywords coincide on Uniref90 proteins: |PF00759| = 261 , |PF02927| = 39 , |PF00759^PF02927| = 39 ( 14.9% and 100.0% ) both PF02927 and PF00759 have PDB structures PF02927 b.1.18.2 b.18.1.24 PF00759 a.102.1.2 SUPERFAM mapping significantly overlapping: 1 PF00759 SSF48208 0.965 (average over 621 mutual instances, PF00759 956 appearances, SSF48208 6032 appearances) 2 PF02927 SSF81296 0.845 (average over 77 mutual instances, PF02927 221 appearances, SSF81296 30857 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 599 ) 6709420_PF00266_PF01053 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00266 is 6704692 with Jaccard = 0.8407 |PF00266|=1537 [ 1304 14 1098660 233 ] parent [ 6704692 ] : 6709420 0.0748695 (=117157/(1046*1496)) 93.8493 given [ 6704692 ] : 6704692 0.0738941 (=441/(4*1492)) 93.014 best keyword for cluster 6704692 is PF00266 with Jaccard = 0.8407 [ 1304 14 1098660 233 ] 0.9894 0.8484 sibling [ 6704692 ] : 6687938 0.123536 (=11055/(94*952)) 89.8447 best keyword for cluster 6687938 is PF01053 with Jaccard = 0.8562 [ 810 131 1099265 5 ] 0.8608 0.9939 SUGGESTING RELATEDNESS OF: A> PF00266 ( PF00266 Aminotransferase class-V ) B> PF01053 ( PF01053 Cys/Met metabolism PLP-dependent enzyme ) they come from the same clan: CL0061.8 : PF05889 PF00464 PF03841 PF00282 PF01276 PF02347 PF01041 PF01053 PF01212 PF00266 PF00202 PF00155 PF06838 PF04864 the two keywords coincide on Uniref90 proteins: |PF00266| = 1537 , |PF01053| = 815 , |PF00266^PF01053| = 1 ( 0.1% and 0.1% ) both PF00266 and PF01053 have PDB structures PF00266 c.67.1.3 c.67.1.4 PF01053 c.67.1.3 SUPERFAM mapping significantly overlapping: 1 PF00266 SSF53383 0.863 (average over 4864 mutual instances, PF00266 4914 appearances, SSF53383 34644 appearances) 2 PF01053 SSF53383 0.965 (average over 2570 mutual instances, PF01053 2583 appearances, SSF53383 34644 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 600 ) 6676383_PF00969_PF07654 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00969 is 6668343 with Jaccard = 0.8404 |PF00969|=1642 [ 1380 0 1098569 262 ] parent [ 6668343 ] : 6676383 0.1361 (=403911/(1433*2071)) 87.2854 given [ 6668343 ] : 6668343 0.161995 (=1387/(6*1427)) 85.0401 best keyword for cluster 6668343 is PF00969 with Jaccard = 0.8404 [ 1380 0 1098569 262 ] 1.0000 0.8404 sibling [ 6668343 ] : 6672045 0.150629 (=1556/(5*2066)) 86.0136 best keyword for cluster 6672045 is PF07654 with Jaccard = 0.6767 [ 1411 592 1098126 82 ] 0.7044 0.9451 SUGGESTING RELATEDNESS OF: A> PF00969 ( PF00969 Class II histocompatibility antigen, beta domain ) B> PF07654 ( PF07654 Immunoglobulin C1-set domain ) Only B has a clan ( CL0011.18 ). the two keywords coincide on Uniref90 proteins: |PF00969| = 1642 , |PF07654| = 1493 , |PF00969^PF07654| = 281 ( 17.1% and 18.8% ) both PF00969 and PF07654 have PDB structures PF07654 b.1.1.2 SUPERFAM mapping significantly overlapping: 1 PF00969 SSF54452 0.877 (average over 8969 mutual instances, PF00969 8969 appearances, SSF54452 25772 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 601 ) 6635446_PF03534_PF05593 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03534 is 6514144 with Jaccard = 0.8400 |PF03534|=23 [ 21 2 1100186 2 ] parent [ 6514144 ] : 6635446 0.253386 (=3180/(25*502)) 75.8558 given [ 6514144 ] : 6514144 0.833333 (=20/(1*24)) 17.076 best keyword for cluster 6514144 is PF03534 with Jaccard = 0.8400 [ 21 2 1100186 2 ] 0.9130 0.9130 sibling [ 6514144 ] : 6633457 0.287075 (=1688/(12*490)) 75.4311 best keyword for cluster 6633457 is PF05593 with Jaccard = 0.9171 [ 321 11 1099861 18 ] 0.9669 0.9469 SUGGESTING RELATEDNESS OF: A> PF03534 ( PF03534 Salmonella virulence plasmid 65kDa B protein ) B> PF05593 ( PF05593 RHS Repeat ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03534| = 23 , |PF05593| = 339 , |PF03534^PF05593| = 4 ( 17.4% and 1.2% ) Neither PF03534 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 602 ) 6703032_PF00332_PF03198 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00332 is 6657782 with Jaccard = 0.8398 |PF00332|=313 [ 304 49 1099849 9 ] parent [ 6657782 ] : 6703032 0.0838167 (=3150/(437*86)) 92.7213 given [ 6657782 ] : 6657782 0.209872 (=7220/(334*103)) 82.6347 best keyword for cluster 6657782 is PF00332 with Jaccard = 0.8398 [ 304 49 1099849 9 ] 0.8612 0.9712 sibling [ 6657782 ] : 6676339 0.136546 (=34/(83*3)) 87.2635 best keyword for cluster 6676339 is PF03198 with Jaccard = 0.9878 [ 81 1 1100129 0 ] 0.9878 1.0000 SUGGESTING RELATEDNESS OF: A> PF00332 ( PF00332 Glycosyl hydrolases family 17 ) B> PF03198 ( PF03198 Glycolipid anchored surface protein (GAS1) ) they come from the same clan: CL0058.10 : PF07971 PF02446 PF03198 PF02324 PF02057 PF01630 PF07745 PF02449 PF01229 PF01301 PF01055 PF02055 PF00933 PF02836 PF02156 PF01183 PF00728 PF00704 PF00332 PF01373 PF00331 PF00232 PF02638 PF00150 PF00128 PF02065 the two keywords do not coincide on UniRef90 proteins only PF00332 has a PDB structure (may not be up to date) PF00332 c.1.8.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 603 ) 6670332_PF00534_PF08323 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08323 is 6656397 with Jaccard = 0.8397 |PF08323|=281 [ 262 31 1099899 19 ] parent [ 6656397 ] : 6670332 0.167036 (=198174/(336*3531)) 85.6249 given [ 6656397 ] : 6656397 0.22006 (=147/(2*334)) 82.1344 best keyword for cluster 6656397 is PF08323 with Jaccard = 0.8397 [ 262 31 1099899 19 ] 0.8942 0.9324 sibling [ 6656397 ] : 6667843 0.17602 (=1863/(3*3528)) 84.954 best keyword for cluster 6667843 is PF00534 with Jaccard = 0.8054 [ 3112 7 1096347 745 ] 0.9978 0.8068 SUGGESTING RELATEDNESS OF: A> PF08323 ( PF08323 Starch synthase catalytic domain ) B> PF00534 ( PF00534 Glycosyl transferases group 1 ) Only B has a clan ( CL0113.8 ). the two keywords coincide on Uniref90 proteins: |PF00534| = 3857 , |PF08323| = 281 , |PF00534^PF08323| = 237 ( 6.1% and 84.3% ) both PF08323 and PF00534 have PDB structures PF08323 c.87.1.8 PF00534 c.87.1.8 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 604 ) 6707025_PF00102_PF00782 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00782 is 6702473 with Jaccard = 0.8354 |PF00782|=648 [ 614 87 1099476 34 ] parent [ 6702473 ] : 6707025 0.0898443 (=53604/(697*856)) 93.4771 given [ 6702473 ] : 6702473 0.110245 (=2644/(29*827)) 92.6183 best keyword for cluster 6702473 is PF00782 with Jaccard = 0.8354 [ 614 87 1099476 34 ] 0.8759 0.9475 sibling [ 6702473 ] : 6678150 0.147908 (=410/(4*693)) 87.7265 best keyword for cluster 6678150 is PF00102 with Jaccard = 0.8047 [ 618 49 1099443 101 ] 0.9265 0.8595 SUGGESTING RELATEDNESS OF: A> PF00782 ( PF00782 Dual specificity phosphatase, catalytic domain ) B> PF00102 ( PF00102 Protein-tyrosine phosphatase ) they come from the same clan: CL0031.8 : PF00102 PF04273 PF00782 PF05706 PF03162 the two keywords do not coincide on UniRef90 proteins both PF00782 and PF00102 have PDB structures PF00782 c.45.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 605 ) 6551245_PF00971_PF01045 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01045 is 6325717 with Jaccard = 0.8333 |PF01045|=6 [ 5 0 1100205 1 ] parent [ 6325717 ] : 6551245 0.6 (=69/(23*5)) 40 given [ 6325717 ] : 6325717 1 (=4/(1*4)) 1e-07 best keyword for cluster 6325717 is PF01045 with Jaccard = 0.8333 [ 5 0 1100205 1 ] 1.0000 0.8333 sibling [ 6325717 ] : 6275750 1 (=76/(19*4)) 2.17657e-11 best keyword for cluster 6275750 is PF00971 with Jaccard = 0.8519 [ 23 0 1100184 4 ] 1.0000 0.8519 SUGGESTING RELATEDNESS OF: A> PF01045 ( PF01045 EIAV glycoprotein, gp45 ) B> PF00971 ( PF00971 EIAV coat protein, gp90 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00971| = 27 , |PF01045| = 6 , |PF00971^PF01045| = 4 ( 14.8% and 66.7% ) Neither PF01045 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 606 ) 6475606_PF01509_PF08068 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08068 is 6457596 with Jaccard = 0.8333 |PF08068|=65 [ 60 7 1100139 5 ] parent [ 6457596 ] : 6475606 0.959075 (=19826/(304*68)) 4.78632 given [ 6457596 ] : 6457596 0.980114 (=1035/(44*24)) 2.03885 best keyword for cluster 6457596 is PF08068 with Jaccard = 0.8333 [ 60 7 1100139 5 ] 0.8955 0.9231 sibling [ 6457596 ] : 6413094 0.999554 (=6717/(24*280)) 0.046832 best keyword for cluster 6413094 is PF01509 with Jaccard = 0.8023 [ 280 0 1099862 69 ] 1.0000 0.8023 SUGGESTING RELATEDNESS OF: A> PF08068 ( PF08068 DKCLD (NUC011) domain ) B> PF01509 ( PF01509 TruB family pseudouridylate synthase (N terminal domain) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01509| = 349 , |PF08068| = 65 , |PF01509^PF08068| = 63 ( 18.1% and 96.9% ) both PF08068 and PF01509 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 607 ) 6561138_PF01037_PF08394 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08394 is 6336483 with Jaccard = 0.8333 |PF08394|=12 [ 10 0 1100199 2 ] parent [ 6336483 ] : 6561138 0.554545 (=4758/(10*858)) 47.8405 given [ 6336483 ] : 6336483 1 (=21/(3*7)) 5.85996e-07 best keyword for cluster 6336483 is PF08394 with Jaccard = 0.8333 [ 10 0 1100199 2 ] 1.0000 0.8333 sibling [ 6336483 ] : 6558108 0.602662 (=5615/(11*847)) 45.1333 best keyword for cluster 6558108 is PF01037 with Jaccard = 0.8715 [ 739 18 1099363 91 ] 0.9762 0.8904 SUGGESTING RELATEDNESS OF: A> PF08394 ( PF08394 Archaeal TRASH domain ) B> PF01037 ( PF01037 AsnC family ) A and B come from a different clan ( CL0175.5 , CL0032.9 ). the two keywords do not coincide on UniRef90 proteins only PF08394 has a PDB structure (may not be up to date) PF01037 d.58.4.2 SUPERFAM mapping significantly overlapping: 1 PF01037 SSF54909 0.871 (average over 3219 mutual instances, PF01037 3221 appearances, SSF54909 7040 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 608 ) 6667472_PF00102_PF02206 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00102 is 6659227 with Jaccard = 0.8331 |PF00102|=719 [ 599 0 1099492 120 ] parent [ 6659227 ] : 6667472 0.155901 (=7226/(75*618)) 84.8418 given [ 6659227 ] : 6659227 0.206026 (=506/(4*614)) 83.1854 best keyword for cluster 6659227 is PF00102 with Jaccard = 0.8331 [ 599 0 1099492 120 ] 1.0000 0.8331 sibling [ 6659227 ] : 6659164 0.189815 (=41/(3*72)) 83.1415 best keyword for cluster 6659164 is PF02206 with Jaccard = 0.7808 [ 57 9 1100138 7 ] 0.8636 0.8906 SUGGESTING RELATEDNESS OF: A> PF00102 ( PF00102 Protein-tyrosine phosphatase ) B> PF02206 ( PF02206 Domain of unknown function ) Only A has a clan ( CL0031.8 ). the two keywords coincide on Uniref90 proteins: |PF00102| = 719 , |PF02206| = 64 , |PF00102^PF02206| = 16 ( 2.2% and 25.0% ) only PF00102 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 609 ) 6753502_PF00339_PF03643 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00339 is 6745088 with Jaccard = 0.8321 |PF00339|=252 [ 228 22 1099937 24 ] parent [ 6745088 ] : 6753502 0.0148073 (=395/(78*342)) 98.871 given [ 6745088 ] : 6745088 0.021021 (=63/(9*333)) 98.2416 best keyword for cluster 6745088 is PF00339 with Jaccard = 0.8321 [ 228 22 1099937 24 ] 0.9120 0.9048 sibling [ 6745088 ] : 6744992 0.0519481 (=4/(1*77)) 98.2338 best keyword for cluster 6744992 is PF03643 with Jaccard = 0.9853 [ 67 0 1100143 1 ] 1.0000 0.9853 SUGGESTING RELATEDNESS OF: A> PF00339 ( PF00339 Arrestin (or S-antigen), N-terminal domain ) B> PF03643 ( PF03643 Vacuolar protein sorting-associated protein 26 ) they come from the same clan: CL0135.6 : PF00339 PF07070 PF03643 the two keywords coincide on Uniref90 proteins: |PF00339| = 252 , |PF03643| = 68 , |PF00339^PF03643| = 2 ( 0.8% and 2.9% ) only PF00339 has a PDB structure (may not be up to date) PF00339 b.1.18.11 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 610 ) 6733269_PF00264_PF03723 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03722 is 6702906 with Jaccard = 0.8301 |PF03722|=171 [ 171 35 1100005 0 ] parent [ 6702906 ] : 6733269 0.0433528 (=3034/(216*324)) 97.0829 given [ 6702906 ] : 6702906 0.0744186 (=16/(1*215)) 92.6929 best keyword for cluster 6702906 is PF03723 with Jaccard = 0.9130 [ 189 17 1100004 1 ] 0.9175 0.9947 sibling [ 6702906 ] : 6732209 0.0309598 (=10/(1*323)) 96.9679 best keyword for cluster 6732209 is PF00264 with Jaccard = 0.9864 [ 290 0 1099917 4 ] 1.0000 0.9864 SUGGESTING RELATEDNESS OF: A> PF03723 ( PF03723 Hemocyanin, ig-like domain ) B> PF00264 ( PF00264 Common central domain of tyrosinase ) Only B has a clan ( CL0205.5 ). the two keywords do not coincide on UniRef90 proteins both PF03723 and PF00264 have PDB structures PF03723 b.1.18.3 SUPERFAM mapping significantly overlapping: 1 PF00264 SSF48056 0.819 (average over 1601 mutual instances, PF00264 1604 appearances, SSF48056 2598 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 611 ) 6592227_PF00519_PF01057 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01057 is 6531307 with Jaccard = 0.8293 |PF01057|=41 [ 34 0 1100170 7 ] parent [ 6531307 ] : 6592227 0.483186 (=2845/(46*128)) 58.0319 given [ 6531307 ] : 6531307 0.752688 (=350/(31*15)) 26.5954 best keyword for cluster 6531307 is PF01057 with Jaccard = 0.8293 [ 34 0 1100170 7 ] 1.0000 0.8293 sibling [ 6531307 ] : 6539179 0.694444 (=175/(2*126)) 31.8946 best keyword for cluster 6539179 is PF00519 with Jaccard = 0.9920 [ 124 1 1100086 0 ] 0.9920 1.0000 SUGGESTING RELATEDNESS OF: A> PF01057 ( PF01057 Parvovirus non-structural protein NS1 ) B> PF00519 ( PF00519 Papillomavirus helicase ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01057 and PF00519 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 612 ) 6686807_PF03706_PF04329 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03706 is 6674998 with Jaccard = 0.8273 |PF03706|=138 [ 115 1 1100072 23 ] parent [ 6674998 ] : 6686807 0.124997 (=4922/(169*233)) 89.6224 given [ 6674998 ] : 6674998 0.157307 (=958/(30*203)) 86.9023 best keyword for cluster 6674998 is PF03706 with Jaccard = 0.8273 [ 115 1 1100072 23 ] 0.9914 0.8333 sibling [ 6674998 ] : 6653191 0.228819 (=740/(22*147)) 81.0397 best keyword for cluster 6653191 is PF04329 with Jaccard = 0.8280 [ 77 6 1100118 10 ] 0.9277 0.8851 SUGGESTING RELATEDNESS OF: A> PF03706 ( PF03706 Uncharacterised protein family (UPF0104) ) B> PF04329 ( PF04329 Family of unknown function (DUF470) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF03706| = 138 , |PF04329| = 87 , |PF03706^PF04329| = 13 ( 9.4% and 14.9% ) Neither PF03706 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 613 ) 6692803_PF03205_PF07683 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03205 is 6524027 with Jaccard = 0.8261 |PF03205|=138 [ 114 0 1100073 24 ] parent [ 6524027 ] : 6692803 0.128002 (=7504/(128*458)) 90.8378 given [ 6524027 ] : 6524027 0.797333 (=299/(3*125)) 22.2543 best keyword for cluster 6524027 is PF03205 with Jaccard = 0.8261 [ 114 0 1100073 24 ] 1.0000 0.8261 sibling [ 6524027 ] : 6690468 0.10022 (=182/(454*4)) 90.356 best keyword for cluster 6690468 is PF07683 with Jaccard = 0.8228 [ 339 70 1099799 3 ] 0.8289 0.9912 SUGGESTING RELATEDNESS OF: A> PF03205 ( PF03205 Molybdopterin guanine dinucleotide synthesis protein B ) B> PF07683 ( PF07683 Cobalamin synthesis protein cobW C-terminal domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF03205 and PF07683 have PDB structures PF07683 d.237.1.1 SUPERFAM mapping significantly overlapping: 1 PF07683 SSF90002 0.980 (average over 1038 mutual instances, PF07683 1046 appearances, SSF90002 2049 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 614 ) 6701651_PF00155_PF00392 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00155 is 6694887 with Jaccard = 0.8258 |PF00155|=3377 [ 2816 33 1096801 561 ] parent [ 6694887 ] : 6701651 0.0820302 (=567380/(2193*3154)) 92.4733 given [ 6694887 ] : 6694887 0.101111 (=1592/(5*3149)) 91.2929 best keyword for cluster 6694887 is PF00155 with Jaccard = 0.8258 [ 2816 33 1096801 561 ] 0.9884 0.8339 sibling [ 6694887 ] : 6658415 0.186615 (=1634/(4*2189)) 82.8631 best keyword for cluster 6658415 is PF00392 with Jaccard = 0.7866 [ 1950 14 1097732 515 ] 0.9929 0.7911 SUGGESTING RELATEDNESS OF: A> PF00155 ( PF00155 Aminotransferase class I and II ) B> PF00392 ( PF00392 Bacterial regulatory proteins, gntR family ) A and B come from a different clan ( CL0061.8 , CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF00155| = 3377 , |PF00392| = 2465 , |PF00155^PF00392| = 432 ( 12.8% and 17.5% ) both PF00155 and PF00392 have PDB structures PF00155 c.67.1.1 c.67.1.3 c.67.1.4 PF00392 a.4.5.6 SUPERFAM mapping significantly overlapping: 1 PF00155 SSF53383 0.849 (average over 10819 mutual instances, PF00155 10880 appearances, SSF53383 34644 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 615 ) 6723269_PF06283_PF06439 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06439 is 6628229 with Jaccard = 0.8246 |PF06439|=57 [ 47 0 1100154 10 ] parent [ 6628229 ] : 6723269 0.0421941 (=180/(54*79)) 95.8968 given [ 6628229 ] : 6628229 0.326531 (=80/(5*49)) 73.7849 best keyword for cluster 6628229 is PF06439 with Jaccard = 0.8246 [ 47 0 1100154 10 ] 1.0000 0.8246 sibling [ 6628229 ] : 6686033 0.135093 (=174/(23*56)) 89.4827 best keyword for cluster 6686033 is PF06283 with Jaccard = 0.7778 [ 21 6 1100184 0 ] 0.7778 1.0000 SUGGESTING RELATEDNESS OF: A> PF06439 ( PF06439 Domain of Unknown Function (DUF1080) ) B> PF06283 ( PF06283 Protein of unknown function (DUF1037) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF06439 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 616 ) 6589675_PF00063_PF00784 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00063 is 6580724 with Jaccard = 0.8237 |PF00063|=638 [ 528 3 1099570 110 ] parent [ 6580724 ] : 6589675 0.43471 (=12787/(53*555)) 57.1727 given [ 6580724 ] : 6580724 0.471636 (=1297/(5*550)) 53.9676 best keyword for cluster 6580724 is PF00063 with Jaccard = 0.8237 [ 528 3 1099570 110 ] 0.9944 0.8276 sibling [ 6580724 ] : 6552525 0.692308 (=36/(1*52)) 40.8609 best keyword for cluster 6552525 is PF00784 with Jaccard = 0.6234 [ 48 4 1100134 25 ] 0.9231 0.6575 SUGGESTING RELATEDNESS OF: A> PF00063 ( PF00063 Myosin head (motor domain) ) B> PF00784 ( PF00784 MyTH4 domain ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00063| = 638 , |PF00784| = 73 , |PF00063^PF00784| = 34 ( 5.3% and 46.6% ) only PF00063 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 617 ) 6716126_PF00076_PF01805 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00076 is 6714471 with Jaccard = 0.8228 |PF00076|=4044 [ 3441 138 1096029 603 ] parent [ 6714471 ] : 6716126 0.0578344 (=29681/(127*4041)) 94.9031 given [ 6714471 ] : 6714471 0.0651384 (=1577/(6*4035)) 94.6378 best keyword for cluster 6714471 is PF00076 with Jaccard = 0.8228 [ 3441 138 1096029 603 ] 0.9614 0.8509 sibling [ 6714471 ] : 6711407 0.0606557 (=37/(5*122)) 94.1557 best keyword for cluster 6711407 is PF01805 with Jaccard = 0.7661 [ 95 2 1100087 27 ] 0.9794 0.7787 SUGGESTING RELATEDNESS OF: A> PF00076 ( PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) ) B> PF01805 ( PF01805 Surp module ) Only A has a clan ( CL0221.5 ). the two keywords coincide on Uniref90 proteins: |PF00076| = 4044 , |PF01805| = 122 , |PF00076^PF01805| = 18 ( 0.4% and 14.8% ) both PF00076 and PF01805 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 618 ) 6722123_PF00096_PF06524 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00096 is 6720078 with Jaccard = 0.8226 |PF00096|=4886 [ 4234 261 1095064 652 ] parent [ 6720078 ] : 6722123 0.0569141 (=6368/(21*5328)) 95.7328 given [ 6720078 ] : 6720078 0.052689 (=2802/(10*5318)) 95.4265 best keyword for cluster 6720078 is PF00096 with Jaccard = 0.8226 [ 4234 261 1095064 652 ] 0.9419 0.8666 sibling [ 6720078 ] : 6690125 0.122222 (=11/(6*15)) 90.2811 best keyword for cluster 6690125 is PF06524 with Jaccard = 0.7273 [ 8 3 1100200 0 ] 0.7273 1.0000 SUGGESTING RELATEDNESS OF: A> PF00096 ( PF00096 Zinc finger, C2H2 type ) B> PF06524 ( PF06524 NOA36 protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00096 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 619 ) 6756152_PF01167_PF03478 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03478 is 6737410 with Jaccard = 0.8221 |PF03478|=212 [ 208 41 1099958 4 ] parent [ 6737410 ] : 6756152 0.0141183 (=377/(69*387)) 99.0409 given [ 6737410 ] : 6737410 0.0287947 (=140/(13*374)) 97.5271 best keyword for cluster 6737410 is PF03478 with Jaccard = 0.8221 [ 208 41 1099958 4 ] 0.8353 0.9811 sibling [ 6737410 ] : 6666452 0.157692 (=41/(65*4)) 84.5977 best keyword for cluster 6666452 is PF01167 with Jaccard = 0.8592 [ 61 0 1100140 10 ] 1.0000 0.8592 SUGGESTING RELATEDNESS OF: A> PF03478 ( PF03478 Protein of unknown function (DUF295) ) B> PF01167 ( PF01167 Tub family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF03478 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01167 SSF54518 0.784 (average over 177 mutual instances, PF01167 199 appearances, SSF54518 251 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 620 ) 6700926_PF05170_PF05359 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05170 is 6698578 with Jaccard = 0.8210 |PF05170|=162 [ 133 0 1100049 29 ] parent [ 6698578 ] : 6700926 0.0945652 (=1218/(70*184)) 92.3402 given [ 6698578 ] : 6698578 0.0917603 (=98/(6*178)) 91.9775 best keyword for cluster 6698578 is PF05170 with Jaccard = 0.8210 [ 133 0 1100049 29 ] 1.0000 0.8210 sibling [ 6698578 ] : 6587305 0.477823 (=237/(8*62)) 56.2 best keyword for cluster 6587305 is PF05359 with Jaccard = 0.9800 [ 49 1 1100161 0 ] 0.9800 1.0000 SUGGESTING RELATEDNESS OF: A> PF05170 ( PF05170 AsmA family ) B> PF05359 ( PF05359 Domain of Unknown Function (DUF748) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05170 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 621 ) 6720440_PF00014_PF00095 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00014 is 6716513 with Jaccard = 0.8202 |PF00014|=425 [ 365 20 1099766 60 ] parent [ 6716513 ] : 6720440 0.0496207 (=3087/(151*412)) 95.4803 given [ 6716513 ] : 6716513 0.0722359 (=147/(5*407)) 94.971 best keyword for cluster 6716513 is PF00014 with Jaccard = 0.8202 [ 365 20 1099766 60 ] 0.9481 0.8588 sibling [ 6716513 ] : 6691549 0.12415 (=73/(147*4)) 90.5824 best keyword for cluster 6691549 is PF00095 with Jaccard = 0.6913 [ 103 12 1100062 34 ] 0.8957 0.7518 SUGGESTING RELATEDNESS OF: A> PF00014 ( PF00014 Kunitz/Bovine pancreatic trypsin inhibitor domain ) B> PF00095 ( PF00095 WAP-type (Whey Acidic Protein) 'four-disulfide core' ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00014| = 426 , |PF00095| = 137 , |PF00014^PF00095| = 31 ( 7.3% and 22.6% ) both PF00014 and PF00095 have PDB structures PF00014 g.8.1.1 g.8.1.2 k.35.1.1 SUPERFAM mapping significantly overlapping: 1 PF00095 SSF57256 0.887 (average over 257 mutual instances, PF00095 361 appearances, SSF57256 386 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 622 ) 6731550_PF00021_PF00087 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00021 is 6727290 with Jaccard = 0.8182 |PF00021|=133 [ 126 21 1100057 7 ] parent [ 6727290 ] : 6731550 0.0447345 (=2005/(180*249)) 96.8914 given [ 6727290 ] : 6727290 0.050223 (=563/(59*190)) 96.3975 best keyword for cluster 6727290 is PF00021 with Jaccard = 0.8182 [ 126 21 1100057 7 ] 0.8571 0.9474 sibling [ 6727290 ] : 6726427 0.0446927 (=8/(1*179)) 96.2886 best keyword for cluster 6726427 is PF00087 with Jaccard = 0.9819 [ 163 2 1100045 1 ] 0.9879 0.9939 SUGGESTING RELATEDNESS OF: A> PF00021 ( PF00021 u-PAR/Ly-6 domain ) B> PF00087 ( PF00087 Snake toxin ) they come from the same clan: CL0117.6 : PF01064 PF06211 PF02988 PF00087 PF00021 the two keywords do not coincide on UniRef90 proteins both PF00021 and PF00087 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 623 ) 6446230_PF00912_PF06832 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06832 is 6335824 with Jaccard = 0.8182 |PF06832|=54 [ 54 12 1100145 0 ] parent [ 6335824 ] : 6446230 0.990878 (=46708/(74*637)) 1.01074 given [ 6335824 ] : 6335824 1 (=73/(1*73)) 5.08114e-07 best keyword for cluster 6335824 is PF06832 with Jaccard = 0.8182 [ 54 12 1100145 0 ] 0.8182 1.0000 sibling [ 6335824 ] : 6430004 0.997592 (=14088/(23*614)) 0.282846 best keyword for cluster 6430004 is PF00912 with Jaccard = 0.8807 [ 576 0 1099557 78 ] 1.0000 0.8807 SUGGESTING RELATEDNESS OF: A> PF06832 ( PF06832 Penicillin-Binding Protein C-terminus Family ) B> PF00912 ( PF00912 Transglycosylase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00912| = 654 , |PF06832| = 54 , |PF00912^PF06832| = 53 ( 8.1% and 98.1% ) only PF06832 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 624 ) 6756416_PF00789_PF03556 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00789 is 6733568 with Jaccard = 0.8163 |PF00789|=247 [ 231 36 1099928 16 ] parent [ 6733568 ] : 6756416 0.0134178 (=307/(65*352)) 99.056 given [ 6733568 ] : 6733568 0.0374626 (=763/(73*279)) 97.1196 best keyword for cluster 6733568 is PF00789 with Jaccard = 0.8163 [ 231 36 1099928 16 ] 0.8652 0.9352 sibling [ 6733568 ] : 6554719 0.578125 (=37/(1*64)) 42.5896 best keyword for cluster 6554719 is PF03556 with Jaccard = 0.9831 [ 58 0 1100152 1 ] 1.0000 0.9831 SUGGESTING RELATEDNESS OF: A> PF00789 ( PF00789 UBX domain ) B> PF03556 ( PF03556 Domain of unknown function (DUF298) ) Only A has a clan ( CL0072.14 ). the two keywords do not coincide on UniRef90 proteins only PF00789 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 625 ) 6585690_PF00289_PF02844 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00289 is 6562669 with Jaccard = 0.8162 |PF00289|=1011 [ 857 39 1099161 154 ] parent [ 6562669 ] : 6585690 0.489895 (=135458/(281*984)) 55.5826 given [ 6562669 ] : 6562669 0.510682 (=502/(1*983)) 49.0745 best keyword for cluster 6562669 is PF00289 with Jaccard = 0.8162 [ 857 39 1099161 154 ] 0.9565 0.8477 sibling [ 6562669 ] : 6464215 0.97491 (=544/(2*279)) 2.89589 best keyword for cluster 6464215 is PF02844 with Jaccard = 0.8194 [ 245 7 1099912 47 ] 0.9722 0.8390 SUGGESTING RELATEDNESS OF: A> PF00289 ( PF00289 Carbamoyl-phosphate synthase L chain, N-terminal domain ) B> PF02844 ( PF02844 Phosphoribosylglycinamide synthetase, N domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF00289 and PF02844 have PDB structures PF00289 c.30.1.1 PF02844 c.30.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 626 ) 6611315_PF00046_PF00292 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00292 is 6572234 with Jaccard = 0.8150 |PF00292|=199 [ 163 1 1100011 36 ] parent [ 6572234 ] : 6611315 0.338824 (=155865/(169*2722)) 67.1606 given [ 6572234 ] : 6572234 0.553892 (=185/(2*167)) 51.5431 best keyword for cluster 6572234 is PF00292 with Jaccard = 0.8150 [ 163 1 1100011 36 ] 0.9939 0.8191 sibling [ 6572234 ] : 6607061 0.353877 (=31402/(33*2689)) 65.2037 best keyword for cluster 6607061 is PF00046 with Jaccard = 0.7494 [ 2545 26 1096815 825 ] 0.9899 0.7552 SUGGESTING RELATEDNESS OF: A> PF00292 ( PF00292 'Paired box' domain ) B> PF00046 ( PF00046 Homeobox domain ) Only B has a clan ( CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF00046| = 3370 , |PF00292| = 199 , |PF00046^PF00292| = 91 ( 2.7% and 45.7% ) both PF00292 and PF00046 have PDB structures PF00046 a.4.1.1 j.92.1.1 SUPERFAM mapping significantly overlapping: 1 PF00292 SSF46689 0.566 (average over 602 mutual instances, PF00292 602 appearances, SSF46689 68153 appearances) 2 PF00046 SSF46689 0.773 (average over 9143 mutual instances, PF00046 9568 appearances, SSF46689 68153 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 627 ) 6718245_PF02552_PF02776 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02552 is 6534419 with Jaccard = 0.8095 |PF02552|=21 [ 17 0 1100190 4 ] parent [ 6534419 ] : 6718245 0.0652965 (=1382/(17*1245)) 95.1822 given [ 6534419 ] : 6534419 0.733333 (=22/(15*2)) 28.5316 best keyword for cluster 6534419 is PF02552 with Jaccard = 0.8095 [ 17 0 1100190 4 ] 1.0000 0.8095 sibling [ 6534419 ] : 6710583 0.0746774 (=463/(5*1240)) 94.0099 best keyword for cluster 6710583 is PF02776 with Jaccard = 0.9328 [ 1028 72 1099109 2 ] 0.9345 0.9981 SUGGESTING RELATEDNESS OF: A> PF02552 ( PF02552 CO dehydrogenase beta subunit/acetyl-CoA synthase epsilon subunit ) B> PF02776 ( PF02776 Thiamine pyrophosphate enzyme, N-terminal TPP binding domain ) Only A has a clan ( CL0085.9 ). the two keywords do not coincide on UniRef90 proteins both PF02552 and PF02776 have PDB structures PF02552 c.31.1.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 628 ) 6755285_PF04488_PF05785 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04488 is 6752513 with Jaccard = 0.8095 |PF04488|=164 [ 153 25 1100022 11 ] parent [ 6752513 ] : 6755285 0.0139186 (=215/(271*57)) 98.9885 given [ 6752513 ] : 6752513 0.017738 (=202/(219*52)) 98.8029 best keyword for cluster 6752513 is PF04488 with Jaccard = 0.8095 [ 153 25 1100022 11 ] 0.8596 0.9329 sibling [ 6752513 ] : 6753581 0.0178571 (=1/(1*56)) 98.875 best keyword for cluster 6753581 is PF05785 with Jaccard = 0.7000 [ 7 3 1100201 0 ] 0.7000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04488 ( PF04488 Glycosyltransferase sugar-binding region containing DXD motif ) B> PF05785 ( PF05785 Rho-activating domain of cytotoxic necrotizing factor ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF04488 and PF05785 have PDB structures PF05785 d.194.1.1 SUPERFAM mapping significantly overlapping: 1 PF05785 SSF64438 0.861 (average over 21 mutual instances, PF05785 21 appearances, SSF64438 688 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 629 ) 6760626_PF00144_PF00933 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00144 is 6745544 with Jaccard = 0.8072 |PF00144|=1031 [ 833 1 1099179 198 ] parent [ 6745544 ] : 6760626 0.00740457 (=5569/(963*781)) 99.2978 given [ 6745544 ] : 6745544 0.0193717 (=148/(8*955)) 98.2752 best keyword for cluster 6745544 is PF00144 with Jaccard = 0.8072 [ 833 1 1099179 198 ] 0.9988 0.8080 sibling [ 6745544 ] : 6759392 0.00897436 (=7/(1*780)) 99.2326 best keyword for cluster 6759392 is PF00933 with Jaccard = 0.9658 [ 649 18 1099539 5 ] 0.9730 0.9924 SUGGESTING RELATEDNESS OF: A> PF00144 ( PF00144 Beta-lactamase ) B> PF00933 ( PF00933 Glycosyl hydrolase family 3 N terminal domain ) A and B come from a different clan ( CL0013.12 , CL0058.10 ). the two keywords coincide on Uniref90 proteins: |PF00144| = 1031 , |PF00933| = 654 , |PF00144^PF00933| = 7 ( 0.7% and 1.1% ) both PF00144 and PF00933 have PDB structures PF00144 e.3.1.1 PF00933 c.1.8.7 SUPERFAM mapping significantly overlapping: 1 PF00144 SSF56601 0.955 (average over 4139 mutual instances, PF00144 4197 appearances, SSF56601 18812 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 630 ) 6736944_PF01553_PF04028 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01553 is 6731479 with Jaccard = 0.8064 |PF01553|=1152 [ 1008 98 1098961 144 ] parent [ 6731479 ] : 6736944 0.0328781 (=2470/(57*1318)) 97.479 given [ 6731479 ] : 6731479 0.0403242 (=5797/(120*1198)) 96.8814 best keyword for cluster 6731479 is PF01553 with Jaccard = 0.8064 [ 1008 98 1098961 144 ] 0.9114 0.8750 sibling [ 6731479 ] : 6514859 0.845455 (=93/(2*55)) 17.586 best keyword for cluster 6514859 is PF04028 with Jaccard = 0.9556 [ 43 1 1100166 1 ] 0.9773 0.9773 SUGGESTING RELATEDNESS OF: A> PF01553 ( PF01553 Acyltransferase ) B> PF04028 ( PF04028 Domain of unknown function (DUF374) ) Only A has a clan ( CL0228.3 ). the two keywords do not coincide on UniRef90 proteins only PF01553 has a PDB structure (may not be up to date) PF01553 c.112.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 631 ) 6688186_PF01943_PF03023 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01943 is 6683234 with Jaccard = 0.8053 |PF01943|=569 [ 459 1 1099641 110 ] parent [ 6683234 ] : 6688186 0.11534 (=31047/(377*714)) 89.9014 given [ 6683234 ] : 6683234 0.125408 (=1229/(14*700)) 88.998 best keyword for cluster 6683234 is PF01943 with Jaccard = 0.8053 [ 459 1 1099641 110 ] 0.9978 0.8067 sibling [ 6683234 ] : 6682609 0.116667 (=217/(372*5)) 88.872 best keyword for cluster 6682609 is PF03023 with Jaccard = 0.6888 [ 228 101 1099880 2 ] 0.6930 0.9913 SUGGESTING RELATEDNESS OF: A> PF01943 ( PF01943 Polysaccharide biosynthesis protein ) B> PF03023 ( PF03023 MviN-like protein ) they come from the same clan: CL0222.3 : PF01554 PF03023 PF01943 PF04506 the two keywords coincide on Uniref90 proteins: |PF01943| = 569 , |PF03023| = 230 , |PF01943^PF03023| = 1 ( 0.2% and 0.4% ) Neither PF01943 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 632 ) 6688257_PF00840_PF02015 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02015 is 6441890 with Jaccard = 0.8043 |PF02015|=46 [ 37 0 1100165 9 ] parent [ 6441890 ] : 6688257 0.106549 (=410/(37*104)) 89.9297 given [ 6441890 ] : 6441890 0.992424 (=131/(33*4)) 0.759409 best keyword for cluster 6441890 is PF02015 with Jaccard = 0.8043 [ 37 0 1100165 9 ] 1.0000 0.8043 sibling [ 6441890 ] : 6625832 0.274157 (=366/(89*15)) 72.9157 best keyword for cluster 6625832 is PF00840 with Jaccard = 0.8788 [ 87 12 1100112 0 ] 0.8788 1.0000 SUGGESTING RELATEDNESS OF: A> PF02015 ( PF02015 Glycosyl hydrolase family 45 ) B> PF00840 ( PF00840 Glycosyl hydrolase family 7 ) A and B come from a different clan ( CL0199.7 , CL0004.14 ). the two keywords do not coincide on UniRef90 proteins both PF02015 and PF00840 have PDB structures PF02015 b.52.1.1 PF00840 b.29.1.10 SUPERFAM mapping significantly overlapping: 1 PF00840 SSF49899 0.989 (average over 255 mutual instances, PF00840 318 appearances, SSF49899 14070 appearances) 2 PF02015 SSF50685 0.935 (average over 100 mutual instances, PF02015 129 appearances, SSF50685 2549 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 633 ) 6698573_PF00465_PF01761 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01761 is 6659594 with Jaccard = 0.8028 |PF01761|=360 [ 289 0 1099851 71 ] parent [ 6659594 ] : 6698573 0.0944991 (=21216/(314*715)) 91.9766 given [ 6659594 ] : 6659594 0.175777 (=164/(311*3)) 83.2773 best keyword for cluster 6659594 is PF01761 with Jaccard = 0.8028 [ 289 0 1099851 71 ] 1.0000 0.8028 sibling [ 6659594 ] : 6695679 0.092437 (=66/(1*714)) 91.4994 best keyword for cluster 6695679 is PF00465 with Jaccard = 0.8457 [ 592 50 1099511 58 ] 0.9221 0.9108 SUGGESTING RELATEDNESS OF: A> PF01761 ( PF01761 3-dehydroquinate synthase ) B> PF00465 ( PF00465 Iron-containing alcohol dehydrogenase ) they come from the same clan: CL0224.3 : PF01761 PF00465 the two keywords do not coincide on UniRef90 proteins both PF01761 and PF00465 have PDB structures PF01761 e.22.1.1 PF00465 e.22.1.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 634 ) 6744375_PF00929_PF02811 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00929 is 6741904 with Jaccard = 0.8025 |PF00929|=1073 [ 1032 213 1098925 41 ] parent [ 6741904 ] : 6744375 0.019284 (=32249/(1040*1608)) 98.1808 given [ 6741904 ] : 6741904 0.0252419 (=1234/(31*1577)) 97.9624 best keyword for cluster 6741904 is PF00929 with Jaccard = 0.8025 [ 1032 213 1098925 41 ] 0.8289 0.9618 sibling [ 6741904 ] : 6739449 0.0295172 (=749/(25*1015)) 97.7263 best keyword for cluster 6739449 is PF02811 with Jaccard = 0.7896 [ 747 56 1099265 143 ] 0.9303 0.8393 SUGGESTING RELATEDNESS OF: A> PF00929 ( PF00929 Exonuclease ) B> PF02811 ( PF02811 PHP domain ) A and B come from a different clan ( CL0219.6 , CL0034.9 ). the two keywords coincide on Uniref90 proteins: |PF00929| = 1073 , |PF02811| = 890 , |PF00929^PF02811| = 55 ( 5.1% and 6.2% ) both PF00929 and PF02811 have PDB structures PF00929 c.55.3.5 SUPERFAM mapping significantly overlapping: 1 PF00929 SSF53098 0.885 (average over 3310 mutual instances, PF00929 3861 appearances, SSF53098 65670 appearances) 2 PF02811 SSF89550 0.796 (average over 2526 mutual instances, PF02811 3397 appearances, SSF89550 5217 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 635 ) 6703698_PF00565_PF00567 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00565 is 6695678 with Jaccard = 0.8000 |PF00565|=320 [ 256 0 1099891 64 ] parent [ 6695678 ] : 6703698 0.08248 (=3552/(135*319)) 92.8462 given [ 6695678 ] : 6695678 0.0981013 (=93/(3*316)) 91.4993 best keyword for cluster 6695678 is PF00565 with Jaccard = 0.8000 [ 256 0 1099891 64 ] 1.0000 0.8000 sibling [ 6695678 ] : 6678034 0.141221 (=74/(4*131)) 87.6765 best keyword for cluster 6678034 is PF00567 with Jaccard = 0.7548 [ 117 4 1100056 34 ] 0.9669 0.7748 SUGGESTING RELATEDNESS OF: A> PF00565 ( PF00565 Staphylococcal nuclease homologue ) B> PF00567 ( PF00567 Tudor domain ) Only B has a clan ( CL0049.9 ). the two keywords coincide on Uniref90 proteins: |PF00565| = 320 , |PF00567| = 151 , |PF00565^PF00567| = 38 ( 11.9% and 25.2% ) only PF00565 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00565 SSF50199 0.792 (average over 816 mutual instances, PF00565 827 appearances, SSF50199 989 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 636 ) 6706354_PF06158_PF06528 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06528 is 6557661 with Jaccard = 0.8000 |PF06528|=15 [ 12 0 1100196 3 ] parent [ 6557661 ] : 6706354 0.0666667 (=28/(14*30)) 93.3533 given [ 6557661 ] : 6557661 0.692308 (=9/(1*13)) 44.9869 best keyword for cluster 6557661 is PF06528 with Jaccard = 0.8000 [ 12 0 1100196 3 ] 1.0000 0.8000 sibling [ 6557661 ] : 6646478 0.243386 (=46/(21*9)) 78.9664 best keyword for cluster 6646478 is PF06158 with Jaccard = 1.0000 [ 19 0 1100192 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06528 ( PF06528 Phage P2 GpE ) B> PF06158 ( PF06158 Phage tail protein E ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF06158| = 19 , |PF06528| = 15 , |PF06158^PF06528| = 2 ( 10.5% and 13.3% ) Neither PF06528 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 637 ) 6666168_PF02121_PF02862 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02862 is 6585801 with Jaccard = 0.7975 |PF02862|=79 [ 63 0 1100132 16 ] parent [ 6585801 ] : 6666168 0.163288 (=733/(67*67)) 84.5138 given [ 6585801 ] : 6585801 0.476923 (=62/(2*65)) 55.6951 best keyword for cluster 6585801 is PF02862 with Jaccard = 0.7975 [ 63 0 1100132 16 ] 1.0000 0.7975 sibling [ 6585801 ] : 6628148 0.30303 (=20/(1*66)) 73.7258 best keyword for cluster 6628148 is PF02121 with Jaccard = 0.8730 [ 55 8 1100148 0 ] 0.8730 1.0000 SUGGESTING RELATEDNESS OF: A> PF02862 ( PF02862 DDHD domain ) B> PF02121 ( PF02121 Phosphatidylinositol transfer protein ) Only B has a clan ( CL0209.4 ). the two keywords coincide on Uniref90 proteins: |PF02121| = 55 , |PF02862| = 79 , |PF02121^PF02862| = 11 ( 20.0% and 13.9% ) only PF02862 has a PDB structure (may not be up to date) PF02121 d.129.3.4 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 638 ) 6769866_PF00903_PF01261 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01261 is 6767388 with Jaccard = 0.7973 |PF01261|=992 [ 952 202 1099017 40 ] parent [ 6767388 ] : 6769866 0.00384732 (=15568/(3022*1339)) 99.7065 given [ 6767388 ] : 6767388 0.00496776 (=886/(1189*150)) 99.6129 best keyword for cluster 6767388 is PF01261 with Jaccard = 0.7973 [ 952 202 1099017 40 ] 0.8250 0.9597 sibling [ 6767388 ] : 6767046 0.00463576 (=28/(2*3020)) 99.5995 best keyword for cluster 6767046 is PF00903 with Jaccard = 0.8550 [ 2111 196 1097742 162 ] 0.9150 0.9287 SUGGESTING RELATEDNESS OF: A> PF01261 ( PF01261 Xylose isomerase-like TIM barrel ) B> PF00903 ( PF00903 Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily ) A and B come from a different clan ( CL0152.6 , CL0104.8 ). the two keywords coincide on Uniref90 proteins: |PF00903| = 2273 , |PF01261| = 992 , |PF00903^PF01261| = 14 ( 0.6% and 1.4% ) both PF01261 and PF00903 have PDB structures PF01261 c.1.15.1 c.1.15.3 c.1.15.4 c.1.15.5 PF00903 d.32.1.1 d.32.1.10 d.32.1.2 d.32.1.3 d.32.1.4 d.32.1.6 d.32.1.8 SUPERFAM mapping significantly overlapping: 1 PF01261 SSF51658 0.729 (average over 3023 mutual instances, PF01261 3048 appearances, SSF51658 3985 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 639 ) 6720806_PF00462_PF04908 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00462 is 6692246 with Jaccard = 0.7954 |PF00462|=902 [ 719 2 1099307 183 ] parent [ 6692246 ] : 6720806 0.0563709 (=1544/(33*830)) 95.531 given [ 6692246 ] : 6692246 0.110169 (=364/(4*826)) 90.7467 best keyword for cluster 6692246 is PF00462 with Jaccard = 0.7954 [ 719 2 1099307 183 ] 0.9972 0.7971 sibling [ 6692246 ] : 6605613 0.40625 (=13/(1*32)) 64.4917 best keyword for cluster 6605613 is PF04908 with Jaccard = 0.8158 [ 31 1 1100173 6 ] 0.9688 0.8378 SUGGESTING RELATEDNESS OF: A> PF00462 ( PF00462 Glutaredoxin ) B> PF04908 ( PF04908 SH3-binding, glutamic acid-rich protein ) they come from the same clan: CL0172.11 : PF00837 PF04908 PF02630 PF08534 PF02114 PF04756 PF07449 PF02798 PF00255 PF00462 PF07912 PF06110 PF05768 PF07955 PF01323 PF01216 PF03960 PF00578 PF00085 the two keywords do not coincide on UniRef90 proteins both PF00462 and PF04908 have PDB structures PF00462 c.47.1.1 SUPERFAM mapping significantly overlapping: 1 PF00462 SSF52833 0.710 (average over 2554 mutual instances, PF00462 2661 appearances, SSF52833 34965 appearances) 2 PF04908 SSF52833 0.972 (average over 75 mutual instances, PF04908 78 appearances, SSF52833 34965 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 640 ) 6709396_PF01871_PF02900 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02900 is 6664524 with Jaccard = 0.7924 |PF02900|=236 [ 187 0 1099975 49 ] parent [ 6664524 ] : 6709396 0.0704607 (=2340/(162*205)) 93.8444 given [ 6664524 ] : 6664524 0.191218 (=1768/(138*67)) 84.1975 best keyword for cluster 6664524 is PF02900 with Jaccard = 0.7924 [ 187 0 1099975 49 ] 1.0000 0.7924 sibling [ 6664524 ] : 6648931 0.205364 (=1248/(103*59)) 79.6683 best keyword for cluster 6648931 is PF01871 with Jaccard = 0.7143 [ 110 36 1100057 8 ] 0.7534 0.9322 SUGGESTING RELATEDNESS OF: A> PF02900 ( PF02900 Catalytic LigB subunit of aromatic ring-opening dioxygenase ) B> PF01871 ( PF01871 AMMECR1 ) Only A has a clan ( CL0283.2 ). the two keywords coincide on Uniref90 proteins: |PF01871| = 118 , |PF02900| = 236 , |PF01871^PF02900| = 13 ( 11.0% and 5.5% ) both PF02900 and PF01871 have PDB structures PF01871 d.309.1.1 SUPERFAM mapping significantly overlapping: 1 PF02900 SSF53213 0.960 (average over 651 mutual instances, PF02900 671 appearances, SSF53213 704 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 641 ) 6653242_PF00547_PF00699 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00699 is 6436120 with Jaccard = 0.7923 |PF00699|=130 [ 103 0 1100081 27 ] parent [ 6436120 ] : 6653242 0.189592 (=1971/(92*113)) 81.0735 given [ 6436120 ] : 6436120 0.995495 (=221/(111*2)) 0.486867 best keyword for cluster 6436120 is PF00699 with Jaccard = 0.7923 [ 103 0 1100081 27 ] 1.0000 0.7923 sibling [ 6436120 ] : 6263173 1 (=91/(1*91)) 2.5738e-12 best keyword for cluster 6263173 is PF00547 with Jaccard = 0.7545 [ 83 0 1100101 27 ] 1.0000 0.7545 SUGGESTING RELATEDNESS OF: A> PF00699 ( PF00699 Urease beta subunit ) B> PF00547 ( PF00547 Urease, gamma subunit ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00547| = 110 , |PF00699| = 130 , |PF00547^PF00699| = 36 ( 32.7% and 27.7% ) both PF00699 and PF00547 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00699 SSF51278 0.899 (average over 491 mutual instances, PF00699 687 appearances, SSF51278 729 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 642 ) 6765418_PF01127_PF02313 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01127 is 6735057 with Jaccard = 0.7904 |PF01127|=230 [ 230 61 1099920 0 ] parent [ 6735057 ] : 6765418 0.00629277 (=103/(496*33)) 99.5309 given [ 6735057 ] : 6735057 0.0358274 (=2002/(173*323)) 97.2782 best keyword for cluster 6735057 is PF01127 with Jaccard = 0.7904 [ 230 61 1099920 0 ] 0.7904 1.0000 sibling [ 6735057 ] : 6762758 0.03125 (=1/(1*32)) 99.4062 best keyword for cluster 6762758 is PF02313 with Jaccard = 1.0000 [ 21 0 1100190 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01127 ( PF01127 Succinate dehydrogenase cytochrome b subunit ) B> PF02313 ( PF02313 Fumarate reductase subunit D ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF01127 and PF02313 have PDB structures PF02313 f.21.2.2 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 643 ) 6580212_PF00129_PF07654 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07654 is 6546979 with Jaccard = 0.7900 |PF07654|=1493 [ 1189 12 1098706 304 ] parent [ 6546979 ] : 6580212 0.471989 (=428027/(1214*747)) 53.7949 given [ 6546979 ] : 6546979 0.652078 (=2369/(3*1211)) 36.6568 best keyword for cluster 6546979 is PF07654 with Jaccard = 0.7900 [ 1189 12 1098706 304 ] 0.9900 0.7964 sibling [ 6546979 ] : 6558890 0.573154 (=854/(2*745)) 45.9789 best keyword for cluster 6558890 is PF00129 with Jaccard = 0.6092 [ 725 0 1099021 465 ] 1.0000 0.6092 SUGGESTING RELATEDNESS OF: A> PF07654 ( PF07654 Immunoglobulin C1-set domain ) B> PF00129 ( PF00129 Class I Histocompatibility antigen, domains alpha 1 and 2 ) Only A has a clan ( CL0011.18 ). the two keywords coincide on Uniref90 proteins: |PF00129| = 1190 , |PF07654| = 1493 , |PF00129^PF07654| = 606 ( 50.9% and 40.6% ) both PF07654 and PF00129 have PDB structures PF07654 b.1.1.2 SUPERFAM mapping significantly overlapping: 1 PF00129 SSF54452 0.972 (average over 8055 mutual instances, PF00129 8055 appearances, SSF54452 25772 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 644 ) 6740496_PF05378_PF06032 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06032 is 6509800 with Jaccard = 0.7872 |PF06032|=47 [ 37 0 1100164 10 ] parent [ 6509800 ] : 6740496 0.0219609 (=353/(38*423)) 97.8272 given [ 6509800 ] : 6509800 0.847926 (=184/(7*31)) 15.2818 best keyword for cluster 6509800 is PF06032 with Jaccard = 0.7872 [ 37 0 1100164 10 ] 1.0000 0.7872 sibling [ 6509800 ] : 6737708 0.0267786 (=67/(6*417)) 97.5561 best keyword for cluster 6737708 is PF05378 with Jaccard = 0.6702 [ 254 122 1099832 3 ] 0.6755 0.9883 SUGGESTING RELATEDNESS OF: A> PF06032 ( PF06032 Protein of unknown function (DUF917) ) B> PF05378 ( PF05378 Hydantoinase/oxoprolinase N-terminal region ) Only B has a clan ( CL0108.10 ). the two keywords coincide on Uniref90 proteins: |PF05378| = 257 , |PF06032| = 47 , |PF05378^PF06032| = 10 ( 3.9% and 21.3% ) Neither PF06032 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF05378 SSF53383 0.793 (average over 1 mutual instances, PF05378 1 appearances, SSF53383 34644 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 645 ) 6689861_PF00256_PF00828 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00256 is 6632191 with Jaccard = 0.7863 |PF00256|=325 [ 298 54 1099832 27 ] parent [ 6632191 ] : 6689861 0.11918 (=4331/(92*395)) 90.2474 given [ 6632191 ] : 6632191 0.28181 (=8146/(97*298)) 75.1802 best keyword for cluster 6632191 is PF00256 with Jaccard = 0.7863 [ 298 54 1099832 27 ] 0.8466 0.9169 sibling [ 6632191 ] : 6549048 0.644522 (=1106/(26*66)) 38.0505 best keyword for cluster 6549048 is PF00828 with Jaccard = 0.7126 [ 62 25 1100124 0 ] 0.7126 1.0000 SUGGESTING RELATEDNESS OF: A> PF00256 ( PF00256 Ribosomal protein L15 ) B> PF00828 ( PF00828 Eukaryotic ribosomal protein L18 ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF00256 has a PDB structure (may not be up to date) PF00256 c.12.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 646 ) 6695975_PF00069_PF01453 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00069 is 6694181 with Jaccard = 0.7833 |PF00069|=11375 [ 10402 1905 1086931 973 ] parent [ 6694181 ] : 6695975 0.0868221 (=389625/(338*13277)) 91.541 given [ 6694181 ] : 6694181 0.10485 (=11130/(8*13269)) 91.1396 best keyword for cluster 6694181 is PF00069 with Jaccard = 0.7833 [ 10402 1905 1086931 973 ] 0.8452 0.9145 sibling [ 6694181 ] : 6676105 0.172108 (=4254/(231*107)) 87.1973 best keyword for cluster 6676105 is PF01453 with Jaccard = 0.6419 [ 294 13 1099753 151 ] 0.9577 0.6607 SUGGESTING RELATEDNESS OF: A> PF00069 ( PF00069 Protein kinase domain ) B> PF01453 ( PF01453 D-mannose binding lectin ) Only A has a clan ( CL0016.14 ). the two keywords coincide on Uniref90 proteins: |PF00069| = 11375 , |PF01453| = 445 , |PF00069^PF01453| = 162 ( 1.4% and 36.4% ) both PF00069 and PF01453 have PDB structures PF01453 b.78.1.1 SUPERFAM mapping significantly overlapping: 1 PF01453 SSF51110 0.759 (average over 1305 mutual instances, PF01453 2025 appearances, SSF51110 3868 appearances) 2 PF00069 SSF56112 0.797 (average over 32363 mutual instances, PF00069 36405 appearances, SSF56112 66637 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 647 ) 6742870_PF05101_PF05245 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05101 is 6702318 with Jaccard = 0.7818 |PF05101|=55 [ 43 0 1100156 12 ] parent [ 6702318 ] : 6742870 0.0311966 (=73/(45*52)) 98.0401 given [ 6702318 ] : 6702318 0.0965909 (=34/(8*44)) 92.5928 best keyword for cluster 6702318 is PF05101 with Jaccard = 0.7818 [ 43 0 1100156 12 ] 1.0000 0.7818 sibling [ 6702318 ] : 6552559 0.612903 (=266/(14*31)) 40.9073 best keyword for cluster 6552559 is PF05245 with Jaccard = 0.7059 [ 12 4 1100194 1 ] 0.7500 0.9231 SUGGESTING RELATEDNESS OF: A> PF05101 ( PF05101 Type IV secretory pathway, VirB3-like protein ) B> PF05245 ( PF05245 Conjugal transfer protein TrbD ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05101 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 648 ) 6754528_PF01575_PF07977 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01575 is 6732221 with Jaccard = 0.7770 |PF01575|=570 [ 453 13 1099628 117 ] parent [ 6732221 ] : 6754528 0.015873 (=6144/(672*576)) 98.9393 given [ 6732221 ] : 6732221 0.0309345 (=144/(7*665)) 96.9692 best keyword for cluster 6732221 is PF01575 with Jaccard = 0.7770 [ 453 13 1099628 117 ] 0.9721 0.7947 sibling [ 6732221 ] : 6736699 0.0330435 (=19/(1*575)) 97.4489 best keyword for cluster 6736699 is PF07977 with Jaccard = 0.6835 [ 324 123 1099737 27 ] 0.7248 0.9231 SUGGESTING RELATEDNESS OF: A> PF01575 ( PF01575 MaoC like domain ) B> PF07977 ( PF07977 FabA-like domain ) they come from the same clan: CL0050.7 : PF03061 PF01643 PF02551 PF07977 PF01575 the two keywords do not coincide on UniRef90 proteins both PF01575 and PF07977 have PDB structures PF07977 d.38.1.2 d.38.1.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 649 ) 6745191_PF04535_PF07911 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04535 is 6690734 with Jaccard = 0.7765 |PF04535|=85 [ 66 0 1100126 19 ] parent [ 6690734 ] : 6745191 0.0191053 (=41/(74*29)) 98.2489 given [ 6690734 ] : 6690734 0.117457 (=109/(16*58)) 90.4268 best keyword for cluster 6690734 is PF04535 with Jaccard = 0.7765 [ 66 0 1100126 19 ] 1.0000 0.7765 sibling [ 6690734 ] : 6704172 0.0714286 (=2/(1*28)) 92.9429 best keyword for cluster 6704172 is PF07911 with Jaccard = 1.0000 [ 23 0 1100188 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF04535 ( PF04535 Domain of unknown function (DUF588) ) B> PF07911 ( PF07911 Protein of unknown function (DUF1677) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04535| = 85 , |PF07911| = 23 , |PF04535^PF07911| = 1 ( 1.2% and 4.3% ) Neither PF04535 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 650 ) 6619387_PF00836_PF05672 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05672 is 6606333 with Jaccard = 0.7742 |PF05672|=30 [ 24 1 1100180 6 ] parent [ 6606333 ] : 6619387 0.346915 (=1822/(101*52)) 70.2099 given [ 6606333 ] : 6606333 0.402174 (=111/(6*46)) 64.7943 best keyword for cluster 6606333 is PF05672 with Jaccard = 0.7742 [ 24 1 1100180 6 ] 0.9600 0.8000 sibling [ 6606333 ] : 6602233 0.427673 (=1088/(48*53)) 62.8107 best keyword for cluster 6602233 is PF00836 with Jaccard = 0.8788 [ 29 4 1100178 0 ] 0.8788 1.0000 SUGGESTING RELATEDNESS OF: A> PF05672 ( PF05672 MAP7 (E-MAP-115) family ) B> PF00836 ( PF00836 Stathmin family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05672 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00836 SSF101494 0.963 (average over 91 mutual instances, PF00836 91 appearances, SSF101494 91 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 651 ) 6762276_PF01381_PF02486 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01381 is 6760686 with Jaccard = 0.7684 |PF01381|=3353 [ 3006 559 1096299 347 ] parent [ 6760686 ] : 6762276 0.00821969 (=4470/(104*5229)) 99.3842 given [ 6760686 ] : 6760686 0.0099152 (=4034/(79*5150)) 99.301 best keyword for cluster 6760686 is PF01381 with Jaccard = 0.7684 [ 3006 559 1096299 347 ] 0.8432 0.8965 sibling [ 6760686 ] : 6755100 0.0134921 (=17/(14*90)) 98.9755 best keyword for cluster 6755100 is PF02486 with Jaccard = 0.9506 [ 77 3 1100130 1 ] 0.9625 0.9872 SUGGESTING RELATEDNESS OF: A> PF01381 ( PF01381 Helix-turn-helix ) B> PF02486 ( PF02486 Replication initiation factor ) Only A has a clan ( CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF01381| = 3353 , |PF02486| = 78 , |PF01381^PF02486| = 8 ( 0.2% and 10.3% ) only PF01381 has a PDB structure (may not be up to date) PF01381 a.35.1.11 a.35.1.2 a.35.1.3 SUPERFAM mapping significantly overlapping: 1 PF01381 SSF47413 0.810 (average over 8999 mutual instances, PF01381 10797 appearances, SSF47413 20047 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 652 ) 6757628_PF05978_PF07690 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05978 is 6736932 with Jaccard = 0.7670 |PF05978|=79 [ 79 24 1100108 0 ] parent [ 6736932 ] : 6757628 0.0123495 (=26074/(145*14561)) 99.1309 given [ 6736932 ] : 6736932 0.0275689 (=66/(126*19)) 97.4776 best keyword for cluster 6736932 is PF05978 with Jaccard = 0.7670 [ 79 24 1100108 0 ] 0.7670 1.0000 sibling [ 6736932 ] : 6757401 0.0111875 (=2441/(15*14546)) 99.1154 best keyword for cluster 6757401 is PF07690 with Jaccard = 0.7945 [ 10339 2422 1087197 253 ] 0.8102 0.9761 SUGGESTING RELATEDNESS OF: A> PF05978 ( PF05978 Eukaryotic protein of unknown function (DUF895) ) B> PF07690 ( PF07690 Major Facilitator Superfamily ) they come from the same clan: CL0015.13 : PF00083 PF03209 PF00854 PF03137 PF03825 PF01733 PF06813 PF07672 PF07690 PF01306 PF01770 PF05978 PF05977 PF05631 PF04332 PF07673 PF06779 PF02487 PF03092 PF06609 the two keywords coincide on Uniref90 proteins: |PF05978| = 79 , |PF07690| = 10592 , |PF05978^PF07690| = 1 ( 1.3% and 0.0% ) only PF05978 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF07690 SSF103473 0.840 (average over 31421 mutual instances, PF07690 31552 appearances, SSF103473 39293 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 653 ) 6701988_PF00406_PF01202 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01202 is 6659576 with Jaccard = 0.7660 |PF01202|=481 [ 370 2 1099728 111 ] parent [ 6659576 ] : 6701988 0.105209 (=52905/(531*947)) 92.5249 given [ 6659576 ] : 6659576 0.230781 (=10543/(423*108)) 83.2649 best keyword for cluster 6659576 is PF01202 with Jaccard = 0.7660 [ 370 2 1099728 111 ] 0.9946 0.7692 sibling [ 6659576 ] : 6693889 0.106285 (=301/(3*944)) 91.0758 best keyword for cluster 6693889 is PF00406 with Jaccard = 0.6344 [ 479 274 1099456 2 ] 0.6361 0.9958 SUGGESTING RELATEDNESS OF: A> PF01202 ( PF01202 Shikimate kinase ) B> PF00406 ( PF00406 Adenylate kinase ) Only A has a clan ( CL0023.26 ). the two keywords do not coincide on UniRef90 proteins both PF01202 and PF00406 have PDB structures PF00406 c.37.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 654 ) 6763647_PF04138_PF04794 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04138 is 6725663 with Jaccard = 0.7650 |PF04138|=232 [ 179 2 1099977 53 ] parent [ 6725663 ] : 6763647 0.0065849 (=182/(249*111)) 99.4482 given [ 6725663 ] : 6725663 0.0493724 (=118/(10*239)) 96.1989 best keyword for cluster 6725663 is PF04138 with Jaccard = 0.7650 [ 179 2 1099977 53 ] 0.9890 0.7716 sibling [ 6725663 ] : 6762612 0.00909091 (=1/(1*110)) 99.4 best keyword for cluster 6762612 is PF04794 with Jaccard = 0.9773 [ 86 2 1100123 0 ] 0.9773 1.0000 SUGGESTING RELATEDNESS OF: A> PF04138 ( PF04138 GtrA-like protein ) B> PF04794 ( PF04794 YdjC-like protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04138| = 232 , |PF04794| = 86 , |PF04138^PF04794| = 3 ( 1.3% and 3.5% ) Neither PF04138 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 655 ) 6453480_PF00005_PF00664 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00664 is 6450609 with Jaccard = 0.7598 |PF00664|=2604 [ 2420 581 1097026 184 ] parent [ 6450609 ] : 6453480 0.987075 (=43874842/(13969*3182)) 1.60299 given [ 6450609 ] : 6450609 0.988648 (=176103/(3125*57)) 1.39078 best keyword for cluster 6450609 is PF00664 with Jaccard = 0.7598 [ 2420 581 1097026 184 ] 0.8064 0.9293 sibling [ 6450609 ] : 6448285 0.991553 (=14455140/(1136*12833)) 1.19485 best keyword for cluster 6448285 is PF00005 with Jaccard = 0.7048 [ 12863 1 1081961 5386 ] 0.9999 0.7049 SUGGESTING RELATEDNESS OF: A> PF00664 ( PF00664 ABC transporter transmembrane region ) B> PF00005 ( PF00005 ABC transporter ) A and B come from a different clan ( CL0241.3 , CL0023.26 ). the two keywords coincide on Uniref90 proteins: |PF00005| = 18249 , |PF00664| = 2604 , |PF00005^PF00664| = 2535 ( 13.9% and 97.4% ) both PF00664 and PF00005 have PDB structures PF00664 f.37.1.1 PF00005 c.37.1.12 j.35.1.1 SUPERFAM mapping significantly overlapping: 1 PF00664 SSF90123 0.746 (average over 7613 mutual instances, PF00664 7751 appearances, SSF90123 18042 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 656 ) 6527357_PF01500_PF05287 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01500 is 6524428 with Jaccard = 0.7586 |PF01500|=52 [ 44 6 1100153 8 ] parent [ 6524428 ] : 6527357 0.792008 (=773/(16*61)) 24.5323 given [ 6524428 ] : 6524428 0.808786 (=626/(18*43)) 22.6091 best keyword for cluster 6524428 is PF01500 with Jaccard = 0.7586 [ 44 6 1100153 8 ] 0.8800 0.8462 sibling [ 6524428 ] : 6324943 1 (=60/(10*6)) 9.24524e-08 best keyword for cluster 6324943 is PF05287 with Jaccard = 0.7273 [ 16 0 1100189 6 ] 1.0000 0.7273 SUGGESTING RELATEDNESS OF: A> PF01500 ( PF01500 Keratin, high sulfur B2 protein ) B> PF05287 ( PF05287 PMG protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF01500 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 657 ) 6729673_PF03435_PF06408 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03435 is 6665268 with Jaccard = 0.7562 |PF03435|=277 [ 214 6 1099928 63 ] parent [ 6665268 ] : 6729673 0.0421345 (=349/(251*33)) 96.6883 given [ 6665268 ] : 6665268 0.175639 (=2747/(115*136)) 84.3269 best keyword for cluster 6665268 is PF03435 with Jaccard = 0.7562 [ 214 6 1099928 63 ] 0.9727 0.7726 sibling [ 6665268 ] : 6687189 0.133333 (=12/(30*3)) 89.6812 best keyword for cluster 6687189 is PF06408 with Jaccard = 1.0000 [ 25 0 1100186 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03435 ( PF03435 Saccharopine dehydrogenase ) B> PF06408 ( PF06408 Homospermidine synthase ) Only A has a clan ( CL0063.17 ). the two keywords do not coincide on UniRef90 proteins only PF03435 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 658 ) 6750141_PF00001_PF01748 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01748 is 6744778 with Jaccard = 0.7557 |PF01748|=317 [ 266 35 1099859 51 ] parent [ 6744778 ] : 6750141 0.018445 (=44072/(5689*420)) 98.6287 given [ 6744778 ] : 6744778 0.0247355 (=886/(119*301)) 98.2175 best keyword for cluster 6744778 is PF01748 with Jaccard = 0.7557 [ 266 35 1099859 51 ] 0.8837 0.8391 sibling [ 6744778 ] : 6744405 0.023841 (=2164/(16*5673)) 98.1833 best keyword for cluster 6744405 is PF00001 with Jaccard = 0.9786 [ 5032 43 1095069 67 ] 0.9915 0.9869 SUGGESTING RELATEDNESS OF: A> PF01748 ( PF01748 Caenorhabditis serpentine receptor-like protein ) B> PF00001 ( PF00001 7 transmembrane receptor (rhodopsin family) ) they come from the same clan: CL0192.7 : PF05296 PF03383 PF01748 PF04789 PF06976 PF01036 PF01461 PF00001 PF03402 the two keywords do not coincide on UniRef90 proteins only PF01748 has a PDB structure (may not be up to date) PF00001 f.13.1.2 i.22.1.1 j.101.1.1 j.35.1.1 j.82.1.1 j.94.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 659 ) 6674928_PF05899_PF06249 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05899 is 6558535 with Jaccard = 0.7554 |PF05899|=139 [ 105 0 1100072 34 ] parent [ 6558535 ] : 6674928 0.161402 (=838/(118*44)) 86.8651 given [ 6558535 ] : 6558535 0.580531 (=328/(5*113)) 45.5925 best keyword for cluster 6558535 is PF05899 with Jaccard = 0.7554 [ 105 0 1100072 34 ] 1.0000 0.7554 sibling [ 6558535 ] : 6644741 0.254826 (=66/(37*7)) 78.3594 best keyword for cluster 6644741 is PF06249 with Jaccard = 0.6129 [ 19 12 1100180 0 ] 0.6129 1.0000 SUGGESTING RELATEDNESS OF: A> PF05899 ( PF05899 Protein of unknown function (DUF861) ) B> PF06249 ( PF06249 Ethanolamine utilisation protein EutQ ) they come from the same clan: CL0029.13 : PF01238 PF05726 PF02678 PF01050 PF02373 PF04209 PF06560 PF05523 PF06249 PF06339 PF04074 PF07385 PF00908 PF06172 PF08007 PF05899 PF07883 PF00190 PF05995 PF02041 PF05118 PF03079 PF02311 PF06052 the two keywords do not coincide on UniRef90 proteins only PF05899 has a PDB structure (may not be up to date) PF05899 b.82.1.11 b.82.1.8 SUPERFAM mapping significantly overlapping: 1 PF06249 SSF51182 0.695 (average over 75 mutual instances, PF06249 75 appearances, SSF51182 14255 appearances) 2 PF05899 SSF51182 0.709 (average over 422 mutual instances, PF05899 437 appearances, SSF51182 14255 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 660 ) 6676374_PF00359_PF00874 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00874 is 6602247 with Jaccard = 0.7517 |PF00874|=275 [ 215 11 1099925 60 ] parent [ 6602247 ] : 6676374 0.136794 (=18645/(235*580)) 87.2803 given [ 6602247 ] : 6602247 0.41523 (=289/(3*232)) 62.8244 best keyword for cluster 6602247 is PF00874 with Jaccard = 0.7517 [ 215 11 1099925 60 ] 0.9513 0.7818 sibling [ 6602247 ] : 6668523 0.158868 (=275/(3*577)) 85.1586 best keyword for cluster 6668523 is PF00359 with Jaccard = 0.6166 [ 386 137 1099585 103 ] 0.7380 0.7894 SUGGESTING RELATEDNESS OF: A> PF00874 ( PF00874 PRD domain ) B> PF00359 ( PF00359 Phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 2 ) Only A has a clan ( CL0166.7 ). the two keywords coincide on Uniref90 proteins: |PF00359| = 489 , |PF00874| = 275 , |PF00359^PF00874| = 76 ( 15.5% and 27.6% ) both PF00874 and PF00359 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00359 SSF55804 0.923 (average over 2008 mutual instances, PF00359 2875 appearances, SSF55804 4723 appearances) 2 PF00874 SSF63520 0.839 (average over 960 mutual instances, PF00874 1775 appearances, SSF63520 2527 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 661 ) 6719639_PF00593_PF01640 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01640 is 6652465 with Jaccard = 0.7500 |PF01640|=7 [ 6 1 1100203 1 ] parent [ 6652465 ] : 6719639 0.0485463 (=2012/(15*2763)) 95.3643 given [ 6652465 ] : 6652465 0.2 (=10/(10*5)) 80.82 best keyword for cluster 6652465 is PF01640 with Jaccard = 0.7500 [ 6 1 1100203 1 ] 0.8571 0.8571 sibling [ 6652465 ] : 6718932 0.054407 (=900/(6*2757)) 95.2683 best keyword for cluster 6718932 is PF00593 with Jaccard = 0.9224 [ 2188 104 1097839 80 ] 0.9546 0.9647 SUGGESTING RELATEDNESS OF: A> PF01640 ( PF01640 Peptidase C10 family ) B> PF00593 ( PF00593 TonB dependent receptor ) A and B come from a different clan ( CL0125.9 , CL0193.8 ). the two keywords do not coincide on UniRef90 proteins both PF01640 and PF00593 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 662 ) 6705148_PF00096_PF07400 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07400 is 6507050 with Jaccard = 0.7500 |PF07400|=3 [ 3 1 1100207 0 ] parent [ 6507050 ] : 6705148 0.0702479 (=4386/(12*5203)) 93.1329 given [ 6507050 ] : 6507050 0.9 (=18/(2*10)) 14.069 best keyword for cluster 6507050 is PF07400 with Jaccard = 0.7500 [ 3 1 1100207 0 ] 0.7500 1.0000 sibling [ 6507050 ] : 6703989 0.0788276 (=4502/(11*5192)) 92.9129 best keyword for cluster 6703989 is PF00096 with Jaccard = 0.8179 [ 4205 255 1095070 681 ] 0.9428 0.8606 SUGGESTING RELATEDNESS OF: A> PF07400 ( PF07400 Interleukin 11 ) B> PF00096 ( PF00096 Zinc finger, C2H2 type ) Only A has a clan ( CL0053.9 ). the two keywords do not coincide on UniRef90 proteins only PF07400 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF07400 SSF47266 0.807 (average over 2 mutual instances, PF07400 2 appearances, SSF47266 2488 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 663 ) 6748487_PF01257_PF06999 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06999 is 6605528 with Jaccard = 0.7326 |PF06999|=86 [ 63 0 1100125 23 ] parent [ 6605528 ] : 6748487 0.0197256 (=552/(66*424)) 98.5005 given [ 6605528 ] : 6605528 0.425366 (=436/(25*41)) 64.39 best keyword for cluster 6605528 is PF06999 with Jaccard = 0.7326 [ 63 0 1100125 23 ] 1.0000 0.7326 sibling [ 6605528 ] : 6694889 0.115498 (=4376/(296*128)) 91.2943 best keyword for cluster 6694889 is PF01257 with Jaccard = 0.8487 [ 258 22 1099907 24 ] 0.9214 0.9149 SUGGESTING RELATEDNESS OF: A> PF06999 ( PF06999 Sucrase/ferredoxin-like ) B> PF01257 ( PF01257 Respiratory-chain NADH dehydrogenase 24 Kd subunit ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06999 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01257 SSF52833 0.515 (average over 774 mutual instances, PF01257 782 appearances, SSF52833 34965 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 664 ) 6696382_PF01585_PF07713 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01585 is 6690768 with Jaccard = 0.7322 |PF01585|=338 [ 257 13 1099860 81 ] parent [ 6690768 ] : 6696382 0.119826 (=1297/(33*328)) 91.6347 given [ 6690768 ] : 6690768 0.125045 (=1378/(38*290)) 90.435 best keyword for cluster 6690768 is PF01585 with Jaccard = 0.7322 [ 257 13 1099860 81 ] 0.9519 0.7604 sibling [ 6690768 ] : 6591104 0.439655 (=51/(29*4)) 57.791 best keyword for cluster 6591104 is PF07713 with Jaccard = 1.0000 [ 26 0 1100185 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01585 ( PF01585 G-patch domain ) B> PF07713 ( PF07713 Protein of unknown function (DUF1604) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01585| = 338 , |PF07713| = 26 , |PF01585^PF07713| = 10 ( 3.0% and 38.5% ) Neither PF01585 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 665 ) 6708570_PF00073_PF00915 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00073 is 6701144 with Jaccard = 0.7265 |PF00073|=433 [ 433 163 1099615 0 ] parent [ 6701144 ] : 6708570 0.0678761 (=11112/(214*765)) 93.7389 given [ 6701144 ] : 6701144 0.0871217 (=9908/(202*563)) 92.3862 best keyword for cluster 6701144 is PF00073 with Jaccard = 0.7265 [ 433 163 1099615 0 ] 0.7265 1.0000 sibling [ 6701144 ] : 6682948 0.116904 (=74/(3*211)) 88.9262 best keyword for cluster 6682948 is PF00915 with Jaccard = 0.9675 [ 149 2 1100057 3 ] 0.9868 0.9803 SUGGESTING RELATEDNESS OF: A> PF00073 ( PF00073 picornavirus capsid protein ) B> PF00915 ( PF00915 Calicivirus coat protein ) they come from the same clan: CL0055.7 : PF01318 PF00915 PF00760 PF01829 PF00073 PF00983 PF00729 the two keywords do not coincide on UniRef90 proteins both PF00073 and PF00915 have PDB structures PF00915 b.121.4.3 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 666 ) 6651087_PF00390_PF01515 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01515 is 6545808 with Jaccard = 0.7197 |PF01515|=314 [ 226 0 1099897 88 ] parent [ 6545808 ] : 6651087 0.197697 (=22161/(248*452)) 80.351 given [ 6545808 ] : 6545808 0.643725 (=159/(1*247)) 35.919 best keyword for cluster 6545808 is PF01515 with Jaccard = 0.7197 [ 226 0 1099897 88 ] 1.0000 0.7197 sibling [ 6545808 ] : 6565475 0.505556 (=455/(2*450)) 50.0974 best keyword for cluster 6565475 is PF00390 with Jaccard = 0.9739 [ 410 11 1099790 0 ] 0.9739 1.0000 SUGGESTING RELATEDNESS OF: A> PF01515 ( PF01515 Phosphate acetyl/butaryl transferase ) B> PF00390 ( PF00390 Malic enzyme, N-terminal domain ) Only A has a clan ( CL0270.2 ). the two keywords coincide on Uniref90 proteins: |PF00390| = 410 , |PF01515| = 314 , |PF00390^PF01515| = 88 ( 21.5% and 28.0% ) both PF01515 and PF00390 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 667 ) 6765896_PF05051_PF06747 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06747 is 6761481 with Jaccard = 0.7155 |PF06747|=213 [ 166 19 1099979 47 ] parent [ 6761481 ] : 6765896 0.00650878 (=119/(47*389)) 99.5518 given [ 6761481 ] : 6761481 0.0109244 (=351/(119*270)) 99.3437 best keyword for cluster 6761481 is PF06747 with Jaccard = 0.7155 [ 166 19 1099979 47 ] 0.8973 0.7793 sibling [ 6761481 ] : 6756568 0.0217391 (=1/(1*46)) 99.0652 best keyword for cluster 6756568 is PF05051 with Jaccard = 1.0000 [ 35 0 1100176 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF06747 ( PF06747 CHCH domain ) B> PF05051 ( PF05051 Cytochrome C oxidase copper chaperone (COX17) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF06747 has a PDB structure (may not be up to date) PF05051 a.17.1.2 SUPERFAM mapping significantly overlapping: 1 PF06747 SSF47072 0.856 (average over 4 mutual instances, PF06747 4 appearances, SSF47072 24 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 668 ) 6664467_PF01391_PF07212 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01391 is 6664282 with Jaccard = 0.7152 |PF01391|=1159 [ 924 133 1098919 235 ] parent [ 6664282 ] : 6664467 0.170602 (=3360/(15*1313)) 84.1743 given [ 6664282 ] : 6664282 0.186005 (=731/(3*1310)) 84.1281 best keyword for cluster 6664282 is PF01391 with Jaccard = 0.7152 [ 924 133 1098919 235 ] 0.8742 0.7972 sibling [ 6664282 ] : 6550373 0.611111 (=22/(3*12)) 39.063 best keyword for cluster 6550373 is PF07212 with Jaccard = 1.0000 [ 9 0 1100202 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF01391 ( PF01391 Collagen triple helix repeat (20 copies) ) B> PF07212 ( PF07212 Hyaluronidase protein (HylP) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01391 has a PDB structure (may not be up to date) PF01391 d.169.1.5 h.1.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 669 ) 6748873_PF04569_PF05754 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05754 is 6736150 with Jaccard = 0.7147 |PF05754|=384 [ 278 5 1099822 106 ] parent [ 6736150 ] : 6748873 0.0204209 (=15965/(619*1263)) 98.5341 given [ 6736150 ] : 6736150 0.0357762 (=3425/(302*317)) 97.3905 best keyword for cluster 6736150 is PF05754 with Jaccard = 0.7147 [ 278 5 1099822 106 ] 0.9823 0.7240 sibling [ 6736150 ] : 6747727 0.0224261 (=4729/(198*1065)) 98.4455 best keyword for cluster 6747727 is PF04569 with Jaccard = 0.7317 [ 150 51 1100006 4 ] 0.7463 0.9740 SUGGESTING RELATEDNESS OF: A> PF05754 ( PF05754 Domain of unknown function (DUF834) ) B> PF04569 ( PF04569 Protein of unknown function ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04569| = 154 , |PF05754| = 384 , |PF04569^PF05754| = 1 ( 0.6% and 0.3% ) Neither PF05754 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 670 ) 6678846_PF00090_PF00200 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00090 is 6658050 with Jaccard = 0.7099 |PF00090|=489 [ 389 59 1099663 100 ] parent [ 6658050 ] : 6678846 0.127163 (=24693/(522*372)) 87.8929 given [ 6658050 ] : 6658050 0.218269 (=2086/(503*19)) 82.7142 best keyword for cluster 6658050 is PF00090 with Jaccard = 0.7099 [ 389 59 1099663 100 ] 0.8683 0.7955 sibling [ 6658050 ] : 6668806 0.159687 (=408/(7*365)) 85.1875 best keyword for cluster 6668806 is PF00200 with Jaccard = 0.7820 [ 269 75 1099867 0 ] 0.7820 1.0000 SUGGESTING RELATEDNESS OF: A> PF00090 ( PF00090 Thrombospondin type 1 domain ) B> PF00200 ( PF00200 Disintegrin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF00090 and PF00200 have PDB structures PF00200 g.20.1.1 SUPERFAM mapping significantly overlapping: 1 PF00200 SSF57552 0.970 (average over 566 mutual instances, PF00200 567 appearances, SSF57552 1832 appearances) 2 PF00090 SSF82895 0.795 (average over 1393 mutual instances, PF00090 1737 appearances, SSF82895 3242 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 671 ) 6780465_PF01659_PF08015 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08015 is 6773994 with Jaccard = 0.7045 |PF08015|=44 [ 31 0 1100167 13 ] parent [ 6773994 ] : 6780465 0.000589449 (=8/(87*156)) 99.9608 given [ 6773994 ] : 6773994 0.0021164 (=4/(42*45)) 99.8306 best keyword for cluster 6773994 is PF08015 with Jaccard = 0.7045 [ 31 0 1100167 13 ] 1.0000 0.7045 sibling [ 6773994 ] : 6778177 0.000988142 (=5/(46*110)) 99.9232 best keyword for cluster 6778177 is PF01659 with Jaccard = 0.8621 [ 25 4 1100182 0 ] 0.8621 1.0000 SUGGESTING RELATEDNESS OF: A> PF08015 ( PF08015 Fungal mating-type pheromone ) B> PF01659 ( PF01659 Luteovirus putative VPg genome linked protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF08015 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 672 ) 6752562_PF00096_PF05485 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05485 is 6723534 with Jaccard = 0.7033 |PF05485|=91 [ 64 0 1100120 27 ] parent [ 6723534 ] : 6752562 0.0131453 (=11888/(118*7664)) 98.8076 given [ 6723534 ] : 6723534 0.0451632 (=155/(66*52)) 95.9325 best keyword for cluster 6723534 is PF05485 with Jaccard = 0.7033 [ 64 0 1100120 27 ] 1.0000 0.7033 sibling [ 6723534 ] : 6752225 0.0171924 (=5634/(43*7621)) 98.7845 best keyword for cluster 6752225 is PF00096 with Jaccard = 0.6418 [ 4375 1931 1093394 511 ] 0.6938 0.8954 SUGGESTING RELATEDNESS OF: A> PF05485 ( PF05485 THAP domain ) B> PF00096 ( PF00096 Zinc finger, C2H2 type ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00096| = 4886 , |PF05485| = 91 , |PF00096^PF05485| = 8 ( 0.2% and 8.8% ) only PF05485 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 673 ) 6771366_PF00702_PF00982 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00702 is 6769089 with Jaccard = 0.7019 |PF00702|=4884 [ 4772 1915 1093412 112 ] parent [ 6769089 ] : 6771366 0.00313522 (=10159/(8245*393)) 99.7556 given [ 6769089 ] : 6769089 0.00472453 (=78112/(3443*4802)) 99.6794 best keyword for cluster 6769089 is PF00702 with Jaccard = 0.7019 [ 4772 1915 1093412 112 ] 0.7136 0.9771 sibling [ 6769089 ] : 6769131 0.00510204 (=2/(1*392)) 99.6811 best keyword for cluster 6769131 is PF00982 with Jaccard = 0.6836 [ 229 105 1099876 1 ] 0.6856 0.9957 SUGGESTING RELATEDNESS OF: A> PF00702 ( PF00702 haloacid dehalogenase-like hydrolase ) B> PF00982 ( PF00982 Glycosyltransferase family 20 ) A and B come from a different clan ( CL0137.9 , CL0113.8 ). the two keywords do not coincide on UniRef90 proteins both PF00702 and PF00982 have PDB structures PF00702 c.108.1.1 c.108.1.10 c.108.1.11 c.108.1.14 c.108.1.2 c.108.1.22 c.108.1.3 c.108.1.4 c.108.1.5 c.108.1.6 d.220.1.1 i.18.1.1 PF00982 c.87.1.6 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 674 ) 6672121_PF07018_PF07201 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07201 is 6650337 with Jaccard = 0.7000 |PF07201|=23 [ 21 7 1100181 2 ] parent [ 6650337 ] : 6672121 0.15505 (=109/(19*37)) 86.0511 given [ 6650337 ] : 6650337 0.211538 (=66/(13*24)) 80.1255 best keyword for cluster 6650337 is PF07201 with Jaccard = 0.7000 [ 21 7 1100181 2 ] 0.7500 0.9130 sibling [ 6650337 ] : 6603328 0.465909 (=41/(8*11)) 63.3105 best keyword for cluster 6603328 is PF07018 with Jaccard = 0.8000 [ 8 2 1100201 0 ] 0.8000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07201 ( PF07201 Hypersensitivity response secretion protein HrpJ ) B> PF07018 ( PF07018 SepL/SsaL protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF07201 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 675 ) 6773436_PF00631_PF01990 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00631 is 6747899 with Jaccard = 0.6947 |PF00631|=94 [ 66 1 1100116 28 ] parent [ 6747899 ] : 6773436 0.0028909 (=23/(78*102)) 99.817 given [ 6747899 ] : 6747899 0.0267241 (=31/(58*20)) 98.4594 best keyword for cluster 6747899 is PF00631 with Jaccard = 0.6947 [ 66 1 1100116 28 ] 0.9851 0.7021 sibling [ 6747899 ] : 6763173 0.00990099 (=1/(1*101)) 99.4257 best keyword for cluster 6763173 is PF01990 with Jaccard = 1.0000 [ 92 0 1100119 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF00631 ( PF00631 GGL domain ) B> PF01990 ( PF01990 ATP synthase (F/14-kDa) subunit ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF00631 and PF01990 have PDB structures PF00631 a.137.3.1 j.103.1.1 SUPERFAM mapping significantly overlapping: 1 PF00631 SSF48670 0.846 (average over 252 mutual instances, PF00631 331 appearances, SSF48670 404 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 676 ) 6558673_PF02875_PF08353 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01225 is 6530772 with Jaccard = 0.6944 |PF01225|=774 [ 768 332 1099105 6 ] parent [ 6530772 ] : 6558673 0.597316 (=45218/(62*1221)) 45.7388 given [ 6530772 ] : 6530772 0.762885 (=284045/(591*630)) 26.1032 best keyword for cluster 6530772 is PF02875 with Jaccard = 0.7965 [ 1057 43 1098884 227 ] 0.9609 0.8232 sibling [ 6530772 ] : 6493015 0.916667 (=110/(2*60)) 9.11813 best keyword for cluster 6493015 is PF08353 with Jaccard = 0.8947 [ 51 6 1100154 0 ] 0.8947 1.0000 SUGGESTING RELATEDNESS OF: A> PF02875 ( PF02875 Mur ligase family, glutamate ligase domain ) B> PF08353 ( PF08353 Domain of unknown function (DUF1727) ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02875 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 677 ) 6695940_PF01537_PF02400 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01537 is 6647811 with Jaccard = 0.6944 |PF01537|=36 [ 25 0 1100175 11 ] parent [ 6647811 ] : 6695940 0.112795 (=67/(27*22)) 91.5255 given [ 6647811 ] : 6647811 0.22 (=11/(2*25)) 79.2617 best keyword for cluster 6647811 is PF01537 with Jaccard = 0.6944 [ 25 0 1100175 11 ] 1.0000 0.6944 sibling [ 6647811 ] : 6668407 0.15 (=6/(2*20)) 85.08 best keyword for cluster 6668407 is PF02400 with Jaccard = 0.8636 [ 19 0 1100189 3 ] 1.0000 0.8636 SUGGESTING RELATEDNESS OF: A> PF01537 ( PF01537 Herpesvirus glycoprotein D ) B> PF02400 ( PF02400 Glycoprotein GG/GX ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF01537 has a PDB structure (may not be up to date) PF01537 b.1.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 678 ) 6643964_PF02403_PF03129 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03129 is 6626433 with Jaccard = 0.6929 |PF03129|=1189 [ 846 32 1098990 343 ] parent [ 6626433 ] : 6643964 0.240807 (=89125/(390*949)) 78.2006 given [ 6626433 ] : 6626433 0.320898 (=40308/(159*790)) 73.0202 best keyword for cluster 6626433 is PF03129 with Jaccard = 0.6929 [ 846 32 1098990 343 ] 0.9636 0.7115 sibling [ 6626433 ] : 6608037 0.413882 (=161/(1*389)) 65.9262 best keyword for cluster 6608037 is PF02403 with Jaccard = 0.9190 [ 329 27 1099853 2 ] 0.9242 0.9940 SUGGESTING RELATEDNESS OF: A> PF03129 ( PF03129 Anticodon binding domain ) B> PF02403 ( PF02403 Seryl-tRNA synthetase N-terminal domain ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF03129 and PF02403 have PDB structures PF03129 c.51.1.1 SUPERFAM mapping significantly overlapping: 1 PF02403 SSF46589 0.950 (average over 1002 mutual instances, PF02403 1005 appearances, SSF46589 5268 appearances) 2 PF03129 SSF52954 0.857 (average over 3463 mutual instances, PF03129 5178 appearances, SSF52954 9421 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 679 ) 6574812_PF04968_PF05002 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05002 is 6383908 with Jaccard = 0.6901 |PF05002|=71 [ 49 0 1100140 22 ] parent [ 6383908 ] : 6574812 0.494505 (=945/(49*39)) 52.1661 given [ 6383908 ] : 6383908 1 (=138/(46*3)) 0.000849054 best keyword for cluster 6383908 is PF05002 with Jaccard = 0.6901 [ 49 0 1100140 22 ] 1.0000 0.6901 sibling [ 6383908 ] : 6530945 0.77027 (=57/(2*37)) 26.2604 best keyword for cluster 6530945 is PF04968 with Jaccard = 0.9730 [ 36 0 1100174 1 ] 1.0000 0.9730 SUGGESTING RELATEDNESS OF: A> PF05002 ( PF05002 SGS domain ) B> PF04968 ( PF04968 CHORD ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF05002 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 680 ) 6646364_PF00059_PF00193 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00193 is 6527604 with Jaccard = 0.6882 |PF00193|=93 [ 64 0 1100118 29 ] parent [ 6527604 ] : 6646364 0.218952 (=13787/(68*926)) 78.8773 given [ 6527604 ] : 6527604 0.778605 (=837/(43*25)) 24.845 best keyword for cluster 6527604 is PF00193 with Jaccard = 0.6882 [ 64 0 1100118 29 ] 1.0000 0.6882 sibling [ 6527604 ] : 6643806 0.24105 (=4794/(22*904)) 78.0925 best keyword for cluster 6643806 is PF00059 with Jaccard = 0.7384 [ 830 18 1099087 276 ] 0.9788 0.7505 SUGGESTING RELATEDNESS OF: A> PF00193 ( PF00193 Extracellular link domain ) B> PF00059 ( PF00059 Lectin C-type domain ) they come from the same clan: CL0056.7 : PF03440 PF01413 PF07979 PF00059 PF00193 the two keywords coincide on Uniref90 proteins: |PF00059| = 1106 , |PF00193| = 93 , |PF00059^PF00193| = 27 ( 2.4% and 29.0% ) both PF00193 and PF00059 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00193 SSF56436 0.607 (average over 209 mutual instances, PF00193 295 appearances, SSF56436 4895 appearances) 2 PF00059 SSF56436 0.797 (average over 2340 mutual instances, PF00059 2804 appearances, SSF56436 4895 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 681 ) 6524585_PF00019_PF04709 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04709 is 6391041 with Jaccard = 0.6875 |PF04709|=12 [ 11 4 1100195 1 ] parent [ 6391041 ] : 6524585 0.798955 (=4129/(16*323)) 22.7655 given [ 6391041 ] : 6391041 1 (=63/(7*9)) 0.00236828 best keyword for cluster 6391041 is PF04709 with Jaccard = 0.6875 [ 11 4 1100195 1 ] 0.7333 0.9167 sibling [ 6391041 ] : 6516282 0.84345 (=2640/(10*313)) 18.1837 best keyword for cluster 6516282 is PF00019 with Jaccard = 0.8119 [ 315 0 1099823 73 ] 1.0000 0.8119 SUGGESTING RELATEDNESS OF: A> PF04709 ( PF04709 Anti-Mullerian hormone, N terminal region ) B> PF00019 ( PF00019 Transforming growth factor beta like domain ) Only B has a clan ( CL0079.7 ). the two keywords coincide on Uniref90 proteins: |PF00019| = 388 , |PF04709| = 12 , |PF00019^PF04709| = 11 ( 2.8% and 91.7% ) only PF04709 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 682 ) 6755081_PF00615_PF00787 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00615 is 6750739 with Jaccard = 0.6855 |PF00615|=250 [ 194 33 1099928 56 ] parent [ 6750739 ] : 6755081 0.0116959 (=2024/(342*506)) 98.9747 given [ 6750739 ] : 6750739 0.0166147 (=112/(21*321)) 98.6738 best keyword for cluster 6750739 is PF00615 with Jaccard = 0.6855 [ 194 33 1099928 56 ] 0.8546 0.7760 sibling [ 6750739 ] : 6752634 0.0118812 (=6/(1*505)) 98.8126 best keyword for cluster 6752634 is PF00787 with Jaccard = 0.6625 [ 424 21 1099571 195 ] 0.9528 0.6850 SUGGESTING RELATEDNESS OF: A> PF00615 ( PF00615 Regulator of G protein signaling domain ) B> PF00787 ( PF00787 PX domain ) Only A has a clan ( CL0272.2 ). the two keywords coincide on Uniref90 proteins: |PF00615| = 250 , |PF00787| = 619 , |PF00615^PF00787| = 7 ( 2.8% and 1.1% ) both PF00615 and PF00787 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00787 SSF64268 0.897 (average over 1424 mutual instances, PF00787 1915 appearances, SSF64268 2671 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 683 ) 6443664_PF03131_PF08383 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08383 is 6209650 with Jaccard = 0.6818 |PF08383|=22 [ 15 0 1100189 7 ] parent [ 6209650 ] : 6443664 0.991398 (=461/(15*31)) 0.861333 given [ 6209650 ] : 6209650 1 (=56/(7*8)) 1.80004e-16 best keyword for cluster 6209650 is PF08383 with Jaccard = 0.6818 [ 15 0 1100189 7 ] 1.0000 0.6818 sibling [ 6209650 ] : 6427635 1 (=30/(1*30)) 0.227124 best keyword for cluster 6427635 is PF03131 with Jaccard = 0.6200 [ 31 0 1100161 19 ] 1.0000 0.6200 SUGGESTING RELATEDNESS OF: A> PF08383 ( PF08383 Maf N-terminal region ) B> PF03131 ( PF03131 bZIP Maf transcription factor ) Only B has a clan ( CL0018.10 ). the two keywords coincide on Uniref90 proteins: |PF03131| = 50 , |PF08383| = 22 , |PF03131^PF08383| = 22 ( 44.0% and 100.0% ) only PF08383 has a PDB structure (may not be up to date) PF03131 a.37.1.1 SUPERFAM mapping significantly overlapping: 1 PF03131 SSF47454 0.500 (average over 115 mutual instances, PF03131 115 appearances, SSF47454 529 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 684 ) 6764539_PF00428_PF00466 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00428 is 6753376 with Jaccard = 0.6791 |PF00428|=296 [ 201 0 1099915 95 ] parent [ 6753376 ] : 6764539 0.00800582 (=781/(458*213)) 99.4906 given [ 6753376 ] : 6753376 0.0141509 (=3/(1*212)) 98.8632 best keyword for cluster 6753376 is PF00428 with Jaccard = 0.6791 [ 201 0 1099915 95 ] 1.0000 0.6791 sibling [ 6753376 ] : 6731084 0.0371991 (=17/(1*457)) 96.8418 best keyword for cluster 6731084 is PF00466 with Jaccard = 0.9791 [ 374 6 1099829 2 ] 0.9842 0.9947 SUGGESTING RELATEDNESS OF: A> PF00428 ( PF00428 60s Acidic ribosomal protein ) B> PF00466 ( PF00466 Ribosomal protein L10 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00428| = 296 , |PF00466| = 376 , |PF00428^PF00466| = 73 ( 24.7% and 19.4% ) both PF00428 and PF00466 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 685 ) 6745345_PF00590_PF01890 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01890 is 6712665 with Jaccard = 0.6765 |PF01890|=135 [ 92 1 1100075 43 ] parent [ 6712665 ] : 6745345 0.0177431 (=3430/(115*1681)) 98.2598 given [ 6712665 ] : 6712665 0.0582524 (=72/(12*103)) 94.3492 best keyword for cluster 6712665 is PF01890 with Jaccard = 0.6765 [ 92 1 1100075 43 ] 0.9892 0.6815 sibling [ 6712665 ] : 6727857 0.0459337 (=1072/(14*1667)) 96.4642 best keyword for cluster 6727857 is PF00590 with Jaccard = 0.7680 [ 1152 230 1098711 118 ] 0.8336 0.9071 SUGGESTING RELATEDNESS OF: A> PF01890 ( PF01890 CbiG ) B> PF00590 ( PF00590 Tetrapyrrole (Corrin/Porphyrin) Methylases ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00590| = 1270 , |PF01890| = 135 , |PF00590^PF01890| = 43 ( 3.4% and 31.9% ) only PF01890 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 686 ) 6604579_PF05048_PF07602 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07602 is 6559946 with Jaccard = 0.6765 |PF07602|=32 [ 23 2 1100177 9 ] parent [ 6559946 ] : 6604579 0.400558 (=3015/(39*193)) 63.9963 given [ 6559946 ] : 6559946 0.591667 (=213/(15*24)) 46.7929 best keyword for cluster 6559946 is PF07602 with Jaccard = 0.6765 [ 23 2 1100177 9 ] 0.9200 0.7188 sibling [ 6559946 ] : 6560014 0.581081 (=3870/(148*45)) 46.8474 best keyword for cluster 6560014 is PF05048 with Jaccard = 0.6259 [ 92 13 1100064 42 ] 0.8762 0.6866 SUGGESTING RELATEDNESS OF: A> PF07602 ( PF07602 Protein of unknown function (DUF1565) ) B> PF05048 ( PF05048 Periplasmic copper-binding protein (NosD) ) Only A has a clan ( CL0268.2 ). the two keywords do not coincide on UniRef90 proteins Neither PF07602 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 687 ) 6543609_PF00516_PF00517 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00517 is 6535611 with Jaccard = 0.6737 |PF00517|=2096 [ 1412 0 1098115 684 ] parent [ 6535611 ] : 6543609 0.65931 (=13143489/(14029*1421)) 34.3182 given [ 6535611 ] : 6535611 0.745948 (=2117/(2*1419)) 29.169 best keyword for cluster 6535611 is PF00517 with Jaccard = 0.6737 [ 1412 0 1098115 684 ] 1.0000 0.6737 sibling [ 6535611 ] : 6532858 0.758554 (=10641/(1*14028)) 27.5467 best keyword for cluster 6532858 is PF00516 with Jaccard = 0.8922 [ 13759 0 1084789 1663 ] 1.0000 0.8922 SUGGESTING RELATEDNESS OF: A> PF00517 ( PF00517 Envelope Polyprotein GP41 ) B> PF00516 ( PF00516 Envelope glycoprotein GP120 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00516| = 15422 , |PF00517| = 2096 , |PF00516^PF00517| = 1568 ( 10.2% and 74.8% ) both PF00517 and PF00516 have PDB structures PF00517 h.3.2.1 j.85.1.1 PF00516 d.172.1.1 j.53.1.1 SUPERFAM mapping significantly overlapping: 1 PF00516 SSF56502 0.967 (average over 73444 mutual instances, PF00516 73449 appearances, SSF56502 79042 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 688 ) 6756749_PF00587_PF04073 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF04073 is 6689420 with Jaccard = 0.6721 |PF04073|=427 [ 287 0 1099784 140 ] parent [ 6689420 ] : 6756749 0.0106988 (=8799/(335*2455)) 99.0757 given [ 6689420 ] : 6689420 0.113965 (=1105/(32*303)) 90.1349 best keyword for cluster 6689420 is PF04073 with Jaccard = 0.6721 [ 287 0 1099784 140 ] 1.0000 0.6721 sibling [ 6689420 ] : 6752593 0.0126175 (=247/(8*2447)) 98.81 best keyword for cluster 6752593 is PF00587 with Jaccard = 0.7547 [ 1652 525 1098022 12 ] 0.7588 0.9928 SUGGESTING RELATEDNESS OF: A> PF04073 ( PF04073 YbaK / prolyl-tRNA synthetases associated domain ) B> PF00587 ( PF00587 tRNA synthetase class II core domain (G, H, P, S and T) ) Only B has a clan ( CL0040.10 ). the two keywords coincide on Uniref90 proteins: |PF00587| = 1664 , |PF04073| = 427 , |PF00587^PF04073| = 136 ( 8.2% and 31.9% ) both PF04073 and PF00587 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF04073 SSF55826 0.873 (average over 1486 mutual instances, PF04073 1998 appearances, SSF55826 2765 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 689 ) 6719656_PF03029_PF06807 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03029 is 6581374 with Jaccard = 0.6699 |PF03029|=206 [ 138 0 1100005 68 ] parent [ 6581374 ] : 6719656 0.0655866 (=1008/(141*109)) 95.3684 given [ 6581374 ] : 6581374 0.482014 (=134/(2*139)) 54.1992 best keyword for cluster 6581374 is PF03029 with Jaccard = 0.6699 [ 138 0 1100005 68 ] 1.0000 0.6699 sibling [ 6581374 ] : 6693227 0.107143 (=45/(4*105)) 90.937 best keyword for cluster 6693227 is PF06807 with Jaccard = 0.7083 [ 34 13 1100163 1 ] 0.7234 0.9714 SUGGESTING RELATEDNESS OF: A> PF03029 ( PF03029 Conserved hypothetical ATP binding protein ) B> PF06807 ( PF06807 Pre-mRNA cleavage complex II protein Clp1 ) Only A has a clan ( CL0017.14 ). the two keywords do not coincide on UniRef90 proteins Neither PF03029 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 690 ) 6666877_PF01291_PF06875 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01291 is 5775091 with Jaccard = 0.6667 |PF01291|=12 [ 8 0 1100199 4 ] parent [ 5775091 ] : 6666877 0.181818 (=32/(8*22)) 84.6952 given [ 5775091 ] : 5775091 1 (=7/(1*7)) 4.29043e-57 best keyword for cluster 5775091 is PF01291 with Jaccard = 0.6667 [ 8 0 1100199 4 ] 1.0000 0.6667 sibling [ 5775091 ] : 6628228 0.302083 (=29/(6*16)) 73.7835 best keyword for cluster 6628228 is PF06875 with Jaccard = 0.6667 [ 12 5 1100193 1 ] 0.7059 0.9231 SUGGESTING RELATEDNESS OF: A> PF01291 ( PF01291 LIF / OSM family ) B> PF06875 ( PF06875 Plethodontid receptivity factor PRF ) they come from the same clan: CL0053.9 : PF06875 PF01291 PF02024 PF00143 PF00489 PF02025 PF00727 PF02059 PF00715 PF03487 PF03039 PF07400 PF00726 PF00714 PF00103 PF01109 PF02947 PF00758 PF01110 PF02404 the two keywords do not coincide on UniRef90 proteins only PF01291 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF06875 SSF47266 0.873 (average over 209 mutual instances, PF06875 212 appearances, SSF47266 2488 appearances) 2 PF01291 SSF47266 0.910 (average over 29 mutual instances, PF01291 29 appearances, SSF47266 2488 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 691 ) 6726719_PF05747_PF06225 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF05747 is 6677373 with Jaccard = 0.6667 |PF05747|=9 [ 6 0 1100202 3 ] parent [ 6677373 ] : 6726719 0.0395349 (=17/(10*43)) 96.3268 given [ 6677373 ] : 6677373 0.125 (=3/(6*4)) 87.5379 best keyword for cluster 6677373 is PF05747 with Jaccard = 0.6667 [ 6 0 1100202 3 ] 1.0000 0.6667 sibling [ 6677373 ] : 6696267 0.101307 (=31/(34*9)) 91.6104 best keyword for cluster 6696267 is PF06225 with Jaccard = 0.8235 [ 28 6 1100177 0 ] 0.8235 1.0000 SUGGESTING RELATEDNESS OF: A> PF05747 ( PF05747 Poxvirus N2L protein ) B> PF06225 ( PF06225 Poxvirus A4/B15 family ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins Neither PF05747 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 692 ) 6636414_PF05318_PF06692 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF06692 is 6563892 with Jaccard = 0.6667 |PF06692|=2 [ 2 1 1100208 0 ] parent [ 6563892 ] : 6636414 0.282051 (=11/(3*13)) 76.0387 given [ 6563892 ] : 6563892 0.5 (=1/(1*2)) 50 best keyword for cluster 6563892 is PF06692 with Jaccard = 0.6667 [ 2 1 1100208 0 ] 0.6667 1.0000 sibling [ 6563892 ] : 6590933 0.454545 (=10/(2*11)) 57.6162 best keyword for cluster 6590933 is PF05318 with Jaccard = 0.8000 [ 12 0 1100196 3 ] 1.0000 0.8000 SUGGESTING RELATEDNESS OF: A> PF06692 ( PF06692 Melon necrotic spot virus P7B protein ) B> PF05318 ( PF05318 Tombusvirus movement protein ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF05318| = 15 , |PF06692| = 2 , |PF05318^PF06692| = 1 ( 6.7% and 50.0% ) Neither PF06692 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 693 ) 6750681_PF04239_PF07870 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF07870 is 6661366 with Jaccard = 0.6667 |PF07870|=21 [ 14 0 1100190 7 ] parent [ 6661366 ] : 6750681 0.0154044 (=48/(164*19)) 98.6673 given [ 6661366 ] : 6661366 0.166667 (=3/(1*18)) 83.5733 best keyword for cluster 6661366 is PF07870 with Jaccard = 0.6667 [ 14 0 1100190 7 ] 1.0000 0.6667 sibling [ 6661366 ] : 6560505 0.565217 (=273/(3*161)) 47.1195 best keyword for cluster 6560505 is PF04239 with Jaccard = 1.0000 [ 135 0 1100076 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF07870 ( PF07870 Protein of unknown function (DUF1657) ) B> PF04239 ( PF04239 Protein of unknown function (DUF421) ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF04239| = 135 , |PF07870| = 21 , |PF04239^PF07870| = 7 ( 5.2% and 33.3% ) Neither PF07870 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 694 ) 6561542_PF02955_PF08443 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08443 is 6526079 with Jaccard = 0.6667 |PF08443|=207 [ 142 6 1099998 65 ] parent [ 6526079 ] : 6561542 0.53425 (=11738/(127*173)) 48.0237 given [ 6526079 ] : 6526079 0.777273 (=1026/(8*165)) 23.7269 best keyword for cluster 6526079 is PF08443 with Jaccard = 0.6667 [ 142 6 1099998 65 ] 0.9595 0.6860 sibling [ 6526079 ] : 6486085 0.935484 (=348/(3*124)) 7.1594 best keyword for cluster 6486085 is PF02955 with Jaccard = 1.0000 [ 112 0 1100099 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF08443 ( PF08443 RimK-like ATP-grasp domain ) B> PF02955 ( PF02955 Prokaryotic glutathione synthetase, ATP-grasp domain ) they come from the same clan: CL0179.8 : PF01740 PF08443 PF05770 PF02955 PF01071 PF04174 PF07478 PF02786 PF02655 PF08442 PF02222 PF02750 PF03133 the two keywords do not coincide on UniRef90 proteins both PF08443 and PF02955 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 695 ) 6756225_PF02810_PF05118 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02810 is 6743318 with Jaccard = 0.6580 |PF02810|=409 [ 379 167 1099635 30 ] parent [ 6743318 ] : 6756225 0.0100039 (=820/(752*109)) 99.046 given [ 6743318 ] : 6743318 0.0272359 (=222/(11*741)) 98.0826 best keyword for cluster 6743318 is PF02810 with Jaccard = 0.6580 [ 379 167 1099635 30 ] 0.6941 0.9267 sibling [ 6743318 ] : 6744839 0.037037 (=4/(1*108)) 98.2222 best keyword for cluster 6744839 is PF05118 with Jaccard = 0.9425 [ 82 3 1100124 2 ] 0.9647 0.9762 SUGGESTING RELATEDNESS OF: A> PF02810 ( PF02810 SEC-C motif ) B> PF05118 ( PF05118 Aspartyl/Asparaginyl beta-hydroxylase ) Only B has a clan ( CL0029.13 ). the two keywords coincide on Uniref90 proteins: |PF02810| = 409 , |PF05118| = 84 , |PF02810^PF05118| = 2 ( 0.5% and 2.4% ) both PF02810 and PF05118 have PDB structures PF05118 b.82.2.4 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 696 ) 6731963_PF03446_PF03807 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03807 is 6581897 with Jaccard = 0.6541 |PF03807|=554 [ 363 1 1099656 191 ] parent [ 6581897 ] : 6731963 0.0407252 (=22407/(393*1400)) 96.9353 given [ 6581897 ] : 6581897 0.479381 (=930/(5*388)) 54.3751 best keyword for cluster 6581897 is PF03807 with Jaccard = 0.6541 [ 363 1 1099656 191 ] 0.9973 0.6552 sibling [ 6581897 ] : 6722456 0.0544 (=17952/(300*1100)) 95.7783 best keyword for cluster 6722456 is PF03446 with Jaccard = 0.6202 [ 769 453 1098971 18 ] 0.6293 0.9771 SUGGESTING RELATEDNESS OF: A> PF03807 ( PF03807 NADP oxidoreductase coenzyme F420-dependent ) B> PF03446 ( PF03446 NAD binding domain of 6-phosphogluconate dehydrogenase ) they come from the same clan: CL0063.17 : PF03721 PF04820 PF02254 PF00899 PF01946 PF02882 PF01488 PF01118 PF08491 PF03435 PF04321 PF07992 PF00070 PF02719 PF02153 PF02423 PF05368 PF01210 PF07994 PF07993 PF03447 PF03446 PF01225 PF06039 PF01232 PF03949 PF05834 PF00056 PF08659 PF07991 PF03486 PF00044 PF00732 PF01134 PF01408 PF00996 PF00479 PF00743 PF01494 PF00890 PF03807 PF01370 PF00208 PF02670 PF01113 PF01266 PF02629 PF02558 PF01593 PF01262 PF00670 PF00107 PF00106 PF02737 PF01073 PF02826 the two keywords coincide on Uniref90 proteins: |PF03446| = 787 , |PF03807| = 554 , |PF03446^PF03807| = 2 ( 0.3% and 0.4% ) both PF03807 and PF03446 have PDB structures PF03807 c.2.1.6 SUPERFAM mapping significantly overlapping: 1 PF03807 SSF51735 0.687 (average over 1471 mutual instances, PF03807 1503 appearances, SSF51735 164772 appearances) 2 PF03446 SSF51735 0.944 (average over 2816 mutual instances, PF03446 5512 appearances, SSF51735 164772 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 697 ) 6774609_PF01402_PF03681 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01402 is 6772743 with Jaccard = 0.6460 |PF01402|=408 [ 312 75 1099728 96 ] parent [ 6772743 ] : 6774609 0.0023806 (=1361/(581*984)) 99.8461 given [ 6772743 ] : 6772743 0.00334821 (=624/(256*728)) 99.7975 best keyword for cluster 6772743 is PF01402 with Jaccard = 0.6460 [ 312 75 1099728 96 ] 0.8062 0.7647 sibling [ 6772743 ] : 6766847 0.00561542 (=345/(442*139)) 99.5924 best keyword for cluster 6766847 is PF03681 with Jaccard = 0.7405 [ 234 81 1099895 1 ] 0.7429 0.9957 SUGGESTING RELATEDNESS OF: A> PF01402 ( PF01402 Ribbon-helix-helix protein, copG family ) B> PF03681 ( PF03681 Uncharacterised protein family (UPF0150) ) Only A has a clan ( CL0057.9 ). the two keywords coincide on Uniref90 proteins: |PF01402| = 408 , |PF03681| = 235 , |PF01402^PF03681| = 14 ( 3.4% and 6.0% ) only PF01402 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF01402 SSF47598 0.835 (average over 310 mutual instances, PF01402 324 appearances, SSF47598 883 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 698 ) 6740759_PF01822_PF03659 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01822 is 6732267 with Jaccard = 0.6441 |PF01822|=114 [ 76 4 1100093 38 ] parent [ 6732267 ] : 6740759 0.0221798 (=151/(46*148)) 97.8541 given [ 6732267 ] : 6732267 0.0326087 (=45/(138*10)) 96.9757 best keyword for cluster 6732267 is PF01822 with Jaccard = 0.6441 [ 76 4 1100093 38 ] 0.9500 0.6667 sibling [ 6732267 ] : 6625911 0.278049 (=57/(5*41)) 72.9853 best keyword for cluster 6625911 is PF03659 with Jaccard = 0.9268 [ 38 1 1100170 2 ] 0.9744 0.9500 SUGGESTING RELATEDNESS OF: A> PF01822 ( PF01822 WSC domain ) B> PF03659 ( PF03659 Glycosyl hydrolase family 71 ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF01822| = 114 , |PF03659| = 40 , |PF01822^PF03659| = 2 ( 1.8% and 5.0% ) Neither PF01822 have structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 699 ) 6509980_PF00906_PF08290 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF08290 is 6379878 with Jaccard = 0.6429 |PF08290|=28 [ 18 0 1100183 10 ] parent [ 6379878 ] : 6509980 0.84639 (=551/(21*31)) 15.4148 given [ 6379878 ] : 6379878 1 (=20/(1*20)) 0.00049646 best keyword for cluster 6379878 is PF08290 with Jaccard = 0.6429 [ 18 0 1100183 10 ] 1.0000 0.6429 sibling [ 6379878 ] : 6452822 1 (=30/(1*30)) 1.56867 best keyword for cluster 6452822 is PF00906 with Jaccard = 0.6444 [ 29 0 1100166 16 ] 1.0000 0.6444 SUGGESTING RELATEDNESS OF: A> PF08290 ( PF08290 Hepatitis core protein, putative zinc finger ) B> PF00906 ( PF00906 Hepatitis core antigen ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00906| = 45 , |PF08290| = 28 , |PF00906^PF08290| = 22 ( 48.9% and 78.6% ) only PF08290 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: 1 PF00906 SSF47852 0.837 (average over 2080 mutual instances, PF00906 2081 appearances, SSF47852 3170 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 700 ) 6737655_PF00037_PF00384 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00037 is 6720702 with Jaccard = 0.6357 |PF00037|=4327 [ 3387 1001 1094883 940 ] parent [ 6720702 ] : 6737655 0.0304517 (=283631/(5218*1785)) 97.5548 given [ 6720702 ] : 6720702 0.0707303 (=369/(1*5217)) 95.5031 best keyword for cluster 6720702 is PF00037 with Jaccard = 0.6357 [ 3387 1001 1094883 940 ] 0.7719 0.7828 sibling [ 6720702 ] : 6730037 0.0391315 (=3064/(45*1740)) 96.7255 best keyword for cluster 6730037 is PF00384 with Jaccard = 0.8944 [ 1381 150 1098667 13 ] 0.9020 0.9907 SUGGESTING RELATEDNESS OF: A> PF00037 ( PF00037 4Fe-4S binding domain ) B> PF00384 ( PF00384 Molybdopterin oxidoreductase ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00037| = 4327 , |PF00384| = 1394 , |PF00037^PF00384| = 103 ( 2.4% and 7.4% ) both PF00037 and PF00384 have PDB structures PF00037 d.58.1.1 d.58.1.2 d.58.1.5 i.4.1.1 SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 701 ) 6770483_PF00235_PF03259 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03259 is 6764885 with Jaccard = 0.6301 |PF03259|=170 [ 109 3 1100038 61 ] parent [ 6764885 ] : 6770483 0.00354773 (=116/(173*189)) 99.7271 given [ 6764885 ] : 6764885 0.00632911 (=31/(31*158)) 99.5073 best keyword for cluster 6764885 is PF03259 with Jaccard = 0.6301 [ 109 3 1100038 61 ] 0.9732 0.6412 sibling [ 6764885 ] : 6759556 0.0144928 (=28/(12*161)) 99.2412 best keyword for cluster 6759556 is PF00235 with Jaccard = 1.0000 [ 119 0 1100092 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF03259 ( PF03259 Roadblock/LC7 domain ) B> PF00235 ( PF00235 Profilin ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins both PF03259 and PF00235 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00235 SSF55770 0.955 (average over 395 mutual instances, PF00235 395 appearances, SSF55770 397 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 702 ) 6735807_PF00165_PF01965 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01965 is 6672831 with Jaccard = 0.6287 |PF01965|=1072 [ 674 0 1099139 398 ] parent [ 6672831 ] : 6735807 0.0287198 (=79647/(760*3649)) 97.3532 given [ 6672831 ] : 6672831 0.146631 (=333/(3*757)) 86.3036 best keyword for cluster 6672831 is PF01965 with Jaccard = 0.6287 [ 674 0 1099139 398 ] 1.0000 0.6287 sibling [ 6672831 ] : 6732735 0.0314039 (=1939/(17*3632)) 97.0212 best keyword for cluster 6732735 is PF00165 with Jaccard = 0.8007 [ 2559 492 1097015 145 ] 0.8387 0.9464 SUGGESTING RELATEDNESS OF: A> PF01965 ( PF01965 DJ-1/PfpI family ) B> PF00165 ( PF00165 Bacterial regulatory helix-turn-helix proteins, AraC family ) A and B come from a different clan ( CL0014.17 , CL0123.12 ). the two keywords coincide on Uniref90 proteins: |PF00165| = 2704 , |PF01965| = 1072 , |PF00165^PF01965| = 271 ( 10.0% and 25.3% ) both PF01965 and PF00165 have PDB structures PF01965 c.23.16.2 PF00165 a.4.1.8 i.11.1.1 SUPERFAM mapping significantly overlapping: 1 PF00165 SSF46689 0.817 (average over 10023 mutual instances, PF00165 14372 appearances, SSF46689 68153 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 703 ) 6621544_PF02189_PF07213 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02189 is 6488687 with Jaccard = 0.6200 |PF02189|=49 [ 31 1 1100161 18 ] parent [ 6488687 ] : 6621544 0.313636 (=138/(40*11)) 71.0381 given [ 6488687 ] : 6488687 0.944862 (=377/(19*21)) 7.88283 best keyword for cluster 6488687 is PF02189 with Jaccard = 0.6200 [ 31 1 1100161 18 ] 0.9688 0.6327 sibling [ 6488687 ] : 6538677 0.833333 (=25/(5*6)) 31.3245 best keyword for cluster 6538677 is PF07213 with Jaccard = 1.0000 [ 4 0 1100207 0 ] 1.0000 1.0000 SUGGESTING RELATEDNESS OF: A> PF02189 ( PF02189 Immunoreceptor tyrosine-based activation motif ) B> PF07213 ( PF07213 DAP10 membrane protein ) Neither A nor B are assigned a clan. the two keywords do not coincide on UniRef90 proteins only PF02189 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 704 ) 6626512_PF00095_PF02822 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF02822 is 6517172 with Jaccard = 0.6154 |PF02822|=26 [ 16 0 1100185 10 ] parent [ 6517172 ] : 6626512 0.279123 (=484/(17*102)) 73.0889 given [ 6517172 ] : 6517172 0.865385 (=45/(4*13)) 18.8895 best keyword for cluster 6517172 is PF02822 with Jaccard = 0.6154 [ 16 0 1100185 10 ] 1.0000 0.6154 sibling [ 6517172 ] : 6625748 0.33 (=66/(2*100)) 72.8359 best keyword for cluster 6625748 is PF00095 with Jaccard = 0.6715 [ 92 0 1100074 45 ] 1.0000 0.6715 SUGGESTING RELATEDNESS OF: A> PF02822 ( PF02822 Antistasin family ) B> PF00095 ( PF00095 WAP-type (Whey Acidic Protein) 'four-disulfide core' ) Neither A nor B are assigned a clan. the two keywords coincide on Uniref90 proteins: |PF00095| = 137 , |PF02822| = 26 , |PF00095^PF02822| = 6 ( 4.4% and 23.1% ) both PF02822 and PF00095 have PDB structures PF02822 g.3.15.1 SUPERFAM mapping significantly overlapping: 1 PF00095 SSF57256 0.887 (average over 257 mutual instances, PF00095 361 appearances, SSF57256 386 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 705 ) 6719519_PF00488_PF01713 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01713 is 6700202 with Jaccard = 0.6142 |PF01713|=316 [ 199 8 1099887 117 ] parent [ 6700202 ] : 6719519 0.0521141 (=9058/(687*253)) 95.3499 given [ 6700202 ] : 6700202 0.1002 (=1504/(95*158)) 92.214 best keyword for cluster 6700202 is PF01713 with Jaccard = 0.6142 [ 199 8 1099887 117 ] 0.9614 0.6297 sibling [ 6700202 ] : 6694179 0.0912873 (=373/(6*681)) 91.1387 best keyword for cluster 6694179 is PF00488 with Jaccard = 0.9400 [ 580 29 1099594 8 ] 0.9524 0.9864 SUGGESTING RELATEDNESS OF: A> PF01713 ( PF01713 Smr domain ) B> PF00488 ( PF00488 MutS domain V ) Only B has a clan ( CL0023.26 ). the two keywords coincide on Uniref90 proteins: |PF00488| = 588 , |PF01713| = 316 , |PF00488^PF01713| = 94 ( 16.0% and 29.7% ) only PF01713 has a PDB structure (may not be up to date) SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 706 ) 6771123_PF00579_PF01479 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01479 is 6750128 with Jaccard = 0.6071 |PF01479|=2079 [ 1817 914 1097218 262 ] parent [ 6750128 ] : 6771123 0.00392019 (=10137/(3123*828)) 99.7475 given [ 6750128 ] : 6750128 0.0192498 (=46916/(1594*1529)) 98.6269 best keyword for cluster 6750128 is PF01479 with Jaccard = 0.6071 [ 1817 914 1097218 262 ] 0.6653 0.8740 sibling [ 6750128 ] : 6769686 0.00483676 (=4/(1*827)) 99.7001 best keyword for cluster 6769686 is PF00579 with Jaccard = 0.9934 [ 755 1 1099451 4 ] 0.9987 0.9947 SUGGESTING RELATEDNESS OF: A> PF01479 ( PF01479 S4 domain ) B> PF00579 ( PF00579 tRNA synthetases class I (W and Y) ) Only B has a clan ( CL0038.9 ). the two keywords coincide on Uniref90 proteins: |PF00579| = 759 , |PF01479| = 2079 , |PF00579^PF01479| = 198 ( 26.1% and 9.5% ) both PF01479 and PF00579 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 707 ) 6704263_PF01436_PF08450 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF01436 is 6692435 with Jaccard = 0.6070 |PF01436|=306 [ 190 7 1099898 116 ] parent [ 6692435 ] : 6704263 0.090432 (=9857/(367*297)) 92.9729 given [ 6692435 ] : 6692435 0.110837 (=225/(7*290)) 90.7586 best keyword for cluster 6692435 is PF01436 with Jaccard = 0.6070 [ 190 7 1099898 116 ] 0.9645 0.6209 sibling [ 6692435 ] : 6634810 0.273158 (=7201/(269*98)) 75.6875 best keyword for cluster 6634810 is PF08450 with Jaccard = 0.6741 [ 211 94 1099898 8 ] 0.6918 0.9635 SUGGESTING RELATEDNESS OF: A> PF01436 ( PF01436 NHL repeat ) B> PF08450 ( PF08450 SMP-30/Gluconolaconase/LRE-like region ) they come from the same clan: CL0186.8 : PF03088 PF08450 PF06739 PF07494 PF01011 PF02897 PF07676 PF08801 PF01436 PF06433 PF00058 PF01839 PF00930 PF02239 PF01731 PF00400 the two keywords coincide on Uniref90 proteins: |PF01436| = 306 , |PF08450| = 219 , |PF01436^PF08450| = 1 ( 0.3% and 0.5% ) both PF01436 and PF08450 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 708 ) 6701752_PF00069_PF00560 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00560 is 6701479 with Jaccard = 0.6048 |PF00560|=5445 [ 3346 87 1094679 2099 ] parent [ 6701479 ] : 6701752 0.0780373 (=4048390/(13652*3800)) 92.4987 given [ 6701479 ] : 6701479 0.0982917 (=4839/(13*3787)) 92.4359 best keyword for cluster 6701479 is PF00560 with Jaccard = 0.6048 [ 3346 87 1094679 2099 ] 0.9747 0.6145 sibling [ 6701479 ] : 6699288 0.0817025 (=27834/(25*13627)) 92.0576 best keyword for cluster 6699288 is PF00069 with Jaccard = 0.7678 [ 10431 2211 1086625 944 ] 0.8251 0.9170 SUGGESTING RELATEDNESS OF: A> PF00560 ( PF00560 Leucine Rich Repeat ) B> PF00069 ( PF00069 Protein kinase domain ) A and B come from a different clan ( CL0022.25 , CL0016.14 ). the two keywords coincide on Uniref90 proteins: |PF00069| = 11375 , |PF00560| = 5445 , |PF00069^PF00560| = 530 ( 4.7% and 9.7% ) both PF00560 and PF00069 have PDB structures SUPERFAM mapping significantly overlapping: 1 PF00069 SSF56112 0.797 (average over 32363 mutual instances, PF00069 36405 appearances, SSF56112 66637 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 709 ) 6728229_PF00313_PF00545 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF00545 is 6584449 with Jaccard = 0.6032 |PF00545|=63 [ 38 0 1100148 25 ] parent [ 6584449 ] : 6728229 0.0355311 (=1455/(45*910)) 96.502 given [ 6584449 ] : 6584449 0.497696 (=216/(31*14)) 55.0605 best keyword for cluster 6584449 is PF00545 with Jaccard = 0.6032 [ 38 0 1100148 25 ] 1.0000 0.6032 sibling [ 6584449 ] : 6690491 0.116722 (=423/(4*906)) 90.3619 best keyword for cluster 6690491 is PF00313 with Jaccard = 0.9095 [ 724 65 1099415 7 ] 0.9176 0.9904 SUGGESTING RELATEDNESS OF: A> PF00545 ( PF00545 ribonuclease ) B> PF00313 ( PF00313 'Cold-shock' DNA-binding domain ) Only B has a clan ( CL0021.12 ). the two keywords coincide on Uniref90 proteins: |PF00313| = 731 , |PF00545| = 63 , |PF00313^PF00545| = 1 ( 0.1% and 1.6% ) both PF00545 and PF00313 have PDB structures PF00313 b.40.4.5 SUPERFAM mapping significantly overlapping: 1 PF00313 SSF50249 0.971 (average over 2759 mutual instances, PF00313 2782 appearances, SSF50249 52669 appearances) 2 PF00545 SSF53933 0.914 (average over 162 mutual instances, PF00545 165 appearances, SSF53933 193 appearances) HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: -------------------====== ( 710 ) 6673211_PF02568_PF03054 ==========------------------ ====== the next two brothers (given, sibling) merge two good clusters ====================== ====== (given is best cluster, sibling has J > 0.6 threshold) ====================== ====== for two seperate keywords - are the keywords related? ====================== best cluster for keyword PF03054 is 6536179 with Jaccard = 0.6004 |PF03054|=457 [ 275 1 1099753 182 ] parent [ 6536179 ] : 6673211 0.170046 (=9469/(301*185)) 86.3916 given [ 6536179 ] : 6536179 0.764214 (=457/(2*299)) 29.6917 best keyword for cluster 6536179 is PF03054 with Jaccard = 0.6004 [ 275 1 1099753 182 ] 0.9964 0.6018 sibling [ 6536179 ] : 6625216 0.292549 (=746/(15*170)) 72.55 best keyword for cluster 6625216 is PF02568 with Jaccard = 0.7976 [ 134 32 1100043 2 ] 0.8072 0.9853 SUGGESTING RELATEDNESS OF: A> PF03054 ( PF03054 tRNA methyl transferase ) B> PF02568 ( PF02568 Thiamine biosynthesis protein (ThiI) ) they come from the same clan: CL0039.7 : PF00764 PF00733 PF01171 PF01902 PF06508 PF02540 PF01507 PF02568 PF03054 the two keywords do not coincide on UniRef90 proteins both PF03054 and PF02568 have PDB structures SUPERFAM mapping significantly overlapping: HMM-Logos (old pfam site):A , B MSA-full-fasta: A,B manual inspection notes: