0.KIJ_redundant_Proteins/ - Redundant protein set that predicted from 29,082 KIJ_Genomes (protein count:64.7M) 1. KIJ_unique_Proteins/ - Identical proteins were removed from the redundant proteins (protein count: 22.1M) 2. KIJ_CD-HIT-100_Proteins/ - 100% similarity cutoff CD-HIT was performed on 1.KIJ_unique_Proteins (protein count: 20.6M) 3. KIJ-UHGP_unique_Proteins/ - KIJ_CD-HIT-100_Proteins and UHGP-100 are merged and identical sequences are removed (protein count: 107.0M) 4. HRGM_Proteins - FINAL HRGM Protein catalog. - CD-HIT 100%, 95%, 90%, 70%, and 50% are performed on KIJ_CD-HIT-100_Proteins sequentially. (See the original paper methods) - Protein count i ) HRGM-100: 103.7M ii ) HRGM-95 : 20.0M iii) HRGM-90 : 14.8M iv ) HRGM-70 : 8.5M v ) HRGM-50 : 4.7M