

Furthermore, we observed occasional amino acid substitutions in the NS4A protein sequences from genotype 3a. During our protein blast analysis of NS4A gene (HCV genotype 3a) isolated from Pakistani population, we observed a relatively conserved nature of NS4A protein. Suzuki and Nei used amino acid sequences of hemagglutinin genes instead of nucleotide sequences in their work on origin and evolution of influenza virus and they reported that amino acid sequences provide more reliable information in establishing evolutionary relationship than nucleotide sequences when the sequence divergence is high. Though HCV classification system has evolved considerably, it does not provide convincing information about origin of the virus. In terms of genetic variability, genotypes differ from each other by 31 to 33% and subtype by 20 to 25%.

Based on nucleotide sequence comparison analysis in 5'UTR, Core/E1 and NS5B regions six major HCV genotypes (HCV-1 to HCV-6) have been described, each containing multiple subtypes (e.g., 1a, 1b, 1c etc). The error-prone nature of this enzyme is responsible for a high mutation rate in HCV. Nonstructural 5B (NS5B) protein is an RNA-dependent RNA polymerase that is responsible for viral genome replication. Within host cell the polyprotein is processed into structural (Core, E1, E2 and P7) and nonstructural proteins (NS2, NS3, NS4A, NS4B, NS5A and NS5B). HCV has a positive-sense single-stranded RNA genome of about 9.6 kb that has one single open reading frame and conserved un-translated regions (UTRs) at the 5' and 3' ends. Hepatitis C virus belongs to Flaviviridae family of viruses and its chronic infection has affected 350 million people worldwide. These results were further confirmed through phylogenetic analysis by constructing phylogenetic tree using NS4A protein as a phylogenetic marker. These observations indicate that NS4A protein of different HCV genotypes originally evolved from NS4A protein of genotype 1 subtype 1b, which in turn indicate that HCV genotype 1 subtype 1b established itself earlier in human population and all other known genotypes evolved later as a result of mutations in HCV genotype 1b. So the different amino acids that were introduced as substitutions in NS4A protein of genotype 1 subtype 1b have been retained as consistent members of the NS4A protein of other known genotypes. Similarly Q 46 and Q 47 in genotype 5, V 29, V 30, Q 46 and Q 47 in genotype 4, C 22, Q 46 and Q 47 in genotype 6, C 22, V 38, Q 46 and Q 47 in genotype 3 and C 22 in genotype 2 as more consistent members of NS4A protein of these genotypes. Furthermore, we observed C 22 and V 30 as more consistent members of NS4A protein of genotype 1a. We investigated 346 sequences and compared amino acid composition of NS4A protein of different HCV genotypes through Multiple Sequence Alignment and observed amino acid substitutions C 22, V 29, V 30, V 38, Q 46 and Q 47 in NS4A protein of genotype 1b. We have identified a conserved NS4A protein sequence for HCV genotype 3a reported from four different continents of the world i.e. HCV genome codes for a single polyprotein of about 3011 amino acids which is processed into structural and non-structural proteins inside host cell by viral and cellular proteases. However this classification system does not provide significant information about the origin of the virus, primarily due to high mutation rate at nucleotide level. Based on genetic variability, HCV has been classified into 6 different major genotypes and 11 different subtypes. A high rate of mutation has been found to be associated with RNA viruses like HCV. The 9.6 kb long RNA genome of Hepatitis C virus (HCV) is under the control of RNA dependent RNA polymerase, an error-prone enzyme, for its transcription and replication.
