Year
Discovery/landmark/reference
Pregenomic era
1871
Discovery of nucleic acids
1889
Hugo de Vries postulated “pangene” to be a living, self-replicating unit of heredity. His postulation was adapted from Darwin’s “pangenesis” (the process by which cells might produce offspring)
1909
Introduction of the word “gene” (second half of pangene) into the German language as “Gen” by Wilhelm Ludvig Johannsen
1940
Beadle and Tatum linked genes to unique protein products and formulated the “one gene, one protein” concept
1951
Discovery of the first protein sequence
1953
Identification of the double-stranded structure of DNA (Watson and Crick 1953)
1960s
Modern concept of gene expression developed following discovery of messenger RNA, deciphering of genetic code, and description of the theory of genetic regulation of protein synthesis Establishment of the complete genetic code
Dawn of the genomic age
1972
Production of the first recombinant DNA organism (Cohen et al. 1972)
1975
DNA hybridization analysis (Southern 1975)
1975
Introduction of 2-dimensional electrophoresis of proteins (O’Farrell 1975)
1977
Advent of DNA sequencing
1978
Discovery of restriction fragment length polymorphism (Maat and Smith 1978)
1981
Gene mapping by in situ hybridization becomes a standard method
1982
GenBank is established
1983
Demonstration of Huntington’s disease gene (Gusella et al. 1983)
1985
Discovery of polymerase chain reaction (Mullis et al. 1986)
1986
Dr. Roderick coined the word “genomics” as the title of the journal that started publication in 1987 (Kuska 1998)
1987
Identification of dystrophin, the protein product of Duchenne muscular dystrophy gene, which now forms basis of gene therapy for this disorder (Hoffman et al. 1987)
Genomic age
1990
Launch of the Human Genome Project, National Institutes of Health, United States (a $3 billion/15-year project)
1990
First human gene therapy experiment. Correction of adenosine deaminase deficiency in T lymphocytes using retroviral-mediated gene transfer (Blaese et al. 1990)
1991
Venter found that expressed sequence tags can provide a cheap, rapid way to skim the genome for practical information. Starting point of commercialization of genomics
1995
Definition of the proteome (Wilkins et al. 1995)
1996
Completion of the first whole-genome sequence of an organism: the budding yeast Saccharomyces cerevisiae
1999
First human chromosome sequenced: chromosome 22
2000
Completion of the sequencing of the human genome ahead of the anticipated date
Postgenomic era
2000–2010
Increase in amount of sequence data; integration of information from genomics with that from other omics, such as proteomics and metabolomics; and applications for the development of personalized medicine
1.2 Variations in the Human Genome
Because of the diversity of the human species, there is no such thing as a normal human genome sequence. Variations are specific locations in the human genome where differences between individuals are found, and the term “normal” or “wild type” refers to the most common variant at a location in a given population group. Variants are referred to as “alleles,” but if the frequency of an allele is greater than 1 %, such variants are called polymorphisms. The term “mutation” is generally used for changes in DNA that are associated with disease.
Events contributing to genomic variation fall into three categories: (1) single-base-pair changes or point mutations that disturb the “normal” DNA nucleotide sequence, (2) insertions and deletions of nucleotides from the DNA, and (3) structural rearrangements that reshuffle the DNA sequence, thus changing the order of nucleotides (Feero et al. 2010). Replication-based mechanisms can result in complex genomic rearrangements. Genetic variations in the human genome are listed in Table 2.
Table 2
Genetic variations in the human genome
Variation |
Features |
---|---|
Complex chromosomal rearrangements (CCRs) |
CCRs account for a large fraction of nonrecurrent rearrangements at a given locus |
Copy number variation (CNV) |
DNA segments >1 kb in length, whose copy number varies with respect to a reference genome. ~12 % of human genes vary in DNA sequences they contain |
Insertions and deletions in the human genome (indel) |
Indels are an alternative form of natural genetic variation that differs from SNPs |
Interspersed repeated elements |
Long and short interspersed nuclear elements are a significant portion of human genome |
Large-scale variation in human genome |
Large portions of DNA can be repeated or missing for no known reason in healthy persons |
Segmental duplication |
Duplicons have >90 % sequence homology to another region in the genome |
Single-nucleotide polymorphisms (SNPs) |
SNPs are sequence variations at single-base-pair level with a population frequency of >1 % |
Structural variations (SVs) |
SVs involve kilobase- to megabase-sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements |
Tandem repeats |
Tandem sequences repetitions represent ~10 % of the genome |
1.3 Neurogenomics in Relation to Other Omics
There are numerous “omics” and relation of some of these is shown in Fig. 1. More are listed at the Website: http://www.genomicglossaries.com/content/omes.asp.

Fig. 1
Relationships of neurogenomics with other omics
Proteomics is the systematic analysis of protein profiles of tissues and parallels the related field of genomics. The term “proteomics” combines the words “protein” and “genome”; the spelling indicates PROTEins expressed by a genOME. Neuroproteomics refers to the protein profile of the nervous system. The massive amount of information generated by genomics and other omics has led to the development of bioinformatics and various tools that are required to analyze this data.
2 Methods of Study of Neurogenomics
2.1 Gene Expression
The activity of a gene, so called gene “expression,” means that its DNA is used as a blueprint to produce a specific protein. Only a limited number of the genes are expressed in a typical human cell, and the expressed genes vary from one cell to another. Gene expression can be detected by various techniques. The discovery that eukaryotic genes are not contiguous sequences of DNA but consist of coding sequences (exons) interrupted by intervening sequences (introns) led to a more complex view of gene expression. The temporal, developmental, typographical, histological, and physiological patterns in which a gene is expressed provide clues to its biological role. Malfunctioning of genes is involved in most diseases, not only inherited ones.
All functions of cells, tissues, and organs are controlled by differential gene expression. Gene expression is used for studying gene function. Genes are now routinely expressed in cultured cell lines by using viral vectors carrying cDNA, the transcription of which yields the gene’s mRNA. RNA–RNA interaction can induce gene expression and RNA can regulate its activities without necessarily requiring a protein. The protein produced from mRNA may confer specific and detectable function on the cells used to express the gene. It is also possible to manipulate cDNA so that proteins are expressed in a soluble form fused to polypeptide tags. This allows purification of large amounts of proteins that can be used to raise antibodies or to probe protein function in vivo in animals. Knowledge of which genes are expressed in healthy and diseased tissues would allow us to identify both the protein required for normal function and the abnormalities causing disease. This information will help in the development of new diagnostic tests for various illnesses as well as new drugs to alter the activity of the affected genes or proteins.
Current techniques for analysis of gene expression either monitor one gene at a time, e.g., RT-PCR methods, or can do simultaneous analysis of thousands of genes, e.g., microarray hybridization or serial analysis of gene expression. A flexible, alternative PCR-based method, RAGE (rapid analysis of gene expression) has been developed which enables expression changes to be determined in either a directed search of known genes or an undirected survey of unknown genes. A single set of reagents and reaction conditions allows analyses of most genes in any eukaryote. The method is useful for assaying on the order of tens to hundreds of genes in multiple samples. Control experiments indicate reliable detection of changes in gene expression twofold and greater and sensitivity of detection better than 1 in 10,000.
2.1.1 Methods for the Study of Gene Expression in the Brain
The human brain has a more complex pattern of gene expression than any other region of the body. The molecular events in neurologic disorders are caused or paralleled by specific gene expression changes. Analysis of these changes provides an understanding of the disease at the molecular level. Gene expression profiling also provides some information about mitochondrial disorders because of a bidirectional information flow between the mitochondrion and the cell nucleus (Mende et al. 2007).
Several technological advances enable the analysis of thousands of expressed genes in a small brain sample. These techniques include expressed sequence tags, sequencing of cDNA libraries, differential display, subtractive hybridization, serial analysis of gene expression, and the high-density DNA microarrays. Gene expression measurements may be used to identify genes that are abnormally regulated as a secondary consequence of a disease state or to identify the response of brain cells to pharmacological treatments.
The usual method for the study of gene expression in the brain is by obtaining tissue sections and examining them for the expression of a particular gene using a fluorescent probe. When these genes are illuminated under a fluorescence microscope, the regions where the gene is most highly activated within the nervous system are clearly shown. The nervous system provides abundant opportunities to study gene expression because of the presence of numerous genes that carry out a wide range of functions. However, the development of a probe for each gene that could potentially be expressed in the brain, and then the utilization of these probes to test for the presence or absence of gene expression, is a challenging task.
2.1.2 Study of Gene Expression by Brain Imaging
Molecular imaging is an emerging field of study that deals with imaging of disease on a cellular and molecular level. It can be considered as an extension of molecular diagnostics. In contradistinction to “classical” diagnostic imaging, it sets forth to probe the molecular abnormalities that are the basis of disease rather than to image the end effects of these molecular alterations. Radionuclide imaging, magnetic resonance imaging (MRI), and positron emission tomography (PET) can be used to visualize gene expression.
Three-dimensional gene expression patterns in the brain can be mapped by analysis of spatially registered voxels (cubes) by a process analogous to the images reconstructed in functional brain imaging systems. Consistent gene expression differences between normal and Alzheimer’s disease brains can be demonstrated by this approach.
2.1.3 Study of Genetic Variation by Brain Imaging
Large-scale neuroimaging studies can be used to discover genetic variants that affect the brain. Screening of brain circuits for testing genetic associations in connectome-wide and genome-wide scans is feasible (Medland et al. 2014). Analysis of massive data, however, will be challenging.
2.2 Genotyping
Single-nucleotide polymorphisms (SNPs) serve to distinguish one individual’s genetic material from that of another. There are no exact figures on the frequency of occurrence of single SNPs in the human genome, but they occur about once every 1,250 bases along the six billion base pairs, i.e., the “letters” that make up the genetic code. Studies suggest ~5 SNPs per gene, but not every gene has an SNP. Approximately nine million SNPs have been identified already in various databases but only a small fraction of these are well characterized and validated. SNPs comprise ~80 % of all known polymorphisms. Several technologies are used for their identification, of which the most important are based on DNA microarrays or biochip technology. SNPs have the following relation to an individual’s disease and drug response:
-
SNPs are linked to disease susceptibility.
-
SNPs are linked to drug response, e.g., insertions or deletions of ACE gene determine the response to beta-blockers.
-
SNPs can be used as biomarkers to segregate individuals with different levels of response to treatment (beneficial or adverse) in clinical settings.
2.3 Copy Number Variations
Copy number variations (CNVs) refer to variation from one person to another in the number of copies of a particular gene or DNA sequence. CNV is a source of genetic diversity in humans. Numerous CNVs are being identified with various genome analysis technologies including array comparative genomic hybridization, SNP genotyping, and DNA sequencing. Some diseases are associated with CNVs rather than SNPs. Although CNVs confer a risk of disease, they may not be sufficient by themselves to lead to a specific disease outcome, and additional risk factors may account for the variation. Considerable variation has been observed in the phenotypes associated with several recurrent specific CNVs that are relatively prevalent (Girirajan et al. 2012). This study, by showing that the phenotypic variation of some genomic disorders may be partially explained by the presence of additional large variants, may help in understanding the causes of some neurologic diseases.
2.4 Biochips/Microarrays
Microarray or DNA chip technology (also called gene chip or “biochip”) is a rapid method of sequencing and analyzing genes. It is comprised of DNA probes formatted on a microscale and the instruments needed to handle the samples (automated robotics), read the reporter molecules (scanners), and analyze the data (bioinformatic tools). Hybridization of RNA- or DNA-derived samples on chips allows the monitoring of expression of mRNAs or the occurrence of polymorphisms in genomic DNA. Examples of biochip technology are:
2.4.1 Automated Programmable Electronic Matrix
This microchip technology consists of a multisite, electronically controlled array of independent test areas, each capable of attracting, binding, or repelling DNA under specific conditions of charge, polarity, current, and voltage. The automated programmable electronic matrix microchip takes advantage of the well-established principles of electrophoresis by moving charged molecules in an electric field, but on a greatly miniaturized scale. As an example of this, DNA (which is strongly electronegative and, therefore, carries a net negative charge) can be moved in an electric field to an area of net positive charge. The sample DNA is significantly concentrated over time in the area of positive charge. This concentrating effect facilitates and greatly speeds up the hybridization of DNA. This effect can simultaneously occur at each test site, permitting rapid, multiple tests on a single sample. Unwanted, nonspecific DNA is repelled from the area of the electrode under closely controlled electronic conditions.
2.4.2 Microfluidic Devices
These are complete biochemical analysis systems that use nanoliter quantities of reagents and are referred to as “labs-on-a-chip.” Disposable, nonreusable chips are economical diagnostic devices.
2.4.3 Chromosome on a Chip
This technique is slightly different from a DNA chip in that it uses genomic DNA instead of cDNA. This technique has been found to be useful for tracking the chromosomal whereabouts of a gene. Further development of the technology will involve construction of a whole-genome chip containing all the chromosomes on it and will be the equivalent of the present-day genetic linkage map.
2.4.4 Protein Chip
This is comparable to DNA chip technology in the field of genome analysis and has important applications in the field of proteomics. The protein chip system uses small arrays or plates with chemically or biologically treated surfaces to interact with proteins. Unknown proteins are affinity captured on treated surfaces, desorbed and ionized by laser excitation, and detected according to molecular weight. Known proteins are analyzed using on-chip functional assays. For example, chip surfaces can contain enzymes, receptor proteins, or antibodies, enabling on-chip protein-to-protein interaction studies, ligand binding studies, or immunoassays. The system enables the detection and analysis of trace amounts of proteins directly from biological tissues and fluids, including proteins differentially expressed in disease (Jain 2014a).
2.4.5 Bioelectronic Microchips
These chips contain numerous electronically active microelectrodes with specific DNA capture probes linked to the electrodes through molecular wires. Target DNA or RNA is labeled in this system by hybridization to specific signaling probes covalently labeled with ferrocene, a redox label. The microelectrode surface is electrically insulated with a monolayer coating to prevent unwanted redox species in the sample chamber from interfering with measurements. Signals, therefore, depend on specific probe and target interaction (i.e., hybridization). Minimal specimen preparation is required, and the system works in whole blood and contaminated specimens. This technology detects, among other targets, SNPs and matches conventional DNA testing for genetic mutations.
3 DNA Sequencing
Most genetic disorders are caused by point mutations. Deletions are less frequent and may be overlooked by DNA mapping. It is difficult to find the location of a gene buried in the tangle of chromosomal DNA in the nucleus; sequencing of individual nucleotide bases may be required. DNA sequence analysis is a multistep process comprising sample preparation, generation of labeled fragments by sequencing reactions, electrophoretic separation of fragments, data acquisition, assembly into a finished sequence, and, most importantly, functional interpretation. Sequencing is also used to determine protein sequences, but it is difficult to determine protein function from sequence. Sequencing is now automated. Sequencing technologies are described in a special report on this topic (Jain 2014a, b, c. Apart from their impact on hereditary neurologic diseases, high-throughput genome-sequencing technologies will improve our understanding of sporadic neurologic diseases as well, particularly those with low-penetrant mutations in the gene for hereditary diseases or de novo mutations (Tsuji 2013).
3.1 Microarray-Based DNA-Sequencing Technologies
Sequencing whole genomes requires resources that are currently beyond those of a single laboratory and therefore it is not a practical approach for resequencing hundreds of individual genomes. High-throughput microarrays, which were initially developed to analyze the expression of many RNA transcripts in parallel, have since been adapted to a variety of applications, one of which is the DNA sequencing. Advances in microarray fabrication and completion of large-scale genome-sequencing projects have enabled the rapid development of affordable array-based methods for high-resolution genome-wide assessment of DNA alterations. Main forms of genomic variations (amplifications, deletions, insertions, rearrangements, and base-pair changes) can be detected using techniques that are readily performed in individual laboratories using simple experimental approaches. A number of array-based technologies are in development and some examples are:
3.1.1 Arrayit’s® H25K
This is the world’s only human genome microarray based on the completely sequenced human genome. H25K is a multipurpose long oligonucleotide microarray that allows karyotyping, gene expression profiling, chromatin structure analysis, and protein–DNA interaction studies on a genomic scale. Its glass substrate slide format is fully compatible with every major microarray scanner brand including the Arrayit InnoScan and SpotLight Scanner series.
3.1.2 High-Throughput Array-Based Resequencing
Although genome-wide association studies have successfully identified associations of many common SNPs with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genome-wide association studies. High-throughput, high-accuracy resequencing technologies needed to identify these rare alleles. Although array-based genotyping has allowed genome-wide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for two main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing and failure to achieve sufficient performance. Scientists at Affymetrix and Genentech in collaboration with Stanford Genome Technology Center have solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data (Zheng et al. 2009). The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA, and ends with interrogation of purified DNA on resequencing arrays. They have used this pipeline to resequence ≈5 Mb of DNA (on three arrays) corresponding to the exons of 1,500 genes in >473 samples; in total >2,350 Mb were sequenced. In the context of this large-scale study, they obtained a false-positive rate of ≈1 in 500,000 bp and a false-negative rate of ≈10 %. Some of the advantages of this approach are:
-
The researchers identified almost 30,000 previously unidentified variants when they applied the approach to HapMap samples—resequencing exonic sequences for about 1,500 genes in nearly 500 samples.
-
Because this approach can distinguish between variant and nonvariant DNA, it may be possible to decrease the cost of sequencing by an order of magnitude simply by focusing on the variant DNA pool alone rather than resequencing the entire genome.
-
Genotyping arrays have enabled large association studies through genotyping tens of thousands of samples. By creating appropriate “upfront” processes for resequencing arrays, they have created the potential to conduct similar large-scale resequencing-based association studies.
3.2 Next-Generation Sequencing Versus Microarrays for Gene Expression Profiling
Like next-generation sequencing (NGS), microarrays can be used to examine thousands of genes in one experiment and obtain gene profiles, but the drawback of microarrays are based on hybridization. Gene expression levels are measured by fluorescence from hybridization but quantification of the fluorescence of vast amount of spots on a chip is often unreliable and varies from one experiment to another. Furthermore, DNA samples can hybridize to more than one spot, thus, generating misleading results. Next-generation sequencing overcomes problems of microarrays by generating actual sequence reads and is ideal for detecting genetic mutations. Gene expression can be more accurately obtained by counting sequence reads.
The Dunnen Center for Human and Clinical Genetics at Leiden University Medical Center (Leiden, The Netherlands) has done the first large-scale comparison between NGS and microarray-based gene expression profiling. Using the Illumina digital gene expression (DGE) assay, the scientists obtained ~2.4 million sequence tags per sample, their abundance spanning four orders of magnitude. Results were highly reproducible, even across laboratories. The correlation with five different microarray platforms was modest and most significant for Affymetrix. The changes in DGE observed by NGS were larger than observed by microarrays or quantitative PCR. While undetectable by microarrays, antisense transcription was found for 51 % of all genes and alternative polyadenylation for 47 %. The study concluded that next-generation sequencing provides a major advance in robustness, comparability, and richness of DGE profiling data and is expected to boost collaborative, comparative, and integrative genomics studies.
Gene expression profiles of an in vitro cell model were used to compare the quality of the data generated by microarray and DGE; the correlation coefficients between the technical replicates were >0.99 and the detection variance was <9 % for both platforms (Feng et al. 2010). The dynamic range of microarray was fixed with four orders of magnitude, whereas that of DGE was extendable. The consistency of the two platforms was high, especially for those abundant genes. It was more difficult to distinguish the expression variation of less abundant genes with the microarray. Although microarrays might be eventually replaced by DGE or transcriptome sequencing in the near future, they are still reliable, practical, and useful for most biological researchers.
3.3 RNA Sequencing
With the recognition of importance of RNA metabolism for brain function, as well as malfunction, there is an interest in understanding posttranscriptional gene regulation through many new and recently discovered mechanisms. Earlier transcriptomics studies were mostly based on hybridization-based microarray technologies and offered a limited ability to fully catalog and quantify the diverse RNA molecules that are expressed from genomes over wide ranges of levels. Introduction of high-throughput NGS technologies have revolutionized transcriptomics by enabling RNA analysis through cDNA sequencing at massive scale (RNA-seq). This development has overcome several challenges posed by microarray technologies, including the limited dynamic range of detection (Ozsolak and Milos 2011). NGS platforms used for RNA-seq are commercially available from several companies, whereas new technologies are in development by others.
RNA-seq is a powerful tool for studying the effect of the transcriptome on phenotypes such as disease susceptibility, cancer progression, and response to pharmaceuticals. Applications include the following:
-
Transcript identification: mapping results reveal the identity of transcripts present in a sample, with ability to detect rare transcripts by increasing sequencing depth.
-
Splice variant analysis: relative expression of exons across a single transcript can elucidate the presence of splice variants.
-
Differential expression: differential expression levels of two transcripts in a single sample or of a single transcript in two disparate samples can be ascertained from relative sequencing depths.
-
RNA measurements for clinical diagnostics, e.g., analysis of circulating extracellular nucleic acid and cells, such as fetal RNA. By enabling earlier diagnosis, disease recurrence, or mutational status, this will help in the realization of the full potential of genomic information and its growing impact on the personalization of healthcare.
3.3.1 Strand-Specific RNA Sequencing
Strand-specific RNA sequencing is rapidly replacing conventional cDNA sequencing as an approach for assessing information about the transcriptome. Alongside improved laboratory protocols, the development of bioinformatic tools is steadily progressing. Currently the Illumina TruSeq library preparation kit is used, along with additional reagents, to make stranded libraries in an automated fashion, which are then sequenced on Illumina HiSeq 2000. By use of freely available bioinformatic tools, it was shown through quality metrics that the protocol is robust and reproducible (Sigurgeirsson et al. 2014). This study further highlights the practicality of strand-specific libraries by comparing them to non-stranded libraries, by looking at known antisense transcription of pseudogenes, and by identifying novel transcription. Non-stranded, Illumina TruSeq kit can be adapted to generate strand-specific libraries and can be used to access detailed information on the transcriptome. The Ribo-Zero kit is very effective in removing ribosomal RNA from total RNA, and the STAR aligner produces high mapping yield in a short time. Strand-specific data gives more detailed and correct results than do non-stranded data as was shown when estimating expression values and in assembling transcripts. Even well-annotated genomes need improvements and corrections, which can be achieved using strand-specific data. It is recommended that researchers in this area should use strand-specific data as it provides more confidence in the data analysis and is less likely to lead to false conclusions. If faced with analyzing non-stranded data, researchers should be well aware of the caveats of that approach.
3.4 Exome Sequencing
The exome is the part of the genome formed by exons or the sequences which when transcribed remain within the mature RNA after introns are removed by RNA splicing. It differs from a transcriptome in that it consists of all DNA that is transcribed into mature RNA in cells of any type. The exome of the human genome consists of ~180,000 exons constituting ~1 % of the total genome or 30 megabases of DNA (Ng et al. 2009). Although comprising a very small fraction of the genome, mutations in the exome harbor 85 % of disease-causing mutations, and exome sequencing is considered to be an efficient strategy to determine the genetic basis of several Mendelian or single-gene disorders. Exome sequencing could enable the discovery of much of the copy number variation that is responsible for many common and rare diseases.
3.4.1 Human Exome Microarrays
Human exome sequencing is considered to be “the holy grail” of resequencing studies that will ultimately lead to significant biomedical breakthroughs. Exomes are the parts of genes that encode amino acids. As such, exome sequencing could enable the discovery of much of the CNV that is responsible for many common and rare diseases (e.g., cancer and Alzheimer’s disease). Exome resequencing is also expected to possibly shed light on why some diseases occur more often in certain populations and could help uncover why drugs are effective only in a subset of the individuals or population. Applications of array-based whole-exome sequencing (WES) include some rare genetic disorders. Exome sequencing will ultimately produce technology to feed the research pipeline and nourish the development of personalized healthcare. However, the sequencing of the exome by conventional PCR methods is neither technically nor economically feasible because the preparation of human coding exons is very expensive and time consuming.
3.4.2 Sequence Capture
Human exome microarrays have made human exome sequencing feasible. These high-density arrays of long oligonucleotide probes provide greater information content and higher data quality necessary for studying the full diversity of genomic and epigenomic variation. The enhanced performance is made possible by Maskless Array Synthesis (Roche) technology, which uses digital light processing and rapid, high-yield photochemistry to synthesize long oligonucleotide, high-density DNA microarrays with extreme flexibility.
Human Exon 1.0 ST Array (Affymetrix) is used for expression profiling at both gene and exon levels. Although it provides accurate assessments of gene expression, only 20 % of its probe sets are supported by high-confidence annotations, and each probe set contains only one to four probes. As a result, the array is susceptible to false-positives for analysis of exons and alternative splicing. An alternative design is to use probes targeting exon–exon junctions in addition to exons, which has been applied to several custom arrays. However, further improvements are needed for cost-effective high-throughput applications in clinical trials.
3.5 Transcriptomics
The focus of decoding genomic information previously has been mostly on proteomics and mRNA (cDNA) analysis. A limitation of this approach is that the information contained within the genome is first expressed in the form of “primary transcripts” before it is processed into mRNA and proteins. The primary transcripts may not lead to the formation of mRNA and proteins but perform crucial cellular functions directly. Transcriptomics is the study of the entire set of RNA transcripts of an organism. For study of transcriptome, samples for genotyping and whole exome/genome sequencing generally come from peripheral blood or saliva. Transcriptome and epigenome profiling can also be performed on peripheral blood and CSF in addition to other biomarker studies.
3.6 Nanotechnology-Based Sequencing
Several nanotechnology-based methods are being used for sequencing. Some are included in this section, whereas those relevant to single-molecule sequencing are described in the following section.
3.6.1 DNA Sequence by Use of Nanoparticles
An optical technique has been developed for the parallel manipulation of nanoscale structures with molecular resolution (Csaki et al. 2007). Bioconjugated metal nanoparticles are positioned at the location of interest, such as certain DNA sequences along metaphase chromosomes, prior to pulsed laser light irradiation of the whole sample. Nanoparticles are designed to absorb the introduced energy highly efficiently, thus acting as nanoantenna. As result of the interaction, structural changes of the sample with subwavelength dimensions and nanoscale precision are observed at the location of the particles. The process leading to the nanolocalized destruction is caused by particle ablation as well as thermal damage of the surrounding material. The procedure is highly parallel and can be potentially multiplexed by addressing several different sequences (such as genes). Potential applications are in DNA analysis such as DNA fingerprinting or mutation analysis as well as single-molecule manipulation.
3.6.2 Denaturation Mapping of DNA in Nanofluidic Channels
By partially denaturing YOYO®-1-labeled DNA in nanofluidic channels with a combination of formamide and local heating, a sequence-dependent “barcode” was obtained corresponding to a series of local dips and peaks in the intensity trace along the extended molecule (Reisner et al. 2010). This structure arises from the physics of local denaturation and statistical mechanical calculations of sequence-dependent melting probability can predict the barcode to be observed experimentally for a given sequence. Consequently, the technique is sensitive to sequence variation without requiring enzymatic labeling or a restriction step. This technique may serve as the basis for a new mapping technology ideally suited for investigating the long-range structure of entire genomes extracted from single cells.
3.6.3 Nanopore Sequencing
The translocation of polymers across nanometer-scale apertures in cell membranes is a common phenomenon in biological systems. The change in electrical conductance of a single nanopore as a polymer transits the pore can be reliably detected and used to characterize the polymer. As saline solution flows through the nanopore, it creates an electric current. However, when a DNA molecule passes through the pore, the current is disrupted and the amount of current disruption depends on which of the nucleobases adenine (A), thymine (T), cytosine (C), or guanine (G) is in the pore. To read the sequence of nucleobases, a scientist simply has to find out how much each base disrupts the electric current. This information enables one to read the sequence of DNA bases simply by logging the sequence of electrical disruptions as a DNA molecule passed through.
It is possible to refine this method to the point where the base sequence of a DNA strand can be read with single-base resolution as the DNA transits the pore, which measures 2–5 nm. This would provide a sequencing method that is faster and cheaper than existing ones by many orders of magnitude (Ghosal 2007). A method of identifying single molecules that does not rely on expensive and complex fluorescent labeling is important for reducing the cost and increasing the speed of genome analysis. It is possible to achieve this goal by monitoring a simple electric current passing through a nanopore. As single DNA bases pass through the nanopore, each base causes a characteristic disruption of current that allows the molecule to be identified. A study has shown that this system can also directly identify methylated cytosine, which can be distinguished from the four DNA bases using the same method (Clarke et al. 2009).
Controlling DNA translocation speed is critically important for nanopore sequencing as free electrophoretic threading is far too rapid to resolve individual bases. A number of promising strategies have been explored in recent years, largely driven by the demands of NGS. Engineering DNA–nanopore interactions with organic coatings is an attractive method as it does not require sample modification, processive enzymes, or complicated and expensive fabrication steps. For the first time, fourfold tuning of unfolded, single-file translocation time through small, amine-functionalized solid-state nanopores has been demonstrated by varying the solution pH in situ (Anderson et al. 2013). Additionally, the authors developed a simple analytical model based on electrostatic interactions to explain this effect which will be a useful tool in designing future devices and experiments. This method of reading DNA sequences is still in development. A nanopore chip could be built into a portable device that could be brought right to the patient at the POC.
Hybridization-Assisted Nanopore Sequencing (NABsys) platform combines nanopore sequencing with sequencing by hybridization. Unlike other nanopore-based sequencing approaches, the NABsys platform does not depend on single-base resolution of the nanopore detector in order to obtain accurate sequence information.
Preliminary findings have been described on the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing (Derrington et al. 2010). Based on the short, narrow nature of the MspA pore, it is considered better than commonly used alpha-hemolysin protein for nanopore-sequencing applications. It is just short enough to accommodate one or two nucleotides. MspA-based protein channels could be used to distinguish between all four nucleotides in ssDNA based on ion current signals and resolve single nucleotides in ssDNA when dsDNA temporarily holds the nucleotides in the pore constriction. By tossing in DNA hairpins or dsDNA, it is possible to slow DNA progression through the pores so that this nucleotide sequence can be observed. It is also possible to genetically engineer MspA to optimize the constriction zone for nanopore sequencing. For example, a mutant version of the MspA was designed that has a neutral, uncharged constriction zone allowing charged ssDNA to move through it unhindered. Similarly, subsequent experiments suggest that punctuating ssDNA with double-stranded sequence, duplex-interrupted nanopore sequencing, can also slow nucleotide movement through the channel to aid sequencing, since dsDNA tends to get stuck in the pore and cannot pass through until dissociated. The ultimate aim is to develop an MspA-based nanopore-sequencing strategy that can unravel sequence of DNA that has not been modified and is cheap as well as easy to perform. For this purpose other strategies for slowing ssDNA movement through the channels are being devised and tested. This approach has been patented and is expected to become commercially available.
IBM Research’s DNA Transistor technology offers true single-molecule sequencing by decoding molecules of DNA while they are threaded through a nanometer-sized pore in a silicon chip. Ultimately, the technology could improve throughput and reduce costs to where the whole human genome can be sequenced at a cost of between $100 and $1,000.
In nanopore strand sequencing, a single strand of DNA moves through a narrow pore and the bases are identified as they pass a reading head (Cockroft et al. 2008). This is a rapid real-time technology, which is the cornerstone of a number of advanced single-molecule DNA-sequencing concepts and does not require the time-consuming cyclic addition of reagents. After implementing a chip with a million pores, the researchers expect nanopore sequencing to achieve a 15-min genome by end of 2014 with a very short sample preparation time.
Although the DNA sequence information obtained from nanopores comes from the signal collected during DNA translocation, the throughput of the method is determined by the rate at which molecules arrive and thread into the pores. The process of DNA capture into nanofabricated SiN pores of molecular dimensions has been investigated (Wanunu et al. 2010). For fixed analyte concentrations, there is an increase in capture rate as the DNA length increases from 800 to 8,000 base pairs, a length-independent capture rate for longer molecules, and increasing capture rates when ionic gradients are established across the pore. Furthermore, the study showed that application of a 20-fold salt gradient allows the detection of picomolar DNA concentrations at high throughput. The salt gradients enhance the electric field, focusing more molecules into the pore, thereby advancing the possibility of analyzing unamplified DNA samples using nanopores. This technology has been licensed to NobleGen Biosciences Inc. for further development.
Many of the technical hurdles preventing the use of nanopores in DNA sequencing (slowing transit through the pore, aligning bases properly for reading, designing the proper-sized nanopore) have been largely addressed. Nanoscale cover plates have been used to convert nanopores in solid-state membranes into versatile devices for label-free molecular sensing, and the custom apertures in the nanoplates can be chemically addressed for sequence-specific detection of DNA (Wei et al. 2012). This is called “DNA origami,” i.e., the art of programming strands of DNA to fold into custom-designed structures with specified chemical properties. Different chemical components beyond DNA could be attached to the appropriate site on a DNA nanoplate.
Commercial potential of a nanopore-based sequencer is enormous—rapid speed, single-molecule analysis, no need for imaging, and tremendous scalability. Oxford Nanopore Technologies® is developing the GridION™ system and miniaturized MinION™ device. These are a new generation of electronic molecular analysis system for use in scientific research and personalized medicine. The platform technology uses nanopores to analyze single molecules including DNA/RNA and proteins.
One problem with nanopore-based sequencing has been that DNA strands move too quickly through nanopores for electric signatures to be captured. But a technology has demonstrated the ability to resolve changes in current that correspond to a known DNA sequence by combining the high sensitivity of a mutated form of the protein pore, Mycobacterium smegmatis porin A (MspA) with phi29 DNA polymerase (DNAP), which controls the rate of DNA translocation through the pore (Manrao et al. 2012). As phi29 DNAP synthesizes DNA and functions like a motor to pull a single-stranded template through MspA, well-resolved and reproducible ionic current levels were observed with median durations of ∼28 ms and ionic current differences of up to 40 pA. Using six different DNA sequences with readable regions 42–53 nucleotides long, the authors recorded current traces that map to the known DNA sequences. With single-nucleotide resolution and DNA translocation control, this system integrates solutions to two long-standing hurdles to nanopore sequencing.
INanoBio is developing a nanosensor device for high-speed genome sequencing in collaboration with other institutions. Rather than slowing down the DNA as it translocates through a nanopore, its strategy is to sequence the DNA at high speed using its FET nanosensor devices. Field-effect transistor sensors can have high switching frequencies of up to 1 GHz or more, which enables faster signal acquisition with minimal background noise. It is expected to sense of individual bases at higher speeds, with larger signal to noise ratios, than when using other strategies.
3.7 Detection of Single Molecules for Sequencing
Direct observation of single molecules is the most elegant sequencing technology because it enables rapid sequencing of even small amounts of DNA. Also referred to by the term “single-molecule genomics,” it includes a group of molecular methods in which single molecules are detected or sequenced. Single-molecule sequencing (SMS) enables analysis of genomic information without the need for cloning or amplification, enabling data generation in only a few hours versus days or weeks with current systems. This is an important requirement for eventual use in a clinical setting. Although technically challenging, the analysis of single molecules has the potential to play a major role in the delivery of truly personalized medicine (McCaughan and Dear 2010). The two main subgroups of single-molecule genomic methods are single-molecule digital PCR and SMS. Single-molecule PCR has a number of advantages over competing technologies, including improved detection of rare genetic variants and more precise analysis of CNVs, and is more easily adapted to the often small amount of material that is available in clinical samples.
Whereas NGS systems, such as SOLiD™, are suitable for WGS and expression profiling, SMS is more useful for analysis of clinically relevant genes in cancer and immunology and for deciphering RNA structure, viruses, and patterns of methylation. High-quality sequence data, obtained from a genome isolated from a single cell, would be a substantial breakthrough, particularly for cancer genomics. To achieve this, advances will be required in techniques for efficiently isolating intact long DNA molecules and for accurately reading the sequence content of these molecules (Metzker 2010).
Various systems for single-molecule sequencing are shown in Table 3. Most of these involve nanobiotechnology and are in development in the commercial sector.
Table 3
Systems for single-molecule sequencing
System |
Technology/basis |
Company/institute |
---|---|---|
DNA sequence by use of nanoparticles |
Optical technique for the parallel manipulation of nanoscale structures with molecular resolution |
Institute for Physical High Technology (Jena, Germany) |
High-throughput, single-molecule DNA sequencing |
Sequencing-by-synthesis strategy using less expensive optics and clonally amplified DNA |
GE Global Research (Niskayuna, NY) |
Helicos™ Genetic Analysis System |
tSMS™: directly sequences single DNA molecules without amplification |
Helicos Biosciences |
Molecular combing |
Direct visualization of single DNA molecules attached, to specially treated glass surfaces |
Genomic Vision |
Nanopore sequencing |
BASE™: used for accurate and continuous identification of DNA bases using nanopores |
Oxford Nanopore Technology |
Nanopore-based DNA sequencing instrument |
Passing single strand of DNA through a nanoscale pore and direct optical reading of the sequence |
Base4 Innovation/Hitachi High-Tech |
PNA-based single-molecule detection of specific DNA sequences |
Hybridizing dsDNA with PNA probes and electrophoretically threading the DNA through nanopores |
Boston University, (Boston, MA) |
Single-molecule DNA sequencing by use of carbon nanotubes |
ssDNA can translocate through single-walled carbon nanotubes |
Arizona State University (Tempe, AZ) |
Single-molecule sequencing using Qdot nanocrystals |
Light flashes from Qdot nanocrystals determine the sequence of each individual DNA strand |
Life Technologies Corporation |
Single-molecule DNA sequencing in a sTOP chip nanowell |
Incorporation of a nucleotide into the nascent DNA strand by the DNA polymerase and conversion of a photon signal into an electrical signal |
Crack (Taiwan) |
SMRT (single-molecule real-time) sequencing |
Zero-mode waveguide nanostructure arrays provide optical observation of parallel, simultaneous detection of thousands of single-molecule sequencing reactions |
Pacific Biosciences |
3.8 Molecular Combing
Molecular Combing (Genomic Vision) is a powerful technique for the direct visualization of single DNA molecules attached, uniformly and irreversibly, to specially treated glass surfaces. This technology considerably improves the structural and functional analysis of DNA across the genome. Molecular Combing is a technology capable of exploring the entire genome at high resolution in a single analysis. Molecular Combing provides clear visualization of genomic anomalies in multiple aligned DNA molecules and has led to novel findings with implications in cancer genomics and medicine.
3.9 Optical Mapping
Optical Mapping (OpGen Inc.) involves the capture of multiple copies of whole genomes, as collections of long single DNA molecules isolated directly from cells without amplification or cloning, immobilized in dense arrays. Shotgun optical-mapping approach can directly map genomic DNA by the random mapping of single molecules. Markers are scored simultaneously, in a single cost-effective manipulation, to produce high-resolution Optical Maps that can be used to characterize and compare genomes from any organism with no need for prior sequence information. This is a case of the right technology at the right time. Recent publications report that insertions and deletions (indels) appear to be more important than SNPs in accounting for sequence variation, evolutionary change, and gene defects. Although Optical Mapping does detect SNPs, the system is primarily designed to identify genomic rearrangements, including indels, translocations, and repetitive elements, in any genome. As attention shifts from SNPs to indels, Optical Mapping is perhaps the only system that can detect these events quickly, cheaply, and with high resolution, across entire genomes. The presence or absence of markers and their distance apart are scored to compare closely related genomes, to identify organisms, and to detect genomic rearrangements such as indels. Optical Mapping has the following advantages over other methods for whole-genome genetic analysis:
-
The process involves only a single addition of reagents directly to native DNA, with no requirement for PCR, primers, or probes, providing massively parallel, low-cost marker analysis.
-
Optical Mapping efficiently finds insertions, deletions, duplications, inversions, and translocations, which are not readily detected by other methods such as SNP assays and shotgun DNA sequencing.
-
Optical Mapping can detect completely new and unsuspected genetic variation, whereas probe-based systems are limited to measuring differences that have been found previously in other samples.
-
Optical Mapping can survey entire human genomes for insertions/deletions, which account for a significantly greater proportion of genetic variation between closely related genomes as compared to SNPs and are a major cause of gene defects.
The advantage of Optical Mapping platform’s freedom from dependence on sequence for de novo variant discovery has a downside to it, i.e., lower resolution than sequence-based approaches. The end points of any individual event can only be resolved to the nearest restriction site. This limitation is being addressed by developing alternative enzyme-based methods that increase marker density and add sequence information to mapped molecules. Algorithms are being developed to take advantage of the additional information for separating multiple genotypes at a single genomic locus. With further advances it will be possible to elucidate complex sequence-level events such as the somatic rearrangements that are a hallmark of cancer genomes. Optical Mapping has been used to create genome-wide restriction maps of a complete hydatidiform mole and three lymphoblast-derived cell lines (Teague et al. 2010). This approach was validated by demonstrating a strong concordance with existing methods.
3.9.1 Nanopore-Based Single-Molecule Detection of Specific DNA Sequences
A purely electrical method has been described for the single-molecule detection of specific DNA sequences, achieved by hybridizing dsDNA with peptide nucleic acid (PNA) probes and electrophoretically threading the DNA through sub-5 nm silicon nitride pores (Singer et al. 2010). The single-molecule detection device is a solitary nanopore fabricated in a freestanding silicon nitride membrane using a focused electron beam. When a positive current is applied across the membrane using electrodes, negatively charged PNA-tagged DNA molecules are captured and guided through the nanopore. This provides control over the speed of the strand and enables researchers to identify bases by evaluating changes in ion current. The current is reduced to a value that reflects the displacement of electrolytes from the nanopore by the DNA segment. Sequence detection is performed by reading the ion current traces of individual translocating DNA molecules, which display a characteristic secondary blockade level, absent in untagged molecules. The potential for barcoding DNA has been demonstrated through nanopore analysis of once-tagged and twice-tagged DNA at different locations on the same genomic fragment.
This purely electrical method vastly improves nanopore sequencing, which is highly regarded for its potential to sequence whole genomes quickly and cheaply. Although the method solves the problem of how to detect single-nucleotide bases, which has plagued nanopore technology, it is unlikely to be used for WGS. PNAs have previously been used as part of a microfluidic approach to DNA sequencing. Sequence detection through PNA invasion of the DNA strand was problematic because it was primarily achieved through electrophoresis gel assays. Utilization of the inherent single-molecule nature of the nanopore system enables one to look at individual molecules, one by one, and judge whether or not they contain the PNA tag and, with it, the DNA sequence of interest.
This method has advantages over other single-molecule sensing methods that require the molecules to be located, which is a difficult task. With the nanopore method, long-range electric fields can be used to focus the DNA molecules from far away into the nanopore. The same electrodes that are used to generate this focusing field also induce ion flow across the pore, giving way to the ion current which is the main detection method. One can draw the DNA toward and through the nanopore by manipulating the electric field, without having to modify either the DNA strand or the pore itself. Eventually, PNA induces DNA structure change. Therefore, by detecting the PNA tags along the DNA molecule, one is in fact detecting the presence of its predefined corresponding sequence; therefore, this method is a perfect match for the nanopore system. The PNA signal is direct (enables quantification), fast, and does not require large sample sizes. This high-throughput long-read length method can be used to identify key sequences embedded in individual DNA molecules. This opens up a wide range of possibilities in human genomics as well as in pathogen detection for treating infectious diseases. Further advances in this approach may detect short DNA sequences embedded in a long DNA molecule. Thus, the single-molecule detection mechanism based on PNA-tagged ssDNA could help facilitate the use of nanopores for molecular diagnostics. Further research should explore methods geared toward detection of specific sequences for molecular diagnostics. By developing a method that searches for key sequences, one can forego the need to sequence blood samples for every disease and instead look for key molecular signatures. Given the single-molecule nature of this method, it is quite feasible that it will be possible to forego the current practice of target amplification. This would not only reduce errors but also simplify the sample preparation process, ultimately reducing costs dramatically.
Nanopatch™ technology (Electronic Biosciences) measures the activity of ion channels at the single-molecule level and can be adapted for DNA sequencing. Electronic Biosciences and its collaborators at the University of Utah have also successfully demonstrated translocation of single-stranded DNA oligomers through a protein ion channel (alpha-hemolysin) reconstituted in a lipid bilayer suspended across the quartz nanopore membranes orifice (Schibel et al. 2010). Nanopore detection of individual DNA abasic sites in single molecules was also achieved by this method (An et al. 2012).
GE Global Research (http://ge.geglobalresearch.com/) is developing a third-generation sequencing technology that aims to be less expensive and more accurate. It is working on a means to interrogate DNA that can be used for high-throughput, single-molecule DNA sequencing. This method can utilize less expensive optics by sequencing clonally amplified DNA or, alternatively, can be developed for a high-end system that can interrogate single molecules. The proposed method is a sequencing-by-synthesis strategy. It uses two separate innovations together to accomplish DNA sequencing. First, terminal phosphate-labeled nucleotides are used with a dye attached that is incorporated and is then removed from the growing DNA strand. Next, a method was developed to “freeze” the DNA polymerase as it is incorporated into this unique nucleotide. In this state, the three-part complex of DNA strand, polymerase, and incoming base is quite stable and can be washed and interrogated to determine the identity of the trapped base by the color of the dye in the complex. After interrogation, the polymerase is allowed to add that single base and proceed on to form the complex on the next template base. This cycle is repeated over and over. The procedure takes place on a solid support in a microfluidic system. The advantages of this method are:
-
It can be used to step DNA synthesis even through homopolymers, reducing some of the issues associated with blocks of identical bases.
-
It can also be used for single-molecule sequencing and provides a stable signal that can be interrogated carefully but requires no chemistry besides DNA polymerization to completely remove the label used. Since a new enzyme is added for each step, enzyme stability is not an issue.
GE Global Research has demonstrated proof of concept for the method. It is now optimizing various aspects of its new sequencing system.
3.9.2 Single-Molecule DNA Sequencing by Use of Carbon Nanotubes
Carbon nanotubes having unique arrangements of carbon atoms exhibit many special physical and chemical properties, and researchers used these properties for DNA sequencing. ssDNA can translocate through single-walled carbon nanotube (SWCNT) with diameter of 1–2 nm (Liu et al. 2010). They fabricated devices in which one SWCNT spans a barrier between two fluid reservoirs, enabling direct electrical measurement of ion transport through the tube. A fraction of the tubes pass anomalously high ionic currents. Electrophoretic transport of small ssDNA oligomers through these tubes was marked by large transient increases in ion current and confirmed by PCR analysis. SWCNTs simplify the construction of nanopores, permit new types of electrical measurements, and may open avenues for control of DNA translocation. It is possible to slow the rate of translocation to a speed where reading the sequence may actually be possible. Researchers are refining the technique to make DNA analysis much faster and more accurate than the currently available methods.
A reader has been developed with this technology that can discriminate between DNA’s four chemical components (Chang et al. 2010). The authors constructed two electrodes, one on the end of a microscope probe and another on the surface. The end of each was chemically modified to attract and catch the DNA between a gap like a pair of chemical tweezers. The gap between these functionalized electrodes had to be adjusted to find the chemical bonding sweet spot so that when a single chemical base of DNA passed through a 2.5-nm gap between two gold electrodes, it momentarily sticks to the electrodes and a small increase in the current is detected. Any smaller and the molecules would be able to bind in many configurations, confusing the readout. Any bigger and smaller bases would not be detected. At this scale, which is just a few atomic diameters wide, quantum phenomena are at play where the electrons can actually leak from one electrode to the other, tunneling through the DNA bases in the process. Each of the chemical bases of the DNA genetic code gives a unique electrical signature as they pass between the gaps in the electrodes. It was discovered that just a single chemical modification to both electrodes could distinguish between all four DNA bases. The group is trying to adapt the reader to work in water-based solutions, which is a critically practical step for DNA-sequencing applications. Also, it will be desirable to combine reader capabilities with the carbon nanotube technology for reading short stretches of DNA.
3.9.3 Single-Molecule Sequencing Using Qdot Nanocrystals
Qdot® nanocrystals (Life Technologies), already known for detection of individual protein molecules, has been used in SMS core sequencing engine. Compared to conventional fluorescence detection with organic dye molecules, the Qdot approach generates signals more than 100 times greater, enabling simple single-molecule detection. The SMS system uses specially designed sequencing versions of these nanocrystals, attached to proprietary DNA polymerase molecules. The system monitors the real-time incorporation of nucleotides into individual growing DNA strands. As nucleotides are incorporated, they are energized by photons transferred from the Qdot nanocrystal, generating a characteristic colored flash of fluorescent light. The prototype SMS system records the time and color series of these light flashes to determine the DNA sequence of each individual DNA strand. A key feature of this approach is that it looks for correlated fluorescence flashes as Qdot signals decrease, which should improve the error-prone profile of noisy single-molecule data.
A unique aspect of the SMS system is reagent exchange, where individual Qdot polymerases and synthesized templates can be removed and replaced with new ones. This enables immobilized individual DNA templates to be resequenced several times, enabling highly accurate reads with minimal sample preparation. In addition, reagent exchange enables linking of multiple long reads to enable virtually unlimited read lengths.
Early stage results from this SMS technology show promise to combine unlimited continuous long read lengths with unmatched accuracy for delivering targeted genomic sequence data in a matter of hours, which will facilitate adoption of sequencing as a routine tool in research laboratories as well as clinical settings. Potential applications include haplotype phasing of the human genome.
3.9.4 Single-Molecule DNA Sequencing in a sTOP Chip Nanowell
sTOP (sequencing is carried out on Top of a Photodiode) technology is developed by Crack of Taiwan with a pending patent. Upon incorporation of a nucleotide into the nascent DNA strand by the DNA polymerase at the bottom of the reaction site, the fluorophore is activated and emits light, the nature of which depends on the nucleotide. This photon signal passes through the filter layer (whereas excitation light does not), to reach the photodiode where photons are converted into an electric signal. The properties of this signal are in turn decoded by the logical circuit adjacent to the diode, leading to the identification of the incorporated nucleotide. If the DNA synthesis reaction does proceed, signals are measured, which correspond to the successive steps of nucleotide incorporation, and enable sequence determination.

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

