Genes in individuals
Microbial genes in different individuals
The next challenge we addressed was to determine which genes of the catalog are present in each individual and with what frequency. This enables searching for association between the genes and the disease.For this purpose we followed a 2-pronged approach, using either the very high throughput DNA sequencing or the DNA arrays.
Very high throughput sequencing generates short sequence “tags”, that originate from genes present in the sample and thus in our gene catalog. We developed procedures to map the tags onto the catalogue genes, and therefore to count the genes present in each individual. Routinely, we generate some 30 millions tags for each sample and can thus count genes with a high accuracy, as their number is substantially lower (slightly above half a million, on average). We developed the bio-informatics procedures required to determine efficiently the gene frequency in our samples. Almost 400 individuals were analyzed in this way.
We also developed arrays that allow to measure gene frequency and, importantly, also gene expression. One is carried out by analyzing the DNA and the other the RNA from each sample. The present-day technology cannot accommodate all of the catalog genes on a single array (two are required), which renders the routine use of the arrays that contain the complete gene set laborious and costly. We thus also developed a simplified version, that contains the most informative genes on a single array and developed the protocols for its efficient use. This array has already been used for the analysis of over 150 individuals.