Begun in November 2009, the Genomic Data Analysis Innovation Project addresses the data management needs for the next-generation sequencing community. Next-generation sequencing is a fundamental technology in research areas involving diagnosis and treatment of disease, (especially cancer and autoimmune illnesses), food production and processing and environmental remediation.
This is an opportunity to centralise the effort of several major institutions to make effective use of gene sequencing instruments. Each institution will house its own data repository. Ultimately a central repository, with the same structure, will be available. Users will be able to download their results from the repository and, optionally, share results with other users by granting permission. Users will then be able to push results of tertiary analysis back into the repository for archiving or sharing.
Next-generation sequencing machines are changing the face of gene sequence analysis. Quantities of data are now best measured in terabytes, and the curve is trending upwards. The bottleneck has moved from obtaining sequence data to storing and analysing the data.
The Intersect project is centralising the efforts of several major institutions to enable genomic sequencing, storage, sharing and analysis for new generation sequencing platforms. The project will be completed in March 2010.