Page 1 of 1

Accessing the data

PostPosted: Tue Sep 01, 2009 1:40 pm
by jcbradley
Where are the data - processed or not - located on this site? What tools do you need to visualize?

Re: Accessing the data

PostPosted: Tue Sep 08, 2009 11:17 am
by donpellegrino
Thanks for the post. I am working on a new page for the site to document the methodology in terms of the data transformations. I will include links to all of the data involved in each experiment. I'll let you know when I have the new page on-line.

The source code for the visualization tool can be found in the source control management system at ... 05/commit/. The page includes links to download the source code of latest version in zip and tar formats. To make the tool easier for non-programmers to run I am working on a distribution via a Firefox plug-in. More on that effort can be found in another post here in the forums (viewtopic.php?f=2&t=6).

Data Documentation

PostPosted: Tue Sep 08, 2009 1:21 pm
by donpellegrino
I have added a new page to the site that describes the data aspects of the project. It is on-line at The page documents the general data collection process for the sequence data. It also list the available meta-data for the sequence records. I plan to add additional pages for each run and experiment. Links to download specific sets of data could be added to those pages. Let me know what you think and if would like to see specific sets made available in specific formats to support analyses.

Re: Accessing the data

PostPosted: Sat Jan 16, 2010 10:51 am
by donpellegrino
I have created a new code repository for the purpose of exposing all of the original influenza data that is collected as well as the post-processed results as a single HDF5 file. The new code repository, "exp007: Influenza Data Processing" is online at []. The attached diagram is an export of the Dia file in that repository documenting the NCBI sources and the aggregator that I am building to read the records and calculate the derived fields.