summaryrefslogtreecommitdiffstats
Unidiff
-rw-r--r--README35
-rw-r--r--data/ProteinNames.txt275
-rw-r--r--doc/Data Deployments.diabin0 -> 3566 bytes
3 files changed, 310 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..9caedb8
--- a/dev/null
+++ b/README
@@ -0,0 +1,35 @@
1Experiment 007
2Don Pellegrino [don@drexel.edu]
3
4Collection and inventory of influenza data.
5
6INTRODUCTION
7
8The "Influenza Virus Resource" at NCBI
9[http://www.ncbi.nlm.nih.gov/genomes/FLU/] exposes the sequence records and
10their meta-data in a number of different ways. An exploration of the
11phylogenetic properties of the records first requires that the available data
12be collected and inventoried.
13
14Two primary alternatives have been identified for managing the data. A
15relational database can be used. IBM DB2 has been used for this. The use of
16a relational database is limited by the difficulty in sharing the data. Each
17vendor uses incompatible import and export routines. Additionally installing
18an instance of a database management system (DBMS) often requires a large
19amount of effort and many not be practical on hosted environments which do not
20support the running of user daemons. Finally proper parallelization of a DBMS
21will require additional system specific configuration for each machine used.
22
23An alternative to the DBMS is to use a container file format such as HDF5.
24This has the advantage that all of the data can be collected into a single
25file which can then be shared with others. It has the disadvantage that is
26lacks the robust search and SQL operations provided by a DBMS. In addition to
27two alternatives use fundamentally different storage strategies with the DBMS
28using a relational model and the contain file format using a hierarchical
29model.
30
31The "doc/Data Deployments.dia" diagram shows the source systems that
32expose the various records as well as the transform routines that are
33used for aggregation of the data on the local system.
34
35 LocalWords: NCBI parallelization HDF SQL Pellegrino phylogenetic DBMS dia
diff --git a/data/ProteinNames.txt b/data/ProteinNames.txt
new file mode 100644
index 0000000..13ae313
--- a/dev/null
+++ b/data/ProteinNames.txt
@@ -0,0 +1,275 @@
1>A_PB2
2
3MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPER
4NEQGQTLWSKMSDAGSDRVMVSPLAVTWWNRNGPVTSTVHYPKVYKTYFDKVERLKHGTFGPVHFRNQVK
5IRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSESQLTITKEKKEELRDCKISPLMVAYMLERE
6LVRKTRFLPVAGGTSSIYIEVLHLTQGTCWEQMYTPGGGVRNDDVDQSLIIAARNIVRRAAVSADPLASL
7LEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTSGSSVKKEEEVLTGNLQ
8TLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNF
9VNRANQRLNPMHQLLRHFQKDAKVLFQNWGVEHIDSVMGMIGVLPDMTPSTEMSMRGIRVSKMGVDEYSS
10TERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTERLTITYSSSMMWEINGPESVLVNTYQWIIRNWEAV
11KIQWSQNPAMLYNKMEFEPFQSLVPKAIRSQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSR
12MQFSSLTVNVRGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDESTSGVESAVLRGFLIIGKE
13DRRYGPALSINELSNLAKGEKANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
14
15>A_PB1
16MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPID
17GPLPEDNEPSGYAQTDCVLEAMAFLEESHPGIFENSCLETMEAVQQTRVDKLTQGRQTYDWTLNRNQPAA
18TALANTIEVFRSNGLTANESGRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQ
19RVNKRGYLIRALTLNTMTKDAERGKLKRRAIATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKA
20KLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLAMITYITKNQPEWFRNILSIAPIMFSNKMAR
21LGKGYMFESKRMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGTASLSPGMMMGMFNMLSTVLG
22VSILNLGQKKYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKKKSYINKTGTF
23EFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
24CHRGDTQIQTRRSFELKKLWDQTQSRAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDENYRGRLCNPLNP
25FVSHKEIESVNNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFF
26PSSSYRRPIGISSMVEAMVSRARIDARIDFESGRIKKEEFSEIMKICSTIEELRRQK
27
28>A_PB1-F2
29MEQEQGTPWTQSTEHTNIQRRGSGRQIQKLGHPNSTQLMDHYLRIMNQVDMHKQTVSWRLWPSLKNPTQV
30SLRTHALKQWKPFNRQGWTN
31
32>A_PA
33
34MEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIVVELDDPNA
35LLKHRFEIIEGRDRTMAWTVVNSICNTTGAGKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKS
36ENTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEEKFEITGT
37MRRLADQSLPPNFSCLENFRAYVDGFEPNGCIEGKLSQMSKEVNAQIEPFLKTTPRPIKLPNGPPCYQRS
38KFLLMDALKLSIEDPSHEGEGIPLYDAIKCIKTFFGWKEPYIVKPHEKGINSNYLLSWKQVLSELQDIEN
39EEKIPRTKNMKKTSQLKWALGENMAPEKVDFENCRDISDLKQYDSDEPELRSLSSWIQNEFNKACELTDS
40VWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDFQLIPMISKCR
41TKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQISRP
42MFLYVRTNGTSKVKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSEAWPIGESPKGVEE
43GSIGKVCRTLLAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWV
44LLNASWFNSFLTHALK
45
46>A_HA
47MKTIIALSYILCLVFAQKLPGNDNSTATLCLGHHAVPNGTIVKTITNDQIEVTNATELVQSSSTGEICDS
48PHQILDGENCTLIDALLGDPQCDGFQNKKWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNES
49FNWTGVTQNGTSSACIRRSNNSFFSRLNWLTHLKFKYPALNVTMPNNEKFDKLYIWGVHHPGTDNDQIFL
50YAQASGRITVSTKRSQQTVIPNIGSRPRVRNIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRSGK
51SSIMRSDAPIGKCNSECITPNGSIPNDKPFQNVNRITYGACPRYVKQNTLKLATGMRNVPEKQTRGIFGA
52IAGFIENGWEGMVDGWYGFRHQNSEGIGQAADLKSTQAAIDQINGKLNRLIGKTNEKFHQIEKEFSEVEG
53RIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCD
54NACIGSIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVALLGFIMWACQKGNI
55RCNICI
56
57>A_NP
58MASQGTKRSYEQMETDGDRQNATEIRASVGKMIDGIGRFYIQMCTELKLSDHEGRLIQNSLTIEKMVLSA
59FDERRNKYLEEHPSAGKDPKKTGGPIYRRVDGKWMRELVLYDKEEIRRIWRQANNGEDATSGLTHIMIWH
60SNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGIGTMVMELIRMVKRGINDRNFWRGE
61NGRKTRSAYERMCNILKGKFQTAAQRAMVDQVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACA
62YGPAVSSGYDFEKEGYSLVGIDPFKLLQNSQIYSLIRPNENPAHKSQLVWMACHSAAFEDLRLLSFIRGT
63KVSPRGKLSTRGVQIASNENMDNMGSSTLELRSGYWAIRTRSGGNTNQQRASAGQTSVQPTFSVQRNLPF
64EKSTIMAAFTGNTEGRTSDMRAEIIRMMEGAKPEEVSFRGRGVFELSDEKATNPIVPSFDMSNEGSYFFG
65DNAEEYDN
66
67>A_NA
68MNPNQKIITIGSVSLTISTICFFMQTAILITTVTLHFKQYEFNSPPNNQVMLCEPTIIERNITEIVYLTN
69TTIEKEICPKLAEYRNWSKPQCDITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPDKCYQFALGQGTTL
70NNVHSNDTVRDRTPYRTLLMNELGVPFHLGTKQVCIAWSSSSCHDGKAWLHVCITGDDKNATASFIYNGR
71LVDSIVSWSKEILRTQESECVCINGTCTVVMTDGSASGKADTKILFIEEGKIVHTSTLSGSAQHVEECSC
72YPRYPGVRCVCRDNWKGSNRPIVDINIKDHSIVSSYVCSGLVGDTPRKNDSSSSSHCLDPNNEEGGHGVK
73GWAFDDGNDVWMGRTISEKSRLGYETFKVIEGWSNPKSKLQINRQVIVDRGNRSGYSGIFSVEGKSCINR
74CFYVELIRGRKEETEVLWTSNSIVVFCGTSGTYGTGSWPDGADINLMPI
75
76>A_NA
77MNTNQRIITIGTICLIVGIISLLLQIGNIILLWMSHSIQTGEKSHPKVCNQSVITYENNTWVNQTYVNIS
78NTNIAAGQGVTPIILAGNSSLCPISGWAIYSKDNSIRIGSKGDIFVMREPFISCSHLECRTFFLTQGALL
79NDRHSNGTVKDRSPYRTLMSCPIGEAPSPYNSRFESVAWSASACHDGMGWLTIGISGPDNGAVAVLKYNG
80IITDTIKSWRNKILRTQESECVCINGSCFTIMTDGPSNGQASYKLFKMEKGKIIRSIELDAPNYHYEECS
81CYPDTGKVVCVCRDNWHASNRPWVSFDQNLDYQIGYICSGVFGDNPRSNDGKGNCGPVLSNGANGVKGFS
82FRYGNGVWIGRTKSISSRSGFEMIWDPNGWTETDSSFSMKQDIIALTDWSGYSGSFVQHPELTGMNCIRP
83CFWVELIRGQPKESTIWTSGSSISFCGVNSGTASWSWPDGADLPFTIDK
84
85>A_M1
86MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFSGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPS
87ERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTT
88EVAFGLVCATCEQIADSQHRSHRQMVATTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEIASQAR
89QMVQAMRAIGTHPSSSTGLRDDLLENLQTYQKRMGVQMQRFK
90
91>A_M2
92PIRNEWGCRCNDSSDPLVVAANIIGILHLILWILDRLFFKCVYRLFKHGLKRGPSTEGVPE
93SMREEYRKEQQNAVDADDSHFVSIELE
94
95>A_NS1
96MDSNTVSSFQVDCFLWHIRKQVVDQELSDAPFLDRLRRDQRSLRGRGNTLGLDIKAATHVGKQIVEKILK
97EESDEALKMTMVSTPASRYITDMTIEELSRNWFMLMPKQKVEGPLCIRMDQAIMEKNIMLKANFSVIFDR
98LETIVLLRAFTEEGAIVGEISPLPSFPGHTIEDVKNAIGVLIGGLEWNDNTVRVSKNLQRFAWRSSNENG
99GPPLTPKQKREMARTARSKV
100
101>A_NS2
102DILLRMSKMQLGSSSEDLNGMITQFESLKIYRDSLGEAVMRMGDLHLLQNRNGKWREQLG
103QKFEEIRWLIEEVRHRLKTTENSFEQITFMQALQLLFEVEQEIRTFSFQLI
104>B_PB1
105MNINPYFLFIDVPIQAAISTTFPYTGVPPYSHGTGTGYTIDTVIRTHEYSNKGKQYISDVTGCTMVDPTN
106GPLPEDNEPSAYAQLDCVLEALDRMDEEHPGLFQAASQNAMEALMVTTVDKLTQGRQTFDWTVCRNQPAA
107TALNTTITSFRLNDLNGADKGGLIPFCQDIIDSLDRPEMTFFSVKNIKKKLPAKNRKGFLIKRIPMKVKD
108KITKVEYIKRALSLNTMTKDAERGKLKRRAIATAGIQIRGFVLVVENLAKNICENLEQSGLPVGGNEKKA
109KLSNAVAKMLSNCPPGGISMTVTGDNTKWNECLNPRIFLAMTERITRDSPIWFRDFCSIAPVLFSNKIAR
110LGKGFMITSKTKRLKAQIPCPDLFSIPLERYNEETRAKLKKLKPFFNEEGTASLSPGMMMGMFNMLSTVL
111GVAALGIKNIGNKEYLWDGLQSSDDFALFVNAKDEETCMEGINDFYRTCKLLGVNMSKKKSYCNETGMFE
112FTSMFYRDGFVSNFAMELPSFGVAGVNESADMAIGMTIIKNNMINNGMGPATAQTAIQLFIADYRYTYKC
113HRGDSKVEGKRMKIIKELWENTKGRDGLLVADGGPNIYNLRNLHIPEIVLKYNLMDPEYKGRLLHPQNPF
114VGHLSIEGIKEADITPAHGPVKKMDYDAVSGTHSWRTKRNRSILNTDQRNMILEEQCYAKCCNLFEACFN
115SASYRKPVGQHSMLEAMAHRLRMDARLDYESGRMSKDDFEKAMAHLGEIGYI
116
117>B_PB2
118
119MTLAKIELLKQLLRDNEAKTVLKQTTVDQYNIIRKFNTSRIEKNPSLRMKWAMCSNFPLALTKGDMANRI
120PLEYKGIQLKTNAEDIGTKGQMCSIAAVTWWNTYGPIGDTEGFERVYESFFLRKMRLDNATWGRITFGPV
121ERVRKRVLLNPLTKEMPPDEASNVIMEILFPKEAGIPRESTWIHRELIKEKREKLKGTMITPIVLAYMLE
122RELVARRRFLPVAGATSAEFIEMLHCLQGENWRQIYHPGGNKLTESRSQSMIVACRKIIRRSIVASNPLE
123LAVEIANKTVIDTEPLKSCLAAIDGGDVACDIIRAALGLKIRQRQRFGRLELKRISGRGFKNDEEILIGN
124GTIQKIGIWDGEEEFHVRCGECRGILKKSKMKLEKLLINSAKKEDMRDLIILCMVFSQDTRMFQGVRGEI
125NFLNRAGQLLSPMYQLQRYFLNRSNDLFDQWGYEESPKASELHGINESMNASDYTLKGVVVTRNVIDDFS
126STETEKVSITKNLSLIKRTGEVIMGANDVSELESQAQLMITYDTPKMWEMGTTKELVQNTYQWVLKNLVT
127LKAQFLLGKEDMFQWDAFEAFESIIPQKMAGQYSGFARAVLKQMRDQEVMKTDQFIKLLPFCFSPPKLRS
128NGEPYQFLKLVLKGGGENFIEVRKGSPLFSYNPQTEVLTICGRMMSLKGKIEDEERNRSMGNAVLAGFLV
129SGKYDPDLGDFKTIEELEKLKPGEKANILLYQGKPVKVVKRKRYSALSNDISQGIKRQRMTVESMGWALS
130
131>B_PA
132
133MDTFITRNFQTTIIQKAKNTMAEFSEDPELQPAMLFNICVHLEVCYVISDMNFLDEEGKAYTALEGQGKE
134QNLRPQYEVIEGMPRTIAWMVQRSLAQEHGIETPKYLADLFDYKTKRFIEVGITKGLADDYFWKKKEKLG
135NSMELMIFSYNQDYSLSNESSLDEEGKGRVLSRLTELQAELSLKNLWQVLIGEEDVEKGIDFKLGQTISR
136LRDISVPAGFSNFEGMRSYIDNIDPKGAIERNLARMSPLVSVTPKKLTWEDLRPIGPHIYDHELPEVPYN
137AFLLMSDELGLANMTEGKSKKPKTLAKECLEKYSTLRDQTDPILIMKSEKANENFLWKLWRDCVNTISNE
138ETSNELQKTNYAKWATGDGLTYQKIMKEVAIDDETMCQEEPKIPNKCRVAAWVQTEMNLLSTLTSKRALD
139LPEIGPDVAPVEHVGSERRKYFVNEINYCKASTVMMKYVLFHTSLLNESNASMGKYKVIPITNRVVNEKG
140ESFDMLYGLAVKGQSHLRGDTDVVTVVTFEFSSTDPRVDSGKWPKYTVFRIGSLFVSGREKSVYLYCRVN
141GTNKIQMKWGMEARRCLLQSMQQMEAIVEQESSIQGYDMTKACFKGDRVNSPKTFSIGTQEGKLVKGSFG
142KALRVIFTKCLMHYVFGNAQLEGFSAESRRLLLLIQALKDRKGPWVFDLEGMYSGIEECISNNPWVIQSA
143YWFNEWLGFEKEGSKVLESVDEIMDE
144
145>B_HA
146MKAIIVLLMVVTSNADRICTGITSSNSPHVVKTATQGEVNVTGVIPLTTTPTKSHFANLKGTETRGKLCP
147KCLNCTDLDVALGRPKCTGNIPSARVSILHEVRPVTSGCFPIMHDRTKIRQLPNLLRGYEHIRLSTHNVI
148NAENAPGGPYKIGTSGSCPNVTNGNGFFATMAWAVPKNDNNKTATNSLTIEVPYICTEGEDQITVWGFHS
149DNETQMAKLYGDSKPQKFTSSANGVTTHYVSQIGGFPNQTEDGGLPQSGRIVVDYMVQKSGKTGTITYQR
150GILLPQKVWCASGRSKVIKGSLPLIGEADCLHEKYGGLNKSKPYYTGEHAKAIGNCPIWVKTPLKLANGT
151KYRPPAKLLKERGFFGAIAGFLEGGWEGMIAGWHGYTSHGAHGVAVAADLKSTQEAINKITKNLNSLSEL
152EVKNLQRLSGAMDELHNEILELDEKVDDLRADTISSQIELAVLLSNEGIINSEDEHLLALERKLKKMLGP
153SAVEIGNGCFETKHKCNQTCLDRIAAGTFDAGEFSLPTFDSLNITAASLNDDGLDNHTILLYYSTAASSL
154AVTLMIAIFVVYMVSRDNVSCSICL
155
156>B_NP
157MSNMDIDGINTGTIDKAPEEITSGTSGTTRPIIRPATLAPPSNKRTRNPSPERATTIGEADVGRKTQKKQ
158TPTEIKKSVYNMVVKLGEFYNQMMVKAGLNDDMERNLIQNAHAVERILLAATDDKKTEFQKKKNARDVKE
159GKEEIDHNKTGGTFYKMVRDDKTIYFSPIRVTFLKEEVKTMYKTTMGSDGFSGLNHIMIGHSQMNDVCFQ
160RSKALKRVGLDPSLISTFAGSTLPRRSGATGVAIKGGGTLVAEAIRFIGRAMADRGLLRDIKAKTAYEKI
161LLNLKNKCSAPQQKALVDQVIGSRNPGIADIEDLTLLARSMVVVRPSVASKVVLPISIYAKIPQLGFNVE
162EYSMVGYEAMALYNMATPVSILRVGDDAKDKSQLFFMSCFGAAYEDLRVLSALTGTEFKPRSALKCKGFH
163VPAKEQVEGMGAALMSIKLQFWAPMTRSGGNEVGGDGGSGQISCSPVFAVERPIALSKQAVRRMLSMNIE
164GRDADVKGNLLKMMNDSMAKKTNGNAFIGKKMFQISDKNKTNPVEIPIKQTIPNFFFGRDTAEDYDDLDY
165>B_NB
166MNNATFNYTNVNPISHIRGSIIITICVSFIIILTIFGYIAKILTNRNNCTNNAIGLCKCIKCSGCEPFCN
167KRGDTSSPRTGVDIPAFILPGLNLSESTPN
168
169>B_NA
170MLPSTIQTLTLFLTSGGVLLSLYVSASLSYLLYSDILLKFSPTEITAPTMPLDCANASNVQAVNRSATKG
171VTLLLPEPEWTYPRLSCPGSTFQKALLISPHRFGETKGNSAPLIIREPFIACGPNECKHFALTHYAAQPG
172GYYNGTRGDRNKLRHLISVKLGKIPTVENSIFHMAAWSGSACHDGKEWTYIGVDGPDNNALLKIKYGEAY
173TDTYHSYANKILRTQESACNCIGGNCYLMITDGSASGVSECRFLKIREGRIIKEIFPTGRVKHTEECTCG
174FASNKTIECACRDNSYTAKRPFVKLNVETDTAEIRLMCTDTYLDTPRPDDGSITGPCESNGDKGSGGIKG
175GFVHQRMASKIGRWYSRTMSKTERMGMGLYVKYDGDPWADSDALAFSGVMVSMKEPGWYSFGFEIKDKKC
176DVPCIGIEMVHDGGKETWHSAATAIYCLMGSGQLLWDTVTGVDMAL
177
178>B_M1
179
180MSLFGDTIAYLLSLTEDGEGKAELAEKLHCWFGGKEFDLDSALEWIKNKRCLTDIQKALIGASICFLKPK
181DQERKRRFITEPLSGMGTTATKKKGLILAERKMRRCVSFHEAFEIAEGHESSALLYCLMVMYLNPGNYSM
182QVKLGTLCALCEKQASHSHRAHSRAARSSVPGVRREMQMVSAMNTAKTMNGMGKGEDVQKLAEELQSNIG
183VLRSLGASQKNGEGIAKDVMEVLKQSSMGNSALVKKYL
184
185>B_BM2
186
187MLEPFQILSICSFILSALHFMAWTIGHLNQIKRGINMKIRIKGPNKETINREVSILRHSYQKEIQAKETM
188KEVLSDNMEVLSDHIIIEGLSAEEIIKMGETVLEIEELH
189
190>B_NS1
191MANNMTTTQIEVGPGATNATINFEAGILECYERLSWQRALDYPGQDRLNRLKRKLESRIKTHNKSEPESK
192RMSLEERKAIGVKMMKVLLFMNPSAGIEGFEPYCMKSSSNSNCTKYNWTDYPSTPGRCLDDIEEEPEDVD
193GPTEIVLRDMNNKDARQKIKEEVNTQKEGKFRLTIKRDMRNVLSLRVLVNGTFLKHPNGYKSLSTLHRLN
194AYDQSGRLVAKLVATDDLTVEDEEDGHRILNSLFERLNEGHSKPIRAAETAVGVLSQFGQEHRLSPEEGD
195N
196
197>B_NS2
198WRMKKMAIGSSTHSSSVLMKDIQSQFEQLKLRWESYPNLVKSTDYHQKRETIRLVTEEL
199YLLSKRIDDNILFHKTVIANSSIIADMVVSLSLLETLYEMKDVVEVYSRQCL
200
201>C_CM2
202MGRMAMKWLVVIICFSITSQPASACNLKTCLKLFNNTDAVTVHCFNENQGYMLTLASLGLGIITMLYLLV
203KIIIELVNGFVLGRWERWCGDIKTTIMPEIDSMEKDIALSRERLDLGEDAPDETDNSPIPFSNDGIFEI
204>C_M1
205MAHEILIAETEAFLKNVAPETRTAIISAITGGKSACKSAAKLIKNEHLPLMSGEATTMHIVMRCLYPEIK
206PWKKASDMLNKATSSLKKSEGRDIRKQMKAAGDFLGVESMMKMRAFRDDQIMEMVEEVYDHPDDYTPDIR
207IGTITAWLRCKNKKSERYRSNVSESGRTALKIHEVRKASTAMNEIAGITGLGEEALSLQRQTESLAILCN
208HTFGSNIMRPHLEKAIKGVEGRVGEMGRMAMK
209>C_NP
210MSDRRQNRKTPDEQRKANALIINENIEAYIAICKEVGLNGDEMLILENGIAIEKAIRICCDGKYQEKREK
211KAREAQRADSNFNADSIGIRLVKRAGSGTNITYHAVVELTSRSRIVQILKSHWGNELNRAKIAGKRLGFS
212ALFASNLEAIIYQRGRNAARRNGSAELFTLTQGAGIETRYKWIMEKHIGIGVLIADAKGLINGKREGKRG
213VDANVKLRAGTTGSPLERAMQGIEKKAFPGPLRALARRVVKANYNDAREALNVIAEASLLLKPQITNKMT
214MPWCMWLAARLTLKDEFANFCAYAGRRAFEVFNIAMEKIGICSFQGTIMNDDEIESIEDKAQVLMMACFG
215LAYEDFSLVSAMVSHPLKLRNRMKIGNFRVGEKVSTVLSPLLRFTRWAEFAQRFALQANTSREGAQISNS
216AVFAVERKITTDVQRVEELLNKVQAHEDEPLQTLYKKVREQISIIGRNKSEIKEFLGSSMYDLNDQEKQN
217PINFRSGAHPFFFEFDPDYNPIRVKRPKKPIAKRNSNISRLEEEGMDENSEIGQAKKMKPLDQLTSTSSN
218IPGKN
219>C_HE
220MFFSLLLMLGLTEAEKIKICLQKQVNSSFSLHNGFGGNLYATEEKRMFELVKPKAGASVLNQSTWIGFGD
221SRTDKSNSAFPRSADVSAKTADKFRSLSGGSLMLSMFGPPGKVDYLYQGCGKHKVFYEGVNWSPHAAINC
222YRKNWTDIKLNFQKNIYELASQSHCMSLVNALDKTIPLQATAGVAKNCNNSFLKNPALYTQEVNPSVEKC
223GKENLAFFTLPTQFGTYECKLHLVASCYFIYDSKEVYNKRGCDNYFQVIYDSSGKVVGGLDNRVSPYTGN
224SGDTPTMQCDMLQLKPGRYSVRSSPRFLLMPERSYCFDMKEKGPVTAVQSIWGKGRESDHAVDQACLSTP
225GCMLIQKQKPYIGEADDHHGDQEMRELLSGLDYEARCISQSGWVNETSPFTEEYLLPPKFGRCPLAAKEE
226SIPKIPDGLLIPTSGTDTTVTKPKSRIFGIDDLIIGLLFVAIVEAGIGGYLLGSRKVSGGGVTKESAEKG
227FEKIGNDIQILRSSTNIAIEKLNDRISHDEQAIRDLTLEIENARSEALLGELGIIRALLVGNISIGLQES
228LWELASEITNRAGDLAVEVSPGCWVIDNNICDQSCQNFIFKFNETAPVPTIPPLDTKIDLQSDPFYWGSS
229LGLAITAAISLAALVISGIAICRTK
230>C_P3
231MSKTFAEIAEAFLEPEAVRIAKEAVEEYGDHERKIIQIGIHFQVCCMFCDEYLSTNGSDRFVLIEGRKRG
232TAVSLQNELCKSYDLEPLPFLCDIFDREEKQFVEIGITRKADDSYFQSKFGKLGNSCKIFVFSYDGRLDK
233NCEGPMEEQKLRIFSFLATAADFLRKENMFNEIFLPDNEETIIEMKKGKTFLKLRDESVPLPFQTYEQMK
234DYCEKFKGNPRELASKVSQMQSNIKLPIKHYEQNKFRQIRLPKGPMAPYTHKFLMEEAWMFTKISDPERS
235RAGEILIDFFKKGNLSAIRPKDKPLQGKYPIHYKNLWNQIKAAIADRTMVINENDHSEFLGGIGRASKKI
236PEVSLTQDVITTEGLKQSENKLPEPRSFPKWFNAEWMWAIKDSDLTGWVPMAEYPPADNELEDYAEHLNK
237TMEGVLQGTNCAREMGKCILTVGALMTECRLFPGKIKVVPIYARSKERKSMQEGLPVPSEMDCLFGICVK
238SKSHLNKDDGMYTIITFEFSIREPNLEKHQKYTVFEAGHTTVRMKKGESVIGREVPLYLYCRTTALSKIK
239NDWLSKARRCFITTMDTVETICLRESAKAEENLVEKTLNEKQMWIGKKNGELIAQPLREALRVQLVQQFY
240FCIYNDSQLEGFCNEQKKILMALEGDKKNKSSFGFNPEGLLEKIEECLINNPMCLFMAQRLNELVIEASK
241RGAKFFKID
242>C_PB1
243MEINPYLMFLNNDVTSLISTTYPYTGPPPMSHGSSTKYTLETIKRTYDYSRTSVEKTSKVFNIPRRKFCN
244CLEDKDELVKPTGNVDISSLLGLAEMMEKRMGEGFFKHCVMEAETEILKMHFSRLTEGRQTYDWTSERNM
245PAATALQLTVDAIKETEGPFKGTTMLEYCNKMIEMLDWKEVKFRKVKTMVRREKDKRSGKEIKTKVPVMG
246IDSIKHDEFLIRALTINTMAKDGERGKLQRRAIATPGMIVRPFSKIVETVAQKICEKLKESGLPVGGNEK
247KAKLKTTVTSLNARMNSDQFAVNITGDNSKWNECQQPEAYLALLAYITKDSSDLMKDLCSVAPVLFCNKF
248VKLGQGIRLSNKRKTKEVIIKAEKMGKYKNLMREEYKNLFEPLEKYIQKDVCFLPGGMLMGMFNMLSTVL
249GVSTLCYMDEELKAKGCFWTGLQSSDDFVLFAVASNWSNIHWTIRRFNAVCKLIGINMSLEKSYGSLPEL
250FEFTSMFFDGEFVSNLAMELPAFTTAGVNEGVDFTAAMSIIKTNMINNSLSPSTALMALRICLQEFRATY
251RVHPWDSRVKGGRMKIINEFIKTIENKDGLLIADGGKLMNNISTLHIPEEVLKFEKMDEQYRNRVFNPKN
252PFTNFDKTIDIFRAHGPIRVEENEAVVSTHSFRTRANRTLLNTDMRAMMAEEKRYQMVCDMFKSVFESAD
253INPPIGAMSIGEAIEEKLLERAKMKRDIGAIEDSEYEEIKDIIRDAKKARIESR
254>C_PB2
255MSFLLTIAKEYKRLCQDAKAAQMMTVGTVSNYTTFKKWTTSRKEKNPSLRMRWAMSSKFPIIANKRMLEE
256AQIPKEHNNVALWEDTEDVSKRDHVLASASCINYWNFCGPCVNNSEVIKEVYKSRFGRLERRKEIMWKEL
257RFTLVDRQRRRVDTQPVEQRLRTGEIKDLQMWTLFEDEAPLASKFILDNYGLVKEMRSKFANKPLNKEVV
258AHMLEKQFNPESRFLPVFGAIRPERMELIHALGGETWIQEANTAGISNVDQRKNDMRAVCRKVCLAANAS
259IMNAKSKLVEYIKSTSMRIGETERKLEELILETDDVSPEVTLCKSALGGPLGKTLSFGPMLLKKISGSGV
260KVKDTVYIQGVRAVQFEYWSEQEEFYGEYKSATALFSRKERSLEWITIGGGINEDRKRLLAMCMIFCRDG
261DYFKDAPATITMADLSTKLGREIPYQYVMMNWIQKSEDNLEALLYSRGIVETNPGKMGSSMGIDGSKRAI
262KSLRAVTIQSGKIDMPESKEKIHLELSDNLEAFDSSGRIVATILDLPSDKKVTFQDVSFQHPDLAVLRDE
263KTAITKGYEALIKRLGTGDNDIPSLIAKKDYLSLYNLPEVKLMAPLIRPNRKGVYSRVARKLVSTQVTTG
264HYSLHELIKVLPFTYFAPKQGMFEGRLFFSNDSFVEPGVNNNVFSWSKADSSKIYCHGIAIRVPLVVGDE
265HMDTSLALLEGFSVCENDPRAPMVTRQDLIDVGFGQKVRLFVGQGSVRTFKRTASQRAASSDVNKNVKKI
266KMSN
267>C_NS1
268MSDKTVKSTNLMAFVATKMLERQEDLDTCTEMQVEKMKTSTKARLRTESSFAPRTWEDAIKDGELLFNGT
269ILQAESPTMTPASVEMKGKKFPIDFAPSNIAPIGQNPIYLSPCIPNFDGNVWEATMYHHRGATLTKTMNC
270NCFQRTIWCHPNPSRMRLSYAFVLYCRNTKKICGYLIAKQVAGIETGIRKCFRCIKSGFVMATDEISLTI
271LQSIKSGAQLDPYWGNETPDIDKTEAYMLSLREAGP
272>C_NS2
273EILRRSVD
274TSSLNKWPELKQELENVSDALKADSLWLPMKSLSLYSKVSNQEPSSIPIGEMKHQILTRLKLICSRLEKL
275DLNLSKAVLGIQNSEDLILIIYNRDVCKNTILMIKSLCNSLI
diff --git a/doc/Data Deployments.dia b/doc/Data Deployments.dia
new file mode 100644
index 0000000..b8ad4af
--- a/dev/null
+++ b/doc/Data Deployments.dia
Binary files differ

Valid XHTML 1.0 Strict

Copyright © 2009 Don Pellegrino All Rights Reserved.