-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.html
586 lines (432 loc) · 33.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>ggd consists of: — GGD documentation</title>
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="_static/alabaster.css" type="text/css" />
<link rel="stylesheet" type="text/css" href="_static/style.css" />
<link rel="stylesheet" type="text/css" href="_static/font-awesome-4.7.0/css/font-awesome.min.css" />
<script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
<script src="_static/jquery.js"></script>
<script src="_static/underscore.js"></script>
<script src="_static/doctools.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="GGD Quick Start" href="quick-start.html" />
<link href="https://fonts.googleapis.com/css?family=Lato|Raleway" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Inconsolata" rel="stylesheet">
<meta name="msapplication-TileColor" content="#ffffff">
<meta name="msapplication-TileImage" content="_static/ms-icon-144x144.png">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/selectize.js/0.12.6/css/selectize.bootstrap3.min.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.3.1/css/bootstrap.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/datatables/1.10.21/js/jquery.dataTables.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/selectize.js/0.12.6/js/standalone/selectize.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.3.1/js/bootstrap.bundle.min.js"></script>
</head><body>
<div class="document">
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<p class="logo">
<a href="#">
<img class="logo" src="_static/logo/GoGetData_name_logo.png" alt="Logo"/>
</a>
</p>
<h3>Navigation</h3>
<ul>
<li class="toctree-l1"><a class="reference internal" href="quick-start.html">GGD Quick Start</a></li>
<li class="toctree-l1"><a class="reference internal" href="using-ggd.html">Using GGD</a></li>
<li class="toctree-l1"><a class="reference internal" href="GGD-CLI.html">GGD Commands</a></li>
<li class="toctree-l1"><a class="reference internal" href="meta-recipes.html">GGD meta-recipes</a></li>
<li class="toctree-l1"><a class="reference internal" href="contribute.html">Contribute</a></li>
<li class="toctree-l1"><a class="reference internal" href="private_recipes.html">Private Recipes</a></li>
<li class="toctree-l1"><a class="reference internal" href="workflows.html">Using GGD in Workflows</a></li>
<li class="toctree-l1"><a class="reference internal" href="cite.html">Citing GGD</a></li>
<li class="toctree-l1"><a class="reference internal" href="recipes.html">Available Data Packages</a></li>
</ul>
<ul>
<li class="toctree-l1"><a href="https://github.com/gogetdata/ggd-recipes">ggd-recipes @ Github</a></li>
<li class="toctree-l1"><a href="https://github.com/gogetdata/ggd-cli">ggd-cli @ Github</a></li>
</ul>
<div id="searchbox" style="display: none" role="search">
<h3 id="searchlabel">Quick search</h3>
<div class="searchformwrapper">
<form class="search" action="search.html" method="get">
<input type="text" name="q" aria-labelledby="searchlabel" />
<input type="submit" value="Go" />
</form>
</div>
</div>
<script>$('#searchbox').show(0);</script>
</div>
</div>
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<img alt="_images/GoGetData_name_logo.png" id="home-page" src="_images/GoGetData_name_logo.png" />
<p><strong>To see and/or search for data packages available through GGD, see:</strong> <a class="reference internal" href="recipes.html#recipes"><span class="std std-ref">Available data packages</span></a></p>
<p><strong>For a quick-start guide to using ggd see:</strong> <a class="reference internal" href="quick-start.html#quick-start"><span class="std std-ref">GGD Quick Start</span></a></p>
<p><strong>To request a new data recipe please fill out the</strong> <a class="reference external" href="https://forms.gle/3WEWgGGeh7ohAjcJA">GGD Recipe Request</a> <strong>Form.</strong></p>
<p><strong>For details on creating and adding recipes to ggd see the</strong> <a class="reference internal" href="contribute.html#make-data-packages"><span class="std std-ref">Contribute</span></a> <strong>page.</strong></p>
<div class="admonition important">
<p class="admonition-title">Important</p>
<p>If you use GGD, please cite the <a class="reference external" href="https://www.nature.com/articles/s41467-021-22381-z">Nature Communications GGD paper</a></p>
</div>
<p>Go Get Data (ggd) is a data management system that provides access to data packages containing auto curated genomic data.
ggd data packages contain all necessary information for data extraction, handling, and processing. With a growing number
of scientific datasets, ggd provides access to these datasets without the hassle of finding, downloading, and processing them
yourself. ggd leverages the <a class="reference external" href="http://conda.pydata.org/docs/intro.html">conda</a> package management system
and the infrastructure of <a class="reference external" href="https://bioconda.github.io/index.html">Bioconda</a> to provide a fast and easy way to
retrieve processed annotations and datasets, supporting data provenance, and providing a stable source of reproducibility.
Using the ggd data management system allows any user to quickly access all desired datasets, manage that data within an environment,
and provides a platform upon which to cite data access and use by way of the ggd data package name and version.</p>
<img alt="_images/ggd-Figure1.png" src="_images/ggd-Figure1.png" />
<p>Go Get Data acts as a multi use tool for genomic data access and management. It provide a system for simple data access all the way to a
complete data management including version tracking, dependency handling, data format standards, and more. Whether using ggd for data
access in genomic workflows, for an analysis, or for any other reason, ggd has been developed to work in many situations.</p>
<p>We want to briefly highlight ggd’s ability to work with multiple conda environments at the same time. Multiple tools in ggd use a
<code class="code docutils literal notranslate"><span class="pre">--prefix</span></code> flag. This prefix flag allows a user to install, manage, and access data in a specific conda environment. Therefore,
a scientist could install ggd data packages into a conda environment set apart for data storage, as well as access and use that data, all
without being in that environment. This helps to reduce the common multiple occurrences of the same dataset or annotation installed
on your system. It also improves the flexibility one has with using the data on different platforms, in different environments, and with
different software tools.</p>
<div class="section" id="ggd-consists-of">
<h1>ggd consists of:<a class="headerlink" href="#ggd-consists-of" title="Permalink to this headline">¶</a></h1>
<ul class="simple">
<li><p>a <a class="reference external" href="https://github.com/gogetdata/ggd-recipes">repository of data recipes</a> hosted on Github</p></li>
<li><p>a <a class="reference external" href="https://github.com/gogetdata/ggd-cli">command line interface (cli)</a> to communicate with the ggd ecosystem</p></li>
<li><p>a continually growing list of genomic recipes to provide quick and easy access to processed genomic data
using the ggd cli tool</p></li>
</ul>
</div>
<div class="section" id="capabilities">
<h1>Capabilities<a class="headerlink" href="#capabilities" title="Permalink to this headline">¶</a></h1>
<p>See <a class="reference internal" href="quick-start.html#quick-start"><span class="std std-ref">GGD Quick Start</span></a> to start using ggd with minimal information.</p>
<p>Use <code class="code docutils literal notranslate"><span class="pre">ggd</span></code> to search, find, and install a data package hosted by ggd. The data package will be installed and processed
on your system, and give you ready-to-use data files. For additional information see <a class="reference internal" href="using-ggd.html#using-ggd"><span class="std std-ref">Using GGD</span></a>.</p>
</div>
<div class="section" id="how-go-get-data-works">
<h1>How ‘Go Get Data’ works:<a class="headerlink" href="#how-go-get-data-works" title="Permalink to this headline">¶</a></h1>
<p>ggd is built upon principles for data access, usability, provenance, and management on a local environment. Basic principles
include:</p>
<ol class="arabic simple">
<li><p>Data recipe creation</p></li>
<li><p>Data package infrastructure</p></li>
<li><p>Data access and management</p></li>
</ol>
<div class="section" id="data-recipe-creation">
<h2>Data recipe creation<a class="headerlink" href="#data-recipe-creation" title="Permalink to this headline">¶</a></h2>
<p>Data recipes are the core of ggd. They represent the standardization and reproducible access to genomic datasets and
annotations. Whether the recipe is simple or complex, they make up the infrastructure of ggd and the support for simple
access to data recipes. Therefore, recipe creation is a vital part of ggd.</p>
<p>Although the ggd team is working to add additional data recipes to the ggd ecosystem, we intend for ggd to be a community
driven project. As such, ggd relies on community contributions. Therefore, ggd has simplified the process of creating data
recipes and contributing them to ggd. One simply needs to combine the different steps required to access and process a
desired data set into a bash script. Once the bash script has been created they can run <code class="code docutils literal notranslate"><span class="pre">ggd</span> <span class="pre">make-recipe</span></code> to create
a ggd data recipe.</p>
<p>A data recipe consists of a meta data file, a data curation script, a local file and environment handling script, and
a checksum file. Together, these files represent the necessary instructions ggd will use to install and manage a data
recipe on your local system.</p>
<p>Validation of the new recipe is a required part of the contribution process. After creating a recipe, simply run
<code class="code docutils literal notranslate"><span class="pre">ggd</span> <span class="pre">check-recipe</span></code> to validate the recipe is working correctly.</p>
<p>For further details about contributing data recipes to ggd see <a class="reference internal" href="contribute.html#make-data-packages"><span class="std std-ref">Contribute</span></a>.</p>
<img alt="_images/Make_GGD_Recipe.png" src="_images/Make_GGD_Recipe.png" />
</div>
<div class="section" id="data-package-infrastructure">
<h2>Data package infrastructure<a class="headerlink" href="#data-package-infrastructure" title="Permalink to this headline">¶</a></h2>
<p>Another vital part of ggd is the infrastructure used to provide access to available data recipes. ggd has integrated multiple
ecosystems together in order to control such access.</p>
<p>First, recipes are compiled together as “cookbooks” on github. This github repo acts as the first stage of recipe access.</p>
<p>Second, a continuous integration (CI) system provides automatic data recipe validation. This CI system acts as a the second stage
to ensure that each data recipe works correctly.</p>
<p>Third, the CI system packages the recipe into a data packages. These data packages contain the same information as the recipe, but
are easier to manage and transfer.</p>
<p>Fourth, we utilize the ecosystem of the Anaconda cloud to store data package(s). Once a recipe has been validated and a packaged the package
is pushed to the Anaconda cloud.</p>
<p>Fifth, we also utilize the ecosystem of Amazon cloud storage. Processed package contents are stored on Amazon cloud storage to improve speed
and accuracy of data curation.</p>
<p>Sixth, the CI system automatically maintains local and global levels of ggd metadata. This metadata is a crucial part of the ggd ecosystem. Automatic
maintenance provides a stable and structured way to ensure metadata is maintained and properly accounted for.</p>
<img alt="_images/GGD_Cloud.png" src="_images/GGD_Cloud.png" />
</div>
<div class="section" id="data-access-and-management">
<h2>Data access and management<a class="headerlink" href="#data-access-and-management" title="Permalink to this headline">¶</a></h2>
<p>Finally, and most importantly, is the ability to access and manage the data on your local system. This is the main purpose of the command line interface (CLI)
for ggd. Simply, ggd CLI provides tools for installing data packages on your local system, but more importantly, tools for access and use on your system. For
more information about the CLI tools see <a class="reference internal" href="GGD-CLI.html#ggd-cli-page"><span class="std std-ref">GGD CLI</span></a>.</p>
<img alt="_images/Data_managment.png" src="_images/Data_managment.png" />
</div>
<div class="section" id="ggd-meta-recipes">
<h2>GGD meta-recipes<a class="headerlink" href="#ggd-meta-recipes" title="Permalink to this headline">¶</a></h2>
<p>meta-recipes are recipes that provide access to a database of data using an identifier specific to that database. This is different then a normal ggd recipe,
which consists of a set of instructions for a specific data file(s) to be installed and processed. Meta-recipes are powerful tools to provide access to database
with large amounts of data which would be unreasonable to create a specific ggd recipe for individual data types/identifiers.</p>
<p>For more information about meta-recipes see: <a class="reference internal" href="meta-recipes.html#meta-recipes"><span class="std std-ref">Meta-Recipes</span></a></p>
</div>
</div>
<hr class="docutils" />
<div class="section" id="example">
<h1>Example:<a class="headerlink" href="#example" title="Permalink to this headline">¶</a></h1>
<ol class="arabic simple">
<li><p>Let’s say you you need to align some sequence(s) to the human reference genome for an analysis you are doing.
You will need to download the reference genome from one of the sites that hosts it. You will need to make sure it is
the correct genome build, it is the right reference genome, and download it from the site. You will then need
to sort and index the reference genome before you can use it. GGD simplifies this process by allowing you to search
and install available processed genomic data packages using the ggd tool.</p></li>
</ol>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>The example below is not necessarily the
correct data package for your needs, but rather is an example of using ggd</p>
</div>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1">#******************************</span>
<span class="c1">#1. Search for a reference genome</span>
<span class="c1"># (See ggd search)</span>
<span class="c1">#******************************</span>
$ ggd search grch37 reference-genome
----------------------------------------------------------------------------------------------------
grch37-reference-genome-ensembl-v1
<span class="o">==================================</span>
Summary: The GRCh37 unmasked genomic DNA sequence reference genome from Ensembl-Release <span class="m">75</span>. Includes all sequence regions EXCLUDING haplotypes and patches. <span class="s1">'Primary Assembly file'</span>
Species: Homo_sapiens
Genome Build: GRCh37
Keywords: Primary-Assembly, Release-75, ref, reference, Ensembl-ref, DNA-Sequence, Fasta-Sequence, fasta-file
Data Provider: Ensembl
Data Version: release-75_2-3-14
File type<span class="o">(</span>s<span class="o">)</span>: fa
Data file coordinate base: NA
Included Data Files:
grch37-reference-genome-ensembl-v1.fa
grch37-reference-genome-ensembl-v1.fa.fai
Approximate Data File Sizes:
grch37-reference-genome-ensembl-v1.fa: <span class="m">3</span>.15G
grch37-reference-genome-ensembl-v1.fa.fai: <span class="m">2</span>.74K
To install run:
ggd install grch37-reference-genome-ensembl-v1
----------------------------------------------------------------------------------------------------
grch37-reference-genome-gencode-v1
<span class="o">==================================</span>
Summary: The GRCh37 DNA nucleotide sequence primary assembly. Sequence regions include reference chromosomes and scaffoldings.
Species: Homo_sapiens
Genome Build: GRCh37
Keywords: Reference-Genome, Fasta, DNA-Sequence, GENCODE-34, Fasta-sequence, primary-assembly
Data Provider: GENCODE
Data Version: release-34
File type<span class="o">(</span>s<span class="o">)</span>: fa
Data file coordinate base: NA
Included Data Files:
grch37-reference-genome-gencode-v1.fa.gz
grch37-reference-genome-gencode-v1.fa.gz.fai
grch37-reference-genome-gencode-v1.fa.gz.gzi
Approximate Data File Sizes:
grch37-reference-genome-gencode-v1.fa.gz: <span class="m">881</span>.99M
grch37-reference-genome-gencode-v1.fa.gz.fai: <span class="m">2</span>.74K
grch37-reference-genome-gencode-v1.fa.gz.gzi: <span class="m">772</span>.92K
To install run:
ggd install grch37-reference-genome-gencode-v1
----------------------------------------------------------------------------------------------------
grch37-toplevel-reference-genome-ensembl-v1
<span class="o">===========================================</span>
Summary: The GRCh37 unmasked genomic DNA sequence reference genome from Ensembl-Release <span class="m">75</span>. Includes all sequence regions flagged as toplevel by Ensembl including chromosomes, regions not assembled into chromosomes, and N padded haplotype/patch regions. <span class="s1">'Top Level file'</span>
Species: Homo_sapiens
Genome Build: GRCh37
Keywords: Top-Level, Release-75, ref, reference, Ensembl-ref, DNA-Sequence, Fasta-Sequence, fasta-file
Data Provider: Ensembl
Data Version: release-75_2-3-14
File type<span class="o">(</span>s<span class="o">)</span>: fa
Data file coordinate base: NA
Included Data Files:
grch37-toplevel-reference-genome-ensembl-v1.fa.gz
grch37-toplevel-reference-genome-ensembl-v1.fa.gz.fai
grch37-toplevel-reference-genome-ensembl-v1.fa.gz.gzi
Approximate Data File Sizes:
grch37-toplevel-reference-genome-ensembl-v1.fa.gz: <span class="m">1</span>.09G
grch37-toplevel-reference-genome-ensembl-v1.fa.gz.fai: <span class="m">11</span>.65K
grch37-toplevel-reference-genome-ensembl-v1.fa.gz.gzi: <span class="m">7</span>.98M
To install run:
ggd install grch37-toplevel-reference-genome-ensembl-v1
----------------------------------------------------------------------------------------------------
.
.
.
<span class="c1">#******************************</span>
<span class="c1">#2. Install the grch37 reference genome from Ensembl</span>
<span class="c1"># (See ggd install)</span>
<span class="c1">#******************************</span>
$ ggd install grch37-reference-genome-ensembl-v1
:ggd:install: Looking <span class="k">for</span> grch37-reference-genome-ensembl-v1 in the <span class="s1">'ggd-genomics'</span> channel
:ggd:install: grch37-reference-genome-ensembl-v1 exists in the ggd-genomics channel
:ggd:install: grch37-reference-genome-ensembl-v1 version <span class="m">1</span> is not installed on your system
:ggd:install: grch37-reference-genome-ensembl-v1 has not been installed by conda
:ggd:install: The grch37-reference-genome-ensembl-v1 package is uploaded to an aws S3 bucket. To reduce processing <span class="nb">time</span> the package will be downloaded from an aws S3 bucket
:ggd:install: Attempting to install the following cached package<span class="o">(</span>s<span class="o">)</span>:
grch37-reference-genome-ensembl-v1
:ggd:utils:bypass: Installing grch37-reference-genome-ensembl-v1 from the ggd-genomics conda channel
Collecting package metadata: <span class="k">done</span>
Processing data: <span class="k">done</span>
<span class="c1">## Package Plan ##</span>
environment location: <env>
added / updated specs:
- grch37-reference-genome-ensembl-v1
The following packages will be downloaded:
package <span class="p">|</span> build
---------------------------<span class="p">|</span>--------------------------------
grch37-reference-genome-ensembl-v1-1<span class="p">|</span> <span class="m">1</span> <span class="m">6</span> KB ggd-genomics
------------------------------------------------------------
Total: <span class="m">6</span> KB
The following NEW packages will be INSTALLED:
grch37-reference-~ ggd-genomics/noarch::grch37-reference-genome-ensembl-v1-1-1
Downloading and Extracting Packages
grch37-reference-gen <span class="p">|</span> <span class="m">6</span> KB <span class="p">|</span> <span class="c1">############################################################################ | 100%</span>
Preparing transaction: <span class="k">done</span>
Verifying transaction: <span class="k">done</span>
Executing transaction: <span class="k">done</span>
:ggd:install: Updating installed package list
:ggd:install: Initiating data file content validation using checksum
:ggd:install: Checksum <span class="k">for</span> grch37-reference-genome-ensembl-v1
:ggd:install: ** Successful Checksum **
:ggd:install: Install Complete
:ggd:install: Installed file <span class="nv">locations</span>
<span class="o">======================================================================================================================</span>
GGD Package Environment Variable<span class="o">(</span>s<span class="o">)</span>
----------------------------------------------------------------------------------------------------
-> grch37-reference-genome-ensembl-v1 <span class="nv">$ggd_grch37_reference_genome</span>-ensembl_v1_dir
<span class="nv">$ggd_grch37_reference_genome_ensembl_v1_file</span>
Install Path: <conda root>/share/ggd/Homo_sapiens/GRCh37/grch37-reference-genome-ensembl-v1/1
:ggd:install: To activate environment variables run <span class="sb">`</span><span class="nb">source</span> activate base<span class="sb">`</span> in the environment the packages were installed in
:ggd:install: NOTE: These environment variables are specific to the <env> conda environment and can only be accessed from within that <span class="nv">environment</span>
<span class="o">======================================================================================================================</span>
:ggd:install: Environment Variables
*****************************
Inactive or out-of-date environment variables:
> <span class="nv">$ggd_grch37_reference_genome</span>-ensembl_v1_dir
> <span class="nv">$ggd_grch37_reference_genome_ensembl_v1_file</span>
To activate inactive or out-of-date vars, run:
<span class="nb">source</span> activate base
*****************************
:ggd:install: DONE
<span class="c1">#******************************</span>
<span class="c1">#3. Identify the data environment variable or the file location</span>
<span class="c1"># (See ggd list, ggd show-env, or ggd get-files)</span>
<span class="c1">#******************************</span>
$ ggd list
<span class="c1"># Packages in environment: <env></span>
<span class="c1">#</span>
------------------------------------------------------------------------------------------------------------------------
Name Pkg-Version Pkg-Build Channel Environment-Variables
------------------------------------------------------------------------------------------------------------------------
-> grch37-reference-genome-ensembl-v1 <span class="m">1</span> <span class="m">1</span> ggd-genomics <span class="nv">$ggd_grch37_reference_genome</span>-ensembl_v1_dir, <span class="nv">$ggd_grch37_reference_genome_ensembl_v1_file</span>
<span class="c1"># To use the environment variables run `source activate base`</span>
<span class="c1"># You can see the available ggd data package environment variables by running `ggd show-env`</span>
$ ggd show-env
***************************
Active environment variables:
> <span class="nv">$ggd_grch38_reference_genome</span>-ensembl_v1_dir
> <span class="nv">$ggd_grch38_reference_genome</span>-ensembl_v1_file
***************************
$ ggd get-files grch38-reference-genome-ensembl-v1
<conda root>/share/ggd/Homo_sapiens/GRCh38/grch38-reference-genome-ensembl-ensembl-v1/1/grch38-reference-genome-ensembl-v1.fa
<conda root>/share/ggd/Homo_sapiens/GRCh38/grch38-reference-genome-ensembl-ensembl-v1/1/grch38-reference-genome-ensembl-v1.fa.fai
<span class="c1">#******************************</span>
<span class="c1">#4. Use files</span>
<span class="c1"># To use the downloaded data packages you can use the full file path from running `ggd get-files`</span>
<span class="c1"># or the environment variables created during installation</span>
<span class="c1"># For more info see the `Using installed data` tab.</span>
<span class="c1">#******************************</span>
</pre></div>
</div>
<div class="section" id="available-data-packages">
<h2>Available Data Packages<a class="headerlink" href="#available-data-packages" title="Permalink to this headline">¶</a></h2>
<p>You can see and search for available packages using the <a class="reference internal" href="recipes.html#recipes"><span class="std std-ref">Available packages</span></a> page of the
ggd documentation</p>
<p>If you have the ggd cli tool installed, you can use <code class="code docutils literal notranslate"><span class="pre">ggd</span> <span class="pre">search</span></code> to search for available packages.</p>
<p>Contents:</p>
<div class="toctree-wrapper compound">
<ul>
<li class="toctree-l1"><a class="reference internal" href="quick-start.html">GGD Quick Start</a><ul>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#installing-ggd">1) Installing GGD</a></li>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#searching-for-data-packages">2) Searching for data packages</a></li>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#installing-a-data-package">3) Installing a data package</a></li>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#listing-installed-packages">4) Listing installed packages</a></li>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#using-the-environment-variables">5) Using the environment variables</a></li>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#fetching-the-data-files-with-get-files">6) Fetching the data files with “get-files”</a></li>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#using-the-data-packages">7) Using the data packages</a></li>
<li class="toctree-l2"><a class="reference internal" href="quick-start.html#additional-info">8) Additional Info</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="using-ggd.html">Using GGD</a><ul>
<li class="toctree-l2"><a class="reference internal" href="using-ggd.html#install-conda">1. Install conda</a></li>
<li class="toctree-l2"><a class="reference internal" href="using-ggd.html#configure-the-conda-channels">2. Configure the conda channels</a></li>
<li class="toctree-l2"><a class="reference internal" href="using-ggd.html#install-ggd">3. Install ggd</a></li>
<li class="toctree-l2"><a class="reference internal" href="using-ggd.html#ggd-tools">4. ggd tools</a></li>
<li class="toctree-l2"><a class="reference internal" href="using-ggd.html#contributing-to-ggd">5. Contributing to ggd</a></li>
<li class="toctree-l2"><a class="reference internal" href="using-ggd.html#ggd-use-case">ggd Use Case</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="GGD-CLI.html">GGD Commands</a><ul>
<li class="toctree-l2"><a class="reference internal" href="ggd-search.html">ggd search</a></li>
<li class="toctree-l2"><a class="reference internal" href="install.html">ggd install</a></li>
<li class="toctree-l2"><a class="reference internal" href="predict-path.html">ggd predict-path</a></li>
<li class="toctree-l2"><a class="reference internal" href="uninstall.html">ggd uninstall</a></li>
<li class="toctree-l2"><a class="reference internal" href="list.html">ggd list</a></li>
<li class="toctree-l2"><a class="reference internal" href="list-file.html">ggd get-files</a></li>
<li class="toctree-l2"><a class="reference internal" href="pkg-info.html">ggd pkg-info</a></li>
<li class="toctree-l2"><a class="reference internal" href="show-env.html">ggd show-env</a></li>
<li class="toctree-l2"><a class="reference internal" href="make-recipe.html">ggd make-recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="make-metarecipe.html">ggd make-meta-recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="check-recipe.html">ggd check-recipe</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="meta-recipes.html">GGD meta-recipes</a><ul>
<li class="toctree-l2"><a class="reference internal" href="meta-recipes.html#ggd-commands-with-meta-recipes">GGD commands with meta-recipes</a></li>
<li class="toctree-l2"><a class="reference internal" href="meta-recipes.html#conda-environments-and-prefix">Conda Environments and Prefix</a></li>
<li class="toctree-l2"><a class="reference internal" href="meta-recipes.html#installation-time">Installation Time</a></li>
<li class="toctree-l2"><a class="reference internal" href="meta-recipes.html#installing-a-meta-recipe">Installing a meta-recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="meta-recipes.html#accessing-installed-id-specific-meta-recipe">Accessing installed ID specific meta-recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="meta-recipes.html#creating-and-testing-meta-recipes">Creating and Testing meta-recipes</a></li>
<li class="toctree-l2"><a class="reference internal" href="meta-recipes.html#meta-recipe-caveats">Meta-Recipe Caveats</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="contribute.html">Contribute</a><ul>
<li class="toctree-l2"><a class="reference internal" href="github-setup.html">Setting up with Github</a></li>
<li class="toctree-l2"><a class="reference internal" href="contribute-recipe.html">Contributing a ggd recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="making-meta-recipes.html">Creating a ggd meta-recipe</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="private_recipes.html">Private Recipes</a><ul>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#create-a-private-github-repository-to-store-private-data-recipes">1. Create a private github repository to store private data recipes</a></li>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#create-a-ggd-recipe">2. Create a ggd recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#check-and-install-the-data-recipe">3. Check and install the data recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#add-the-data-recipe-to-github">4. Add the data recipe to github</a></li>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#how-to-access-installed-data-from-private-recipe">5. How to access installed data from private recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#ggd-commands-the-won-t-work-with-private-recipes">6. GGD commands the won’t work with private recipes</a></li>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#uninstalling-a-previously-installed-private-data-recipe">7. Uninstalling a previously installed private data recipe</a></li>
<li class="toctree-l2"><a class="reference internal" href="private_recipes.html#finally">Finally</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="workflows.html">Using GGD in Workflows</a><ul>
<li class="toctree-l2"><a class="reference internal" href="workflows.html#ggd-and-workflows">GGD and Workflows</a></li>
<li class="toctree-l2"><a class="reference internal" href="workflows.html#snakemake-workflow-example"><span class="xref std std-ref">Snakemake Workflow Example</span></a></li>
<li class="toctree-l2"><a class="reference internal" href="workflows.html#nextflow-workflow-example"><span class="xref std std-ref">Nextflow Workflow Example</span></a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="cite.html">Citing GGD</a></li>
<li class="toctree-l1"><a class="reference internal" href="recipes.html">Available Data Packages</a></li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer">
©2016-2021, The GoGetData team.
|
<a href="_sources/index.rst.txt"
rel="nofollow">Page source</a>
</div>
</body>
</html>