Skip to content
This repository has been archived by the owner on Nov 14, 2023. It is now read-only.

Commit

Permalink
Merge pull request #288 from NFDI4Chem/development
Browse files Browse the repository at this point in the history
fix: add horizon command to initial setup steps
  • Loading branch information
NishaSharma14 authored Mar 3, 2023
2 parents f3e45c7 + 3ed88a6 commit 837b4ba
Show file tree
Hide file tree
Showing 17 changed files with 299 additions and 19 deletions.
4 changes: 2 additions & 2 deletions docs/advanced-guides/nmr-repositories/organism.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Organisms are found only in metabolomics-related repositories, i.e., MTBLS and M
<td>Organism</td>
<td>ontology-driven</td>
<td>none</td>
<td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null".</td>
<td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null"; or the organism is not found in NCBI taxonomy.</td>
</tr>
<tr>
<td><b>MW</b></td>
Expand Down Expand Up @@ -69,7 +69,7 @@ Some values were ambiguous such as "Various", "Extract", "Multi-species non-defi
However, even after taking all that was mentioned above, it is still clear that the most studied species are humans (Homo sapiens) and mice (Mus musculus)
<div style={{textAlign: 'center'}}>
<img src="/img/analysis/org/all.png" width="700"/>
<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the sample pH</figcaption>
<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the organism</figcaption>
</div>
<br></br>

Expand Down
80 changes: 80 additions & 0 deletions docs/advanced-guides/nmr-repositories/part.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
sidebar_position: 11
title: Organism Part
---

# Organism Part
[Notebook link](https://github.com/NFDI4Chem/repo-scripts/blob/main/notebooks/organism-part.ipynb) where you can find all the graphs.

Data created on 17.10.2022 at 19:32:45

Data updated on 17.10.2022 at 19:32:45

## Support by Ontologies
[The BRENDA Tissue Ontology - BTO](https://www.ebi.ac.uk/ols/ontologies/bto) and [Experimental Factor Ontology - EFO](https://www.ebi.ac.uk/ols/ontologies/efo) are excellent sources for organisms parts.

## Data Sanitisation and Missing Values
Organisms parts are found only in metabolomics-related repositories, i.e., MTBLS and MW.


<table>
<tr>
<th></th>
<th>Field Type</th>
<th>Field Name</th>
<th>Values Readability</th>
<th>Unit</th>
<th>Missing</th>
</tr>
<tr>
<td><b>MTBLS</b></td>
<td>dedicated</td>
<td>Organism part</td>
<td>ontology-driven</td>
<td>none</td>
<td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null"; or the organism is not found in NCBI taxonomy.</td>
</tr>
<tr>
<td><b>MW</b></td>
<td>dedicated</td>
<td>SAMPLE_TYPE</td>
<td>free text</td>
<td>none</td>
<td>The field is not provided; or the value is provided as N/A or other similar expressions; or decoding the JSON file containing the study details has failed due to syntax error there; or the organism was not found in NCBI taxonomy.</td>
</tr>
</table>

<table>
<tr>
<th></th>
<th>Input Examples</th>
<th>Output</th>
</tr>
<tr>
<td><b>MTBLS</b></td>
<td>["blood serum", "serum", "A2780cisR cell", "muscle", "feces", "Acetonitrile:H2O (1:3)"]</td>
<td>["blood serum", "urine", etc.]</td>
</tr>
<tr>
<td><b>MW</b></td>
<td>["Urine", "urine", "BLOOD", "Serum", "Plasma, Liver"]</td>
<td>["blood serum", "urine", etc.]</td>
</tr>
</table>

## Results
Organism parts details are available in metabolomics repositories. The use of different sources of ontologies was easy to see when using terms such as "Blood" vs "blood". Additionally, values other than organism parts were sometimes provided, such as "Acetonitrile:H2O (1:3)".

The most used part was the blood serum, then come the urine, blood plasma, liver, and others.

<div style={{textAlign: 'center'}}>
<img src="/img/analysis/part/all.png" width="700"/>
<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the organism part</figcaption>
</div>
<br></br>

Here one can see the number of studies providing the organism part and its value.
<div style={{textAlign: 'center'}}>
<img src="/img/analysis/part/h.png" width="1000"/>
<figcaption>The number of studies in MTBLS and MW based on the organism part</figcaption>
</div>
76 changes: 76 additions & 0 deletions docs/advanced-guides/nmr-repositories/variant.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
sidebar_position: 12
title: Variant
---

# Variant
[Notebook link](https://github.com/NFDI4Chem/repo-scripts/blob/main/notebooks/variant.ipynb) where you can find all the graphs.

Data created on 17.10.2022 at 19:32:45

Data updated on 17.10.2022 at 19:32:45

## Support by Ontologies
[The BRENDA Tissue Ontology - BTO](https://www.ebi.ac.uk/ols/ontologies/bto) and [Experimental Factor Ontology - EFO](https://www.ebi.ac.uk/ols/ontologies/efo) are good sources for variants.

Variants are found only in metabolomics-related repositories, i.e., MTBLS and MW.

<table>
<tr>
<th></th>
<th>Field Type</th>
<th>Field Name</th>
<th>Values Readability</th>
<th>Unit</th>
<th>Missing</th>
</tr>
<tr>
<td><b>MTBLS</b></td>
<td>dedicated</td>
<td>Variant</td>
<td>free text</td>
<td>none</td>
<td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null"; or the organism is not found in NCBI taxonomy.</td>
</tr>
<tr>
<td><b>MW</b></td>
<td>dedicated</td>
<td>GENOTYPE_STRAIN</td>
<td>free text</td>
<td>none</td>
<td>The field is not provided; or the value is provided as N/A or other similar expressions; or decoding the JSON file containing the study details has failed due to syntax error there; or the organism was not found in NCBI taxonomy.</td>
</tr>
</table>

<table>
<tr>
<th></th>
<th>Input Examples</th>
<th>Output</th>
</tr>
<tr>
<td><b>MTBLS</b></td>
<td>["Mus musculus str. SAMP1/YitFc", "BY4741", "Thoroughbred", "EFO:Thalassiosira pseudonana CCMP1335"]</td>
<td>["c57Bl-6", "c3h-hen", etc.]</td>
</tr>
<tr>
<td><b>MW</b></td>
<td>["C57BL/6", "Swiss Webster Mice", "C3H/HeN"]</td>
<td>["c57Bl-6", "c3h-hen", etc.]</td>
</tr>
</table>

## Results
Variants details are available in metabolomics repositories. The most used variant was "C57BL/6J".

<div style={{textAlign: 'center'}}>
<img src="/img/analysis/var/all.png" width="700"/>
<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the variant</figcaption>
</div>
<br></br>

Here one can see the number of studies providing the variant and its value.
<div style={{textAlign: 'center'}}>
<img src="/img/analysis/var/h.png" width="1000"/>
<figcaption>The number of studies in MTBLS and MW based on the variant</figcaption>
</div>
27 changes: 27 additions & 0 deletions docs/developer-guides/installation/mac.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,33 @@ composer install
```bash
npm install && npm run dev
```
* For all background jobs to run, nmrXiv is powered with [Redis](https://redis.com/) and packaged with [Horizon](https://github.com/laravel/horizon).
Run the below command to publish all the jobs and start the worker for the background jobs to execute.
```bash
./vendor/bin/sail artisan horizon:publish
./vendor/bin/sail artisan horizon
```

* To configure file object storage, you should have [Minio](https://min.io/) instance already running in your local(for more details check your docker-compose file). For the first time you have to generate the Access Keys, create the buckets and configure that in your `.env` file.
* Open the Minio instance running in your [http://localhost:8900](http://localhost:8900/)
* Login with user - `sail` and password - `password`
* Go to Access Keys and create a new access key.
* Create the two buckets with Read Write Access as `nmrxiv` and `nmrxiv-public`
* Update Filesystem driver and the AWS Keys as below in the `.env` file. Make sure you point your `AWS_URL` to Minio API which is running in port 9000.

```bash
FILESYSTEM_DRIVER=minio
FILESYSTEM_DRIVER_PUBLIC=minio_public

AWS_ACCESS_KEY_ID=RjcSdMxMiiGYycQV
AWS_SECRET_ACCESS_KEY=jCq9hAvsW4lmMzLzdyuvmoX7dqBpSc7W
AWS_DEFAULT_REGION=us-east-1
AWS_BUCKET=nmrxiv
AWS_ENDPOINT=http://localhost:9000/
AWS_URL=http://localhost:9000/
AWS_USE_PATH_STYLE_ENDPOINT=false
AWS_BUCKET_PUBLIC=nmrxiv-public
```

Once the application's Docker containers have been started, you can access the application in your web browser at [http://localhost](http://localhost). But first, you will be prompted to <b>Generate app key</b>. After pressing the generation button, the following message is shown on the screen: "The solution was executed successfully. Refresh now." After refreshing, you access the application.

Expand Down
28 changes: 28 additions & 0 deletions docs/developer-guides/installation/ubuntu.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ The whole project is a package of below services and features.
* [Selenium](https://www.selenium.dev/documentation/)
* [Meilisearch](https://docs.meilisearch.com/)
* [MailHog](https://mailtrap.io/blog/mailhog-explained/)
* [Minio](https://min.io/)

#### Ubuntu 20.04

Expand Down Expand Up @@ -63,6 +64,33 @@ Don't forget to note down the admin's user id and password provided at the end o
```bash
npm install && npm run dev
```
* For all background jobs to run, nmrXiv is powered with [Redis](https://redis.com/) and packaged with [Horizon](https://github.com/laravel/horizon).
Run the below command to publish all the jobs and start the worker for the background jobs to execute.
```bash
./vendor/bin/sail artisan horizon:publish
./vendor/bin/sail artisan horizon
```

* To configure file object storage, you should have [Minio](https://min.io/) instance already running in your local(for more details check your docker-compose file). For the first time you have to generate the Access Keys, create the buckets and configure that in your `.env` file.
* Open the Minio instance running in your [http://localhost:8900](http://localhost:8900/)
* Login with user - `sail` and password - `password`
* Go to Access Keys and create a new access key.
* Create the two buckets with Read Write Access as `nmrxiv` and `nmrxiv-public`
* Update Filesystem driver and the AWS Keys as below in the `.env` file. Make sure you point your `AWS_URL` to Minio API which is running in port 9000.

```bash
FILESYSTEM_DRIVER=minio
FILESYSTEM_DRIVER_PUBLIC=minio_public

AWS_ACCESS_KEY_ID=RjcSdMxMiiGYycQV
AWS_SECRET_ACCESS_KEY=jCq9hAvsW4lmMzLzdyuvmoX7dqBpSc7W
AWS_DEFAULT_REGION=us-east-1
AWS_BUCKET=nmrxiv
AWS_ENDPOINT=http://localhost:9000/
AWS_URL=http://localhost:9000/
AWS_USE_PATH_STYLE_ENDPOINT=false
AWS_BUCKET_PUBLIC=nmrxiv-public
```

Once the application's Docker containers have been started, you can access the application in your web browser at [http://localhost](http://localhost). But first, you will be prompted to <b>Generate app key</b>. After pressing the generation button, the following message is shown on the screen: "The solution was executed successfully. Refresh now." After refreshing, you access the application.

Expand Down
28 changes: 28 additions & 0 deletions docs/developer-guides/installation/windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,34 @@ npm install
npm run dev
```

* For all background jobs to run, nmrXiv is powered with [Redis](https://redis.com/) and packaged with [Horizon](https://github.com/laravel/horizon).
Run the below command to publish all the jobs and start the worker for the background jobs to execute.
```bash
./vendor/bin/sail artisan horizon:publish
./vendor/bin/sail artisan horizon
```

* To configure file object storage, you should have [Minio](https://min.io/) instance already running in your local(for more details check your docker-compose file). For the first time you have to generate the Access Keys, create the buckets and configure that in your `.env` file.
* Open the Minio instance running in your [http://localhost:8900](http://localhost:8900/)
* Login with user - `sail` and password - `password`
* Go to Access Keys and create a new access key.
* Create the two buckets with Read Write Access as `nmrxiv` and `nmrxiv-public`
* Update Filesystem driver and the AWS Keys as below in the `.env` file. Make sure you point your `AWS_URL` to Minio API which is running in port 9000.

```bash
FILESYSTEM_DRIVER=minio
FILESYSTEM_DRIVER_PUBLIC=minio_public

AWS_ACCESS_KEY_ID=RjcSdMxMiiGYycQV
AWS_SECRET_ACCESS_KEY=jCq9hAvsW4lmMzLzdyuvmoX7dqBpSc7W
AWS_DEFAULT_REGION=us-east-1
AWS_BUCKET=nmrxiv
AWS_ENDPOINT=http://localhost:9000/
AWS_URL=http://localhost:9000/
AWS_USE_PATH_STYLE_ENDPOINT=false
AWS_BUCKET_PUBLIC=nmrxiv-public
```

Once the application's Docker containers have been started, you can access the application in your web browser at [http://localhost](http://localhost). But first, you will be prompted to <b>Generate app key</b>. After pressing the generation button, the following message is shown on the screen: "The solution was executed successfully. Refresh now." After refreshing, you can access the application.

Run `code .` to open the code base to your VSCode editor.
Expand Down
15 changes: 8 additions & 7 deletions docs/submission-guides/data-model/dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,7 @@ If any structure is added to `Chemical structures` field in NMRium, it appears d
- **Meta** is the second table coming after **Info**. It includes the metadata from the instrument file.

## Create
There are two ways to create datasets. First is through the [submission pipeline](/docs/submission-guides/submission/upload.md), where the datasets will be automatically detected. The second is after submission from the `Datasets` tab within a study by dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md), but the second option is possible only for private ones.

There are two ways to create datasets. First is through the [submission pipeline](/docs/submission-guides/submission/upload.md), where the datasets will be automatically detected. The second is after submission from the `Datasets` tab within a study by clicking on [`Manage Datasets` button](#manage-datasets) and dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md), but the second option is possible only for private ones. Please pay attention to dragging the datasets to the right study folder.

## Access
You can access your created datasets and the ones shared with you by [entering their parent studies](/docs/submission-guides/data-model/study/#access) and going to the `Datasets` or `Files` tabs. All the public datasets on **[nmrXiv](https://nmrxiv.org/)** are in the `Datasets` folder.
Expand All @@ -47,7 +46,7 @@ You can access your created datasets and the ones shared with you by [entering t
To edit a dataset, you should have **editing** access to it, which is the case when you are its creator or when it is shared with you as an owner or a collaborator. The dataset should also still be private. You can edit the dataset through NMRium, and a history of changes is kept.

### Manage Datasets
From the `Datasets` tab within a study, you can add more datasets by dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md), or you can delete them by selecting a dataset **in the left panel** and pressing `Delete`.
From the `Datasets` tab within a study, you can add more datasets by dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md). Please pay attention to dragging the datasets to the right study folder. Or you can delete them by selecting a dataset **in the left panel** and pressing `Delete`.

## Validation

Expand All @@ -57,13 +56,15 @@ To publish a dataset, i.e., to make it public, you need to publish its parent pr
<img src="/img/project/publish.png" width="1000"/>
</div>

Clicking on **Why can't I publish?** leads to a new page similar to the [step-3 of the submission pipeline](/docs/submission-guides/submission/upload#complete---step-3). Here you can find either red <span style={{color:"red"}}>x</span> or green <span style={{color:"green"}}>✓</span> to indicate the existence or absence of the metadata respectively. Whenever the red <span style={{color:"red"}}>x</span> exists, it is accompanied by an `Edit` button to facilitate providing the missing data. If the `NMRium info` validation fails, this means NMRium didn't manage to extract the metadata from the files. Please click `Edit`, which will lead you to the corresponding dataset, and there click on `Preview`, which will update the preview and save it. You can make sure that the metadata is generated by checking the `Info` table below NMRium.

If `Assignments` validation fails, this means you have no structure at the dataset level. Click `Edit`, which will lead you to the corresponding dataset, and in NMRium `Chemical structures`, you have to ensure providing a structure there. The datasets where validation failed can be found from the red highlighting of their name.
Clicking on **Why can't I publish?** leads to a new page similar to the [step-3 of the submission pipeline](/docs/submission-guides/submission/upload#complete---step-3). Here you can find either red <span style={{color:"red"}}>x</span>, or a green <span style={{color:"green"}}>✓</span> to indicate the existence or absence of the mandatory metadata respectively. A yellow <span style={{color:"orange"}}>⚠</span> indicates the absence of recommended (not mandatory) metadata. Whenever the red <span style={{color:"red"}}>x</span> or the yellow <span style={{color:"orange"}}>⚠</span> exist, they are accompanied by an `Edit` button to facilitate providing the missing data. Here are more details about why a certain dataset validation fails:
- Files: This field checks whether there are spectral NMR data files. Since the [project submission](/docs/submission-guides/submission/upload.md) (not publishing) is not possible without spectral files, this field always passes the validation before submission <span style={{color:"green"}}>✓</span>.
- NMRium info: This field is about whether the dataset was processed completely by NMRium, where its metadata gets parsed and the spectrum gets viewed. If this field doesn't pass the validation <span style={{color:"red"}}>x</span>, please go to the respective dataset (with the edit button), and there click on `Preview`, which will update the preview and save it. You can make sure that the metadata is generated by checking the existence of the `Info` table below NMRium <span style={{color:"green"}}>✓</span>.
- Assay (Metadata): This feature is in development and, at the moment, the corresponding field will always pass the validation <span style={{color:"green"}}>✓</span>.
- Assignments: The molecular assignment with NMRium is recommended by **[nmrXiv](https://nmrxiv.org/)**. Therefore, whenever it is not provided, the field gets marked with a yellow <span style={{color:"orange"}}>⚠</span>, but it is still possible to publish without it. To provide the molecular assignment, please refer to [NMRium documentation](https://docs.nmrium.org/structure_assignment/assign/add).

<div style={{textAlign: 'center'}}>
<img src="/img/dataset/validation.png" width="1000"/>
<figcaption>Validation Checklist of the Dataset</figcaption>
<figcaption>Validation Checklist of the Dataset (within a study)</figcaption>
</div>

## Share
Expand Down
Loading

0 comments on commit 837b4ba

Please sign in to comment.