Merge pull request #288 from NFDI4Chem/development

fix: add horizon command to initial setup steps
NFDI4Chem · Mar 3, 2023 · 837b4ba · 837b4ba
2 parents f3e45c7 + 3ed88a6
commit 837b4ba
Show file tree

Hide file tree

Showing 17 changed files with 299 additions and 19 deletions.
diff --git a/docs/advanced-guides/nmr-repositories/organism.md b/docs/advanced-guides/nmr-repositories/organism.md
@@ -31,7 +31,7 @@ Organisms are found only in metabolomics-related repositories, i.e., MTBLS and M
     <td>Organism</td>
     <td>ontology-driven</td>
     <td>none</td>
-    <td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null".</td>
+    <td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null"; or the organism is not found in NCBI taxonomy.</td>
   </tr>
   <tr>
     <td><b>MW</b></td>
@@ -69,7 +69,7 @@ Some values were ambiguous such as "Various", "Extract", "Multi-species non-defi
 However, even after taking all that was mentioned above, it is still clear that the most studied species are humans (Homo sapiens) and mice (Mus musculus)
 <div style={{textAlign: 'center'}}>
 <img src="/img/analysis/org/all.png" width="700"/>
-<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the sample pH</figcaption>
+<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the organism</figcaption>
 </div>
 <br></br>
 

diff --git a/docs/advanced-guides/nmr-repositories/part.md b/docs/advanced-guides/nmr-repositories/part.md
@@ -0,0 +1,80 @@
+---
+sidebar_position: 11
+title: Organism Part
+---
+
+# Organism Part
+[Notebook link](https://github.com/NFDI4Chem/repo-scripts/blob/main/notebooks/organism-part.ipynb) where you can find all the graphs.
+
+Data created on 17.10.2022 at 19:32:45
+
+Data updated on 17.10.2022 at 19:32:45
+
+## Support by Ontologies
+[The BRENDA Tissue Ontology - BTO](https://www.ebi.ac.uk/ols/ontologies/bto) and [Experimental Factor Ontology - EFO](https://www.ebi.ac.uk/ols/ontologies/efo) are excellent sources for organisms parts. 
+
+## Data Sanitisation and Missing Values
+Organisms parts are found only in metabolomics-related repositories, i.e., MTBLS and MW. 
+
+
+<table>
+  <tr>
+    <th></th>
+    <th>Field Type</th>
+    <th>Field Name</th>
+    <th>Values Readability</th>
+    <th>Unit</th>
+    <th>Missing</th>
+  </tr>
+  <tr>
+    <td><b>MTBLS</b></td>
+    <td>dedicated</td>
+    <td>Organism part</td>
+    <td>ontology-driven</td>
+    <td>none</td>
+    <td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null"; or the organism is not found in NCBI taxonomy.</td>
+  </tr>
+  <tr>
+    <td><b>MW</b></td>
+    <td>dedicated</td>
+    <td>SAMPLE_TYPE</td>
+    <td>free text</td>
+    <td>none</td>
+    <td>The field is not provided; or the value is provided as N/A or other similar expressions; or decoding the JSON file containing the study details has failed due to syntax error there; or the organism was not found in NCBI taxonomy.</td>
+  </tr>
+</table>
+
+<table>
+  <tr>
+    <th></th>
+    <th>Input Examples</th>
+    <th>Output</th>
+  </tr>
+  <tr>
+    <td><b>MTBLS</b></td>
+    <td>["blood serum", "serum", "A2780cisR cell", "muscle", "feces", "Acetonitrile:H2O (1:3)"]</td>
+    <td>["blood serum", "urine", etc.]</td>
+  </tr>
+  <tr>
+    <td><b>MW</b></td>
+    <td>["Urine", "urine", "BLOOD", "Serum", "Plasma, Liver"]</td>
+    <td>["blood serum", "urine", etc.]</td>
+  </tr>
+</table>
+
+## Results
+Organism parts details are available in metabolomics repositories. The use of different sources of ontologies was easy to see when using terms such as "Blood" vs "blood". Additionally, values other than organism parts were sometimes provided, such as "Acetonitrile:H2O (1:3)". 
+
+The most used part was the blood serum, then come the urine, blood plasma, liver, and others.
+
+<div style={{textAlign: 'center'}}>
+<img src="/img/analysis/part/all.png" width="700"/>
+<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the organism part</figcaption>
+</div>
+<br></br>
+
+Here one can see the number of studies providing the organism part and its value.
+<div style={{textAlign: 'center'}}>
+<img src="/img/analysis/part/h.png" width="1000"/>
+<figcaption>The number of studies in MTBLS and MW based on the organism part</figcaption>
+</div>
diff --git a/docs/advanced-guides/nmr-repositories/variant.md b/docs/advanced-guides/nmr-repositories/variant.md
@@ -0,0 +1,76 @@
+---
+sidebar_position: 12
+title: Variant
+---
+
+# Variant
+[Notebook link](https://github.com/NFDI4Chem/repo-scripts/blob/main/notebooks/variant.ipynb) where you can find all the graphs.
+
+Data created on 17.10.2022 at 19:32:45
+
+Data updated on 17.10.2022 at 19:32:45
+
+## Support by Ontologies
+[The BRENDA Tissue Ontology - BTO](https://www.ebi.ac.uk/ols/ontologies/bto) and [Experimental Factor Ontology - EFO](https://www.ebi.ac.uk/ols/ontologies/efo) are good sources for variants. 
+
+Variants are found only in metabolomics-related repositories, i.e., MTBLS and MW. 
+
+<table>
+  <tr>
+    <th></th>
+    <th>Field Type</th>
+    <th>Field Name</th>
+    <th>Values Readability</th>
+    <th>Unit</th>
+    <th>Missing</th>
+  </tr>
+  <tr>
+    <td><b>MTBLS</b></td>
+    <td>dedicated</td>
+    <td>Variant</td>
+    <td>free text</td>
+    <td>none</td>
+    <td>The field is not provided; or the value is provided as N/A or other similar expressions; or the study "assays" value is "null"; or the organism is not found in NCBI taxonomy.</td>
+  </tr>
+  <tr>
+    <td><b>MW</b></td>
+    <td>dedicated</td>
+    <td>GENOTYPE_STRAIN</td>
+    <td>free text</td>
+    <td>none</td>
+    <td>The field is not provided; or the value is provided as N/A or other similar expressions; or decoding the JSON file containing the study details has failed due to syntax error there; or the organism was not found in NCBI taxonomy.</td>
+  </tr>
+</table>
+
+<table>
+  <tr>
+    <th></th>
+    <th>Input Examples</th>
+    <th>Output</th>
+  </tr>
+  <tr>
+    <td><b>MTBLS</b></td>
+    <td>["Mus musculus str. SAMP1/YitFc", "BY4741", "Thoroughbred", "EFO:Thalassiosira pseudonana CCMP1335"]</td>
+    <td>["c57Bl-6", "c3h-hen", etc.]</td>
+  </tr>
+  <tr>
+    <td><b>MW</b></td>
+    <td>["C57BL/6", "Swiss Webster Mice", "C3H/HeN"]</td>
+    <td>["c57Bl-6", "c3h-hen", etc.]</td>
+  </tr>
+</table>
+
+## Results
+Variants details are available in metabolomics repositories. The most used variant was "C57BL/6J". 
+
+<div style={{textAlign: 'center'}}>
+<img src="/img/analysis/var/all.png" width="700"/>
+<figcaption>A rough estimate of the percentages of all studies in MTBLS and MW repositories based on the variant</figcaption>
+</div>
+<br></br>
+
+Here one can see the number of studies providing the variant and its value.
+<div style={{textAlign: 'center'}}>
+<img src="/img/analysis/var/h.png" width="1000"/>
+<figcaption>The number of studies in MTBLS and MW based on the variant</figcaption>
+</div>
diff --git a/docs/developer-guides/installation/mac.md b/docs/developer-guides/installation/mac.md
@@ -57,6 +57,33 @@ composer install
 ```bash
 npm install && npm run dev
 ```
+* For all background jobs to run, nmrXiv is powered with [Redis](https://redis.com/) and packaged with [Horizon](https://github.com/laravel/horizon).
+Run the below command to publish all the jobs and start the worker for the background jobs to execute.
+```bash
+./vendor/bin/sail artisan horizon:publish
+./vendor/bin/sail artisan horizon
+```
+
+* To configure file object storage, you should have [Minio](https://min.io/) instance already running in your local(for more details check your docker-compose file). For the first time you have to generate the Access Keys, create the buckets and configure that in your `.env` file.
+    * Open the Minio instance running in your [http://localhost:8900](http://localhost:8900/)
+    * Login with user - `sail` and password - `password`
+    * Go to Access Keys and create a new access key.
+    * Create the two buckets with Read Write Access as `nmrxiv` and `nmrxiv-public`
+    * Update Filesystem driver and the AWS Keys as below in the `.env` file. Make sure you point your `AWS_URL` to Minio API which is running in port 9000.
+
+```bash
+FILESYSTEM_DRIVER=minio
+FILESYSTEM_DRIVER_PUBLIC=minio_public
+
+AWS_ACCESS_KEY_ID=RjcSdMxMiiGYycQV
+AWS_SECRET_ACCESS_KEY=jCq9hAvsW4lmMzLzdyuvmoX7dqBpSc7W
+AWS_DEFAULT_REGION=us-east-1
+AWS_BUCKET=nmrxiv
+AWS_ENDPOINT=http://localhost:9000/
+AWS_URL=http://localhost:9000/
+AWS_USE_PATH_STYLE_ENDPOINT=false
+AWS_BUCKET_PUBLIC=nmrxiv-public
+```
 
 Once the application's Docker containers have been started, you can access the application in your web browser at [http://localhost](http://localhost). But first, you will be prompted to <b>Generate app key</b>. After pressing the generation button, the following message is shown on the screen: "The solution was executed successfully. Refresh now." After refreshing, you access the application.
 

diff --git a/docs/developer-guides/installation/ubuntu.md b/docs/developer-guides/installation/ubuntu.md
@@ -11,6 +11,7 @@ The whole project is a package of below services and features.
 * [Selenium](https://www.selenium.dev/documentation/)
 * [Meilisearch](https://docs.meilisearch.com/)
 * [MailHog](https://mailtrap.io/blog/mailhog-explained/)
+* [Minio](https://min.io/)
 
 #### Ubuntu 20.04
 
@@ -63,6 +64,33 @@ Don't forget to note down the admin's user id and password provided at the end o
 ```bash
 npm install && npm run dev
 ```
+* For all background jobs to run, nmrXiv is powered with [Redis](https://redis.com/) and packaged with [Horizon](https://github.com/laravel/horizon).
+Run the below command to publish all the jobs and start the worker for the background jobs to execute.
+```bash
+./vendor/bin/sail artisan horizon:publish
+./vendor/bin/sail artisan horizon
+```
+
+* To configure file object storage, you should have [Minio](https://min.io/) instance already running in your local(for more details check your docker-compose file). For the first time you have to generate the Access Keys, create the buckets and configure that in your `.env` file.
+    * Open the Minio instance running in your [http://localhost:8900](http://localhost:8900/)
+    * Login with user - `sail` and password - `password`
+    * Go to Access Keys and create a new access key.
+    * Create the two buckets with Read Write Access as `nmrxiv` and `nmrxiv-public`
+    * Update Filesystem driver and the AWS Keys as below in the `.env` file. Make sure you point your `AWS_URL` to Minio API which is running in port 9000.
+
+```bash
+FILESYSTEM_DRIVER=minio
+FILESYSTEM_DRIVER_PUBLIC=minio_public
+
+AWS_ACCESS_KEY_ID=RjcSdMxMiiGYycQV
+AWS_SECRET_ACCESS_KEY=jCq9hAvsW4lmMzLzdyuvmoX7dqBpSc7W
+AWS_DEFAULT_REGION=us-east-1
+AWS_BUCKET=nmrxiv
+AWS_ENDPOINT=http://localhost:9000/
+AWS_URL=http://localhost:9000/
+AWS_USE_PATH_STYLE_ENDPOINT=false
+AWS_BUCKET_PUBLIC=nmrxiv-public
+```
 
 Once the application's Docker containers have been started, you can access the application in your web browser at [http://localhost](http://localhost). But first, you will be prompted to <b>Generate app key</b>. After pressing the generation button, the following message is shown on the screen: "The solution was executed successfully. Refresh now." After refreshing, you access the application.
 

diff --git a/docs/developer-guides/installation/windows.md b/docs/developer-guides/installation/windows.md
@@ -83,6 +83,34 @@ npm install
 npm run dev
 ```
 
+* For all background jobs to run, nmrXiv is powered with [Redis](https://redis.com/) and packaged with [Horizon](https://github.com/laravel/horizon).
+Run the below command to publish all the jobs and start the worker for the background jobs to execute.
+```bash
+./vendor/bin/sail artisan horizon:publish
+./vendor/bin/sail artisan horizon
+```
+
+* To configure file object storage, you should have [Minio](https://min.io/) instance already running in your local(for more details check your docker-compose file). For the first time you have to generate the Access Keys, create the buckets and configure that in your `.env` file.
+    * Open the Minio instance running in your [http://localhost:8900](http://localhost:8900/)
+    * Login with user - `sail` and password - `password`
+    * Go to Access Keys and create a new access key.
+    * Create the two buckets with Read Write Access as `nmrxiv` and `nmrxiv-public`
+    * Update Filesystem driver and the AWS Keys as below in the `.env` file. Make sure you point your `AWS_URL` to Minio API which is running in port 9000.
+
+```bash
+FILESYSTEM_DRIVER=minio
+FILESYSTEM_DRIVER_PUBLIC=minio_public
+
+AWS_ACCESS_KEY_ID=RjcSdMxMiiGYycQV
+AWS_SECRET_ACCESS_KEY=jCq9hAvsW4lmMzLzdyuvmoX7dqBpSc7W
+AWS_DEFAULT_REGION=us-east-1
+AWS_BUCKET=nmrxiv
+AWS_ENDPOINT=http://localhost:9000/
+AWS_URL=http://localhost:9000/
+AWS_USE_PATH_STYLE_ENDPOINT=false
+AWS_BUCKET_PUBLIC=nmrxiv-public
+```
+
 Once the application's Docker containers have been started, you can access the application in your web browser at [http://localhost](http://localhost). But first, you will be prompted to <b>Generate app key</b>. After pressing the generation button, the following message is shown on the screen: "The solution was executed successfully. Refresh now." After refreshing, you can access the application.
 
 Run `code .` to open the code base to your VSCode editor. 

diff --git a/docs/submission-guides/data-model/dataset.md b/docs/submission-guides/data-model/dataset.md
@@ -37,8 +37,7 @@ If any structure is added to `Chemical structures` field in NMRium, it appears d
 - **Meta** is the second table coming after **Info**. It includes the metadata from the instrument file.
 
 ## Create
-There are two ways to create datasets. First is through the [submission pipeline](/docs/submission-guides/submission/upload.md), where the datasets will be automatically detected. The second is after submission from the `Datasets` tab within a study by dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md), but the second option is possible only for private ones.
-
+There are two ways to create datasets. First is through the [submission pipeline](/docs/submission-guides/submission/upload.md), where the datasets will be automatically detected. The second is after submission from the `Datasets` tab within a study by clicking on [`Manage Datasets` button](#manage-datasets) and dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md), but the second option is possible only for private ones. Please pay attention to dragging the datasets to the right study folder. 
 
 ## Access
 You can access your created datasets and the ones shared with you by [entering their parent studies](/docs/submission-guides/data-model/study/#access) and going to the `Datasets` or `Files` tabs. All the public datasets on **[nmrXiv](https://nmrxiv.org/)** are in the `Datasets` folder.
@@ -47,7 +46,7 @@ You can access your created datasets and the ones shared with you by [entering t
 To edit a dataset, you should have **editing** access to it, which is the case when you are its creator or when it is shared with you as an owner or a collaborator. The dataset should also still be private. You can edit the dataset through NMRium, and a history of changes is kept.
 
 ### Manage Datasets
-From the `Datasets` tab within a study, you can add more datasets by dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md), or you can delete them by selecting a dataset **in the left panel** and pressing `Delete`.
+From the `Datasets` tab within a study, you can add more datasets by dragging and dropping files or folders to the opened [submission pipeline](/docs/submission-guides/submission/upload.md). Please pay attention to dragging the datasets to the right study folder. Or you can delete them by selecting a dataset **in the left panel** and pressing `Delete`.
 
 ## Validation
 
@@ -57,13 +56,15 @@ To publish a dataset, i.e., to make it public, you need to publish its parent pr
 <img src="/img/project/publish.png" width="1000"/>
 </div>
 
-Clicking on **Why can't I publish?** leads to a new page similar to the [step-3 of the submission pipeline](/docs/submission-guides/submission/upload#complete---step-3). Here you can find either red <span style={{color:"red"}}>x</span> or green <span style={{color:"green"}}>✓</span> to indicate the existence or absence of the metadata respectively. Whenever the red <span style={{color:"red"}}>x</span> exists, it is accompanied by an `Edit` button to facilitate providing the missing data. If the `NMRium info` validation fails, this means NMRium didn't manage to extract the metadata from the files. Please click `Edit`, which will lead you to the corresponding dataset, and there click on `Preview`, which will update the preview and save it. You can make sure that the metadata is generated by checking the `Info` table below NMRium.
-
-If `Assignments` validation fails, this means you have no structure at the dataset level. Click `Edit`, which will lead you to the corresponding dataset, and in NMRium `Chemical structures`, you have to ensure providing a structure there. The datasets where validation failed can be found from the red highlighting of their name. 
+Clicking on **Why can't I publish?** leads to a new page similar to the [step-3 of the submission pipeline](/docs/submission-guides/submission/upload#complete---step-3). Here you can find either red <span style={{color:"red"}}>x</span>, or a green <span style={{color:"green"}}>✓</span> to indicate the existence or absence of the mandatory metadata respectively. A yellow <span style={{color:"orange"}}>⚠</span> indicates the absence of recommended (not mandatory) metadata. Whenever the red <span style={{color:"red"}}>x</span> or the yellow <span style={{color:"orange"}}>⚠</span> exist, they are accompanied by an `Edit` button to facilitate providing the missing data. Here are more details about why a certain dataset validation fails:
+- Files: This field checks whether there are spectral NMR data files. Since the [project submission](/docs/submission-guides/submission/upload.md) (not publishing) is not possible without spectral files, this field always passes the validation before submission <span style={{color:"green"}}>✓</span>.
+- NMRium info: This field is about whether the dataset was processed completely by NMRium, where its metadata gets parsed and the spectrum gets viewed. If this field doesn't pass the validation <span style={{color:"red"}}>x</span>, please go to the respective dataset (with the edit button), and there click on `Preview`, which will update the preview and save it. You can make sure that the metadata is generated by checking the existence of the `Info` table below NMRium <span style={{color:"green"}}>✓</span>.
+- Assay (Metadata): This feature is in development and, at the moment, the corresponding field will always pass the validation <span style={{color:"green"}}>✓</span>.
+- Assignments: The molecular assignment with NMRium is recommended by **[nmrXiv](https://nmrxiv.org/)**. Therefore, whenever it is not provided, the field gets marked with a yellow <span style={{color:"orange"}}>⚠</span>, but it is still possible to publish without it. To provide the molecular assignment, please refer to [NMRium documentation](https://docs.nmrium.org/structure_assignment/assign/add).
 
 <div style={{textAlign: 'center'}}>
 <img src="/img/dataset/validation.png" width="1000"/>
-<figcaption>Validation Checklist of the Dataset</figcaption>
+<figcaption>Validation Checklist of the Dataset (within a study)</figcaption>
 </div>
 
 ## Share