Add action to spell check docs and README

couchbaselabs · Sep 24, 2024 · 2a0b7b7 · 2a0b7b7
1 parent 569707f
commit 2a0b7b7
Show file tree

Hide file tree

Showing 17 changed files with 187 additions and 31 deletions.
diff --git a/.github/workflows/.spellcheck.yml b/.github/workflows/.spellcheck.yml
@@ -0,0 +1,27 @@
+matrix:
+  - name: Markdown
+    expect_match: false
+    aspell:
+      lang: en
+    dictionary:
+      wordlists:
+        - .github/workflows/.wordlist.txt
+      output: wordlist.dic
+      encoding: utf-8
+    pipeline:
+      - pyspelling.filters.markdown:
+          markdown_extensions:
+            - markdown.extensions.extra:
+      - pyspelling.filters.html:
+          comments: false
+          attributes:
+            - alt
+          ignores:
+            - ':matches(code, pre)'
+            - 'code'
+            - 'pre'
+            - 'blockquote'
+    sources:
+      - 'README.md'
+      - 'docs/*.adoc'
+      - 'docs/**/*.adoc'
diff --git a/.github/workflows/.wordlist.txt b/.github/workflows/.wordlist.txt
@@ -0,0 +1,118 @@
+aarch
+adoc
+analytics
+Analytics
+api
+aws
+
+capella
+cb
+cbenv
+cbsh
+cbshell
+CBShell
+CIDR
+CIDRs
+cli
+CLI
+config
+Config
+connstr
+contentVector
+couchbaselabs
+couchbase
+Couchbase
+csv
+CSV
+
+darwin
+datasets
+dataverses
+descriptionEmbedding
+dotfile
+dotfiles
+
+EE
+embeddings
+env
+
+fieldName
+FileSize
+
+github
+gz
+
+Homebrew
+hostnames
+html
+http
+https
+
+ints
+InVpc
+
+json
+JSON
+
+kv
+
+linux
+llm
+localdev
+localhost
+lookups
+
+macOS
+memcached
+MiB
+msvc
+
+namespace
+netlify
+nowrap
+nushell
+Nushell
+Nushell's
+
+OpenAI
+OpenSSL
+
+pc
+plaintext
+png
+pre
+projectcapella
+
+QL
+quickstart
+Quickstart
+
+rustup
+
+sectnums
+sqlite
+SRV
+subcommands
+subdoc
+
+templating
+tera
+tls
+TLS
+toclevels
+toml
+toolchain
+
+uments
+unix
+upsert
+upserted
+userguide
+
+whoami
+www
+
+xattrs
+xml
+
+yaml
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -159,4 +159,15 @@ jobs:
       - uses: hustcer/setup-nu@main
         with:
           version: "*"
-      - run: nu docs/sample_config/prompt_tests.nu
+      - run: nu docs/sample_config/prompt_tests.nu
+
+  check-spelling:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v3
+      - name: Check Spelling
+        uses: rojopolis/[email protected]
+        with:
+          config_path: .github/workflows/.spellcheck.yml
+          task_name: Markdown
diff --git a/README.md b/README.md
@@ -93,8 +93,8 @@ On top of [nushell](https://www.nushell.sh/) built-in commands, the following co
  - `cb-env project` - Sets the active cloud project based on its name
  - `cb-env scope` - Sets the active scope based on its name
  - `cb-env timeouts` - Sets the default timeouts
- - `clouds` - Lists all clusters on the active Capella organisation
- - `clusters`- Lists all clusters on the active Capella organisation
+ - `clouds` - Lists all clusters on the active Capella organization
+ - `clusters`- Lists all clusters on the active Capella organization
  - `clusters create` - Creates a new cluster against the active Capella organization
  - `clusters drop` - Deletes a cluster from the active Capella organization
  - `clusters get` - Gets a cluster from the active Capella organization

diff --git a/docs/commands.adoc b/docs/commands.adoc
@@ -1,7 +1,7 @@
 == Couchbase Commands
 
-The following sections discuss the individual couchbase specific commands in greater detail. Remember, you can always mix and match
-them with built-in other shell commands as well as executables from your environment.
+The following sections discuss the individual Couchbase specific commands in greater detail. Remember, you can always mix and match
+them with built-in other shell commands as well as executable programs from your environment.
 
 include::commands/buckets.adoc[]
 
@@ -52,7 +52,7 @@ You can retrieve a document with `doc get`:
 ```
 
 To distinguish the actual content from the metadata, the content is nested in the `content` field.
-If you want to have everything at the toplevel, you can pipe to the `flatten` command:
+If you want to have everything at the top level, you can pipe to the `flatten` command:
 
 [options="nowrap"]
 ```
@@ -288,7 +288,7 @@ The answering of questions with supplied context can be used to easily implement
 
 === `version`
 
-The `version` command lists the version of the couchbase shell.
+The `version` command lists the version of the Couchbase shell.
 
 ```
 > version

diff --git a/docs/commands/query.adoc b/docs/commands/query.adoc
@@ -5,7 +5,7 @@ The query commands can be used to explore/create indexes and execute queries.
 
 ==== `query`
 
-Takes a n1ql statement and executes it against the active cluster.
+Takes a N1QL statement and executes it against the active cluster.
 
 ```
 👤 Charlie  🏠 local in 🗄 travel-sample._default._default

diff --git a/docs/commands/vector.adoc b/docs/commands/vector.adoc
@@ -309,7 +309,7 @@ Embedding batch 1/1
 
 The resulting document is the same as the original, but with a new field `contentVector` which contains the result of embedding the content field with the <<_cb_env_llm,active llm>>.
 The name of the field that the embedding will be written to will default to the name of the original field with "Vector" appended.
-This default behaviour can be overwritten with the `vectorField` flag.
+This default behavior can be overwritten with the `vectorField` flag.
 The resulting document is formatted with an id and content column which allows it to be piped into a `doc upsert` command to store it in the connected couchbase cluster.
 
 ```

diff --git a/docs/exporting-data.adoc b/docs/exporting-data.adoc
@@ -12,7 +12,7 @@ If you want to only store the document body then you can use `doc get <id> | get
 
 ===== To JSON
 
-From KeyValue
+From key-value
 ```
 > doc get airport_3719 --bucket travel-sample
 ╭───┬──────────────┬────────────────────────────────────┬─────────────────────┬───────┬─────────╮
@@ -152,7 +152,7 @@ To Multiple Documents
 
 ===== To CSV
 
-From KeyValue
+From key-value
 
 [options="nowrap"]
 ```

diff --git a/docs/importing-data.adoc b/docs/importing-data.adoc
@@ -3,7 +3,7 @@
 Couchbase Shell supports loading data from a variety of formats and sources.
 
 The simplest way to import data is using `doc import` as covered in <<_loading_data_into_the_shell,Loading data into the shell>>.
-These recipes will cover more advanced usecases.
+These recipes will cover more advanced use cases.
 
 ==== A Note On Data format
 

diff --git a/docs/intro.adoc b/docs/intro.adoc
@@ -1,16 +1,16 @@
 == Introduction
 
-Couchbase Shell is fully featured, so it does not only contain commands related to couchbase but is actually built on top of a
-general purpose shell called https://www.nushell.sh/[nushell]. This allows you to interact with the file system or any other
+Couchbase Shell is fully featured, so it does not only contain commands related to Couchbase but is actually built on top of a
+general purpose shell called https://www.nushell.sh/[Nushell]. This allows you to interact with the file system or any other
 command available on your machine, making it a great tool for both operational and development tasks on top of Couchbase.
 
 The following introduction only touches on the basic concepts to make you productive quickly. We recommend also checking out the
-great https://www.nushell.sh/book[nushell documentation] so you can get the most out of it.
+great https://www.nushell.sh/book[Nushell documentation] so you can get the most out of it.
 
 === Navigating the Shell
 
 Commands take inputs and produce output in a structured manner, most often represented as tables. Note how both the generic `ls`
-command and the couchbase-specific `buckets` command both produce a table as their output:
+command and the Couchbase-specific `buckets` command both produce a table as their output:
 
 ```
 > ls
@@ -166,7 +166,7 @@ If we ran a `doc get` it would fetch the doc from travel-sample.inventory.landma
 === Loading Data into the Shell
 
 If you want to import data into Couchbase, or just load it into the shell for further processing, there are different commands available to help you.
-Once the data is loaded into the shell it can be sent to one of the couchbase save commands like `doc upsert` and `doc import`.
+Once the data is loaded into the shell it can be sent to one of the Couchbase save commands like `doc upsert` and `doc import`.
 Depending on the structure of the data, and the command used, you may also need to tweak it a little bit so it can be properly stored.
 
 ==== Doc import
@@ -259,7 +259,7 @@ In our case we use `from json`:
 ```
 
 TIP: look at the many different import formats `from` supports, including csv, xml, yaml and even sqlite. With this simple tool
-at hand you are able to load many different data formats quickly and import them into couchbase!
+at hand you are able to load many different data formats quickly and import them into Couchbase!
 
 We cannot use this format directly with commands like `doc upsert` as the command expects two "columns" in the data - id and content.
 This means that we have to perform some translation from the above format to one that `doc upsert` understands.

diff --git a/docs/quickstart.adoc b/docs/quickstart.adoc
@@ -41,7 +41,7 @@ image::mac-binary-unsigned.png[macOS Warning,600]
 
 ==== Homebrew
 
-If running on macOS you can install via the https://formulae.brew.sh/formula/couchbase-shell[homebrew] formula:
+If running on macOS you can install via the https://formulae.brew.sh/formula/couchbase-shell[Homebrew] formula:
 
 ```
 $ brew install couchbase-shell
@@ -99,7 +99,7 @@ To start experimenting with data operations load some sample data onto the clust
 ╰───┴─────────┴───────────────┴─────────╯
 ```
 
-Now you can try running n1ql queries using the <<_query,query>> command.
+Now you can try running N1QL queries using the <<_query,query>> command.
 
 ```
 👤 Administrator 🏠 default

diff --git a/docs/recipes.adoc b/docs/recipes.adoc
@@ -3,7 +3,7 @@
 :sectnums:
 
 Welcome to the recipes section of the Couchbase Shell `cbsh` documentation.
-Here you can find how powerful tasks can be performed by a combination of pipelined statements using `cbsh`.
+Here you can find how powerful tasks can be performed by a combining `cbsh` statements.
 
 include::recipes/register_cluster.adoc[]
 

diff --git a/docs/recipes/managing_multiple_clusters.adoc b/docs/recipes/managing_multiple_clusters.adoc
@@ -62,7 +62,7 @@ To focus on the free memory that each cluster has, we can https://www.nushell.sh
 ╰────┴────────────┴─────────────╯
 ```
 
-We can reformat the tables to make the the data more readable, but nushell's understanding of various data types allows us to reformat the values within the table.
+We can reformat the tables to make the the data more readable, but Nushell's understanding of various data types allows us to reformat the values within the table.
 For example we could convert the `memory_free` values from bytes to gigabytes:
 
 [options="nowrap"]
@@ -85,10 +85,10 @@ For example we could convert the `memory_free` values from bytes to gigabytes:
 ╰───┴─────────────┴─────────────╯
 ```
 
-We do this by iterating over each node and https://www.nushell.sh/commands/docs/update.html[updating] the value in the `memory_free` column by multiplying the current value by nushell's inbuilt https://www.nushell.sh/book/types_of_data.html#file-sizes[File Size] datatype.
+We do this by iterating over each node and https://www.nushell.sh/commands/docs/update.html[updating] the value in the `memory_free` column by multiplying the current value by Nushell's inbuilt https://www.nushell.sh/book/types_of_data.html#file-sizes[File Size] datatype.
 
 We can take this one step further and use the values returned to calculate new metrics about our clusters.
-When performing a healthcheck it's be useful to know the memory utilization for each cluster.
+When performing a health check it's be useful to know the memory utilization for each cluster.
 There are two columns that can be used to calculate this: `memory_free` and `memory_total`.
 
 [options="nowrap"]

diff --git a/docs/recipes/moving_data.adoc b/docs/recipes/moving_data.adoc
@@ -35,7 +35,7 @@ The first thing to do is to recreate all of the buckets that we have on the `loc
 ```
 
 Here we simply get all of the buckets, then iterate over the list with https://www.nushell.sh/commands/docs/each.html[each] and create buckets with the same name and ram quota, specifying the `remote` cluster with the https://couchbase.sh/docs/#_the_clusters_flag[--clusters] flag.
-Since the value for the ram quote is returned in bytes from `buckets` we convert it to MiB by dividing by nushell's 1MB https://www.nushell.sh/book/types_of_data.html#file-sizes[FileSize] datatype.
+Since the value for the ram quota is returned in bytes from `buckets` we convert it to MiB by dividing by Nushell's 1MB https://www.nushell.sh/book/types_of_data.html#file-sizes[FileSize] datatype.
 We can check that this has worked by running the `buckets` command against the remote cluster:
 
 [options="nowrap"]

diff --git a/docs/recipes/similarity_search.adoc b/docs/recipes/similarity_search.adoc
@@ -121,5 +121,5 @@ Embedding batch 1/1
 ```
 
 Here we have done another similarity search using the same index, but our source vector is the result of embedding the phrase "physical exercise".
-One important detail to remeber is that the embedding generated from `vector enrich-text` must have the same dimension as those over which the index was created, otherwise `vector search` will return no results.
+One important detail to remember is that the embedding generated from `vector enrich-text` must have the same dimension as those over which the index was created, otherwise `vector search` will return no results.
 See https://couchbase.sh/docs/#_vector_enrich_text[vector enrich-text] for how to specify the dimension of the generated embeddings.
diff --git a/docs/recipes/simple_rag.adoc b/docs/recipes/simple_rag.adoc
@@ -1,7 +1,7 @@
 == Simple RAG
 
 Couchbase Shell's https://couchbase.sh/docs/#_vector_commands[vector commands] along with https://couchbase.sh/docs/#_ask[ask] can be used to implement simple Retrieval Augmented Generation, more commonly know as RAG.
-In this process similarity search is used over chunks of a larger body of text to contextualise questions sent to a Large Language model to improve the answers given.
+In this process similarity search is used over chunks of a larger body of text to contextualize questions sent to a Large Language model to improve the answers given.
 For this demo we will use a text version of the Couchbase Shell docs as the source text for our chunks of data we have this stored locally as a text file.
 
 ```
@@ -147,7 +147,7 @@ The we use the question to generate an embedding which we then pipe to https://c
 This returns the vector docs with the most semantically similar chunks to our question.
 
 Using the returned doc ids we can use the https://couchbase.sh/docs/#_subdoc_get[subdoc get] command to retrieve the chunks.
-These chunks can then be piped directly into `ask` where they will be used to contextualise the question:
+These chunks can then be piped directly into `ask` where they will be used to contextualize the question:
 
 ```
 👤 Charlie 🏠 remote in ☁️ RagChunks._default._default
@@ -175,4 +175,4 @@ Remember to consult the available flags and options for more customization and f
 ```
 
 This allows `ask` to produce a much more accurate and informative answer using the context it was given.
-Changing the size of the chunks, number og neighbours returned as well as the dimension of the embeddings can all have an impact on the result of RAG, and `cbsh` should help experimenting with these variables quick and easy.
+Changing the size of the chunks, number of neighbors returned as well as the dimension of the embeddings can all have an impact on the result of RAG, and `cbsh` should help experimenting with these variables quick and easy.
diff --git a/docs/release_notes.adoc b/docs/release_notes.adoc
@@ -22,7 +22,7 @@
 
 === 0.75.2 - 2023-04-10
 
-* Updated macos build to not have any dependency on OpenSSL.
+* Updated macOS build to not have any dependency on OpenSSL.
 * Fixed the release workflow so that Linux release tarballs contain the `cbsh` binary.
 
 === 0.75.1 - 2023-04-13
@@ -32,7 +32,7 @@ As our versioning continues to track the underlying Nushell minor version this h
 
 * Updated config file to rename `[[cluster]]` to `[[database]]` (`[[cluster]]` will continue to work).
 * **Breaking** Updated config file to rename `hostnames` to `connstr` and changed the format to be a string.
-* Added support, and detection, for different "cluster types"; Capella and Other. This allows us to modify behaviour based on cluster type.
+* Added support, and detection, for different "cluster types"; Capella and Other. This allows us to modify behavior based on cluster type.
 * *Breaking* Renamed `clusters health` to `health`.
 * *Breaking* Renamed other `clusters ...` commands to `database ...`
 * Replaced references to cluster with database.