Skip to content

Releases: COMBINE-lab/piscem

piscem v0.7.1

03 Feb 18:56
Compare
Choose a tag to compare

This release adds the ability to (optionally) provide a seed parameter for SSHash construction. In rare situations, SSHash construction can fail due to empty buckets in the skew index (this is a technical detail you need not be concerned with as a user). Usually, this issue can be resolved by just attempting to build the index again with a different seed. The --seed option to piscem build will allow you to set the seed used in construction (the default seed is 1).

Piscem v0.7.0

19 Dec 01:46
Compare
Choose a tag to compare

This release of piscem adds the ability to index decoy sequencing using the "distinguishing flanking k-mer" methodology described in Hjörleifsson and Sullivan et al.1. This variant of considering decoy sequences that is optimized to work with pseudoalignment and pseudoalignment-like approaches where alignment scores are unavailable (unlike the approach of 2, which is designed to work with selective-alignment).

The implementation in piscem adopts the terminology of "poison" k-mers — that is, the decoy sequence is used to create a separate table of poison k-mers whose presence will cause a read to be discarded, rather than to map to some target in the index. Poison k-mers are simply distinguishing flanking k-mers that belong to some decoy sequence, and hence their presence in a mapping should "poison" the mapping (i.e. lead to it being discarded).

To build a decoy-aware index, one simply passes the --decoy-paths argument to piscem build. This accepts a , separated list of FASTA files that will be used to generate the poison k-mer set. This will create a separate data structure (the poison table) that will be used to filter fragments that are potentially mapped spuriously to the index.

Likewise, when performing mapping, if a poison table has been built, it will be used by default. However, you can pass the --no-poison flag to map-bulk and map-sc to avoid considering poison k-mers, even if the index was constructed with a poison table.

  1. Eldjárn Hjörleifsson, Kristján, et al. "Accurate quantification of single-nucleus and single-cell RNA-seq transcripts." bioRxiv (2022): 2022-12.

  2. Srivastava, Avi, et al. "Alignment and mapping methodology influence transcript abundance estimation." Genome biology 21.1 (2020): 1-29.

piscem v0.7.0-beta

03 Oct 00:59
Compare
Choose a tag to compare
version 0.7.0-beta

piscem v0.6.3

28 Aug 05:33
Compare
Choose a tag to compare

Fixes a silly bug in the streaming iterator used for mapping that could sometimes cause a k-mer lookup to fail. Observed differences should be very small, but this fix can lead to more precise (i.e. better) mappings.

piscem v0.6.2

24 Aug 03:27
Compare
Choose a tag to compare

Expand fix from v0.6.1 to the alternative (non-shared minimizer) streaming query.

piscem v0.6.1

11 Aug 15:44
Compare
Choose a tag to compare

This release fixes a bug in streaming query that could, in rare circumstances, result in a segmentation fault (coming from the C++ piscem-cpp library). The corresponding issue was tracked here for those interested, and is resolved in this release.

piscem v0.6.0

12 Apr 04:30
Compare
Choose a tag to compare

This version fixes an issue in the streaming canonical query class that is used to speed up index search for streaming queries. There were specific variables tracking state that were not always properly reset. As a result, a small number of queries could return invalid results (usually causing the corresponding reads not to map). This commit addresses this issue. The effect was quite rare (occurring in ~0.01-0.02% of reads), but the streaming query results now always match the results one would get without using the streaming optimization.

piscem v0.5.1

10 Apr 20:11
Compare
Choose a tag to compare

This release fixes an issue related to the construction of the index on very small references (or with very small values of k and m) when the number of threads used for construction was set as a large number.

The specific fix limits the number of threads used in minimizer index construction (first step of sshash) to 8 (to match upstream sshash). This avoids some crashes observed when building the index on very small reference sets or with very small k / m.

piscem v0.5.0

10 Apr 04:43
7443ab2
Compare
Choose a tag to compare
piscem v0.5.0

piscem v0.4.3

13 Feb 19:18
1ca7996
Compare
Choose a tag to compare
piscem v0.4.3