Skip to content

Commit

Permalink
WIP clean large files
Browse files Browse the repository at this point in the history
  • Loading branch information
tpillone committed Oct 16, 2023
1 parent 9c7a7ea commit 6efc517
Show file tree
Hide file tree
Showing 3 changed files with 73 additions and 4 deletions.
33 changes: 33 additions & 0 deletions bin/clean_work_files.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/bin/bash
# https://raw.githubusercontent.com/SystemsGenetics/GEMmaker/master/bin/clean_work_files.sh
# This script is meant for cleaning any file in a Nextflow work directory.
# The $files_list variable is set within the Nextflow process and should
# contain the list of files that need cleaning. This can be done by creating
# a channel in a process that creates files, and merging that channel with
# a signal from another process indicating the files are ready for cleaning.
#
# The cleaning process empties the file, converts it to a sparse file so it
# has an acutal size of zero but appears as the original size, the access
# and modify times are kept the same.
files_list="$1"

for file in ${files_list}; do
# Remove cruff added by Nextflow
file=`echo $file | perl -p -e 's/[\\[,\\]]//g'`
if [ -e $file ]; then
# Log some info about the file for debugging purposes
echo "cleaning $file"
stat $file
# Get file info: size, access and modify times
size=`stat --printf="%s" $file`
atime=`stat --printf="%X" $file`
mtime=`stat --printf="%Y" $file`

# Make the file size 0 and set as a sparse file
> $file
truncate -s $size $file
# Reset the timestamps on the file
touch -a -d @$atime $file
touch -m -d @$mtime $file
fi
done
37 changes: 37 additions & 0 deletions modules/local/clean_work.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@

process clean_work_dirs {
input:
tuple val(directory)

output:
val(1), emit: IS_CLEAN

script:
"""
for dir in ${directory}; do
if [ -e \$dir ]; then
echo "Cleaning: \$dir"
files=`find \$dir -type f `
echo "Files to delete: \$files"
clean_work_files.sh "\$files" "null"
fi
done
"""
}

process clean_work_files {

cache 'lenient'

input:
val(file)

output:
val(1), emit: IS_CLEAN

script:
"""
clean_work_files.sh "${file}"
"""
}

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 6efc517

Please sign in to comment.