Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix lbann half issues (elemental#93)
* tick up version to 1.3.3 * fix issues with CPU half * rework the copy interface * work on Copy * add a warmup run to the gemm test * Various updates to copy dispatch; no more ETI for Copy * fix an issue with the cuda half type's assignment operators * make gpu_half_type assignment operators into templates * Update include/El/blas_like/level1/CopyLocal.hpp Co-Authored-By: Tim Moon <[email protected]> * remove some debugging output * add decls for BaseDistMatrix copy and copyasync * make gpu_half_type streamable * be a little more clever about casting to __half * fix some things * add overloads of sqrt and pow for half types * add unary minus for gpu half type * fix an issue where NVCC decided that rvalue references have value semantics * add an overload of Log for gpu_half_type * add exception-throwing bitwise operators for half. This is to appease Aluminum at compile-time and should never be encounted IRL. * add a bunch of missing library symbols * add a bunch of transendental functions, etc, for half types * fix some missing symbols when compiling without half support * Fixes to instantiate Read/Write with gpu half type * Add overload for instantiate with half types * add a few missing symbols * add a write impl for gpu half matrices * temp: dispatch gemv through Gemm for __half * Fix the GEMV as GEMM call * temporary error-throw for unhandled case * remove the incy != 1 gemv case * patch around an issue in the Half library * expose AbstractMatrix interface to Print() Co-authored-by: Tim Moon <[email protected]>
- Loading branch information