Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Better Data Types #678

Open
arthurschreiber opened this issue Jan 10, 2018 · 9 comments
Open

[RFC] Better Data Types #678

arthurschreiber opened this issue Jan 10, 2018 · 9 comments
Labels
Action Required An action is needed by Tedious member discussion feature-request Follow up Recent question asked by tedious members for old issues or discussion that requires member input Response needed Response by tedious member is needed

Comments

@arthurschreiber
Copy link
Collaborator

arthurschreiber commented Jan 10, 2018

inspired by issues like #608, #675, #487, #474, #163, #339, #1058 and comments made by @chdh over at #674, I want to propose the following changes to the data type system to make it more robust and extensible.

These will be breaking changes, so I'm trying to collect some feedback first. 😅

What's currently wrong with data types?

Many (but not all) of the data types supported in SQL Server do not map 1:1 to JavaScript types. This causes a lot of friction in the form of precision loss or conversion failures. E.g. numeric/decimal (depending on their precision and scale), bigint, and many of the date and time formats, cannot be converted to/from JavaScript without a loss of precision or information.

There are JavaScript libraries (big-number, bigdecimal, moment, js-joda), that implement custom versions of compatible types in JavaScript itself, but I don't want to add these libraries as dependencies of tedious and "force" our users to use them, especially because there are often many competing libraries for the same use-case, and it's not always clear which one is the "best".

How do we fix this?

I want data types to be pluggable with custom encoder/decoder functions. A encoder function would be responsible for converting a JavaScript object into it's TDS binary representation, and the decoder function would be responsible for converting the binary representation back into a JavaScript object.

tedious would ship with a default set of encoder and decoder functions. Data Types which can be mapped to JavaScript types directly would be fully supported out-of the box, while all other data types will come with default implementations that will encode from and decode to plain JavaScript strings. This would ensure that tedious can work with all data-types out of the box, without loss of information.

Separate plug-in libraries can then provide more specific conversion functions.

Fantasy API

Here's a very rought example of how this could look like:

const tedious = require('tedious');
const BigInt = tedious.TYPES.BigInt;
const BigInteger = require("big-integer");

const MAX_BIGINT = new BigInteger('9223372036854775807')
const MIN_BIGINT = new BigInteger('-9223372036854775808')

tedious.TYPES.BigInt.defineEncoder(function(value: any) : Buffer {
  if (!(value instanceof BigInteger) || !value.isFinite() || value.lt(MIN_BIGINT) || value.gt(MAX_BIGINT)) {
    // The given value is either not a BigInteger or out of bounds.
    throw new tedious.DataTypeError();
  }

  const result = new Buffer.alloc(8);
  // ... write the value to the buffer
  return result;
});

BigInt.defineDecoder(function(data: Buffer) {
  // ... convert the given buffer back to a BigInt and return it here
});

Other changes

While we're at it, I'd like to propose another change to how the data types API works and is used. I think we should no longer pass data type conversion errors to the Request callback, but instead throw the error right when request.addParameter is called. This would improve the error stack trace (it will point to exactly where the incorrect data type value was passed) and would allow us to clean some of the tedious internals to not have to hold onto these errors. It would also allow us to change the internal storage of parameters - we'd no longer have to store the original value for the parameter, just the Buffer result of the conversion. This'd allow V8 to perform better optimizations, as we'd move from a data structure with a megamorphic value property to one where value is always a buffer.

@rossipedia
Copy link
Contributor

(nitpick)

const result = new Buffer.alloc(8);

should probably be:

const result = Buffer.alloc(8);

(carry on)

@jimmont
Copy link

jimmont commented May 24, 2018

Given the little I see it seems to make more sense to limit scope and try mapping many types to the fewer supported in JavaScript (rather than the opposite: supporting more types and features). The statement "cannot be converted to/from JavaScript without a loss of precision or information" would seem to suggest more time and effort than is currently expended on this project and version. Since there are 104 open issues some more than 2 years old, 21 open pull requests and 46 branches I'd suggest time constraints and quality as more relevant to this project.

@ferk6a
Copy link
Contributor

ferk6a commented Feb 12, 2019

Is this being worked on? This is a very important issue for projects involving decimal values, and I was thinking about starting a prototype for this RFC.

@IanChokS IanChokS added Follow up Recent question asked by tedious members for old issues or discussion that requires member input Action Required An action is needed by Tedious member Response needed Response by tedious member is needed labels Dec 13, 2019
@ethankale
Copy link

An example of where this would help is in the datetimeoffset type, which AFAIK right now just uses the local timezone of the server when fed a Javascript date... which completely defeats the point of having a datetimeoffset value in the first place.

@KAMAELUA
Copy link

@IanChokS any updates so far?

@MichaelSun90
Copy link
Contributor

Hi @KAMAELUA , Unfortunately, there is no progress for now. The task is currently in the backlog and there is no one working on it yet.

@ephys
Copy link

ephys commented Sep 24, 2022

This feature would be absolutely lovely. we're working on a full rewrite of our Data Types in Sequelize (sequelize/sequelize#14505) and the way Tedious currently represents some values in JS leads to unavoidable precision loss:

  • datetimeoffset (and friends), and time are parsed as a JS Date which discards precision past the millisecond (precision 3, SQL server supports precision up to 7)
  • decimal is parsed as a JS number. Retrieving 9007199254740993 from the db results in 9007199254740994
  • date (with no time) is also parsed as a JS Date. It doesn't lead to a loss of precision but it still creates an intermediary object we always convert back to string (because we don't believe Date is appropriate for dateonly, and we're preparing the arrival of Temporal).

We'd prefer to receive strings for all of the above types. Being able to customize their parsing would enable us to fix this on our end.

@trexshw
Copy link

trexshw commented Nov 30, 2022

This feature would be absolutely lovely. we're working on a full rewrite of our Data Types in Sequelize (sequelize/sequelize#14505) and the current way Tedious represents some values in JS leads to unavoidable precision loss:

  • datetimeoffset (and friends), and time are parsed as a JS Date which discards precision past the millisecond (precision 3, SQL server supports precision up to 7)
  • decimal is parsed a JS number. Retrieving 9007199254740993 from the db results in 9007199254740994
  • date (with no time) is also parsed as a JS Date. It doesn't lead to a loss of precision but it still creates an intermediary object we always convert back to string (because we don't believe Date is appropriate for dateonly, and we're preparing the arrival of Temporal).

We'd prefer to receive strings for all of the above types. Being able to customize their parsing would enable us to fix this on our end.

To ride on this, would be great if tedious can provide an option to allow column returning string.
The javascript precision problem is painful and even if we want to adopt an arbitrary-precision arithmetic library like BigNumber.js, it is not possible since the value retrieved from database is already losing precision.

@khkiley
Copy link

khkiley commented Jan 4, 2024

To simplify this, could it be possible to just have an option to treat certain SQL types as binary, and just return the raw response for that column in a buffer ? We could then parse the buffer.

A pluggable encoder/decoder system would be great but looks like a lot of work, and it looks like that project has stalled.

Also, encoding isn't as important as decoding, SQL server does a pretty good job of casting strings to the required type when needed. I'm happy with casting to strings as needed when sending data to SQL server.

If all my queries were SELECTS I'd cast the columns to binary (or string) during the select, but many result sets I deal with come from stored procedures which I don't have control over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Action Required An action is needed by Tedious member discussion feature-request Follow up Recent question asked by tedious members for old issues or discussion that requires member input Response needed Response by tedious member is needed
Projects
None yet
Development

No branches or pull requests