[RFC] Better Data Types #678

arthurschreiber · 2018-01-10T08:32:33Z

inspired by issues like #608, #675, #487, #474, #163, #339, #1058 and comments made by @chdh over at #674, I want to propose the following changes to the data type system to make it more robust and extensible.

These will be breaking changes, so I'm trying to collect some feedback first. 😅

What's currently wrong with data types?

Many (but not all) of the data types supported in SQL Server do not map 1:1 to JavaScript types. This causes a lot of friction in the form of precision loss or conversion failures. E.g. numeric/decimal (depending on their precision and scale), bigint, and many of the date and time formats, cannot be converted to/from JavaScript without a loss of precision or information.

There are JavaScript libraries (big-number, bigdecimal, moment, js-joda), that implement custom versions of compatible types in JavaScript itself, but I don't want to add these libraries as dependencies of tedious and "force" our users to use them, especially because there are often many competing libraries for the same use-case, and it's not always clear which one is the "best".

How do we fix this?

I want data types to be pluggable with custom encoder/decoder functions. A encoder function would be responsible for converting a JavaScript object into it's TDS binary representation, and the decoder function would be responsible for converting the binary representation back into a JavaScript object.

tedious would ship with a default set of encoder and decoder functions. Data Types which can be mapped to JavaScript types directly would be fully supported out-of the box, while all other data types will come with default implementations that will encode from and decode to plain JavaScript strings. This would ensure that tedious can work with all data-types out of the box, without loss of information.

Separate plug-in libraries can then provide more specific conversion functions.

Fantasy API

Here's a very rought example of how this could look like:

const tedious = require('tedious');
const BigInt = tedious.TYPES.BigInt;
const BigInteger = require("big-integer");

const MAX_BIGINT = new BigInteger('9223372036854775807')
const MIN_BIGINT = new BigInteger('-9223372036854775808')

tedious.TYPES.BigInt.defineEncoder(function(value: any) : Buffer {
  if (!(value instanceof BigInteger) || !value.isFinite() || value.lt(MIN_BIGINT) || value.gt(MAX_BIGINT)) {
    // The given value is either not a BigInteger or out of bounds.
    throw new tedious.DataTypeError();
  }

  const result = new Buffer.alloc(8);
  // ... write the value to the buffer
  return result;
});

BigInt.defineDecoder(function(data: Buffer) {
  // ... convert the given buffer back to a BigInt and return it here
});

Other changes

While we're at it, I'd like to propose another change to how the data types API works and is used. I think we should no longer pass data type conversion errors to the Request callback, but instead throw the error right when request.addParameter is called. This would improve the error stack trace (it will point to exactly where the incorrect data type value was passed) and would allow us to clean some of the tedious internals to not have to hold onto these errors. It would also allow us to change the internal storage of parameters - we'd no longer have to store the original value for the parameter, just the Buffer result of the conversion. This'd allow V8 to perform better optimizations, as we'd move from a data structure with a megamorphic value property to one where value is always a buffer.

The text was updated successfully, but these errors were encountered:

rossipedia · 2018-01-10T17:12:57Z

(nitpick)

const result = new Buffer.alloc(8);

should probably be:

const result = Buffer.alloc(8);

(carry on)

jimmont · 2018-05-24T02:23:03Z

Given the little I see it seems to make more sense to limit scope and try mapping many types to the fewer supported in JavaScript (rather than the opposite: supporting more types and features). The statement "cannot be converted to/from JavaScript without a loss of precision or information" would seem to suggest more time and effort than is currently expended on this project and version. Since there are 104 open issues some more than 2 years old, 21 open pull requests and 46 branches I'd suggest time constraints and quality as more relevant to this project.

ferk6a · 2019-02-12T15:31:59Z

Is this being worked on? This is a very important issue for projects involving decimal values, and I was thinking about starting a prototype for this RFC.

ethankale · 2020-04-20T22:17:43Z

An example of where this would help is in the datetimeoffset type, which AFAIK right now just uses the local timezone of the server when fed a Javascript date... which completely defeats the point of having a datetimeoffset value in the first place.

KAMAELUA · 2021-10-12T13:17:36Z

@IanChokS any updates so far?

MichaelSun90 · 2021-10-12T18:38:28Z

Hi @KAMAELUA , Unfortunately, there is no progress for now. The task is currently in the backlog and there is no one working on it yet.

ephys · 2022-09-24T18:00:04Z

This feature would be absolutely lovely. we're working on a full rewrite of our Data Types in Sequelize (sequelize/sequelize#14505) and the way Tedious currently represents some values in JS leads to unavoidable precision loss:

datetimeoffset (and friends), and time are parsed as a JS Date which discards precision past the millisecond (precision 3, SQL server supports precision up to 7)
decimal is parsed as a JS number. Retrieving 9007199254740993 from the db results in 9007199254740994
date (with no time) is also parsed as a JS Date. It doesn't lead to a loss of precision but it still creates an intermediary object we always convert back to string (because we don't believe Date is appropriate for dateonly, and we're preparing the arrival of Temporal).

We'd prefer to receive strings for all of the above types. Being able to customize their parsing would enable us to fix this on our end.

trexshw · 2022-11-30T18:18:45Z

This feature would be absolutely lovely. we're working on a full rewrite of our Data Types in Sequelize (sequelize/sequelize#14505) and the current way Tedious represents some values in JS leads to unavoidable precision loss:

datetimeoffset (and friends), and time are parsed as a JS Date which discards precision past the millisecond (precision 3, SQL server supports precision up to 7)

decimal is parsed a JS number. Retrieving 9007199254740993 from the db results in 9007199254740994

date (with no time) is also parsed as a JS Date. It doesn't lead to a loss of precision but it still creates an intermediary object we always convert back to string (because we don't believe Date is appropriate for dateonly, and we're preparing the arrival of Temporal).

We'd prefer to receive strings for all of the above types. Being able to customize their parsing would enable us to fix this on our end.

To ride on this, would be great if tedious can provide an option to allow column returning string.
The javascript precision problem is painful and even if we want to adopt an arbitrary-precision arithmetic library like BigNumber.js, it is not possible since the value retrieved from database is already losing precision.

khkiley · 2024-01-04T20:55:59Z

To simplify this, could it be possible to just have an option to treat certain SQL types as binary, and just return the raw response for that column in a buffer ? We could then parse the buffer.

A pluggable encoder/decoder system would be great but looks like a lot of work, and it looks like that project has stalled.

Also, encoding isn't as important as decoding, SQL server does a pretty good job of casting strings to the required type when needed. I'm happy with casting to strings as needed when sending data to SQL server.

If all my queries were SELECTS I'd cast the columns to binary (or string) during the select, but many result sets I deal with come from stored procedures which I don't have control over.

asage-me mentioned this issue Jan 11, 2018

Tedious driver changes tediousjs/node-mssql#594

Closed

Suraiya-Hameed mentioned this issue Apr 19, 2018

Feat: option to parse decimal and numeric types as string #732

Closed

Suraiya-Hameed mentioned this issue May 11, 2018

TediousType.Money saving floats incorrectly #741

Open

Suraiya-Hameed mentioned this issue Jul 3, 2018

Decimal conversion in value-parser.js has problems with DECIMAL(38,20) #760

Open

Hadis-Knj added feature-request discussion labels Jul 20, 2018

willmorgan mentioned this issue Nov 21, 2018

Difference between mssql result and actual query result tediousjs/node-mssql#638

Closed

fennibay mentioned this issue Mar 19, 2019

String support for Decimal and Numeric values #163

Open

IanChokS added Follow up Recent question asked by tedious members for old issues or discussion that requires member input Action Required An action is needed by Tedious member Response needed Response by tedious member is needed labels Dec 13, 2019

IanChokS mentioned this issue Dec 18, 2019

Inserts into varbinary(max) col with FILESTREAM or reads from that column takes 10's of seconds #971

Closed

IanChokS mentioned this issue Feb 26, 2020

writeParameterData with a large scale value won't write correct buffer value #1058

Open

nibblesnbits mentioned this issue Jun 10, 2020

Include timezone offset in return value from readDateTimeOffset #1107

Open

IanChokS added this to the Refactor Better Data Types milestone Jun 23, 2020

MichaelSun90 mentioned this issue Aug 17, 2022

UDT parameter for requests #487

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Better Data Types #678

[RFC] Better Data Types #678

arthurschreiber commented Jan 10, 2018 •

edited by IanChokS

Loading

rossipedia commented Jan 10, 2018

jimmont commented May 24, 2018

ferk6a commented Feb 12, 2019

ethankale commented Apr 20, 2020

KAMAELUA commented Oct 12, 2021

MichaelSun90 commented Oct 12, 2021

ephys commented Sep 24, 2022 •

edited

Loading

trexshw commented Nov 30, 2022 •

edited

Loading

khkiley commented Jan 4, 2024 •

edited

Loading

[RFC] Better Data Types #678

[RFC] Better Data Types #678

Comments

arthurschreiber commented Jan 10, 2018 • edited by IanChokS Loading

What's currently wrong with data types?

How do we fix this?

Fantasy API

Other changes

rossipedia commented Jan 10, 2018

jimmont commented May 24, 2018

ferk6a commented Feb 12, 2019

ethankale commented Apr 20, 2020

KAMAELUA commented Oct 12, 2021

MichaelSun90 commented Oct 12, 2021

ephys commented Sep 24, 2022 • edited Loading

trexshw commented Nov 30, 2022 • edited Loading

khkiley commented Jan 4, 2024 • edited Loading

arthurschreiber commented Jan 10, 2018 •

edited by IanChokS

Loading

ephys commented Sep 24, 2022 •

edited

Loading

trexshw commented Nov 30, 2022 •

edited

Loading

khkiley commented Jan 4, 2024 •

edited

Loading