We need to be able to use parameterised insert queries to populate a table during migration #1292

OracPrime · 2024-10-29T16:35:38Z

Description

It is currently quite messy to use parameterised inserts. If you naively use pgm.sql with $1 style parameters you get an error message related to regular expressions which is somewhat impenetrable if your not expecting mustache templating. If you use mustache templating, you are open to injection attacks. The only solution I can find is using pg-format for build the string (and I'm a bit worried if the result of that itself contains, for example $1.

Alternative

pg-format is the only one I'm aware of.

Additional context

At minimum the regex error should be decorated with some explanation?

Shinigami92 · 2024-10-29T18:56:40Z

Please note that we don't really need to be careful/sanitize parameters, because you should not use node-pg-migrate at runtime, but only for migration progresses.

That aside, what would you propose as an API?

And how does a current use-case look like right now (rough example).

OracPrime · 2024-10-30T00:16:44Z

The current use case if for populating a static lookup table. I'd agree in theory you're only using your own data, so you should feel slightly safe, but in an open source environment an innocuous update of a data file could become an injection attack, so I want to be paranoid.

I think my proposed api is to have a function that looks exactly like sql but which doesn't mustache.

Current use case it this (previously insert with $1... $5)

	async up(pgm) {
		pgm.createTable("uk_stations", {
			station_id: { notNull: true, primaryKey: true, type: "bigserial" },
			station_name: { notNull: true, type: "text" },
			sixteen_character_name: { notNull: true, type: "text" },
			longitude: { notNull: true, type: "float8" },
			latitude: { notNull: true, type: "float8" },
			crs_code: { notNull: true, type: "char(3)" },
		});

		const filePath = path.join(__dirname, "../ukStations.csv");
		const fileContent = fs.readFileSync(filePath, "utf8");
		const records = parse(fileContent, {
			columns: true,
			skip_empty_lines: true,
		});

		// Prepare data for bulk insertion
		const values = records.map(
			({
				station_name,
				sixteen_character_name,
				latitude,
				longitude,
				crs_code,
			}) => [
				station_name,
				sixteen_character_name,
				latitude,
				longitude,
				crs_code,
			],
		);

		const query = format(
			`INSERT INTO uk_stations (station_name, sixteen_character_name, latitude, longitude, crs_code) VALUES %L`,
			values,
		);
		pgm.sql(query);
	},

Shinigami92 · 2024-10-30T14:41:21Z

Just me thinking out loud: Not sure if we should provide a insert (and then update and delete) statement, because these are not really anymore migrations but populations 🤔
Currently I would suggest to use node-pg-migration as a migration tool to have your DB versioned and use other tools (potentially knex) for population after the migration was executed.

However, I even know from myself in company work that there is best practice and real world scenario 🤣 ...

So maybe we could add just the absolutely simplest form of such functions that in the end will more or less just to what you already do yourself with the format function.

pgm.insert('uk_station', {
  station_name: '...',
  // data for one station...
});

pgm.insertBulk('uk_station', [
  {
    station_name: '...',
    // data for one station...
  },
  // ...
]);

I have not thought that fully out yet, but I would be okay with it if you explore an initial draft PR on your own with that info and then I can review and see where it is going.
Sounds good?

firelizzard18 · 2024-11-06T18:33:47Z

@Shinigami92 Some kinds of data are part of the schema, or in other words some tables are pointless without data. If I have an entity such as book and another entity to classify the first such as mediaType (e.g. paper, ebook, audiobook, etc), the class entity is effectively part of the schema IMO and should be populated by the migration. pgm.insert would work for my purposes.

Shinigami92 · 2024-11-06T18:38:04Z

@Shinigami92 Some kinds of data are part of the schema, or in other words some tables are pointless without data. If I have an entity such as book and another entity to classify the first such as mediaType (e.g. paper, ebook, audiobook, etc), the class entity is effectively part of the schema IMO and should be populated by the migration. pgm.insert would work for my purposes.

Do you know https://www.postgresql.org/docs/current/datatype-enum.html ?

firelizzard18 · 2024-11-06T18:40:31Z

Do you know https://www.postgresql.org/docs/current/datatype-enum.html ?

I was not aware of that. If I were designing a new schema I might use that, but I've inherited an existing schema and reworking it is not an option at this time. That being said, I often like to have metadata attached to my classes/enumerations.

Shinigami92 · 2024-11-06T18:45:04Z

As I said, I'm not against such an implementation, because I know how real-life and company work is. However, some real-talk... I'm currently in a mental health breakdown (partly because of today...).
So I reduce currently as much as possible to work for others if I do not benefit from it myself.
But what we can do here, is to work in a community and someone of the persons in this issue can start a PR (implementation, documentation and tests), and I can review and help from there. I hope that sounds fair.

firelizzard18 · 2024-11-06T18:52:07Z

That's certainly fair, I am not entitled to your free labor. If this becomes a blocker for me I will open a PR.

OracPrime added the s: pending triage Pending Triage label Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We need to be able to use parameterised insert queries to populate a table during migration #1292

We need to be able to use parameterised insert queries to populate a table during migration #1292

OracPrime commented Oct 29, 2024

Shinigami92 commented Oct 29, 2024

OracPrime commented Oct 30, 2024

Shinigami92 commented Oct 30, 2024

firelizzard18 commented Nov 6, 2024

Shinigami92 commented Nov 6, 2024

firelizzard18 commented Nov 6, 2024

Shinigami92 commented Nov 6, 2024

firelizzard18 commented Nov 6, 2024

We need to be able to use parameterised insert queries to populate a table during migration #1292

We need to be able to use parameterised insert queries to populate a table during migration #1292

Comments

OracPrime commented Oct 29, 2024

Description

Suggested solution

Alternative

Additional context

Shinigami92 commented Oct 29, 2024

OracPrime commented Oct 30, 2024

Shinigami92 commented Oct 30, 2024

firelizzard18 commented Nov 6, 2024

Shinigami92 commented Nov 6, 2024

firelizzard18 commented Nov 6, 2024

Shinigami92 commented Nov 6, 2024

firelizzard18 commented Nov 6, 2024