Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract base classes high max_length value #1477

Closed
koeaw opened this issue Dec 4, 2024 · 14 comments
Closed

Abstract base classes high max_length value #1477

koeaw opened this issue Dec 4, 2024 · 14 comments
Labels
app-entities Anything to do with entities – the models central to an APIS app documentation Improvements of or additions to documentation – project docs, docstrings, code comments

Comments

@koeaw
Copy link
Contributor

koeaw commented Dec 4, 2024

I noticed the CharFields in our abstract base classes provide unusually high max_length values of 4096 when I seemed to remember 255 to be the default. Example:

class E21_Person(models.Model):
forename = models.CharField(blank=True, default="", max_length=4096)
surname = models.CharField(blank=True, default="", max_length=4096)
gender = models.CharField(blank=True, default="", max_length=4096)

According to the Django docs on CharField and Character fields, the value is fine for Postgres, which we use, but could become a problem for MySQL users. (The MySQL docs suggest VARCHAR values exceeding the limit are truncated.)

AFAIK we don't caution against or rule out using MySQL, so if we want our base classes to be more DB agnostic/inclusive of MySQL, we should probably reduce the max_lengths to 255. Alternatively, a general recommendation for Postgres over other databases + stating that we specifically develop with Postgres in mind (for which we could cite the base classes as example) would be helpful.

@koeaw koeaw added documentation Improvements of or additions to documentation – project docs, docstrings, code comments app-entities Anything to do with entities – the models central to an APIS app labels Dec 4, 2024
@b1rger
Copy link
Contributor

b1rger commented Dec 4, 2024

which default was 255?

@b1rger
Copy link
Contributor

b1rger commented Dec 5, 2024

Could you please specify which part of the documentation you refer to in this sentence:

According to the Django docs on CharField and Character fields, the value is fine for Postgres, which we use, but could become a problem for MySQL users.

especially in regards to which parts lead you to the conclusion that this "could become a problem for MySQL users"?

@koeaw
Copy link
Contributor Author

koeaw commented Dec 9, 2024

which default was 255?

Sorry, imprecise wording. Substitute "typically used".

255 is the number I had memorised as (max) safe value for max_length for CharField. The use of 4096 in our classes confused me so I checked if I'd assumed incorrectly for ages or if 255 was outdated info. A search for the number in our repo just now turns up results which match the image I had in my head.

Re: could become a problem:

Frst sentence from the "Character fields" section I linked earlier:

Any fields that are stored with VARCHAR column types may have their max_length restricted to 255 characters if you are using unique=True for the field.

We are not setting the fields unique=True in Core, but I wouldn't assume that people who inherit from our classes will leave everything as-is. E.g. we want to modify the fields to add verbose_names for TBF; someone else may need to add a unique constraint (assuming this can be done just as easily?) but not realise they might then also have to modify the max_length value. Setting this to 255 would be a precaution/us acting in a more user-friendly way.

@b1rger
Copy link
Contributor

b1rger commented Dec 9, 2024

E.g. we want to acdh-oeaw/apis-instance-tbf#69 for TBF; someone else may need to add a unique constraint

Interesting... how do you change a field you inherited?

@gythaogg
Copy link
Contributor

gythaogg commented Dec 9, 2024

Interesting... how do you change a field you inherited?

I have done this by redefining (overriding?) the inherited fields in my models.

Fields inherited from abstract base classes can be overridden with another field or value, or be removed with None.

@b1rger
Copy link
Contributor

b1rger commented Dec 9, 2024

Interesting... how do you change a field you inherited?

I have done this by redefining (overriding?) the inherited fields in my models.

Fields inherited from abstract base classes can be overridden with another field or value, or be removed with None.

Yeah, but isn't this then a new field? It doesn't inherit the attributes of the field you redefined/overwrote (i.e. the max_length value)?

@gythaogg
Copy link
Contributor

gythaogg commented Dec 9, 2024

Yeah, but isn't this then a new field? It doesn't inherit the attributes of the field you redefined/overwrote (i.e. the max_length value)?

You're right, it's usually the max_length or choices that I overwrite and when I do that, the other attributes for this field defined in the base class aren't inherited; I copy those I want to keep from the base class.

@koeaw
Copy link
Contributor Author

koeaw commented Dec 9, 2024

Hmm, I thought a model's __init__ method could be overridden to change field attributes. Though I've admittedly only tried this with verbose_names so far, for which I don't care about what happens on the DB level but only the UI side.

@koeaw
Copy link
Contributor Author

koeaw commented Dec 9, 2024

If this is not a thing that should be done and therefore not something we want to support, then the 2nd part of my initial suggestion is the only revelant thing left to discuss.

Oh wait, maybe not. Because of the unique=True thing.

@gythaogg
Copy link
Contributor

gythaogg commented Dec 9, 2024

@koeaw I don't get it, in the MySQL documentation you linked (https://dev.mysql.com/doc/refman/8.4/en/char.html) I can't find any information about truncation after 255 characters. I see a confusing limit of "maximum row size" in MySQL but that's not our problem, MySQL users should know that). What am I missing?

Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535. The effective maximum length of a VARCHAR is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used. See Section 10.4.7, “Limits on Table Column Count and Row Size”.

In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

If strict SQL mode is not enabled and you assign a value to a CHAR or VARCHAR column that exceeds the column's maximum length, the value is truncated to fit and a warning is generated. For truncation of nonspace characters, you can cause an error to occur (rather than a warning) and suppress insertion of the value by using strict SQL mode. See Section 7.1.11, “Server SQL Modes”.

For VARCHAR columns, trailing spaces in excess of the column length are truncated prior to insertion and a warning is generated, regardless of the SQL mode in use. For CHAR columns, truncation of excess trailing spaces from inserted values is performed silently regardless of the SQL mode.

@koeaw
Copy link
Contributor Author

koeaw commented Dec 9, 2024

@koeaw I don't get it, in the MySQL documentation you linked (https://dev.mysql.com/doc/refman/8.4/en/char.html) I can't find any information about truncation after 255 characters. I see a confusing limit of "maximum row size" in MySQL but that's not our problem, MySQL users should know that). What am I missing?

My statements were primarily based on the linked sections in the Django docs. It's possible they are wrong/ambiguous/not reflective of the MySQL docs (or that I misinterpreted them).

@b1rger
Copy link
Contributor

b1rger commented Dec 10, 2024

Maybe you could give us an example code that leads to the problems you see. I tried using the __init__ method, but that did not have any effect on the datamodel

@koeaw
Copy link
Contributor Author

koeaw commented Dec 10, 2024

Maybe you could give us an example code that leads to the problems you see.

I'm not using MySQL (I don't think any of us are?), would def. have included more info if I was experiencing problems because of this myself!

I opened the issue based on what I read in the docs; this earlier comment has the chain of events:

255 is the number I had memorised as (max) safe value for max_length for CharField. The use of 4096 in our classes confused me so I checked if I'd assumed incorrectly for ages or if 255 was outdated info. A search for the number in our repo just now turns up results which match the image I had in my head.

I tried using the __init__ method, but that did not have any effect on the datamodel

Ah, interesting. So this more of a "hack", a way to manipulate the UI side of things only, hmm. Thank you.

@b1rger
Copy link
Contributor

b1rger commented Dec 10, 2024

Oke, thanks! Given that we don't even know if there is a problem, I'm closing this

@b1rger b1rger closed this as completed Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app-entities Anything to do with entities – the models central to an APIS app documentation Improvements of or additions to documentation – project docs, docstrings, code comments
Projects
None yet
Development

No branches or pull requests

3 participants