hacky special-case solution for links in feed excerpts #265

brabster · 2024-04-06T13:51:15Z

Hack part-fix for #263 for info

The best solution I could find quickly was to use the html output from the markdown rendering process. Seems that it rendered using the settings on the mkdocs site and link/image URLs have been turned into links relative to the site root. This works fine in the condition that looks for the excerpt tag, think it's gong to get the same content as would have been from the raw markdown.

This also takes care of snippets I noticed were also getting literally passed through in the feed.

I regex to find any src= and href= attributes, pull out the value, resolve relative to the site root using built in urllib, and replace each link. It's not tooooo hacky I guess.

Added a basic test to check that the links are resolved correctly in an example post.

Using my branch as a direct git import I can see that the link processing seems to be effective.

Before

https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fdeploy-preview-16--tw-mkdocs.netlify.app%2Ffeed_rss_created.xml - several relative link validation failures, link broken in rss preview site

After

https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Ftempered.works%2Ffeed_rss_created.xml - no relative link issues, image visible in rss preview site

If this approach is of interest, let me know what you'd like me to do to make mergeable?

Thanks

codecov · 2024-04-22T20:31:44Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.56%. Comparing base (68c62e5) to head (8c5db54).

❗ Current head 8c5db54 differs from pull request most recent head d447151

Please upload reports for the commit d447151 to get more accurate results.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #265      +/-   ##
==========================================
+ Coverage   80.66%   84.56%   +3.89%     
==========================================
  Files          10       10              
  Lines         631      570      -61     
  Branches      131      121      -10     
==========================================
- Hits          509      482      -27     
+ Misses         84       56      -28     
+ Partials       38       32       -6

Flag	Coverage Δ
unittests	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
mkdocs_rss_plugin/plugin.py	`92.53% <ø> (+2.46%)`	⬆️
mkdocs_rss_plugin/util.py	`75.47% <100.00%> (+1.73%)`	⬆️

... and 8 files with indirect coverage changes

Guts · 2024-04-22T20:39:09Z

Good evening @brabster ,

Thank you for your interest in this project, for investigating and even proposing a solution, it's great!

Indeed, it's a... pragmatic approach! It's a problem I've encountered before, and I opted for a... phlegmatic approach! Pragmatic > phlegmatic, so why not?

Before doing a more in-depth review, have you looked at the Mkdocs utilities for handling this kind of case? After all, it's part of the Markdown generator's job to manage HTML tags when it's about to convert. Because ideally you don't want to add homemade code that's complex to maintain, as it can't handle every possible and unimaginable case. Especially with regex.

…links-in-feed

brabster · 2024-05-05T20:12:35Z

Hi there! I looked in a couple of places for a solution but came up empty - mkdocs/catalog and pymdown-extensions. MagicLink doesn't look helpful. I'm not seeing anything in markdown extensions either but perhaps a fairly trivial markdown extension to write and avoid the nasty post-processing approach. I recall I started down this road initially and got stuck because my mkdocs-material blog has some complexity in the way it produces URLs - blog posts get yyyy/mm/dd and images don't, for example. Can't just glue the site URL on, it wont work. I'll have a go if I have time and see what I find out

brabster · 2024-05-05T21:49:26Z

Hmm, first attempt not going well. Found PathConverter but not able to get it working so far:

elif (
            abstract_delimiter
            and (
                excerpt_separator_position := in_page.markdown.find(abstract_delimiter)
            )
            > -1
        ):
            return markdown.markdown(
                    in_page.markdown[:excerpt_separator_position],
                    output_format="html5",
                    extensions=['pymdownx.pathconverter'],
                    extension_configs={
                        'pymdownx.pathconverter': {
                            'base_url': 'https://foo.bar'
                        }
                    })

I've also remembers that there's other little things that I have in mkdocs-material like snippets that don't get converted in the feed but do if you process the html like this

<description><p><img alt="A photo of a crab on a night dive in the red sea. Credit: me" src="./assets/phone.webp"></p><p>Why I've been trialling GitHub Codespaces as a more secure alternative to local development. I never expected to be pushing changes from my phone!</p><p>--8&lt;-- "ee.md"</p></description>

I remember now I was seeing little things not working correctly like this and it seemed you'd need to mirror the mkdocs-material configuration somehow - at that point, just processing the output HTML from the standard processing seemed like a pretty good idea!

for more information, see https://pre-commit.ci

…' into hacky-fix-relative-links-in-feed

for more information, see https://pre-commit.ci

brabster · 2024-05-05T23:04:36Z

Seems like this is just a fairly hard problem - pretty much every one of the feeds we are dropping into Slack have some kind of mess in the descriptions, even Medium.

That failing test is one that checks a description length below a fixed size - naturally my change makes that test fail because there's an h1 wrapper with an id attr now - don't think it's a real problem

To solve my remaining problems, I've added an additional step to clean up some of the wrapper tags that can appear in mkdocs-material - glightbox and grid-cards.

I wouldn't take it as-is if I were you. If you're interested in a version of it, perhaps protected by a specific opt-in config to avoid breaking anyone and naturally more tests, I can probably do that without much trouble. I'd rather clean it up and have it supported than effectively fork it, and I'm not seeing a better option right now 🤷

Guts · 2024-06-10T19:42:03Z

Hi @brabster

As you may have noticed, I've made a set of improvements lately and published them in a 1.13.0 version. Just to say that I did not forget your PR but I'm still a bit confused about how to implement it without slowing mkdocs build and I don't understand why you don't use the excerpt function, which lets you cut content where you want it?

EDIT: hum, sorry, I've spent too much time away from this project and my answer is a little off the mark. I'll discuss on the issue on the issue and I'll get back here when it's clear enougth.

sonarqubecloud · 2024-06-12T21:30:16Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

for more information, see https://pre-commit.ci

sonarqubecloud · 2024-10-27T09:54:50Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

hacky special-case solution for links in feed excerpts

8c5db54

Guts self-assigned this Apr 22, 2024

Merge remote-tracking branch 'upstream/main' into hacky-fix-relative-…

a344855

…links-in-feed

github-actions bot added bug Something isn't working quality Tests, project resiliency, etc. labels May 5, 2024

brabster and others added 7 commits May 5, 2024 22:11

remove glightbox image wrapper if present; assume only one image

0a214d6

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b04234

for more information, see https://pre-commit.ci

remove grid cards wrapper too, one failing test

89414e8

Merge remote-tracking branch 'origin/hacky-fix-relative-links-in-feed…

f15850a

…' into hacky-fix-relative-links-in-feed

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad64f98

for more information, see https://pre-commit.ci

match any new lines in the cleanups

29af0d4

[pre-commit.ci] auto fixes from pre-commit.com hooks

7bb6b4b

for more information, see https://pre-commit.ci

Merge branch 'main' into hacky-fix-relative-links-in-feed

8468e94

Merge branch 'main' into hacky-fix-relative-links-in-feed

d424d11

brabster and others added 3 commits June 25, 2024 06:59

merge upstream chanhges

c00ec68

[pre-commit.ci] auto fixes from pre-commit.com hooks

d447151

for more information, see https://pre-commit.ci

Merge branch 'Guts:main' into hacky-fix-relative-links-in-feed

684c169

Guts linked an issue Dec 2, 2024 that may be closed by this pull request

Links incorrect in feed #263

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hacky special-case solution for links in feed excerpts #265

hacky special-case solution for links in feed excerpts #265

brabster commented Apr 6, 2024 •

edited

Loading

codecov bot commented Apr 22, 2024 •

edited

Loading

Guts commented Apr 22, 2024 •

edited

Loading

brabster commented May 5, 2024 •

edited

Loading

brabster commented May 5, 2024

brabster commented May 5, 2024 •

edited

Loading

Guts commented Jun 10, 2024 •

edited

Loading

sonarqubecloud bot commented Jun 12, 2024

sonarqubecloud bot commented Oct 27, 2024

hacky special-case solution for links in feed excerpts #265

Are you sure you want to change the base?

hacky special-case solution for links in feed excerpts #265

Conversation

brabster commented Apr 6, 2024 • edited Loading

Before

After

codecov bot commented Apr 22, 2024 • edited Loading

Codecov Report

Guts commented Apr 22, 2024 • edited Loading

brabster commented May 5, 2024 • edited Loading

brabster commented May 5, 2024

brabster commented May 5, 2024 • edited Loading

Guts commented Jun 10, 2024 • edited Loading

sonarqubecloud bot commented Jun 12, 2024

Quality Gate passed

sonarqubecloud bot commented Oct 27, 2024

Quality Gate passed

brabster commented Apr 6, 2024 •

edited

Loading

codecov bot commented Apr 22, 2024 •

edited

Loading

Guts commented Apr 22, 2024 •

edited

Loading

brabster commented May 5, 2024 •

edited

Loading

brabster commented May 5, 2024 •

edited

Loading

Guts commented Jun 10, 2024 •

edited

Loading