Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hacky special-case solution for links in feed excerpts #265

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

brabster
Copy link

@brabster brabster commented Apr 6, 2024

Hack part-fix for #263 for info

The best solution I could find quickly was to use the html output from the markdown rendering process. Seems that it rendered using the settings on the mkdocs site and link/image URLs have been turned into links relative to the site root. This works fine in the condition that looks for the excerpt tag, think it's gong to get the same content as would have been from the raw markdown.

This also takes care of snippets I noticed were also getting literally passed through in the feed.

I regex to find any src= and href= attributes, pull out the value, resolve relative to the site root using built in urllib, and replace each link. It's not tooooo hacky I guess.

Added a basic test to check that the links are resolved correctly in an example post.

Using my branch as a direct git import I can see that the link processing seems to be effective.

Before

https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fdeploy-preview-16--tw-mkdocs.netlify.app%2Ffeed_rss_created.xml - several relative link validation failures, link broken in rss preview site

image

After

https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Ftempered.works%2Ffeed_rss_created.xml - no relative link issues, image visible in rss preview site

image

If this approach is of interest, let me know what you'd like me to do to make mergeable?

Thanks

@Guts Guts self-assigned this Apr 22, 2024
Copy link

codecov bot commented Apr 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.56%. Comparing base (68c62e5) to head (8c5db54).

Current head 8c5db54 differs from pull request most recent head d447151

Please upload reports for the commit d447151 to get more accurate results.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #265      +/-   ##
==========================================
+ Coverage   80.66%   84.56%   +3.89%     
==========================================
  Files          10       10              
  Lines         631      570      -61     
  Branches      131      121      -10     
==========================================
- Hits          509      482      -27     
+ Misses         84       56      -28     
+ Partials       38       32       -6     
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
mkdocs_rss_plugin/plugin.py 92.53% <ø> (+2.46%) ⬆️
mkdocs_rss_plugin/util.py 75.47% <100.00%> (+1.73%) ⬆️

... and 8 files with indirect coverage changes

@Guts
Copy link
Owner

Guts commented Apr 22, 2024

Good evening @brabster ,

Thank you for your interest in this project, for investigating and even proposing a solution, it's great!

Indeed, it's a... pragmatic approach! It's a problem I've encountered before, and I opted for a... phlegmatic approach! Pragmatic > phlegmatic, so why not?

Before doing a more in-depth review, have you looked at the Mkdocs utilities for handling this kind of case? After all, it's part of the Markdown generator's job to manage HTML tags when it's about to convert. Because ideally you don't want to add homemade code that's complex to maintain, as it can't handle every possible and unimaginable case. Especially with regex.

@github-actions github-actions bot added bug Something isn't working quality Tests, project resiliency, etc. labels May 5, 2024
@brabster
Copy link
Author

brabster commented May 5, 2024

Hi there! I looked in a couple of places for a solution but came up empty - mkdocs/catalog and pymdown-extensions. MagicLink doesn't look helpful. I'm not seeing anything in markdown extensions either but perhaps a fairly trivial markdown extension to write and avoid the nasty post-processing approach. I recall I started down this road initially and got stuck because my mkdocs-material blog has some complexity in the way it produces URLs - blog posts get yyyy/mm/dd and images don't, for example. Can't just glue the site URL on, it wont work. I'll have a go if I have time and see what I find out

@brabster
Copy link
Author

brabster commented May 5, 2024

Hmm, first attempt not going well. Found PathConverter but not able to get it working so far:

elif (
            abstract_delimiter
            and (
                excerpt_separator_position := in_page.markdown.find(abstract_delimiter)
            )
            > -1
        ):
            return markdown.markdown(
                    in_page.markdown[:excerpt_separator_position],
                    output_format="html5",
                    extensions=['pymdownx.pathconverter'],
                    extension_configs={
                        'pymdownx.pathconverter': {
                            'base_url': 'https://foo.bar'
                        }
                    })

I've also remembers that there's other little things that I have in mkdocs-material like snippets that don't get converted in the feed but do if you process the html like this

<description><p><img alt="A photo of a crab on a night dive in the red sea. Credit: me" src="./assets/phone.webp"></p><p>Why I've been trialling GitHub Codespaces as a more secure alternative to local development. I never expected to be pushing changes from my phone!</p><p>--8&lt;-- "ee.md"</p></description>

I remember now I was seeing little things not working correctly like this and it seemed you'd need to mirror the mkdocs-material configuration somehow - at that point, just processing the output HTML from the standard processing seemed like a pretty good idea!

@brabster
Copy link
Author

brabster commented May 5, 2024

Seems like this is just a fairly hard problem - pretty much every one of the feeds we are dropping into Slack have some kind of mess in the descriptions, even Medium.

That failing test is one that checks a description length below a fixed size - naturally my change makes that test fail because there's an h1 wrapper with an id attr now - don't think it's a real problem

To solve my remaining problems, I've added an additional step to clean up some of the wrapper tags that can appear in mkdocs-material - glightbox and grid-cards.

I wouldn't take it as-is if I were you. If you're interested in a version of it, perhaps protected by a specific opt-in config to avoid breaking anyone and naturally more tests, I can probably do that without much trouble. I'd rather clean it up and have it supported than effectively fork it, and I'm not seeing a better option right now 🤷

@Guts
Copy link
Owner

Guts commented Jun 10, 2024

Hi @brabster

As you may have noticed, I've made a set of improvements lately and published them in a 1.13.0 version. Just to say that I did not forget your PR but I'm still a bit confused about how to implement it without slowing mkdocs build and I don't understand why you don't use the excerpt function, which lets you cut content where you want it?

EDIT: hum, sorry, I've spent too much time away from this project and my answer is a little off the mark. I'll discuss on the issue on the issue and I'll get back here when it's clear enougth.

Copy link

Quality Gate Passed Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link

@Guts Guts linked an issue Dec 2, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working quality Tests, project resiliency, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Links incorrect in feed
2 participants