-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
dda32c4
commit 74e57fe
Showing
1 changed file
with
64 additions
and
0 deletions.
There are no files selected for viewing
64 changes: 64 additions & 0 deletions
64
src/_til/2023/2023-11-30-prefer-xml-in-the-wikimedia-apis.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
--- | ||
layout: til | ||
date: 2023-11-30 00:27:25 +0100 | ||
title: Why I prefer XML to JSON in the Wikimedia Commons APIs | ||
summary: | | ||
The XML-to-JSON conversion leads to some inconsistent behaviour, especially in corner cases of the API. | ||
tags: | ||
- wikimedia-commons | ||
--- | ||
The Wikimedia APIs I've used can return results in three formats: HTML, JSON, and XML. | ||
Initially I was using the JSON APIs because JSON is easy, it's familiar, there are built-in methods for it my HTTP client libraries. | ||
|
||
It seems like at least some of the APIs are doing an automated XML-to-JSON translation, which has inconsistent results in certain corner cases. | ||
This is why I'm gradually leaning towards the XML APIs, which seem to be more consistent in how they behave. | ||
|
||
This is a useful example of automated XML-to-JSON risks in general. | ||
|
||
## The `languagesearch` API | ||
|
||
First let's go ahead and use then [Languagesearch API](https://www.mediawiki.org/wiki/API:Languagesearch) to find a list of languages which match the query "english": | ||
|
||
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>curl <span class="s1">'https://en.wikipedia.org/w/api.php?action=languagesearch&search=english&format=json'</span> | jq <span class="nb">.</span> | ||
<span class="go">{ | ||
"languagesearch": { | ||
"en": "english", | ||
"en-us": "english sa america", | ||
"en-au": "english sa australia", | ||
… | ||
} | ||
} | ||
|
||
</span><span class="gp">$</span><span class="w"> </span>curl <span class="s1">'https://en.wikipedia.org/w/api.php?action=languagesearch&search=english&format=xml'</span> | xmllint <span class="nt">--format</span> - | ||
<span class="go"><?xml version="1.0"?></span><span class="w"> | ||
</span><span class="go"><api></span><span class="w"> | ||
</span><span class="go"> < | ||
languagesearch | ||
en="english" | ||
en-us="english sa america" | ||
en-au="english sa australia" | ||
… | ||
</span><span class="go"> /></span><span class="w"> | ||
</span><span class="go"></api></span><span class="w"> | ||
</span></code></pre></div></div> | ||
|
||
The JSON contains an object which maps language ID to name; the XML uses language IDs as attributes and names as values. | ||
|
||
Now let's try that query again, with a query that won't return any results; | ||
|
||
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>curl <span class="s1">'https://en.wikipedia.org/w/api.php?action=languagesearch&search=doesnotexist&format=json'</span> | jq <span class="nb">.</span> | ||
<span class="go">{ | ||
"languagesearch": [] | ||
} | ||
|
||
</span><span class="gp">$</span><span class="w"> </span>curl <span class="s1">'https://en.wikipedia.org/w/api.php?action=languagesearch&search=doesnotexist&format=xml'</span> | xmllint <span class="nt">--format</span> - | ||
<span class="go"><?xml version="1.0"?></span><span class="w"> | ||
</span><span class="go"><api></span><span class="w"> | ||
</span><span class="go"> <languagesearch/></span><span class="w"> | ||
</span><span class="go"></api></span><span class="w"> | ||
</span></code></pre></div></div> | ||
|
||
Notice that the structure of the JSON response has changed slightly -- where previously it returned an object, now it returns an array. | ||
Meanwhile the XML response looks just as before, just without any attributes. | ||
|
||
This broke my JSON-using code, because I was assuming the `languagesearch` value would always be a mapping, and that worked until I tested the empty case. |
74e57fe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Published on https://alexwlchan.net as production
🚀 Deployed on https://6653c932165fa1f67ce85072--alexwlchan.netlify.app
74e57fe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Published on https://alexwlchan.net as production
🚀 Deployed on https://66543762f3c7b7b060de645f--alexwlchan.netlify.app
74e57fe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Published on https://alexwlchan.net as production
🚀 Deployed on https://665588f788d445be86d37f69--alexwlchan.netlify.app