Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge behaviors of user/channel types and add support for forHandle #339

Merged
merged 2 commits into from
Sep 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG → CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Deprecated

- `--type user` is now deprecated (will be removed in next major)

### Fixed

- Ignore empty playlists (#340)

### Changed

- Merge behaviors of user/channel types and add support for `forHandle` (#339, fix for #338)

## [3.1.0] - 2024-09-05

### Added
Expand Down
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@ To add a new locale (`fr` in this example, use only ISO-639-1):
## releasing

* Update your dependencies: `pip install -U setuptools wheel twine`
* Make sure CHANGELOG is up-to-date
* Make sure CHANGELOG.md is up-to-date
* Bump version on `youtube2zim/VERSION`
* Build packages `python ./setup.py sdist bdist_wheel`
* Upload to PyPI `twine upload dist/youtube2zim-2.0.0*`.
* Commit your CHANGELOG + version bump changes
* Commit your CHANGELOG.md + version bump changes
* Tag version on git `git tag -a v2.0.0`

## developing the ZIM UI in Vue.JS
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ RUN pip install --no-cache-dir /src/scraper

# Copy code + associated artifacts
COPY scraper/src /src/scraper/src
COPY *.md LICENSE CHANGELOG /src/
COPY *.md LICENSE CHANGELOG.md /src/

# Install + cleanup
RUN pip install --no-cache-dir /src/scraper \
Expand Down
6 changes: 6 additions & 0 deletions scraper/src/youtube2zim/scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,12 @@
):
# data-retrieval info
self.collection_type = collection_type
if self.collection_type == USER:
logger.warning(

Check warning on line 127 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L127

Added line #L127 was not covered by tests
"Collection type 'user' is deprecated. Please use 'channel' type,"
" behaviors have been merged. 'user' type is going to be dropped in "
" next major release"
)
self.youtube_id = youtube_id
self.api_key = api_key
self.dateafter = dateafter
Expand Down
53 changes: 24 additions & 29 deletions scraper/src/youtube2zim/youtube.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,32 +80,32 @@
return False


def get_channel_json(channel_id, *, for_username=False):
def get_channel_json(channel_id):

Check warning on line 83 in scraper/src/youtube2zim/youtube.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/youtube.py#L83

Added line #L83 was not covered by tests
"""fetch or retieve-save and return the Youtube ChannelResult JSON"""
fname = f"channel_{channel_id}"
channel_json = load_json(YOUTUBE.cache_dir, fname)
if channel_json is None:
logger.debug(f"query youtube-api for Channel #{channel_id}")
req = requests.get(
CHANNELS_API,
params={
"forUsername" if for_username else "id": channel_id,
"part": "brandingSettings,snippet,contentDetails",
"key": YOUTUBE.api_key,
},
timeout=REQUEST_TIMEOUT,
)
if req.status_code >= HTTPStatus.BAD_REQUEST:
logger.error(f"HTTP {req.status_code} Error response: {req.text}")
req.raise_for_status()
try:
channel_json = req.json()["items"][0]
except (KeyError, IndexError):
if for_username:
logger.error(f"Invalid username `{channel_id}`: Not Found")
else:
logger.error(f"Invalid channelId `{channel_id}`: Not Found")
raise
for criteria in ["forHandle", "id", "forUsername"]:
logger.debug(f"query youtube-api for {channel_id} by {criteria}")
req = requests.get(

Check warning on line 90 in scraper/src/youtube2zim/youtube.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/youtube.py#L89-L90

Added lines #L89 - L90 were not covered by tests
CHANNELS_API,
params={
criteria: channel_id,
"part": "brandingSettings,snippet,contentDetails",
"key": YOUTUBE.api_key,
},
timeout=REQUEST_TIMEOUT,
)
if req.status_code >= HTTPStatus.BAD_REQUEST:
logger.error(f"HTTP {req.status_code} Error response: {req.text}")
req.raise_for_status()
req_json = req.json()

Check warning on line 102 in scraper/src/youtube2zim/youtube.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/youtube.py#L100-L102

Added lines #L100 - L102 were not covered by tests
if "items" not in req_json:
logger.warning(f"Failed to find {channel_id} by {criteria}")
continue
channel_json = req_json["items"][0]

Check warning on line 106 in scraper/src/youtube2zim/youtube.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/youtube.py#L104-L106

Added lines #L104 - L106 were not covered by tests
if channel_json is None:
raise Exception(f"Impossible to find {channel_id}, check for typos")

Check warning on line 108 in scraper/src/youtube2zim/youtube.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/youtube.py#L108

Added line #L108 was not covered by tests
save_json(YOUTUBE.cache_dir, fname, channel_json)
return channel_json

Expand Down Expand Up @@ -324,13 +324,8 @@
uploads_playlist_id = None
main_channel_id = None
if collection_type in (USER, CHANNEL):
if collection_type == USER:
# youtube_id is a Username, fetch actual channelId through channel
channel_json = get_channel_json(youtube_id, for_username=True)
else:
# youtube_id is a channelId
channel_json = get_channel_json(youtube_id)

# get_channel_json is capable to retrieve user and channel
channel_json = get_channel_json(youtube_id)

Check warning on line 328 in scraper/src/youtube2zim/youtube.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/youtube.py#L328

Added line #L328 was not covered by tests
main_channel_id = channel_json["id"]

# retrieve list of playlists for that channel
Expand Down