-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raw images first #76
base: master
Are you sure you want to change the base?
raw images first #76
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the redundant binary file.
tumblr-photo-video-ripper.py
Outdated
@@ -68,8 +68,27 @@ def run(self): | |||
def download(self, medium_type, post, target_folder): | |||
try: | |||
medium_url = self._handle_medium_url(medium_type, post) | |||
medium_url_bak = medium_url | |||
medium_url =re.sub(u'[^/]*media.tumblr.com', u'data.tumblr.com', medium_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why changing to this?
tumblr-photo-video-ripper.py
Outdated
@@ -68,8 +68,27 @@ def run(self): | |||
def download(self, medium_type, post, target_folder): | |||
try: | |||
medium_url = self._handle_medium_url(medium_type, post) | |||
medium_url_bak = medium_url | |||
medium_url =re.sub(u'[^/]*media.tumblr.com', u'data.tumblr.com', medium_url) | |||
if (b'_100.' in medium_url): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this exhaustive way. Hard coded is not a good choice.
Why not splitting the string and replacing with raw
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And you should not change at here. Only photos/images are applicable with raw.
Method _download(**)
is the right place.
@@ -68,8 +68,23 @@ def run(self): | |||
def download(self, medium_type, post, target_folder): | |||
try: | |||
medium_url = self._handle_medium_url(medium_type, post) | |||
if medium_url is not None: | |||
self._download(medium_type, medium_url, target_folder) | |||
#print("medium url is %s", medium_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this comment line.
self._download(medium_type, medium_url, target_folder, resp_raw) | ||
elif medium_type == "photo": | ||
medium_url_bak = medium_url | ||
medium_url_dot = medium_url.split('.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The url parsing here seems complex and error-prone.
Below part is a better way. WDYT?
def download(self, medium_type, post, target_folder):
try:
medium_url = self._handle_medium_url(medium_type, post)
if medium_url is not None:
if medium_type == "photo":
try:
# try to download raw image
medium_url_raw = medium_url.replace("68.media.tumblr.com", "data.tumblr.com")
raw_matched = self.hd_photo_regex.match(medium_url_raw)
if raw_matched is not None:
replace_raw = raw_matched.groups()[0]
replace_raw = replace_raw.replace(raw_matched.groups()[1], "raw")
medium_url_raw = medium_url_raw.replace(raw_matched.groups()[0], replace_raw)
self._download(medium_type, medium_url_raw, target_folder)
return
except:
pass
self._download(medium_type, medium_url, target_folder)
except TypeError:
pass
# can register differnet regex match rules
def _register_regex_match_rules(self):
# will iterate all the rules
# the first matched result will be returned
self.regex_rules = [video_hd_match(), video_default_match()]
self.hd_photo_regex = re.compile(r".*(tumblr_\w+_(\d+))", re.IGNORECASE)
|
@liyiecho So just use |
Fixes: #65