raw images first #76

liyiecho · 2018-03-19T04:24:43Z

Fixes: #65

dixudx

Remove the redundant binary file.

dixudx · 2018-03-19T04:38:01Z

tumblr-photo-video-ripper.py

@@ -68,8 +68,27 @@ def run(self):
    def download(self, medium_type, post, target_folder):
        try:
            medium_url = self._handle_medium_url(medium_type, post)
+            medium_url_bak = medium_url
+            medium_url =re.sub(u'[^/]*media.tumblr.com', u'data.tumblr.com', medium_url)


Why changing to this?

dixudx · 2018-03-19T04:42:22Z

tumblr-photo-video-ripper.py

@@ -68,8 +68,27 @@ def run(self):
    def download(self, medium_type, post, target_folder):
        try:
            medium_url = self._handle_medium_url(medium_type, post)
+            medium_url_bak = medium_url
+            medium_url =re.sub(u'[^/]*media.tumblr.com', u'data.tumblr.com', medium_url)
+            if (b'_100.' in medium_url):


I don't like this exhaustive way. Hard coded is not a good choice.

Why not splitting the string and replacing with raw instead?

And you should not change at here. Only photos/images are applicable with raw.

Method _download(**) is the right place.

dixudx · 2018-03-19T07:45:28Z

tumblr-photo-video-ripper.py

@@ -68,8 +68,23 @@ def run(self):
    def download(self, medium_type, post, target_folder):
        try:
            medium_url = self._handle_medium_url(medium_type, post)
-            if medium_url is not None:
-                self._download(medium_type, medium_url, target_folder)
+            #print("medium url is %s", medium_url)


Remove this comment line.

dixudx · 2018-03-19T09:16:56Z

tumblr-photo-video-ripper.py

+                self._download(medium_type, medium_url, target_folder, resp_raw)
+            elif medium_type == "photo":
+                medium_url_bak = medium_url
+                medium_url_dot = medium_url.split('.')


The url parsing here seems complex and error-prone.

Below part is a better way. WDYT?

def download(self, medium_type, post, target_folder): try: medium_url = self._handle_medium_url(medium_type, post) if medium_url is not None: if medium_type == "photo": try: # try to download raw image medium_url_raw = medium_url.replace("68.media.tumblr.com", "data.tumblr.com") raw_matched = self.hd_photo_regex.match(medium_url_raw) if raw_matched is not None: replace_raw = raw_matched.groups()[0] replace_raw = replace_raw.replace(raw_matched.groups()[1], "raw") medium_url_raw = medium_url_raw.replace(raw_matched.groups()[0], replace_raw) self._download(medium_type, medium_url_raw, target_folder) return except: pass self._download(medium_type, medium_url, target_folder) except TypeError: pass # can register differnet regex match rules def _register_regex_match_rules(self): # will iterate all the rules # the first matched result will be returned self.regex_rules = [video_hd_match(), video_default_match()] self.hd_photo_regex = re.compile(r".*(tumblr_\w+_(\d+))", re.IGNORECASE)

liyiecho · 2018-03-19T09:55:04Z

medium_url_raw = medium_url.replace("68.media.tumblr.com", "data.tumblr.com")
It doesn't always 68.media.tumblr.com

dixudx · 2018-03-19T13:39:24Z

It doesn't always 68.media.tumblr.com

@liyiecho So just use regex to match and replace it.

Delete Extra

4f925df

dixudx requested changes Mar 19, 2018

View reviewed changes

liyiecho added 2 commits March 19, 2018 14:33

raw images

755ae29

fix url error

6cbe325

dixudx requested changes Mar 19, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

raw images first #76

raw images first #76

liyiecho commented Mar 19, 2018 •

edited by dixudx

Loading

dixudx left a comment

dixudx Mar 19, 2018

dixudx Mar 19, 2018

dixudx Mar 19, 2018

dixudx Mar 19, 2018

dixudx Mar 19, 2018

liyiecho commented Mar 19, 2018

dixudx commented Mar 19, 2018

raw images first #76

Are you sure you want to change the base?

raw images first #76

Conversation

liyiecho commented Mar 19, 2018 • edited by dixudx Loading

dixudx left a comment

Choose a reason for hiding this comment

dixudx Mar 19, 2018

Choose a reason for hiding this comment

dixudx Mar 19, 2018

Choose a reason for hiding this comment

dixudx Mar 19, 2018

Choose a reason for hiding this comment

dixudx Mar 19, 2018

Choose a reason for hiding this comment

dixudx Mar 19, 2018

Choose a reason for hiding this comment

liyiecho commented Mar 19, 2018

dixudx commented Mar 19, 2018

liyiecho commented Mar 19, 2018 •

edited by dixudx

Loading