-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer-Encoding: chunked with proxy enabled #112
Comments
Can you post the contents of your local composer.json and production
composer.json & composer.lock ?
…On Fri, 5 Oct 2018 at 09:28 Gilles Migliori ***@***.***> wrote:
Hi,
I've been using search-engine-google for a while, worked perfectly until
now but I've got a recent issue with proxies.
The Google scraper works fine on my localhost, but on the production
server it throws an error 500: Unable to check javascript status
The scraped results come with dom => textContent starting with "ncoding
Transfer-Encoding: chunked"
I put a simple test online here:
https://www.hack-hunt.com/scraping-simple-test.php
The code is the code of your example here:
http://serp-spider.github.io/documentation/search-engine/google/#installation
I just added:
$proxy = ***@***.***');
$browser->setProxy($proxy);
It works fine on localhost, or on production server if I remove the proxy,
but it fails on production with proxy.
Not sure if the issue comes from my server or search-engine-google.
Any help much appreciated, thanks
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#112>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AKNqM3yup7uJ31PvQlGsf7i29tHAmHx7ks5uhxgogaJpZM4XJx4q>
.
|
same on local & server |
Hm, looks alright - can you check that your production server IP is
whitelisted / isn't blocked with the proxy service ?
…On Fri, 5 Oct 2018 at 09:39 Gilles Migliori ***@***.***> wrote:
composer.zip
<https://github.com/serp-spider/search-engine-google/files/2449573/composer.zip>
same on local & server
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#112 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKNqMyZ_H7fZB4KOi7Nn2t3aBW7ZY7kqks5uhxrPgaJpZM4XJx4q>
.
|
@migliori Please check your CURL version. If curl version is not the same on the server, please try to upgrade and let us know what's going on. |
No, it isn't, if you open https://www.hack-hunt.com/scraping-simple-test.php you'll see the string added before Google content: simpsons - Recherche Google(function(){window.google=...` I suspected that headers could be added by Apache pagespeed module, but tried to disable it without success. I can't change my PHP Curl version, it's built-in with PLESK PHP. I just tested with nginx instead of apache: same result. |
@migliori not php-curl, just curl itself. Run |
I already did it: apt-get update && apt-get install curl libcurl |
Your version of curl is very old. Try to upgrade to version 7.61 and see if it works. Additionally curl <7.48 has issue with cookies, preventing SERPS to work correctly with cookies. |
I'm in touch with my server provider & let you know if it's ok or not as soon as the upgrade is done - may take 1 or 2 days. Thanks so much for your reactivity & help |
Hi,
I've been using search-engine-google for a while, worked perfectly until now but I've got a recent issue with proxies.
The Google scraper works fine on my localhost, but on the production server it throws an error 500: Unable to check javascript status
The scraped results come with dom => textContent starting with "ncoding
Transfer-Encoding: chunked"
I put a simple test online here: https://www.hack-hunt.com/scraping-simple-test.php
The code is the code of your example here: http://serp-spider.github.io/documentation/search-engine/google/#installation
I just added:
$proxy = Proxy::createFromString('https://xxx:proxy@ip');
$browser->setProxy($proxy);
It works fine on localhost, or on production server if I remove the proxy, but it fails on production with proxy.
Not sure if the issue comes from my server or search-engine-google.
Any help much appreciated, thanks
The text was updated successfully, but these errors were encountered: