Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid webDriver->quit() on __destruct() when scraping *remote* websites #466

Open
ThomasLandauer opened this issue Apr 20, 2021 · 8 comments

Comments

@ThomasLandauer
Copy link
Contributor

When connecting to remote webpages, I'm sometimes getting this exception:

Curl error thrown for http DELETE to /session/5db262bc-961f-4cbf-9983-8d602f00d89a
Operation timed out after 30000 milliseconds with 0 bytes received

And at the bottom of Symfony's exception page:

Curl error thrown for http POST to /session/5db262bc-961f-4cbf-9983-8d602f00d89a/url with params: {"url":"https://www.example.com"}
Operation timed out after 30001 milliseconds with 0 bytes received

As far as I can see, the cause is:
When done, Panther tries to cleanup and Client::quit() calls $this->webDriver->quit();.
And from this I'm guessing:

  • Some servers just respond with 5xx. Possible side effect: After doing this "forbidden" request repeatedly, I might get blocked.
  • Some don't send a response at all. Side effect: Panther waits for 30 seconds (=general timeout), i.e. my command hangs.

So the solution looks pretty clear to me: Don't send that request remotely ;-)

So the first question towards a PR would be: Do you want an automatic check, or rather some user-configurable option to suppress this cleanup?

Related: #169 (don't know if it's really the same, or some Docker-related problem)

@trbsi
Copy link

trbsi commented May 10, 2021

I run 30 instances of Panther in parallel using different ports and each of them connects to different proxy. I often get that error and I'm not sure why

@Mepcuk
Copy link

Mepcuk commented May 31, 2021

@ThomasLandauer what client for scraping do you use? Curl ? Chrome? Firefox?

@ThomasLandauer
Copy link
Contributor Author

Firefox.

@dunglas
Copy link
Member

dunglas commented May 31, 2021

It's maybe related to the bug I try to fix in #425. However I didn't manage to get this patch working and I'm not sure of when I'll have the time yo work on it again.

Help welcome on this one (yes, destructors are hard to deal with).

@gravitiq-cm
Copy link

gravitiq-cm commented Aug 18, 2022

I think I'm experiencing the same issue (i.e. I get the delete error when using Panther on remote sites)

Is there any fix suggested? Or where should I look to try and patch it myself?

Maybe we could add some setting on Client to tell it not to call $this->webDriver->close() from Client::close()?

@dunglas
Copy link
Member

dunglas commented Aug 19, 2022

Hi @gravitiq-cm, as explained previously, this error is most likely caused by the bug I tried to fix in #425

Unfortunately, I didn't find the time to finish it and it's at the very end of my todo list. Help on fixing this would be much appreciated!

@gravitiq-cm
Copy link

I think a valid solution would be to allow users to define a different class to use instead of the hard-coded RemoteWebDriver.

For example, change from:

/**
 * @throws \RuntimeException
 */
public function start(): WebDriver
{
    // ...
    return RemoteWebDriver::create(...);
}

to:

/**
 * @throws \RuntimeException
 */
public function start(): WebDriver
{
    // ...
    $webDriverClass = $this->options['web_driver_class'];
    return $webDriverClass::create(...);
    // (or could use call_user_func_array() if preferred style)
}

Users could then create a custom class which extends RemoteWebDriver and has their own customisation. RemoteWebDriver looks extensible... it has no private methods or functions, so would be easy to extend then (in this case) override CustomRemoteWebDriver::quit() to not try and delete the session.

@dunglas
Copy link
Member

dunglas commented Aug 20, 2022

IMHO it would be better to fix the bug for everybody without asking the user to write custom code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants