Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seems to be broken with playwright 1.45+ #31

Open
Rafiot opened this issue Jul 9, 2024 · 8 comments
Open

Seems to be broken with playwright 1.45+ #31

Rafiot opened this issue Jul 9, 2024 · 8 comments

Comments

@Rafiot
Copy link

Rafiot commented Jul 9, 2024

I'll dig a bit further, but the short version is that when I pass a page to playwright_stealth using playwright 1.45, I get 20+ JS errors in the debug console and the none of the javascript on the page loads after that.

I'm guessing a few things changed in playwright and it's causing this module to barf.

I poked a bit more and it seems to be working fine again after I disabled the scripts:

  • navigator_languages
  • navigator_user_agent
  • navigator_vendor
@b-hairston
Copy link

b-hairston commented Jul 26, 2024

Yep this fix works for me on plawright 1.45.1

@shriyashfinov8
Copy link

Won't the blocking of navigator_user_agent script defeats the whole purpose of being exposed.

@Rafiot
Copy link
Author

Rafiot commented Aug 12, 2024

Afaict, not if you pass a useragent to playwright when you initialize the context: https://playwright.dev/python/docs/api/class-browser#browser-new-context-option-user-agent

@shriyashfinov8
Copy link

Can we pass useragent when launching the browser, as in have a use case to avoid using context which overloads the scraping instances and cause timeout issues

@Rafiot
Copy link
Author

Rafiot commented Aug 13, 2024

as far as I can tell, using a context is strongly recommended. But to answer your question (and after a quick look at the doc), no, it doesn't seem to be possible. But I never had the problem with contexts so this is fine for me (note that I'm just a playwright_stealth user).

@Mattwmaster58
Copy link
Contributor

The lib assumes that all "All init scripts are combined", which doesn't seem to be the case anymore. But to preserve passing config options to the init scripts, you can use this workaround that works for now (naive workaround):

        class FixedConfig(StealthConfig):
            @property
            def enabled_scripts(self):
                key = "".join(random.choices(string.ascii_letters, k=10))
                for script in super().enabled_scripts:
                    if "const opts" in script:
                        yield script.replace("const opts", f"window.{key}")
                        continue
                    yield script.replace("opts", f"window.{key}")
        await stealth_async(page, FixedConfig())

@FelipeSanchezCalzada
Copy link

In my case the problem was that the added scripts are executed in other order (idk why). This works for me:

async def custom_stealth_async(page: Page, config: StealthConfig = None):
    """teaches asynchronous playwright Page to be stealthy like a ninja!"""
    real_config = config or StealthConfig()
    merged_scripts = '\n'.join(real_config.enabled_scripts)
    await page.add_init_script(merged_scripts)

Maybe you need to modify to support sync, i only use async

@Mattwmaster58
Copy link
Contributor

fixed in my fork: https://github.com/Mattwmaster58/playwright_stealth

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants