Playwright 反检测自动化脚本详解
代码功能概述本文展示了一个使用 Playwright 进行网页自动化操作的 Python 脚本特别针对反检测机制进行了优化。该脚本模拟人类操作行为访问百度搜索页面自动输入查询词并提交搜索同时通过多种技术手段规避网站的反爬虫检测。环境准备pipinstallplaywright python-mplaywrightinstallchromium完整代码带详细注释fromplaywright.sync_apiimportsync_playwrightimporttimeimportrandomdefhuman_delay(a0.5,b1.5):time.sleep(random.uniform(a,b))withsync_playwright()asp:# 禁用自动化标志 --disable-blink-featuresAutomationControlledbrowserp.chromium.launch(headlessFalse,args[--disable-blink-featuresAutomationControlled])pagebrowser.new_page()# 注入脚本隐藏 webdriver 属性page.add_init_script(Object.defineProperty(navigator, webdriver, {get: () undefined}))page.goto(https://baidu.com)page.fill(#chat-textarea,ChatGPT)human_delay()page.click(#chat-submit-button)textpage.inner_text(body)print(text)input(按回车关闭浏览器...)browser.close()保存cookie版本fromplaywright.sync_apiimportsync_playwrightfrombs4importBeautifulSoupimporttimeimportrandomdefhuman_delay(a0.5,b1.5):time.sleep(random.uniform(a,b))withsync_playwright()asp:contextp.chromium.launch_persistent_context(user_data_dirmy_profile,# ⭐在这里才存在headlessFalse,args[--disable-blink-featuresAutomationControlled])pagecontext.new_page()# 注入脚本隐藏 webdriver 属性page.add_init_script(Object.defineProperty(navigator, webdriver, {get: () undefined}))page.goto(https://baidu.com)page.fill(#chat-textarea,ChatGPT)human_delay()page.wait_for_timeout(2000)# 或等某个结果 selectorpage.click(#chat-submit-button)page.wait_for_timeout(2000)# 或等某个结果 selectortextpage.inner_text(body)htmlpage.content()# print(html)soupBeautifulSoup(html,html.parser)print(soup.title.text)print(soup.get_text())withopen(baidu.html,w,encodingutf-8)asf:f.write(html)input(按回车关闭浏览器...)try:context.close()exceptExceptionase:pass