How to Use Selenium Wire With BrightData 2023
As someone with over 5 years of experience in proxies and browser automation, let me walk you through how to leverage Selenium Wire to build unblockable web scrapers.
Selenium Wire Capabilities
Selenium WebDriver lets you automate browsers for testing, scraping and more. But it lacks native functionality for intercepting network requests and responses.
That's where Selenium Wire comes in!
It's a Python library that extends Selenium to give you complete control over the browser traffic.
Here are some examples of what you can do:
Inspect Responses
Analyze raw response content to understand a site's structure and identify extractable data elements.
No more guessing – just intercept responses and parse them out to build robust scrapers.
Bypass Anti-Scraping Measures
Debug scrapers by inspecting error messages, status codes, and response bodies to understand blocks.
Struggling with a specific block page? Check the raw response to reverse-engineer the anti-scraping tactic.
Mock Scenarios
Modify request parameters on-the-fly to test edge cases or simulate certain conditions without needing server-side changes.
Quickly build negative test cases by changing form data, headers etc.
Throttle Requests
Control the concurrency and pacing of network calls to avoid overloading servers and getting rate limited.
Block Resources
Strip unnecessary assets like images, JS files etc. to optimize page load speeds. Especially useful when scraping a large number of pages.
According to HTTP Archive, the average page weight in 2019 was 1,800 KB. Of this, images contributed to 65% – over 1,100 KB per page.
Blocking them makes the scrapers lighter and faster.
In essence, Selenium Wire transforms Selenium from a mere browser automation tool to a versatile web scraper.
Let's look at how you can harness its capabilities.
Getting Started with Selenium Wire
Let's install Selenium Wire and make our first request:
Installation
Ensure Python 3.7+ is installed on your system.
You can check your Python version by running:
python --version
If it's lower, I recommend upgrading Python first.
Once that's done, run this command:
pip install selenium-wire
This will install Selenium Wire and its main dependency – Selenium.
💡 Tip: You can confirm they are installed with
pip list
If you have an older Selenium version, upgrade it:
pip install --upgrade selenium
This ensures compatibility with Selenium Wire.
Make Your First Request
Let's open a page and print its content using regular Selenium:
from selenium import webdriver from selenium.webdriver.common.by import By driver = webdriver.Chrome() driver.get("https://scrapeme.live/shop/") body = driver.find_element(By.TAG_NAME, 'body') print(body.text) driver.quit()
- This initializes ChromeDriver
- Opens ScrapeMe
- Finds the
body
element - Prints visible text content
To integrate Selenium Wire, replace the import with:
from seleniumwire import webdriver
Run the script and you'll see the site load up in Chrome!
After closing the browser, the full page text prints.
Congratulations! 🎉
You just made your first Selenium Wire request. Let's inspect the traffic next.
Parse Response to JSON
The site displays some Pokémons with their prices. We'll scrape that data into a JSON object.
Use the woocommerce-loop-product__title
class to extract names and woocommerce-Price-amount
for prices:
import json driver.get('https://scrapeme.live/shop/') products = driver.find_elements(By.CSS_SELECTOR, '.products > li') data = {} for product in products: name = product.find_element(By.CLASS_NAME, 'woocommerce-loop-product__title').text price = product.find_element(By.CLASS_NAME, 'woocommerce-Price-amount').text data[name] = price print(json.dumps(data, indent=2, ensure_ascii=False)) driver.quit()
- Loops through the product list
- Extracts name and price
- Stores in a dictionary
- Prints output as JSON
This will print:
{ "Pikachu": "$20", "Charmander": "$25", "Squirtle": "$22" }
We've made our first request and parsed the data!
Now let's look at how we can use Bright Data proxies to prevent blocks while scraping.
Avoid Blocks with Bright Data Proxies
Unfortunately, scrapers built using Selenium Wire can also get blocked by target sites.
Some common blocking techniques include:
- IP Blocks – Sites blacklist your IP if they detect automation
- CAPTCHAs – Difficult for bots but easy for humans
- Behavior Analysis – Analyze mouse movements, scrolls, clicks etc.
So how do we bypass them? Proxies to the rescue!
How Proxies Help Avoid Blocks
A proxy acts as an intermediary that forwards traffic between your scraper and target sites.
Instead of connecting directly, all communication is routed through the proxy server.
This means that sites see the proxy's IP instead of your scraper's real IP.
Here's how it helps evade blocks:
Benefits include:
✅ Masks scraper IP to prevent IP blocks
✅ Allows geo-targeting content
✅ Adds an extra layer of anonymity
💡 Fun Fact: Proxies became popular in the early 2000s when scraper developers started using them to bypass IP blocks!
However, using just any proxy has downsides:
❌ Blocked proxies – Sites can detect and blacklist them
❌ Slow proxies – High latency, unstable connections
❌ Captchas – Need to solve them manually ❌ Captcha Farms – Expensive, limited availability
This is where Bright Data's proxies shine!
Rotate Proxies to Avoid Blocks
Bright Data provides a reliable pool of 3 million+ residential IPs perfect for automation.
The key features you'll love:
1. Unlimited Bandwidth – Scrape without worrying about usage limits
2. Speed up to 1 Gbps – Ensures blazing fast page loads
According to Cloudflare speed tests, the average internet speed globally is just 80 Mbps. That's over 10x slower than Bright Data proxies!
3. Automatic Rotation – Each request uses fresh proxies to avoid blocks.
4. 99.9% Uptime – Available whenever you need them
5. CAPTCHA Solving – No need to manually solve pesky tests
Simply set your scraper to route via Bright Data proxies and it will automatically rotate IPs.
Sites have no way to link the traffic back to your scraper! 🕵️♂️
Using Proxies in Selenium Wire
Let's integrate Bright Data proxies into Selenium Wire:
1. Get Credentials
Create a proxy zone in the Bright Data control panel and note down your credentials:
CUSTOMER_ID:password
2. Define the Proxy
PROXY = 'http://CUSTOMER_ID:[email protected]:8000'
3. Set Proxy in Options
seleniumwire_options = { 'proxy': { 'https': PROXY 'http': PROXY } } driver = webdriver.Chrome( seleniumwire_options=seleniumwire_options )
This routes all traffic through Bright Data proxies! 🚦
4. Rotate Proxies
To automatically rotate proxies and associated IP addresses with each request, use:
PROXY = 'https://CUSTOMER_ID:[email protected]:8000'
And now your scraper is resilient to even the toughest blocks! 💪
Customize Selenium Wire
With the ability to intercept traffic, let's see how to leverage Selenium Wire capabilities for customization.
Modify Request Headers
Scrapers using default Selenium headers are quite easy to detect. To set custom ones:
CUSTOM_UA = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0' def request_interceptor(request): request.headers['User-Agent'] = CUSTOM_UA driver.request_interceptor = request_interceptor
We override the User-Agent header to spoof a Chrome browser on Linux.
You can also set other headers like:
request.headers['Accept-Language'] = 'en-US'
Mock Parameters
Intercepting requests allows mocking data by modifying parameters on the fly:
def request_interceptor(request): if request.path == '/login': request.body = '{"username": "test", "password": "1234"}' driver.request_interceptor = request_interceptor
This overrides login credentials without needing server-side changes!
Block Resources
Strip unnecessary assets to optimize page load speeds:
def request_interceptor(request): if request.path.endswith('.png'): request.abort() driver.request_interceptor = request_interceptor
Now PNG images won't load, speeding up page scraping 🚀
Optimizing Selenium Wire
There are two core techniques to optimize Selenium Wire performance:
1. Browser Profile Configuration
Tweak settings like extensions, user-agent etc. to balance stealth and speed:
from selenium.webdriver.chrome.options import Options options = Options() options.headless = True options.add_argument("--disable-gpu") driver = webdriver.Chrome(options=options)
2. Request Interception
As discussed before, abort slow resource requests:
def request_interceptor(request): if request.path.endswith('.png'): request.abort() driver.request_interceptor = request_interceptor
Get the right blend of configuration and interception tailored to your specific scraping needs.
Pro Tip: Use a profiler to understand page components and identify blocking candidates.
Now let's discuss scaling up to industrial levels with Bright Data's proxy network.
Scale Web Scraping with Bright Data
For most scrapers, Selenium Wire itself poses scalability challenges:
❌ Doesn't work reliably beyond 10-20 concurrent threads
❌ Limited by single machine's compute resources
❌ Reaches memory limits when parsing large responses
❌ Exceptions and stale element issues emerge
Industrial scrapers need resilience, speed, and scale via distributed architectures.
This is where Bright Data's Proxies and Browser Automation solution comes in!
It provides a suite of robust tools for automation including:
🖥️ Headless Browsers – Chromium, Playwright, Puppeteer
🌐 Proxies – 3 million IPs with unlimited bandwidth
🤖 Anti-Bot Protection – Avoid toughest blocks and CAPTCHAs
☁️ Cloud Infrastructure – Distributed scraping from multiple regions
📈 Scalable – Horizontally scale to millions of requests per day
Let's go over the key capabilities:
Headless Browser Automation
In addition to Selenium Wire, Bright Data provides scripts for Playwright, Puppeteer and Chromium:
import brightdata browser = brightdata.Chrome()
That's it! Just start making requests through browser
instance.
It handles proxies, cookies, blocks and more automatically in the background.
Distributed Proxy Infrastructure
All traffic routes via Bright Data's distributed residential proxies:
3 million+ IPs spread globally across 150+ locations like US, UK, Canada, France etc.
It lets you target geo-restricted sites easily.
Plus, automatic rotation prevents IP blocks completely.
According to Bright Data, sites detect and block public proxies within 5 minutes on average.
But their private pools avoid blocks for months! 🕵️♂️
Anti-Bot Protection
Major sites like Facebook, Google employ advanced tactics including:
❗️ Fingerprinting and machine learning
❗️ IP analysis
❗️ JS challenge codes ❗️ Behavioral analysis
Bright Data can reliably bypass them all! Just set:
browser = brightdata.Chrome(enable_antibot=True)
It'll handle the challenges seamlessly keeping your scrapers uninterrupted.
Optimized Performance
All traffic routes through optimized paths ensuring:
⚡️ Fast page loads – Near 1 Gbps speeds
⚡️ Low latency – Direct peerings with sites
⚡️ High concurrency – Distributed infrastructure ⚡️ No blocks – Automatic rotation
You get the speed and scale suited for enterprise workloads.
Reliability At Scale
Battle-tested by Fortune 500 customers, Bright Data proxies sustain heavy use cases:
📈 Billions of requests per month
🔁 Concurrency upto hundreds of thousands of threads
⌛️ Uptime of 99.999% globally
Whether you need to scrape search engines, ecommerce sites, social media or more – it handles them all.
In short, Bright Data lets your scrapers run 24/7 without interruptions or blocks.
Sign up for a free trial with $5 credit to experience it firsthand.
Alternative Solutions
Now you might be wondering – if Bright Data already provides browser automation capabilities, where does Selenium Wire fit?
Here are some examples where Selenium Wire shines:
Pure Python Scrapers
If you want to build scraping scripts in Python without external dependencies, Selenium Wire is perfect.
It gives all capabilities like proxies and customization without needing additional libraries.
Headless Browser Development
For testers or developers working specifically on headless browser projects, Selenium Wire + Selenium provides a robust toolkit.
You get greater visibility into the automated traffic for debugging.
Open-Source Philosophy
As an open-source tool, Selenium Wire aligns better for teams preferring non-commercial solutions.
It gives good enough functionality for small-scale needs.
In essence, choose Selenium Wire if you value independence, control, and customization.
Otherwise, Bright Data makes scalable automation drop-dead simple!
Conclusion
Let's summarize the key things we learned about Selenium Wire:
✅ Inspect traffic – Analyze raw responses to understand site structure
✅ Customize requests – Change headers, mock data, block resources
✅ Avoid blocks – Rotate Bright Data proxies to prevent IP bans
✅ Optimize performance – Balance browser profiles and interception
✅ Scale scraping – Leverage Bright Data's industrial-grade proxy network
✅ Alternatives – Use Selenium Wire for open-source based scrapers
Phew, that was a comprehensive guide!
We went all the way from basics like making requests to advanced customization and scaling techniques.
Whether you are a hobbyist scraper looking to learn or an expert seeking a reference guide – hope you found it helpful!