Puppeteer Integration
Puppeteer integration with Incogniton enables browser automation using custom profiles. This combination allows for web scraping, testing, and data extraction while maintaining consistent browser fingerprints. Learn how to set up and use Puppeteer with Incogniton profiles below.
Official Incogniton SDKs are currently under development and are expected to be available in the coming weeks. In the meantime, this guide provides a reliable solution for leveraging Puppeteer with Incogniton to automate browser tasks.
Prerequisites
- Node.js: Version 14.x or higher recommended
- Puppeteer-core: Install via
npm
oryarn
How to Setup Puppeteer with Incogniton
Follow these steps to set up Puppeteer with Incogniton for browser automation:
Install Puppeteer-core
Run the following command in your Node.js project. Unlike regular Puppeteer, puppeteer-core doesn't download Chromium, making it ideal for use with Incogniton's browser profiles.
npm install puppeteer-core
Usage (JavaScript)
Here's an example of using Puppeteer to scrape a page:
import puppeteer from 'puppeteer-core'
const main = async () => {
try {
const profileId = 'your-profile-id' // Replace with your actual profile ID
const launchUrl = `http://localhost:35000/automation/launch/puppeteer`
const requestBody = {
profileID: profileId,
}
const response = await fetch(launchUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify(requestBody),
})
const data = await response.json()
const { puppeteerUrl } = data
console.log('The Incogniton browser is launching...')
await new Promise((resolve) => setTimeout(resolve, 30000)) // Wait for browser to launch
const browser = await puppeteer.connect({
browserURL: puppeteerUrl,
acceptInsecureCerts: true,
})
const page = await browser.newPage()
await page.goto('https://example.com', { waitUntil: 'networkidle2' })
const text = await page.$eval('#sample-id', (el) => el.textContent)
console.log(`Extracted text: ${text}`)
await browser.close()
console.log('Browser session closed successfully.')
} catch (error) {
console.error('Error during scraping session ->', error)
}
}
main()
Key Features
- Headless Mode: Run Chromium in the background for efficient automation.
- Rich APIs: Take screenshots, emulate devices, or intercept network requests.
- Fast Integration: Seamlessly interact with any web page, even those with dynamic content.
Troubleshooting
- Use
try/catch
for robust error handling in async code - Increase the timeout value to resolve timeout errors when navigating to a page
- Use the
--no-sandbox flag
if Puppeteer is not launching correctly due to permission issues - Run in non-headless mode or spoof the user-agent string to bypass blocked network requests or CAPTCHAs