ParseHub Tutorial: Scraping 2 eCommerce Websites in 1 Project
TLDRThis tutorial showcases a method to scrape data from one website and use it on another. It guides users through creating a project in Parsa, selecting and extracting product names and prices from Amazon, and then utilizing this data for searches on eBay. The process involves using various Parsa commands like 'select', 'relative select', 'loop', and 'go to template', and emphasizes testing the project to ensure proper functioning. The tutorial is a practical guide for those interested in web scraping and data extraction across different e-commerce platforms.
Takeaways
- π Start by opening the Parsa client and creating a new project with the URL of the website to be scraped.
- π― Use the interactive view within the Parsa client to inspect the website and identify elements for data extraction.
- π The Parsa client interface is divided into three areas: project structure and settings, interactive website view, and data preview in CSV or JSON formats.
- π Begin the project with an empty selection command, which should be placed in the command structure by default.
- π Utilize the Select command to identify and select elements like product names on the website.
- π The relative Select command is used to relate data from one element to another, such as associating a product with its price.
- π Switch between different website projects by using the Go to Template command and creating new templates for different HTML structures.
- π Use the Loop command to iterate through a list of items, such as product names from Amazon, to perform actions like searching on eBay.
- π The Browse mode allows you to simulate browsing the website and testing the data extraction process.
- π¦ Test runs can be conducted locally on your computer to debug and understand the project's behavior.
- π After running the project, results can be downloaded in CSV or JSON formats, or integrated with other applications using the provided API.
Q & A
What is the main topic of the tutorial?
-The main topic of the tutorial is demonstrating how to scrape data from one website and use it as input for another website using the Parsa tool.
What is the recommended approach for scraping websites with Parsa?
-Parsa normally recommends building separate projects for each website, but in some cases, it might be necessary to combine two different websites into one project.
How does one begin a new project in the Parsa client?
-To begin a new project, open the Parsa client, click on 'New Project', and enter the URL of the website that you would like to scrape.
What are the three areas visible within the Parsa client when a project is loaded?
-The three areas are: the left side containing the project structure and settings, the middle containing an interactive view of the website, and the bottom section for previewing data in CSV or JSON formats.
How can you select and extract product names from a website using Parsa?
-Using the 'Select' command, click on the title of the first product to select it. Parsa will then highlight similar elements in yellow. To select the rest, click on one of the highlighted products, and Parsa will automatically extract the names and URLs of these products.
How can you rename a selection in Parsa?
-To rename a selection, double-click on the command, and enter the new name for the selection, such as renaming it to 'Amazon Product' for the extracted product names.
What is the purpose of the 'Relative Select' command in Parsa?
-The 'Relative Select' command is used to relate each extracted piece of data, such as a price, to its corresponding product in the results file.
How does one switch between different templates in Parsa?
-To switch templates, click on the 'Go to Template' command, enter the URL of the new template, and create a new template if necessary.
What is the 'Loop' command used for in Parsa?
-The 'Loop' command is used to iterate through a list of items, such as product names from Amazon, and perform actions on each item, like searching for them on eBay.
How can you test a project in Parsa?
-To test a project, click on 'Get Data' at the bottom of the page and then click 'Test Run'. This allows you to run the project locally on your computer and understand its behavior.
What are the different ways to run a project in Parsa and view the results?
-You can use the 'Step In' button to run through the project one step at a time, the 'Play' button to run it slowly, the 'Fast-Forward' button to quickly see the extracted data, or the 'Stop' button to end the test run. Once the project has finished, you can download the results in CSV or JSON formats.
How can users get help with their specific projects in Parsa?
-Users can contact Parsa support at 'hello@parsub.com' for assistance with any questions or issues related to their particular projects.
Outlines
π Scraping Data from a Website and Using it on Another
This paragraph introduces the process of scraping data from one website and utilizing it on another. It begins by explaining the recommended practice of creating separate projects for different websites but acknowledges the need to combine data from different sources in certain cases. The tutorial then provides a step-by-step guide on how to set up a project in the Parsa client, including starting a new project, selecting the target website (Amazon in this case), and navigating the Parsa interface. It details the use of the Select command to extract product names and URLs, renaming selections for clarity, and the use of the relative Select command to associate prices with their corresponding products. The paragraph also touches on the importance of adjusting selection commands to capture complete data, such as full product prices.
π Using Extracted Data for Searching on a Different E-commerce Platform
This paragraph continues the tutorial by explaining how to use the extracted data from Amazon for searching products on eBay, a different e-commerce platform with a distinct HTML structure. It guides through the process of creating a new template for eBay, using a loop command to input the list of Amazon product names as search terms, and the subsequent steps to extract relevant information from eBay's search results. The paragraph also covers the use of input commands, selection of search buttons, and the creation of new entries for eBay products. Additionally, it provides insights into testing the project locally using test runs, highlighting the different modes available for review, and the final steps to retrieve and download the extracted data in desired formats. The tutorial concludes with an offer of assistance for any project-related queries and emphasizes the versatility of the demonstrated technique.
Mindmap
Keywords
π‘Web Scraping
π‘Parsa Client
π‘Project Structure
π‘Select Command
π‘Relative Select Command
π‘Template
π‘Loop Command
π‘Search Bar
π‘Data Extraction
π‘CSV and JSON Formats
π‘Test Run
Highlights
Demonstrating the process of scraping data from one website and using it as input for another.
Recommendation to build separate projects for each website, but acknowledging exceptions.
Starting a new project in the Parsa client by entering the URL of the website to scrape.
Using the interactive view within the Parsa client to preview data in CSV or JSON formats.
The automatic placement of an empty selection command in the command structure.
Selecting and extracting product names from the first website using the Select command.
Renaming selections for clarity and better organization of the project.
Utilizing the relative select command to relate product prices to their corresponding products.
Adjusting the selection to capture the entire product price using zoom out functionality.
Creating a new template for eBay due to its different HTML structure compared to Amazon.
Using a loop command to iterate through the list of Amazon product names as search terms on eBay.
Extracting and organizing eBay product information based on the Amazon product names.
Selecting and extracting product names and prices on eBay following the same steps as on Amazon.
Testing the project using test runs to understand the project's behavior and functionality.
Running the project on the server and downloading the results in CSV or JSON formats.
Providing an API for integrating the scraped data with other applications.
Offering support for any project-related questions through contact with Parsa.
Transcripts
Browse More Related Video
ParseHub Tutorial: Scraping Product Details from Amazon
ParseHub Tutorial: Directories
Web scraping | Scrape eCommerce Websites Without Coding
Web Scraping with ChatGPT Code Interpreter is Mind-Blowing!
ParseHub Tutorial: Pagination (no 'next' button)
Web Scraping to CSV | Multiple Pages Scraping with BeautifulSoup
5.0 / 5 (0 votes)
Thanks for rating: