Living in the digital world today has definitely made our lives easier in many aspects as the internet becomes the ultimate source to finding most of everything we need; such digital transformation has generated new challenges to how data can be assessed, collected, stored and analyzed.
The number of…
Added by Paul Black on October 22, 2018 at 3:00am —
original posted at: https://www.octoparse.com/blog/what-is-web-scraping
What Is Web Scraping?
It is the process of extracting information and data from a website, transforming the information on a webpage into structured data for further analysis. Web scraping is also known as web harvesting or web data extraction. With the… Continue
Added by Paul Black on August 30, 2018 at 7:00am —
What is web scraping?
Web scraping aka. web extraction or web crawling refers to the process of obtaining various unstructured information from any websites and turn it into structured, clean data such as xls, csv, or txt or populate the captured data to a database directly. Some common uses of web scraping include lead generation, data collection for academic researches, price monitoring from… Continue
Added by Paul Black on September 1, 2017 at 3:43am —
There are thousands of big data tools out there for data analysis today. Data analysis is the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making. To save your time, in this post, I will list out 30 top big data tools for data analysis in the areas ofopen source data… Continue
Added by Paul Black on August 27, 2017 at 10:42pm —
Web scraping software, also known as data extraction tool, is the software to collect the data from the website. It’s usually not easy for us to pick up a web scraping tool as there’s so many web scraping tools available now (refer to Top 30 Free Web Scraping Software to learn more). That’s why I decided to put the web scraping tool… Continue
Added by Paul Black on August 3, 2017 at 11:30pm —
Web scraping, also known as web crawling, (web) data extraction, data mining, screen scraping, is the process of collecting large amounts of data from the web, then save to a file, database, etc. Let’s dig deeper into web… Continue
Added by Paul Black on May 15, 2017 at 4:00am —
Web crawling (also known as web scraping) is widely applied in many areas today. It targets at fetching new or updated data from any websites and store the data for an easy access. Web crawler tools are getting well known to the common, since the web crawler has simplified and automated the entire crawling process to make web data resource become easily accessible to everyone. Using a web crawler tool will set free people from repetitive typing or copy-pasting, and we could expect a… Continue
Added by Paul Black on April 26, 2017 at 4:35am —
Octoaprse enables you to scrape latest news articles from news source.
There're two parts for getting the real-time data in Octoparse - Make a scraping task and schedule a task to run it in Octoparse cloud.
In this web scraping tutorial we will scrape the latest news articles of TheStreet.com to get article information - such as the title of article, article body text, date/time article published, author and article URL with… Continue
Added by Paul Black on January 8, 2017 at 10:57pm —
Octoaprse enables you to scrape finance data from financial websites. There're two parts for getting the real-time data in Octoparse - Make a scraping task and schedule a task to run it in Octoparse cloud.
In this web scraping tutorial we will scrape the stock data - such as most active stocks, stock gainers and stock losers on Yahoo Finance with Octoparse.
The website URLs we will use are as follows.…
Added by Paul Black on January 5, 2017 at 11:05pm —
Octoaprse enables you to scrape stock information from financial website. There're two parts for getting the real-time data in Octoparse - Make a scraping task and schedule a task to run it in Octoparse cloud.
In this web scraping tutorial we will scrape the stock data from CNN Money website to get detail information - such as the title, body, published date/time, author and article URL with Octoparse.
The website URL we will use is… Continue
Added by Paul Black on January 5, 2017 at 11:01pm —
Q: How to add a fixed value when scraping in Octoparse?
How to add a fixed value as one of the data fields when making a scraping task in Octoparse?
The simplest method:
You can add a fixed value when you are in the "Extract Data" action:
1. Click the "Add Pre-defined Fields".
2. Choose the “Add a fixed value…
Added by Paul Black on January 5, 2017 at 5:20am —
Octoparse enables you to scrape reviews from yelp.com.
In this tutorial we will scrape all reviews about car audios in Brooklyn, NY, United States from yelp.com with Octoparse.
The website URL we will use is … Continue
Added by Paul Black on January 3, 2017 at 1:41am —
In this tutorial we will scrape the phone numbers of all the hotels and their customer reviews in London from TripAdvisor.com with Octoparse.
The website URL we will use is https://www.tripadvisor.com/Hotels-g186338-London_England-Hotels.html.
The data fields include Hotel name, the number of reviews, address, ranking, PhoneNumber, customer… Continue
Added by Paul Black on January 3, 2017 at 1:39am —
Octoparse enables you to scrape the search results from Yell.com. After you enter the items you want to search in a certain region, you will redirect to the search page by clicking the “Search” botton.
In this tutorial we will scrape data about all restaurants in London from yell.com with Octoparse.
Then we will use the URL of the…
Added by Paul Black on January 3, 2017 at 1:37am —
Octoparse enables you to scrape data from eBay.com. To speed up the extraction, you can use our Cloud Extraction to split the scraping task into many sub-tasks. Then our cloud servers will collect the data shortly and provide you with a structured data-set.
To scrape product details from eBay.com as fast as possible, you can make two scraping tasks -- Task 1 and Task 2. Task 1 is used to scrape the URLs of product details and Task 2 is used to scrape all the product details from … Continue
Added by Paul Black on January 3, 2017 at 1:35am —
Octoparse enables you to scrape yellowpages.com (www.yp.com). You can capture names, addresses, cities, phone numbers, websites, etc of a certain job positions in a region posted on yellowpages.com.
In this tutorial we will scrape all anesthesiologist in New York, NY, United States from yellowpages.com with Octoparse.
The website URL we will use is …
Added by Paul Black on January 3, 2017 at 1:33am —
Octoparse enables you to scrape the online dictionary into an organized list by entering a list of words. It’s very easy to use and could get the definition and examples of the word you want by using a Loop mode for entering a text list.
In this tutorial, I will show you how to scrape definition of some words from merriam-webster.com.
The website URL we will use is … Continue
Added by Paul Black on December 29, 2016 at 9:43pm —
(picture from www.luxurybackpacker.com)
Collecting online customer reviews, including star ratings, comments, likes, dislikes, images, videos, share channels and etc, can help an online retailer to better understand if the product sold is a good purchase and popular among customers, thus to adjust marketing strategies. There are many web scraping tools available online to live up to your expectations to scrape… Continue
Added by Paul Black on December 29, 2016 at 9:00pm —
Octoparse offers the most convenient way to scrape data from websites. Although few programming knowledge is required, some still claim that they have no ideas about how to use Octoparse. Thus this post aims to help our lovely new users to settle into Octoparse smoothly.
Below you will find links to 10 of the most helpful tutorials that will support you to make a first step in Octoparse. These guides will not only help you in scraping different kinds of website structures,… Continue
Added by Paul Black on December 29, 2016 at 8:47pm —
We all want to get a neat Excel spreadsheet with the data scraped, before going further analysis.
With Octoparse, you can fetch the data you want from websites and have the data ready for your use. Our cloud services enable you to fetch large amounts of data by running your scraping task with Cloud Extraction. The premise is, you know how to deal with all the circumstances when you are using Cloud Extraction to scrape the sites.
We summarize several problems encountered by our… Continue
Added by Paul Black on December 27, 2016 at 4:14am —