By Christian Prokopp on 2022-05-10
Get huge, valuable datasets with 4.9 million Amazon bestsellers for free. No payment, registration or credit card is needed.
Find the latest and biggest dataset yet at Free Amazon Product and Bestseller Data.
Download the bestsellers of Amazon.com (2,090,907 products), Amazon.de (1,431,524 products), and Amazon.co.uk (1,377,192 products). The products list all categories in which they rank in the top 100, the product name, reviews, review average, offer price, number of offers, and extra tag data like author, type, brand, etc.
The datasets* contain all Amazon Germany, Uk and US bestsellers, i.e. top 100 products of all categories. Bold Data retrieved them on the 8th and the 9th of May 2022. For a detailed list of data attributes (column names) and their description go to the end of the post.
Note, Bold Data can provide this data updated with any frequency on request and as time-series data including trends on categories, price, reviews, ranking, etc. or any other data. Get in touch by email to start a conversation.
Amazon bestseller data is being used by startups and established businesses to analyse their pricing, supply, new product and category strategies, for example. Other uses include students studying data analytics, business intelligence, data science or machine learning, or doing online learning or competitions with Coursera or Kaggle, for example. Other use cases include researchers at universities who analyse market changes, retail and e-commerce.
A random sample of one thousand products from Amazon.com.
A random sample of one million products from Amazon.com.
The full bestseller dataset consisting of 2,090,907 products was retrieved from Amazon.com and stored as a gzipped CSV file. Please send an email request to receive free access.
A random sample of one thousand products from Amazon.co.uk.
A random sample of one million products from Amazon.co.uk.
The full bestseller dataset consisting of 1,377,192 products was retrieved from Amazon.co.uk and stored as a gzipped CSV file. Please send an email request to receive free access.
A random sample of one thousand products from Amazon.de.
A random sample of one million products from Amazon.de.
The full bestseller dataset consisting of 1,431,524 products was retrieved from Amazon.de and stored as a gzipped CSV file. Please send an email request to receive free access.
If you need help with the data contact Christian the founder of Bold Data. If you want to stay up to date with information on the datasets and future datasets subscribe to the email list (see the bottom or top right for links).
If you have specific dataset needs and want to inquire about Bold Data's services do contact Christian. This can be specific to Amazon, e.g. frequent updates, detailed product data or different countries. It can also be completely different websites, datasets or analyses you are interested in.
Below are the dataset column names and their meaning.
sku: The unique product identifier (ASINs in these datasets).
name: The product name.
review_avg: The average review rating.
review_count: The total number of reviews.
ranks: All category identifiers and associated best selling rank for the category.
min_rank: Best (smallest) rank across bestseller categories.
max_rank: Worst (highest) rank across bestseller categories.
ranks_count: Number of bestseller categories the product was found in.
offer: Best offer price in local currency, e.g. GBP, USD, EUR.
offers: Number of offers (may include used or warehouse offers).
tag1: Additional data like product type, author, brand, etc.
tag2: Additional data like product type, author, brand, etc.
request_date: Date the data was retrieved.
*Note that no guarantees are made about the completeness or accuracy of the data and no liabilities arise from the download or use of the data. The data was collected from public sources as is and may contain restricted data like trademarks, language deemed inappropriate in certain circumstances, erroneous data or other unforeseen limitations.
The data may be used for private or commercial analysis and decision-making use only. Redistribution or resale of the data is prohibited unless explicitly agreed in writing by Bold Data Ltd. Where the data is used, e.g. for analysis, diagrams, charts or otherwise, attribution to the source, e.g. "Bold Data, https://www.bolddata.org" or equivalent, must be made.
Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at christian@bolddata.biz for inquiries.
2024-04-12
128k tokens are 96k words in English for ChatGPT 3.5 and 4. The ratio is estimated to be 0.75 words per token. However, the answer is not straightf...
2024-03-14
Tax Shrink is a new online tool that helps owner-operators of Limited companies in the UK calculate and visualise the ideal salary-to-dividend rati...
2023-02-14
Discover the power of the Delta Lake transaction log - ensuring Data reliability and consistency.
2023-02-11
Microsoft could follow Google's $100bn loss. I tried the new Bing Chat (ChatGPT) feature, which was great until it went disastrously wrong. It even...
2023-01-29
Can ChatGPT help you develop software in Python? Let us ask ChatGPT to write code to query AWS Athena to test if and how we can do it step-by-step.
2023-01-25
ChatGPT and similar language models have recently been gaining attention for their potential to revolutionise code generation and enhance developer...