E-commerce is The challenge of running an e-commerce site is further amplified due to the vast number of SKUs that are involved. With millions of products, the associated content such as pricing information, product descriptions, images and reviews can easily go into hundreds of millions for some of the largest e-tailers. Though most sites predominantly use their own data, clicking their own images and filling out the other associated details in-house, some data is obtained by crawling sites of the product sellers or suppliers.
To understand how a database works think of an Excel file. You have columns and rows. The columns contain specific types of information, such as product name, price, color and many other fields for eCommerce websites. Rows contain the actual information for each of those columns. For example row one may be a shoe product and will have "Nike Air Jordan" in the product column, "$185.00" in the price column and "White, Black" in the color column. As mentioned above, if you want customers to be able to search and sort by designer or manufacturer, you'll also need a column for "Designer" or "Manufacturer" and also have that designer name in the row with the product (ex: "Nike"). This would then allow a customer to search for "Nike" only products and the website would look for all rows that have "Nike" in the "Designer" column.
We have a secret ingredient that helped us build an estimate from the ground-up: proprietary data. Here at RJMetrics, we work with hundreds of online retailers who generously allow us to anonymize high-level data points for analyses like these.
By combining our proprietary data with size and revenue information from third-party sources like the Internet Retailer Top 500 Guide, Alexa, and BuiltWith, we've conducted a comprehensive bottoms-up analysis of the ecommerce industry.
Obviously, the long tail is going to be very long here. Using BuiltWith to identify which websites have ecommerce technologies installed, we found 180,000 live websites with just the Magento shopping cart. When you extrapolate to include the full universe of competing ecommerce technologies, you can see how some estimates approach the one-million mark. As you might have guessed, however, the majority of these sites are not generating revenue on any meaningful scale.
Alexa rank is an easily-obtained proxy for traffic. Alexa ranks every website in the world based on traffic volume. A global rank of 1 represents the website with the most traffic in the world (currently Google). Since ecommerce revenue is directly correlated with the number of visitors to a site, we theorized that Alexa rank could serve as a proxy for revenue. To test this, we needed revenue data for a set of ecommerce companies that spanned a broad spectrum of Alexa ranks.
To get revenue data, we turned to the data in the Internet Retailer Top 500 guide and augmented it with our own proprietary benchmarking data set. The IR 500 includes the heaviest-hitters in ecommerce and our own data covered mid- and smaller-sized companies. Between these two data sets we had Alexa rank and revenue data on the full spectrum of ecommerce companies. Here's what we saw: