Problem Statement
Given 2015 and 2016 Iowa liquor sales data, determine the best locations for a new storefront.
Risks and Assumptions
The team assumed that the best place to build a new store would be near stores that are performing well, but a store can perform well for several different reasons. They can have very high advertising budgets, could have monopolies on certain areas (maybe a University only allows 2 liquor stores near campus), great supply chain management, etc. To determine which location in Iowa we looked at County for the main predictor. This is because we wanted to start broad and then hone in on a specific city. All of subsequent information is dependent upon data from 2015 and Q1 2016, which is assumed to be representative of the larger data set of 2.7 million transactions. There were also several errors in the original dataset, but a majority of the data was able to be cleaned up.
Data Analysis
First, the current information was evaluated to see which stores had the highest sales and where they were located.
To determine where the client should build a new store, ‘Profit per Store’ was evaluated (the top 10 counties shown in graph). Johnson county had the highest profit per store followed by Polk, Scott, Dallas, Woodbury, Cerro Gordo, Dickinson, Howard, Kossuth, and Linn. To make sure the recommendation was not too risky, building a store in a county that had very few stores was prioritized. For example, if a county only had 1 store but a high average profit per store then it may indicated there wasn’t a big enough market.
Des Moines had the two highest performing stores (2633 and 4829) which had total sales of $407,000 and $363,000 respectively. There is over a 50% drop off to the next best store which was in Iowa City and had sales of $177,000 dollars.
A linear regression model was created to predict where sales would be highest in the future based on ‘County’, ‘State Bottle Retail’ and ‘Bottles Sold’. 2015 data was used as training data to predict what sales would be like in 2016.
In order to achieve our main goal of determining where the best performing stores are, we plotted the top 10 counties with the highest sales.
Taking it one step further, the sales were evaluated to show average sales were per store in those counties. After correcting to show the results per store, we found that sales per store did not vary nearly as much as sales per county.
The top 10 counties with the highest total of stores are Polk (207), Linn (101), Black Hawk (73), Scott (66), Johnson (53), Pottawattamie (53), Dubuque (40), Story (38), Woodbury (38), and Des Moines(21). So while Dallas had the 3rd highest profit per store, we’d be reluctant to build there since they don’t have many stores (17).
Conclusion
Polk seems like a great place to build a new store since there are already 207 stores and it still has 2nd highest profit per store. If we built the 208th store, we’d expect for their to be a minimal decrease on the profit per store average in the county whereas if we built in Dallas we’d expect a more significant decline. Other counties that appear to be great locations to build a new store are Johnson ($130,000/store, 53 stores) and Scott ($125,000/store, 66 stores).
Find Jupyter Notebook for this project here