Data Visualization.
The first chart (a pie chart) represents the frequency distribution for the variable shipping modes. The variable has four levels: second class, standard class, first-class, and the same day flights. The pie chart shows that 60% of the orders are shipped under standard class, 20 % are shipped under second class, 15% are shipped under first-class, and lastly 5% are shipped under same-day flights.
Graph 1:
The second graph is a bar graph representing the products shipped against the number of products. Number of products is labelled as quantity, and this variable was calculated by adding the number of products shipped in each category. From the graph, we can see that a large number of items were shipped through standard class (22797 products), and the least number of items were shipped through same-day flights (1960 items); 5693 items were shipped through first-class, and 7423 items were shipped using second class. Most clients prefer standard class and same-day flights, probably because the client does not have items to send back.
Graph 2:
The third graph is also a bar graph representing discounts warrantied on goods against shipping methods. The graph shows that goods shipped by standard class received the highest discount rates, followed by second and first-class flights, then same-day flights. The previous graph (graph 2) showed that most goods were transported using standard class and the shipping method used least was same-day flights.
I believe that the number of goods transported through standard class had higher discount rates because the number of items transported by standard class was significantly higher than the other shipping method; the same case applies to same-day flights. Standard class flights transported more goods. As a result, they had more discounts, not necessarily because they received higher or lowered discounts. Instead, they had a significantly higher number of products, which meant that they had a greater number of cumulated discounted prices. In statistics, however, people make informed guesses and justify their take on by putting their suggestions to the test. That is why the next test had to be a correlation test accompanied by a regression test.
Graph 3:
As mentioned above, the graph above represents a regression between the number of goods, labelled as quantities, against discounted amounts. In correlation, we test if a hypothesis is null or alternative. Correlation tests if there is a significant relationship between two variables. In this case, we measure if the number of orders significantly relates to the discounted amount. Table 1 represents the results of our test hypothesis, and graph 5 represents a scatter plot diagram that represents regression.
The table shows that our correlation coefficient is 0.009, which means that the two variables are positively correlated. Increasing the quantities of the order increases our discounted amount, proving my theory was true. However, was the theory significantly true? My theory was not significantly true because my test was not significant; my p-value was 0.3887, which is more than 0.05. In addition, even if my theory was true, the relationship between discounts and the number of orders was weak. Why? One, My correlation coefficient was 0.009, which shows a very weak relationship. Once the coefficient is close to zero, it’s weak, and when it is closer to plus or minus one, it is strong. Two, when you look at the scatter plot. Most of the points are spread out, showing that this relationship is weak.
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0,008623 | |||||||
R Square | 7,44E-05 | |||||||
Adjusted R Square | -2,6E-05 | |||||||
Standard Error | 2,225138 | |||||||
Observations | 9994 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 3,678854 | 3,678854 | 0,743017 | 0,388717 | |||
Residual | 9992 | 49472,79 | 4,95124 | |||||
Total | 9993 | 49476,47 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95,0% | Upper 95,0% | |
Intercept | 3,775057 | 0,027912 | 135,2507 | 0 | 3,720344 | 3,829769 | 3,720344 | 3,829769 |
Discount | 0,092937 | 0,107818 | 0,861984 | 0,388717 | -0,11841 | 0,304282 | -0,11841 | 0,304282 |
Table 1:
Graph 4:
Graph 5:
Graph 5 is a bar graph representing profits against shipping methods. From the graph, we can take away that a lot of profit is made from goods transported by standard class, followed by second and first-class, then the least profits margins are recorded for same-day flights.
Table 2.
The most important analysis in the data provided will be examining if the sales made are related to the profit margin. Therefore, we need to carry out a correlation test. For example, if you look at table two, you will notice that our correlation coefficient equals 0.479064; therefore, our variables are positively related, increasing the number of sales increase the profit margin. In addition, the relationship is not only positive; it is also significant because our p-value is less than 0.05. Looking at graph six, we can tell that the relationship between sales and profit is a moderate association; some points are too close to the line, but others are far away. Also, the correlation coefficient is 0.479, which is categorized as a medium correlation.
What makes statistics amazing is the power to confirm our speculation and theories. Graphs, tables, charts, and any other visuals help us identify trends and predict information, just like I did in this assignment. However, the information presented in this paper is just but a rough review of inferences that we can draw from the data provided.
Leave a Reply