Sunday 9 August 2015

Data Visualization in Python

Introduction:

In the previous section we did some basic data processing to find out some details about our data.In this section we will see how to use powerful data visualization power of python to communicate results in a far better and interpretable way.We will be using the matplotlib library of python.

matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB®* or Mathematica®), web application servers, and six graphical user interface toolkits.

Bar Plot of Radish Votes:
Let's see how to generate bar plot of the data.We will display the individual counts of radish variety across the data.

Code:

This gives us the following plot:
One view at the plot and we know that champion is the favorite radish variety.This is the power of visualizations.We have used a lot of code here so let's try and understand that.
We are importing two libraries matplotlib and numpy.While matplotlib is based on how charting is done in MATLAB,numpy has a lot of numeric functions that come in quite handy.
In the for loop we generate output that is easy for matplotlib to interpret.The names contain the labels for the x - axis while the votes contain the actual data for plotting.

We create a range of indexes for the X values in the graph, one entry for each entry in the "counts" dictionary (ie len(counts)), numbered 0,1,2,3,etc.This will spread out the graph bars evenly across the X axis on the plot.
np.arange is a NumPy function like the range() function in Python, only the result it produces is a "NumPy array".
plt.xticks() specifies a range of values to use as labels ("ticks") for the X axis.
x + 0.5 is a special expression because x is a NumPy array. NumPy arrays have some special capabilities that normal lists orrange() objects don't have.
The above part is from opentechschool.python tutorial.
To add labels to the x-axis and y-axis we can use:
Conclusion:
Here we have seen the power of visualization in python using the matplotlib library in python.

No comments:

Post a Comment