/*Font style and formatting for LibGuides*/
Skip to Main ContentNumPy (from “numerical Python”) is a Python package that handles multi-dimensional arrays and matrices, as well as more extensive mathematical functions than are available in base Python. NumPy handles mathematical operations in parallel, and calculations on numpy arrays are much faster than iterating through a for loop in base Python.
NumPy functions are useful in data science for cleaning and transforming numerical data. Many of them are available for use in Pandas.
Pandas (from “panel data”) is a Python package that is primarily used to handle tabular data in the form of a DataFrame object. DataFrames are similar in many ways to spreadsheets. Pandas is a powerful and versatile toolkit that allows its user to merge, join, summarize, concatenate, compare, and reshape data. It can even be used to visualize data, although there are more sophisticated tools for that.
Pandas has many reader functions that allow data to be read into Python from a variety of sources, including .csv, .json, .xlsx, SQL, HDF5, html, and many others. Each of these functions creates a DataFrame object from the file that’s read in. Pandas also includes corresponding writer functions that can write a DataFrame’s contents to a file.