What is Series?
In pandas, a series is a one-dimensional labeled array capable of holding any data type. It is a fundamental data structure in pandas and can be thought of as a column in a spreadsheet or a single column of a DataFrame. The Series is similar to a NumPy array but comes with additional features like labeled indices, which makes it more powerful and flexible.
import pandas as pd
# Creating a Series with custom indices data = [1, 2, 3, 4, 5] custom_indices = ['a', 'b', 'c', 'd', 'e'] series = pd.Series(data, index=custom_indices)
#1st argument is Data, next is index # Displaying the Series with custom indices print(series)
O/P:
a 1 b 2 c 3 d 4 e 5 dtype: int64
How to access value from Series?
print(series['c'])
O/p:
3
What is DataFrame?
In pandas, a DataFrame is a two-dimensional labeled data structure that is widely used for handling and manipulating tabular data. It can be thought of as a table, similar to a spreadsheet or a SQL table, where data is organized in rows and columns. import pandas as pd
import numpy as np
from numpy.random import randn
np.random.seed(121)
# Create Dataframe
df = pd.DataFrame(randn(5, 4), ['A', 'B', 'C', 'D', 'E'], ['W', 'X', 'Y', 'Z'])
print("DataFrame:\n")
print(df)
# Access Column: Return type -> Series
print("Access Column named 'W'\n")
print(df['W'])
# Access list of Column : Return type -> Dataframe
print("Access Columns named Y & Z\n")
print(df[['Y', 'Z']])
# Create a new column
df['V'] = df['Y'] + df['Z']
print("DataFrame:\n")
print(df)
# Drop the new column, Inplace -> To drop row, use axis=0
df.drop('V', axis=1, inplace=True)
print("DataFrame:\n")
print(df)
# Access Row using its label : Return type -> Series
print("Access Row labeled as 'B':\n")
print(df.loc['B'])
# Access Row using its internal index : Return type -> Series
print("Access Row 1:\n")
print(df.iloc[1])
#Access multiple rows : Return type-> DataFrame
print("Access Row labeled as B & C:\n")
print(df.loc[['B','C']])O/p:DataFrame: W X Y Z A -0.212033 -0.284929 -0.573898 -0.440310 B -0.330111 1.183695 1.615373 0.367062 C -0.014119 0.629642 1.709641 -1.326987 D 0.401873 -0.191427 1.403826 -1.968769 E -0.790415 -0.732722 0.087744 -0.500286 Access Column named 'W' A -0.212033 B -0.330111 C -0.014119 D 0.401873 E -0.790415 Name: W, dtype: float64 Access Columns named Y & Z Y Z A -0.573898 -0.440310 B 1.615373 0.367062 C 1.709641 -1.326987 D 1.403826 -1.968769 E 0.087744 -0.500286 DataFrame: W X Y Z V A -0.212033 -0.284929 -0.573898 -0.440310 -1.014208 B -0.330111 1.183695 1.615373 0.367062 1.982435 C -0.014119 0.629642 1.709641 -1.326987 0.382653 D 0.401873 -0.191427 1.403826 -1.968769 -0.564943 E -0.790415 -0.732722 0.087744 -0.500286 -0.412542 DataFrame: W X Y Z A -0.212033 -0.284929 -0.573898 -0.440310 B -0.330111 1.183695 1.615373 0.367062 C -0.014119 0.629642 1.709641 -1.326987 D 0.401873 -0.191427 1.403826 -1.968769 E -0.790415 -0.732722 0.087744 -0.500286 Access Row labeled as 'B': W -0.330111 X 1.183695 Y 1.615373 Z 0.367062 Name: B, dtype: float64 Access Row 1: W -0.330111 X 1.183695 Y 1.615373 Z 0.367062 Name: B, dtype: float64 Access Row labeled as B & C: W X Y Z B -0.330111 1.183695 1.615373 0.367062 C -0.014119 0.629642 1.709641 -1.326987Similarly,
# Access a particular cell
print(df.loc['B', 'Y'])
# Access a subset of DF
print(df.loc[['B','C'],['X','Y']])
1.6153729283493805
X Y
B 1.183695 1.615373
C 0.629642 1.709641
No comments:
Post a Comment