← Back to Dashboard

Intro Data Science NumPy Pandas: Concept Notes

"""
Topic 08: Introduction to Data Science (NumPy & Pandas) - Concept Notes

1. NumPy (Numerical Python)
   - Fundamental package for scientific computing.
   - Core object: ndarray (n-dimensional array).
   - Fast, efficient for large datasets compared to standard Python lists.
   - Operations: Element-wise arithmetic, linear algebra, statistical methods.

2. Pandas
   - Library for data manipulation and analysis.
   - Core objects: 
     - Series: 1D labeled array.
     - DataFrame: 2D labeled data structure (like a table/Excel).
   - Features: Data cleaning, filtering, merging, grouping, and handling missing data.
"""

# 1. NumPy Basics
import numpy as np

# Creating an array
arr = np.array([1, 2, 3, 4, 5])
print(f"NumPy Array: {arr}")
print(f"Array Shape: {arr.shape}")

# Vectorized Operations
arr_double = arr * 2
print(f"Vectorized (x2): {arr_double}")

# Statistical Methods
print(f"Mean: {np.mean(arr)}, Std Dev: {np.std(arr)}")

# 2. Pandas Basics
import pandas as pd

# Creating a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['NY', 'London', 'Paris']
}
df = pd.DataFrame(data)
print("\nPandas DataFrame:\n", df)

# Selecting Columns
print("\nAge Column:\n", df['Age'])

# Filtering Data
over_30 = df[df['Age'] >= 30]
print("\nUsers >= 30:\n", over_30)

# 3. Reading Data (Conceptual)
# df = pd.read_csv('data.csv')
# df = pd.read_json('data.json')