Skip to content
Learn Netverks

Lesson

Step 1/36 3% through track

intro

Introduction to Pandas

Last reviewed May 28, 2026 Content v20260528
Track mode
server_script
Means
Server runner
Reading
~4 min
Level
beginner

This lesson

An orientation to the Pandas track—Series, DataFrames, wrangling, groupby, merges, and links to SciPy/sklearn next.

You need labeled-table fluency before sklearn and production ETL—otherwise groupby, merge, and datetime bugs dominate every sprint.

You will apply Introduction to Pandas in contexts like: CSV/Parquet analysis, ETL notebooks, and ad hoc reporting.

Read the narrative, run `import pandas as pd` snippets with in-memory DataFrames (install pandas and numpy with pip if needed), inspect `.head()`, `.dtypes`, and complete MCQs. Also read the interview prep blocks; print `df.shape`, `df.dtypes`, and `df.head()` after every transform.

After /python/intro and /numpy/intro—when you are ready for labeled tables and daily wrangling workflows.

How this Pandas track works

  • Python playground — lessons use execution_profile: server_script. Snippets use import pandas as pd with in-memory DataFrames; install with pip install pandas numpy if the runner lacks them.
  • Tables first — Series, DataFrames, IO concepts, filtering, groupby, merges, time series, and performance—after NumPy arrays.
  • Prerequisites — finish Python, NumPy basics, and skim Data Science workflow. SQL helps for warehouse + pandas splits.
  • Pair withSciPy for scientific routines, SQL for large aggregates, and AI for ML product context.

Run each lesson, inspect .shape, .dtypes, and .head() after every transform—habits that prevent silent data bugs.

Install on your device (macOS, Linux, Windows)

Install Python 3.11+ locally for notebooks and frameworks; the on-site playground uses the dev runner when enabled.

macOS

  1. brew install python@3.12 or install from python.org (check “Add to PATH” on installers).
  2. Create a project folder: mkdir ~/python-practice && cd ~/python-practice.
  3. python3 -m venv .venv && source .venv/bin/activate
  4. pip install --upgrade pip

Linux

  1. Debian/Ubuntu: sudo apt update && sudo apt install -y python3 python3-pip python3-venv
  2. Fedora: sudo dnf install -y python3 python3-pip
  3. python3 -m venv .venv && source .venv/bin/activate
  4. pip install --upgrade pip

Windows

  1. Install from python.org and enable Add python.exe to PATH.
  2. Or: winget install Python.Python.3.12
  3. PowerShell: py -3 -m venv .venv; .\.venv\Scripts\Activate.ps1
  4. pip install --upgrade pip

Verify: python3 --version (or py --version on Windows) shows 3.11+.

Run code on this site (Backend & language playgrounds)

  1. Clone or open this project locally; copy .env.example to .env.
  2. Ensure LEARNING_RUNNER_ENABLED=true and LEARNING_RUNNER_URL=http://127.0.0.1:9999/v1/execute.
  3. Terminal 1: php artisan serve (or composer run dev for Laravel + Vite + runner together).
  4. Terminal 2: npm run runner — keep it running while you click Run on server.

In your venv: pip install pandas

Pandas is Python's primary library for labeled tabular data. Its Series and DataFrame types sit on top of NumPy ndarrays and add row/column labels, file IO, groupby, merge, and time-series tools—essential after Python basics, NumPy, and Data Science workflow concepts.

Prerequisites and how this track works

Complete Python basics, NumPy intro, and skim Data Science intro for EDA and ethics context. Lessons run Python with execution_profile: server_script; Pandas and NumPy are pre-installed in the playground—no pip install needed here.

What you will learn

  • Creating and inspecting Series and DataFrame objects
  • Selection with loc/iloc, filtering, sorting, and missing data
  • Groupby, pivot, merge, concat, and duplicate handling
  • Categoricals, time series, rolling windows, and performance habits
  • How Pandas connects to NumPy, Matplotlib, scikit-learn, SciPy, and SQL

First run

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'product': ['Widget', 'Gadget', 'Widget'],
    'price': [9.99, 14.50, 9.99],
    'qty': [10, 5, 8],
})
print('Pandas version:', pd.__version__)
print(df)
print('shape:', df.shape, 'columns:', list(df.columns))

Why Pandas after NumPy?

NumPy excels at homogeneous numeric arrays. Real analytics data has column names, mixed types, missing values, and joins across tables. Pandas adds labels and relational-style operations while keeping vectorized performance for numeric columns.

Important interview questions and answers

  1. Q: Is Pandas required for data science?
    A: Nearly always for tabular EDA and feature engineering in Python—sklearn and plotting libraries expect Pandas or ndarray inputs.
  2. Q: What is a DataFrame?
    A: A 2D labeled table: rows indexed (often 0..n-1), columns named, each column often backed by a 1D NumPy array.

Self-check

  1. Which three tracks should you finish before deep Pandas work?
  2. What two core Pandas types will you use most?

Challenge

First Pandas run in this track

  1. Click Run with the default code.
  2. Confirm the terminal shows DataFrame shape and column dtypes.
  3. Filter one row or column and run again.

Done when: the terminal shows DataFrame output and your filter result.

Tip: Run the playground challenge—every lesson uses import pandas as pd and import numpy as np.

Interview prep

Prerequisite?

Python basics (/python/intro), NumPy (/numpy/intro), and data science workflow (/data-science/intro).

Core types?

Series (1D labeled) and DataFrame (2D labeled table).

Next tracks?

SciPy (/scipy/intro) for scientific algorithms; SQL (/sql/intro) for warehouse-scale queries.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • Why Pandas after NumPy?
  • Series vs DataFrame?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump