Skip to content
Learn Netverks

Lesson

Step 1/36 3% through track

intro

Introduction to Data Science

Last reviewed May 28, 2026 Content v20260528
Track mode
server_script
Means
Server runner
Reading
~3 min
Level
beginner

This lesson

An orientation to the Data Science track—workflow, ethics, Python playground practice, and links to NumPy/Pandas next.

You need a clear map of the Data Science lifecycle so exploration, leakage, and stakeholder communication do not feel like ad hoc guessing.

You will apply Introduction to Data Science in contexts like: Analytics teams, product experimentation, research labs, and ML-adjacent engineering in every data-driven company.

Read the narrative, run Python in the playground (stdlib snippets now; install Jupyter, pandas, and scikit-learn locally for full notebooks), and complete MCQs to lock in vocabulary. Also read the interview prep blocks; write one measurable question for a dataset you care about.

After /python/intro basics and ideally some /sql/intro—before deep NumPy/Pandas specialization.

How this Data Science track works

  • Python playground — lessons use execution_profile: server_script. Run snippets in the playground; heavier stacks (Jupyter, pandas, scikit-learn) install locally with pip.
  • Workflow-first — questions, data quality, exploration, cleaning, modeling concepts, ethics, and communication—before deep dives on NumPy, Pandas, and SciPy.
  • Prerequisites — finish Python basics and skim SQL for warehouse queries. Statistics intuition helps but is taught from scratch here.
  • Pair withAI and Generative AI for product context after you understand the data science lifecycle.

Playground code uses Python stdlib where possible. Install pandas, jupyter, and matplotlib locally for full notebook workflows.

Install on your device (macOS, Linux, Windows)

Install Python 3.11+ locally for notebooks and frameworks; the on-site playground uses the dev runner when enabled.

macOS

  1. brew install python@3.12 or install from python.org (check “Add to PATH” on installers).
  2. Create a project folder: mkdir ~/python-practice && cd ~/python-practice.
  3. python3 -m venv .venv && source .venv/bin/activate
  4. pip install --upgrade pip

Linux

  1. Debian/Ubuntu: sudo apt update && sudo apt install -y python3 python3-pip python3-venv
  2. Fedora: sudo dnf install -y python3 python3-pip
  3. python3 -m venv .venv && source .venv/bin/activate
  4. pip install --upgrade pip

Windows

  1. Install from python.org and enable Add python.exe to PATH.
  2. Or: winget install Python.Python.3.12
  3. PowerShell: py -3 -m venv .venv; .\.venv\Scripts\Activate.ps1
  4. pip install --upgrade pip

Verify: python3 --version (or py --version on Windows) shows 3.11+.

Run code on this site (Backend & language playgrounds)

  1. Clone or open this project locally; copy .env.example to .env.
  2. Ensure LEARNING_RUNNER_ENABLED=true and LEARNING_RUNNER_URL=http://127.0.0.1:9999/v1/execute.
  3. Terminal 1: php artisan serve (or composer run dev for Laravel + Vite + runner together).
  4. Terminal 2: npm run runner — keep it running while you click Run on server.

Starter stack: pip install jupyter pandas matplotlib seaborn scikit-learn

Data science turns raw data into decisions: ask questions, collect data, clean and explore, model uncertainty, and communicate findings. This track teaches the workflow and thinking—using Python in the playground and pointing to NumPy, Pandas, and local Jupyter for deeper tooling.

Prerequisites and how this track works

Finish Python basics (variables, functions, lists, dicts). Skim SQL if you will query warehouses. Lessons run Python with execution_profile: server_script; install jupyter and pandas locally for notebook workflows.

What you will learn

  • Framing business questions and data types
  • Exploration, quality, missing data, and outliers
  • Cleaning, train/test splits, and modeling concepts
  • Metrics, cross-validation, bias, and ethics
  • Visualization, storytelling, reproducibility, SQL in pipelines

First run

print("Data Science track")
print("Next: NumPy, Pandas, and local Jupyter for full stacks")

Data science vs software engineering

Engineers ship features; data scientists validate hypotheses with evidence. Both write Python—but DS emphasizes distributions, leakage, and stakeholder communication.

Important interview questions and answers

  1. Q: Is this the same as ML?
    A: Data science includes exploration and communication; ML engineering focuses on training and serving models at scale.
  2. Q: Why Python?
    A: Readable syntax and PyPI ecosystem—pandas, scikit-learn, and notebooks are industry defaults.

Self-check

  1. What two tracks should you complete first?
  2. What execution profile does this topic use?

Challenge

First Python run in Data Science track

  1. Click Run with the default code.
  2. Confirm output appears in the terminal.
  3. Add a line printing one data science question you care about.

Done when: the terminal shows the default message and your custom question.

Tip: Run the playground challenge before moving on—later lessons assume Python runs.

Interview prep

Prerequisite?

Python basics (/python/intro) and SQL literacy (/sql/intro) help.

Playground?

server_script runs Python; install Jupyter/pandas locally for full stacks.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Playground

Runs on the configured server runner (dev: npm run runner with LEARNING_RUNNER_ENABLED=true). Output appears below the editor.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • Why DS after Python?
  • Jupyter local setup?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump