In Python, every value has a type. A data type defines what kind of value something is and what you can do with it. Think of it like this:
A stringholds text.
An intholds whole numbers.
A floatholds decimal numbers.
A listholds a collection of items.
And so on...
Knowing data types is critical in machine learning, because your models expect specific types of data — like numbers for training, labels as strings, or structured data like lists and arrays.
Built-in Data Types in Python (Core Types)
1. Numeric Types
int: Integer (whole number)
float: Floating point (decimal)
complex: Complex number
2. Text Type
str: String
3. Sequence Types
list: Ordered, changeable, allows duplicates
tuple: Ordered, unchangeable (immutable)
range: Used for loops
4. Mapping Type
dict: Key-value pairs
5. Set Types
set, frozenset
6. Boolean Type
bool: True or False
7. Binary Types
bytes, bytearray, memoryview
Practical Examples + Notes for ML
1. Numbers: int, float, complex
ML Note:
When feeding data to machine learning models:
Use int or float.
Avoid complex unless working with signal processing or advanced math.
2. String: str
Strings are used to hold text like:
Category labels ("spam", "ham")
Column names in pandas
File paths ("data/train.csv")
ML Note: Use label encoding or one-hot encoding to convert strings into numbers for ML models.
3. List: list
Lists hold ordereditems.
Can hold mixed types, but that’s discouraged in ML input.
ML Use Case:
Store a row of features.
Hold dataset samples before converting to NumPy or pandas.
4. Tuple: tuple
Similar to list, but immutable(cannot be changed).
Often used for coordinates, fixed-size data.
ML Note: Use when you want fixed data that should not be changed (e.g., image shape (224, 224, 3)).
5. Dictionary: dict
Holds key-value pairs.
Fast lookup, widely used for configurations and mappings.
ML Use:
Store model settings: {"learning_rate": 0.01}
Mapping labels: {"cat": 0, "dog": 1}
6. Set: set
Unordered, no duplicates.
Useful to remove duplicates or check membership.
ML Tip: Use set() to find unique classes in your target column:
7. Boolean: bool
Used in:
Conditional statements
Controlling training loops
Evaluation (e.g., accuracy > 0.9)
ML Use:
8. Type Checking with type() and isinstance()
Tip: Always validate your data types before passing to ML models!
9. Type Casting (Conversion)
Real-world ML Example: CSV files often load data as str. You must convert to int or float:
Summary Table
Data Type
Example
ML Usage
int
5
ID, count, label
float
3.14
Feature value, weight
str
"cat"
Label, path, text
bool
True
Condition, flag
list
[1, 2]
Features, samples
tuple
(2, 3)
Shape, coordinates
dict
{"a": 1}
Configs, mappings
set
{"a", "b"}
Unique values
Example: Simple ML Data Representation
id: int
features: list of float
label: str
This is a common structure before converting to pandas DataFrame or NumPy array.
Keywords
data types, python data types, int, float, str, list, tuple, dict, set, bool, type conversion, type casting, isinstance, type, machine learning, data preprocessing, feature engineering, python basics, numeric types, sequence types, nerd cafe