Data Types
Nerd Cafe
In Python, every value has a type. A data type defines what kind of value something is and what you can do with it. Think of it like this:
A string holds text.
An int holds whole numbers.
A float holds decimal numbers.
A list holds a collection of items.
And so on...
Knowing data types is critical in machine learning, because your models expect specific types of data — like numbers for training, labels as strings, or structured data like lists and arrays.
Built-in Data Types in Python (Core Types)
1. Numeric Types
int: Integer (whole number)float: Floating point (decimal)complex: Complex number
2. Text Type
str: String
3. Sequence Types
list: Ordered, changeable, allows duplicatestuple: Ordered, unchangeable (immutable)range: Used for loops
4. Mapping Type
dict: Key-value pairs
5. Set Types
set,frozenset
6. Boolean Type
bool: True or False
7. Binary Types
bytes,bytearray,memoryview
Practical Examples + Notes for ML
1. Numbers: int, float, complex
int, float, complexML Note:
When feeding data to machine learning models:
Use int or float.
Avoid
complexunless working with signal processing or advanced math.
2. String: str
strStrings are used to hold text like:
Category labels (
"spam","ham")Column names in pandas
File paths (
"data/train.csv")
ML Note: Use label encoding or one-hot encoding to convert strings into numbers for ML models.
3. List: list
listLists hold ordered items.
Can hold mixed types, but that’s discouraged in ML input.
ML Use Case:
Store a row of features.
Hold dataset samples before converting to NumPy or pandas.
4. Tuple: tuple
tupleSimilar to list, but immutable (cannot be changed).
Often used for coordinates, fixed-size data.
ML Note: Use when you want fixed data that should not be changed (e.g., image shape (224, 224, 3)).
5. Dictionary: dict
dictHolds key-value pairs.
Fast lookup, widely used for configurations and mappings.
ML Use:
Store model settings:
{"learning_rate": 0.01}Mapping labels:
{"cat": 0, "dog": 1}
6. Set: set
setUnordered, no duplicates.
Useful to remove duplicates or check membership.
ML Tip: Use set() to find unique classes in your target column:
7. Boolean: bool
boolUsed in:
Conditional statements
Controlling training loops
Evaluation (e.g., accuracy > 0.9)
ML Use:
8. Type Checking with type() and isinstance()
type() and isinstance()Tip: Always validate your data types before passing to ML models!
9. Type Casting (Conversion)
Real-world ML Example: CSV files often load data as str. You must convert to int or float:
Summary Table
int
5
ID, count, label
float
3.14
Feature value, weight
str
"cat"
Label, path, text
bool
True
Condition, flag
list
[1, 2]
Features, samples
tuple
(2, 3)
Shape, coordinates
dict
{"a": 1}
Configs, mappings
set
{"a", "b"}
Unique values
Example: Simple ML Data Representation
id:intfeatures:listoffloatlabel:str
This is a common structure before converting to pandas DataFrame or NumPy array.
Keywords
data types, python data types, int, float, str, list, tuple, dict, set, bool, type conversion, type casting, isinstance, type, machine learning, data preprocessing, feature engineering, python basics, numeric types, sequence types, nerd cafe
Last updated