Python Encyclopedia for Academics
  • Course Outline
  • Artificial Intelligence
    • Data Science Foundation
      • Python Programming
        • Introduction and Basics
          • Variables
          • Print Function
          • Input From User
          • Data Types
          • Type Conversion
        • Operators
          • Arithmetic Operators
          • Relational Operators
          • Bitwise Operators
          • Logical Operators
          • Assignment Operators
          • Compound Operators
          • Membership Operators
          • Identity Operators
      • Numpy
        • Vectors, Matrix
        • Operations on Matrix
        • Mean, Variance, and Standard Deviation
        • Reshaping Arrays
        • Transpose and Determinant of Matrix
      • Pandas
        • Series and DataFrames
        • Slicing, Rows, and Columns
        • Operations on DataFrames
        • Different wayes to creat DataFrame
        • Read, Write Operations with CSV files
      • Matplotlib
        • Graph Basics
        • Format Strings in Plots
        • Label Parameters, Legend
        • Bar Chart, Pie Chart, Histogram, and Scatter Plot
  • Machine Learning Algorithms
    • Regression Analysis In ML
      • Regression Analysis in Machine Learning
      • Proof of Linear Regression Formulas
      • Simple Linear Regression Implementation
      • Multiple Linear Regression
      • Advertising Dataset Example
      • Bike Sharing Dataset
      • Wine Quality Dataset
      • Auto MPG Dataset
    • Classification Algorithms in ML
      • Proof of Logistic Regression
      • Simplified Mathematical Proof of SVM
      • Iris Dataset
  • Machine Learning Laboratory
    • Lab 1: Titanic Dataset
      • Predicting Survival on the Titanic with Machine Learning
    • Lab 2: Dow Jones Index Dataset
      • Dow Jones Index Predictions Using Machine Learning
    • Lab 3: Diabetes Dataset
      • Numpy
      • Pandas
      • Matplotlib
      • Simple Linear Regression
      • Simple Non-linear Regression
      • Performance Matrix
      • Preprocessing
      • Naive Bayes Classification
      • K-Nearest Neighbors (KNN) Classification
      • Decision Tree & Random Forest
      • SVM Classifier
      • Logistic Regression
      • Artificial Neural Network
      • K means Clustering
    • Lab 4: MAGIC Gamma Telescope Dataset
      • Classification in ML-MAGIC Gamma Telescope Dataset
    • Lab 5: Seoul Bike Sharing Demand Dataset
      • Regression in ML-Seoul Bike Sharing Demand Dataset
    • Lab 6: Medical Cost Personal Datasets
      • Predict Insurance Costs with Linear Regression in Python
    • Lab 6: Predict The S&P 500 Index With Machine Learning And Python
      • Predict The S&P 500 Index With Machine Learning And Python
  • Artificial Neural Networks
    • Biological Inspiration vs. Artificial Neurons
    • Review linear algebra and calculus essentials for ANNs
    • Activation Function
  • Mathematics
    • Pre-Calculus
      • Factorials
      • Roots of Polynomials
      • Complex Numbers
      • Polar Coordinates
      • Graph of a Function
    • Calculus 1
      • Limit of a Function
      • Derivative of Function
      • Critical Points
      • Indefinite Integrals
  • Calculus 2
    • 3D Coordinates and Vectors
    • Vectors and Vector Operations
    • Lines and Planes in Space (3D)
    • Partial Derivatives
    • Optimization Problems (Maxima/Minima) in Multivariable Functions
    • Gradient Vectors
  • Engineering Mathematics
    • Laplace Transform
  • Electrical & electronics Eng
    • Resistor
      • Series Resistors
      • Parallel Resistors
    • Nodal Analysis
      • Example 1
      • Example 2
    • Transient State
      • RC Circuit Equations in the s-Domain
      • RL Circuit Equations in the s-Domain
      • LC Circuit Equations in the s-Domain
      • Series RLC Circuit with DC Source
  • Computer Networking
    • Fundamental
      • IPv4 Addressing
      • Network Diagnostics
  • Cybersecurity
    • Classical Ciphers
      • Caesar Cipher
      • Affine Cipher
      • Atbash Cipher
      • Vigenère Cipher
      • Gronsfeld Cipher
      • Alberti Cipher
      • Hill Cipher
Powered by GitBook
On this page
  • Built-in Data Types in Python (Core Types)
  • Practical Examples + Notes for ML
  • Summary Table
  • Example: Simple ML Data Representation
  • Keywords
  1. Artificial Intelligence
  2. Data Science Foundation
  3. Python Programming
  4. Introduction and Basics

Data Types

Nerd Cafe

In Python, every value has a type. A data type defines what kind of value something is and what you can do with it. Think of it like this:

  • A string holds text.

  • An int holds whole numbers.

  • A float holds decimal numbers.

  • A list holds a collection of items.

  • And so on...

Knowing data types is critical in machine learning, because your models expect specific types of data — like numbers for training, labels as strings, or structured data like lists and arrays.

Built-in Data Types in Python (Core Types)

1. Numeric Types

  • int: Integer (whole number)

  • float: Floating point (decimal)

  • complex: Complex number

2. Text Type

  • str: String

3. Sequence Types

  • list: Ordered, changeable, allows duplicates

  • tuple: Ordered, unchangeable (immutable)

  • range: Used for loops

4. Mapping Type

  • dict: Key-value pairs

5. Set Types

  • set, frozenset

6. Boolean Type

  • bool: True or False

7. Binary Types

  • bytes, bytearray, memoryview

Practical Examples + Notes for ML

1. Numbers: int, float, complex

a = 10           # int
b = 3.14         # float
c = 2 + 3j       # complex

ML Note:

When feeding data to machine learning models:

  • Use int or float.

  • Avoid complex unless working with signal processing or advanced math.

2. String: str

name = "Mr Nerd Cafe"

Strings are used to hold text like:

  • Category labels ("spam", "ham")

  • Column names in pandas

  • File paths ("data/train.csv")

ML Note: Use label encoding or one-hot encoding to convert strings into numbers for ML models.

3. List: list

features = [5.1, 3.5, 1.4, 0.2]
  • Lists hold ordered items.

  • Can hold mixed types, but that’s discouraged in ML input.

ML Use Case:

  • Store a row of features.

  • Hold dataset samples before converting to NumPy or pandas.

4. Tuple: tuple

point = (3.5, 7.2)
  • Similar to list, but immutable (cannot be changed).

  • Often used for coordinates, fixed-size data.

ML Note: Use when you want fixed data that should not be changed (e.g., image shape (224, 224, 3)).

5. Dictionary: dict

student = {"name": "Mr Nerd Cafe", "age": 25, "score": 98.5}
  • Holds key-value pairs.

  • Fast lookup, widely used for configurations and mappings.

ML Use:

  • Store model settings: {"learning_rate": 0.01}

  • Mapping labels: {"cat": 0, "dog": 1}

6. Set: set

unique_labels = {"cat", "dog", "mouse"}
  • Unordered, no duplicates.

  • Useful to remove duplicates or check membership.

ML Tip: Use set() to find unique classes in your target column:

labels = ["cat", "dog", "dog", "cat", "mouse"]
print(set(labels))  # {'mouse', 'cat', 'dog'}

7. Boolean: bool

is_valid = True
is_training = False

Used in:

  • Conditional statements

  • Controlling training loops

  • Evaluation (e.g., accuracy > 0.9)

ML Use:

if accuracy > 0.95:
    print("Excellent model!")

8. Type Checking with type() and isinstance()

x = 42
print(type(x))           # <class 'int'>
print(isinstance(x, int)) # True

Tip: Always validate your data types before passing to ML models!

9. Type Casting (Conversion)

x = "42"
x = int(x)  # Convert str to int

Real-world ML Example: CSV files often load data as str. You must convert to int or float:

age = float("23.5")

Summary Table

Data Type
Example
ML Usage

int

5

ID, count, label

float

3.14

Feature value, weight

str

"cat"

Label, path, text

bool

True

Condition, flag

list

[1, 2]

Features, samples

tuple

(2, 3)

Shape, coordinates

dict

{"a": 1}

Configs, mappings

set

{"a", "b"}

Unique values

Example: Simple ML Data Representation

sample = {
    "id": 101,
    "features": [5.1, 3.5, 1.4, 0.2],
    "label": "setosa"
}
  • id: int

  • features: list of float

  • label: str

This is a common structure before converting to pandas DataFrame or NumPy array.

Keywords

data types, python data types, int, float, str, list, tuple, dict, set, bool, type conversion, type casting, isinstance, type, machine learning, data preprocessing, feature engineering, python basics, numeric types, sequence types, nerd cafe

PreviousInput From UserNextType Conversion

Last updated 2 months ago