Green AI – Programming Language Impact on Energy Consumption

Study Aim

  • Investigate impact of programming languages on AI energy efficiency.
  • Main RQ: "How do programming languages affect AI energy consumption during training and inference?"

Experimental Design

  • Programming languages: 55 (C++, Java, Python, MATLAB, R).
  • Algorithms: 77 (KNN, SVC, AdaBoost, Decision Tree, Logistic Regression, Naive Bayes, Random Forest).
  • Datasets: 33 UCI sets (Iris 150150 instances 44 features, Breast Cancer 699699 instances 3030 features, Wine Quality 4.9k4.9\,\text{k} instances 44 features).
  • Phases analyzed: Training vs. Inference (data split 80%80\% train, 20%20\% test).
  • Runs: 3030 repetitions per configuration ⇒ 6.3k6.3\,\text{k} total executions.
  • Hardware: Apple M22, 88-core CPU, 1010-core GPU, 88 GB RAM.
  • Energy metric: Total Joules (CPU + GPU + RAM) via CodeCarbon.

Key Training Results (RQ1.11.1)

  • Energy ranking (least to most):
    1. C++
    2. Java (≈ 4×4\times C++)
    3. MATLAB (≈ 7×7\times C++)
    4. Python (≈ 15×15\times C++)
    5. R (≈ 37×37\times C++)
  • Outliers heavily influence totals:
    • R Logistic Regression = 71.1%71.1\% of R’s training energy.
    • Python SVC = 63.38%63.38\% of Python’s training energy.
  • Java supplies most energy-efficient implementations for 4/74/7 algorithms; C++ lowest cumulative energy.

Key Inference Results (RQ1.21.2)

  • Energy ranking (least to most):
    1. Java
    2. C++ (≈ 2×2\times Java)
    3. Python (≈ 35×35\times Java)
    4. R (≈ 39×39\times Java)
    5. MATLAB (≈ 54×54\times Java)
  • Single-algorithm impacts:
    • Python SVC = 64.8%64.8\% Python inference energy.
    • MATLAB Naive Bayes = 62.7%62.7\% MATLAB inference energy.
  • Java hosts 4/74/7 most efficient inference algorithms; C++ hosts remaining 33.

Core Insights

  • Compiled / semi-compiled languages (C++, Java) markedly greener than interpreted ones (Python, MATLAB, R) — up to 54×54\times difference.
  • Efficiency ranking flips between phases; context (training vs. inference) matters.
  • No algorithm is intrinsically energy-greedy across all languages; implementation quality dominates.
  • Algorithm implementation can override language effects; e.g., C++ Decision Tree is 2nd worst despite overall C++ efficiency.

Practical Recommendations

  • Choose language by dominant phase:
    • C++/Java for sustained training or inference at scale.
    • MATLAB suitable for prototyping; avoid for large-scale inference.
  • Before refactoring, profile specific library implementation; swapping algorithm library may yield bigger gains than switching language.
  • Balance energy gains against development & maintenance cost; pre-compiled libraries in high-level languages can mitigate overhead.

Threats to Validity (summary)

  • Limited datasets, algorithms, and single library per language; results not universal.
  • Measurements obtained on single M22 machine; other hardware may differ.
  • Energy only, excludes carbon-intensity variability.

Future Work

  • Extend to more datasets, deep-learning workloads, additional libraries and hardware.
  • Compare multiple implementations within the same language.
  • Explore optimisation techniques that raise energy efficiency without abandoning current development stacks.

Carbon Footprint of Study

  • Experiment emitted ≈ 42.942.9 g CO2_2 (≈ 0.390.39 km electric-car travel).