When NOT to use Machine Learning
When NOT to Use Machine Learning: Candid advice from an MLE
As a Machine Learning Engineer, the number of times I’ve had to suggest to people NOT to use ML has been quite high. Sounds counter intuitive to the role, doesn’t it? But while ML is powerful, it’s not a silver bullet. Using ML where it’s not actually required or feasible can waste time, money, and resources. In this post, I’ll try to jot down scenarios where ML is not required.
1. When Convex Optimization Can Solve the Problem
This has been perhaps the most common case. Problems where the goal is to do constrained optimization and people directly jump to ML. Convex optimization is a mathematical technique that provides optimal solutions to specific types of problems—those with convex objectives and constraints. These methods are faster, more reliable, and easier to interpret than machine learning models.
What Is Convex Optimization?
Convex optimization involves finding the best solution (e.g., minimizing cost or maximizing efficiency) within a well-defined set of constraints. Here are a few examples
Examples Where Convex Optimization has been used
- Resource Allocation
- Problem: A manufacturing company wanted to minimize production costs while meeting demand and staying within resource limits.
- Solution: Use convex optimization to determine how much of each product to produce to minimize costs.
- Portfolio Optimization
- Problem: An investor wants to allocate capital across different stocks to maximize returns while keeping risk below a threshold.
- Solution: Apply the Markowitz mean-variance optimization model, a classic convex optimization problem.
2. When There Isn’t Enough Data
ML thrives on data. The more, the better. Without sufficient high-quality data, ML models are like chefs without ingredients.
Imagine you’re a startup building a personalized expense tracking app. If you only have 100 customer transactions, training a meaningful ML model is impossible. In this case, focus on gathering more data and understanding your customers before diving into AI.
3. When Class Distributions Are Identical and Indistinguishable
Machine Learning thrives on differences—patterns, trends, or signals that distinguish one class or outcome from another. If the distributions of all classes are identical and even humans can’t discern between them, AI is unlikely to provide meaningful results. Imagine trying to classify objects when the data for all categories looks nearly the same. If no distinguishable features exist, ML models cannot separate the classes effectively, no matter how sophisticated they are.
Simple trick: Plot the distribution of features for different targets. If the distributions are the same for all targets, sophisticated ML methods won’t help you. You need to get more distinguishing features. Mahalanobis distance between distributions can also be a good distance measure to estimate differences between distributions.
Example Case: Fraud Detection with Uniform Patterns
Scenario:
A company wants to detect fraudulent transactions. However, the data for fraudulent and legitimate transactions is virtually identical—transaction amounts, locations, and times follow the same statistical patterns for both.
- Human Analysis: Even domain experts struggle to identify fraud by looking at transaction details.
- ML Outcome: A model trained on this data would fail to differentiate fraud from legitimate transactions because no meaningful signal exists.
4. When It’s Too Expensive
AI can be costly—both in terms of development and ongoing maintenance. If the potential return on investment (ROI) doesn’t justify these costs, it’s better to look for simpler solutions.
For instance:
- Predicting customer behavior for a smallish store might be overkill if they only deal with <100 customers monthly. Instead, personal relationships and simple tools can suffice.
5. When There’s No Clear Business Problem
Adopting AI because “everyone else is doing it” is a common pitfall. Without a specific, well-defined business problem, AI efforts are likely to flounder.
Ask yourself:
- What’s the exact problem AI is solving?
- How will success be measured?
If you can’t answer these questions, it’s not the right time for AI.
TLDR
Machine learning is an incredible tool, but it’s not the answer to every problem. Before jumping into AI, evaluate whether it’s truly the right fit for your situation. Ask yourself these questions:
- Is there a simpler solution?
- Is it really a AI problem or plain optimization?
- Does the data have any pattern?
- Will it eventually lead to return on investments?
Sometimes, the best decision a business leader can make is not using AI at all. By applying AI thoughtfully, you’ll ensure it delivers value instead of becoming an expensive experiment.
What do you think? Have you seen examples where AI was used unnecessarily or overcomplicated the solution? Share your thoughts in the comments!