Skip to main content

Command Palette

Search for a command to run...

Demystifying Algorithms: How Machines Learn From Data

Updated
2 min read

After understanding supervised and unsupervised learning, I wanted to dig into the engine that drives machine learning: the algorithms themselves. Algorithms are the step-by-step instructions that help machines learn from data — and while they may sound intimidating at first, many are surprisingly intuitive once you understand the basic idea behind them.

One of the first algorithms I learned about was Linear Regression. It tries to draw the best possible straight line through data points to predict outcomes. For example, if you’re trying to predict someone's salary based on their years of experience, linear regression would try to find the equation of the line that best fits that relationship.

Then there’s Logistic Regression — despite the name, it’s used for classification problems, not regression. It’s helpful when you're trying to answer yes/no questions like “Will this email be spam?” or “Will this customer churn?”

I also explored Decision Trees, which I really liked because of their intuitive nature. It’s literally like a flowchart: the algorithm keeps asking questions and splits the data until it makes a decision. For example, if you're building a model to decide whether someone should get a loan, a decision tree might ask: “Is the income > ₹50,000?”, “Is the credit score good?”, and so on.

One of the coolest ones I came across was K-Nearest Neighbors (KNN). The idea is simple: to classify a data point, just look at the ‘k’ closest known data points around it and go with the majority. It reminded me of peer pressure — if your neighbors are mostly cats, you’re probably a cat too.

There are plenty of other algorithms like Support Vector Machines (SVM), Naive Bayes, and Random Forests, each with their own strengths depending on the type of data and problem.

What’s becoming clear to me is that there’s no one-size-fits-all algorithm. Choosing the right one depends on things like the size and nature of the dataset, whether it’s a classification or regression problem, and how much interpretability you want.

Where I go from here:

I’m planning to actually implement a few of these from scratch using Python — not just relying on libraries like scikit-learn. That way, I can get a deeper understanding of what’s happening behind the scenes. I think it’ll really cement the concepts and give me more confidence as I continue this journey.

More from this blog

Aviral's Tech Blog

20 posts

Self-taught full-stack web developer by passion, a mechanical engineer by qualification and an avocational musician.