“AI Will Build What You Ask. It Won’t Tell You You’re Wrong.”
A case for mastering the fundamentals in the age of AI Tools for everything you need
As and when a new technology takes up more space in the ecosystems, the question arises- if this technology can do x for me, I don’t really need to study and understand that concept, it is redundant and a waste of time. In the engineering community, we pride ourselves by building, deploying and fixing. At a time when all these three stages can be built out by an LLM- our logically trained brain asks us to remove redundancies and just get it done by the LLMs. Hey- if I get paid to get an LLM to do my work, I’d think that is great too. Why not just master the tools that have flooded the market and make my life easier? Here is where it gets tricky though, and why I keep asking everyone around me to get into the fundamentals and learn them. I’ll build out my case through examples of what is at stake when we think of skipping the fundamentals.
An LLM can generate a CNN, a transformer, even a novel architecture. What it cannot do is tell you whether you should have built one in the first place, whether your evaluation is honest, why your model silently degraded in production last Tuesday, or whether the metric you’re optimizing actually corresponds to the business outcome you want. Every one of those questions requires the fundamentals.
What fundamentals actually buy you (and tools never will)
Problem framing. The hardest decision in any ML project isn’t the architecture but it is whether ML is the right tool, what the loss function should encode, what data to collect, what to optimize. People without fundamentals often overcomplicate things and make the wrong decisions. For example, they would reach for ML when a heuristic(a simple rule based system) would do, or for an LLM when classical ML would be 100x cheaper and more accurate. An LLM will cheerfully build whatever you ask for; it won’t tell you that you asked for the wrong thing. For example, AI won’t push back if you asked it to build a candidate filtering system based on resumes you put into it. It will build it for you without telling you that a simple rule based algorithm of your key filters like “work experience”, “education”, “core skills” would work just as well and would be a lot cheaper for you.
Evaluation literacy. This is the single biggest separator of good ML practitioners. Knowing why accuracy lies on imbalanced data, when offline metrics diverge from production behavior, how to construct a holdout that isn’t contaminated, what AUC actually measures or when a benchmark is gameable(meaning on paper you get the job done but you haven’t been able to solve the real problem). Without this, you ship models that “test well”, of course, but fail silently in production.
Failure mode reasoning. ML systems fail in subtle, non-obvious ways: label leakage, distribution shift, drift, calibration breakdown, prompt injection, retrieval poisoning. If you take away one line from this whole write up on fundamentals, let it be this- you cannot debug what you don’t understand. The LLM will suggest five fixes; only fundamentals tell you which one applies to which scenario.
Cost and architecture trade-offs. Knowing when a 50ms logistic regression on a CPU beats a billion-parameter LLM. When fine-tuning beats prompting. When RAG beats fine-tuning. When a distilled 7B model is the right answer over Claude or GPT. This is the difference between burning cash on the wrong approach and shipping a sustainable product.
And if I can say it louder for everyone at the back-LLMs ARE ML. Tokenization, embeddings, attention, context windows, sampling temperature, RAG, fine-tuning, evaluation harnesses and I can go on- these are all deep learning fundamentals. The people getting the most out of LLMs and building real technical products are the ones who understand them to the very core. Everyone else hits the ceiling quite fast.
Tools have a half-life of 18 months. Fundamentals have a half-life of decades. Linear algebra, probability, optimization, generalization, evaluation - these will be just as relevant in 2040 as today. LangChain probably won’t be relevant in 2027. Just as we all wanted to learn Hadoop 10 years ago until better things like DataBricks came along, tools lose their value as a skill sooner rather than later. What remains still is the fundamentals of Distributed computing, data partitioning and fault tolerance. You can move on to the “better” tools built on the same architecture because you know how those bricks hold the structure together.
As more systems become data-driven and probabilistic, the core engineering skill of the next 20 years is judgment under uncertainty about systems that fail silently. That skill is ML/DL fundamentals. It’s not optional anymore as one would think it is, in fact it’s becoming the baseline for what it means to be a technical engineer at all.
