<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Puru's Substack]]></title><description><![CDATA[Welcome to purukathuria.com]]></description><link>https://www.purukathuria.com</link><image><url>https://substackcdn.com/image/fetch/$s_!7h7X!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c7d405a-21a6-4f48-a4fb-bdc8166d2096_144x144.png</url><title>Puru&apos;s Substack</title><link>https://www.purukathuria.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 24 May 2026 19:47:47 GMT</lastBuildDate><atom:link href="https://www.purukathuria.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Puru Kathuria]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[puru@lexailabs.com]]></webMaster><itunes:owner><itunes:email><![CDATA[puru@lexailabs.com]]></itunes:email><itunes:name><![CDATA[Puru Kathuria]]></itunes:name></itunes:owner><itunes:author><![CDATA[Puru Kathuria]]></itunes:author><googleplay:owner><![CDATA[puru@lexailabs.com]]></googleplay:owner><googleplay:email><![CDATA[puru@lexailabs.com]]></googleplay:email><googleplay:author><![CDATA[Puru Kathuria]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[“AI Will Build What You Ask. It Won’t Tell You You’re Wrong.” ]]></title><description><![CDATA[A case for mastering the fundamentals in the age of AI Tools for everything you need]]></description><link>https://www.purukathuria.com/p/ai-will-build-what-you-ask-it-wont</link><guid isPermaLink="false">https://www.purukathuria.com/p/ai-will-build-what-you-ask-it-wont</guid><dc:creator><![CDATA[Puru Kathuria]]></dc:creator><pubDate>Tue, 28 Apr 2026 15:59:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7h7X!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c7d405a-21a6-4f48-a4fb-bdc8166d2096_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As and when a new technology takes up more space in the ecosystems, the question arises- if this technology can do x for me, I don&#8217;t really need to study and understand that concept, it is redundant and a waste of time. In the engineering community, we pride ourselves by building, deploying and fixing. At a time when all these three stages can be built out by an LLM- our logically trained brain asks us to remove redundancies and just get it done by the LLMs. Hey- if I get paid to get an LLM to do my work, I&#8217;d think that is great too. Why not just master the tools that have flooded the market and make my life easier? Here is where it gets tricky though, and why I keep asking everyone around me to get into the fundamentals and learn them. I&#8217;ll build out my case through examples of what is at stake when we think of skipping the fundamentals.</p><p>An LLM can generate a CNN, a transformer, even a novel architecture. What it cannot do is tell you whether you should have built one in the first place, whether your evaluation is honest, why your model silently degraded in production last Tuesday, or whether the metric you&#8217;re optimizing actually corresponds to the business outcome you want. Every one of those questions requires the fundamentals.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>What fundamentals actually buy you (and tools never will)</strong></p><p><em><strong>Problem framing.</strong></em> The hardest decision in any ML project isn&#8217;t the architecture but it is whether ML is the right tool, what the loss function should encode, what data to collect, what to optimize. People without fundamentals often overcomplicate things and make the wrong decisions. For example, they would  reach for ML when a heuristic(a simple rule based system) would do, or for an LLM when classical ML would be 100x cheaper and more accurate. An LLM will cheerfully build whatever you ask for; it won&#8217;t tell you that you asked for the wrong thing. For example, AI won&#8217;t push back if you asked it to build a candidate filtering system based on resumes you put into it. It will build it for you without telling you that a simple rule based algorithm of your key filters like &#8220;work experience&#8221;, &#8220;education&#8221;, &#8220;core skills&#8221; would work just as well and would be a lot cheaper for you.</p><p><em><strong>Evaluation literacy.</strong></em><strong> </strong>This is the single biggest separator of good ML practitioners. Knowing why accuracy lies on imbalanced data, when offline metrics diverge from production behavior, how to construct a holdout that isn&#8217;t contaminated, what AUC actually measures or when a benchmark is gameable(meaning on paper you get the job done but you haven&#8217;t been able to solve the real problem). Without this, you ship models that &#8220;test well&#8221;, of course, but fail silently in production.</p><p><em><strong>Failure mode reasoning.</strong></em> ML systems fail in subtle, non-obvious ways: label leakage, distribution shift, drift, calibration breakdown, prompt injection, retrieval poisoning. If you take away one line from this whole write up on fundamentals, let it be this-<em><strong> you cannot debug what you don&#8217;t understand. </strong></em>The LLM will suggest five fixes; only fundamentals tell you which one applies to which scenario.</p><p><em><strong>Cost and architecture trade-offs.</strong></em> Knowing when a 50ms logistic regression on a CPU beats a billion-parameter LLM. When fine-tuning beats prompting. When RAG beats fine-tuning. When a distilled 7B model is the right answer over Claude or GPT. This is the difference between burning cash on the wrong approach and shipping a sustainable product.</p><p><em><strong>And if I can say it louder for everyone at the back-LLMs ARE ML.</strong></em><strong> </strong>Tokenization, embeddings, attention, context windows, sampling temperature, RAG, fine-tuning, evaluation harnesses and I can go on- these are all deep learning fundamentals. The people getting the most out of LLMs and building real technical products are the ones who understand them to the very core. Everyone else hits the ceiling quite fast.</p><p>Tools have a half-life of 18 months. Fundamentals have a half-life of decades. Linear algebra, probability, optimization, generalization, evaluation - these will be just as relevant in 2040 as today. LangChain probably won&#8217;t be relevant in 2027. Just as we all wanted to learn Hadoop 10 years ago until better things like DataBricks came along, tools lose their value as a skill sooner rather than later. What remains still is the fundamentals of Distributed computing, data partitioning and fault tolerance. You can move on to the &#8220;better&#8221; tools built on the same architecture because you know how those bricks hold the structure together.</p><p>As more systems become data-driven and probabilistic, the core engineering skill of the next 20 years is <strong>judgment under uncertainty about systems that fail silently.</strong> That skill <em>is</em> ML/DL fundamentals. It&#8217;s not optional anymore as one would think it is, in fact it&#8217;s becoming the baseline for what it means to be a technical engineer at all.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What should we learn as an Engineer in the modern AI world ?]]></title><description><![CDATA[Mental Models & Philosophy to still be an elite engineer in this modern AI world.]]></description><link>https://www.purukathuria.com/p/what-should-we-learn-as-an-engineer</link><guid isPermaLink="false">https://www.purukathuria.com/p/what-should-we-learn-as-an-engineer</guid><dc:creator><![CDATA[Puru Kathuria]]></dc:creator><pubDate>Fri, 17 Apr 2026 07:40:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7h7X!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c7d405a-21a6-4f48-a4fb-bdc8166d2096_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Most software engineers would be navigating AI wrong.</strong> </p><p>As we see, the utility and the spread of generative <em><strong>AI platforms &amp; tools</strong></em> are increasing day by day, and the definition of the &#8220;<em>education on AI</em>&#8221; is being correlated with the knowledge of AI tools.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Can someone really be called an AI or ML engineer if they know how to use AI tools? </p><p><strong>The tool-first philosophy narrative</strong>: since we have democratized code generation and absolutely anyone can now write code turning it into a commodity- traditional computer science fundamentals, AI algorithms, and machine learning fundamentals do not have as much of an important role to play in.</p><p>However, if we look at the philosophies held by all the top tech industry leaders like Demis Hassabis, Andrej Karpathy, Ilya Sutskever, Yann LeCun and Jeff Dean, it reveals that the opposite is true.</p><p>In this era of high-velocity tool-flooded ecosystems and AI-assisted coding development, <em><strong>deep foundational knowledge is not a legacy requirement but a primary differentiator and a competitive advantage of an elite engineer.</strong></em></p><p><strong>History</strong></p><p>History has always been our guide, especially when we want to build an opinion on the shifts that we have seen in engineering and technology. To understand this current rise of AI as a technology, one must put in context the broader history of engineering, technology, and computer science. The progression of programming and programming-based tools has always been towards a higher layer of abstraction- meaning- hiding the &#8220;how and whys&#8221; of the layers beneath it to let you just focus on the outcome it can produce. Every few years, we grow one layer above in abstracting the programming or in abstracting the coding layer. This has led to massive development in the technology space but I am here to tell you not to discount the &#8220;how and whys&#8221; so soon. Let&#8217;s look at some pivotal moments in computing and engineering history:</p><p><strong>1950&#8217;s</strong></p><p>In the mid-20th century, programming was an exercise in &#8220;hand-to-hand combat&#8221; with hardware. The first generation of software development relied on machine code- direct manipulation of binary (1s and 0s) or hexadecimal representations of data. At this level, there was no distinction between the &#8220;tool&#8221; and the &#8220;fundamental.&#8221; Early programmers worked directly with machine code, manually specifying opcodes and memory addresses, and even computing jump targets by hand.</p><p>Because this process was highly complex and only understandable by experts, programming was often described as a kind of &#8220;priesthood&#8221; and only a small cohort of programmers existed who had access to this kind of technology and the intellect to control it. </p><p>The second generation introduced assembly language, which used symbolic mnemonics (opcodes like MOV, ADD, XCHG) to represent machine instructions. While this made code more readable, it remained &#8220;machine-dependent&#8221; and required a deep understanding of processor architecture, registers, and memory alignment. </p><p><strong>Compilers </strong></p><p>The 1950s saw the birth of the first high-level language, Fortran<em>(Formula Translation)</em>, which allowed engineers to write logic in &#8220;natural mathematical notation&#8221;. Engineers could write expressions like: x = a + b * c and the system would execute it. Hence we created a higher layer of abstraction.</p><p>Naturally, as human behaviour goes, the introduction of &#8220;<strong>compilers&#8221; </strong>- tools that automatically translated high-level logic into machine code was met with skepticism from the established &#8220;priesthood of programmers.&#8221; The arguments used in 1958 &amp; 1959 are strikingly similar to those used by modern critics of AI-assisted coding:</p><ol><li><p><strong>Efficiency</strong>: People from the programming community argued that the compiled code could never be as compact or as efficient as handwritten assembly. They also believed that a human brain would always beat the compiler&#8217;s optimization. </p></li><li><p><strong>Control</strong>: They feared that surrendering control to an automated translator would lead to bugs they couldn&#8217;t debug and systems they didn&#8217;t fully understand. (It sounds a bit familiar?)</p></li><li><p><strong>Trust</strong>: If the engineer did not &#8220;code to the metal,&#8221; how could they be sure the machine was executing the intent correctly?</p></li></ol><p>Despite this resistance, by 1958, more than half of the code running on IBM computers was generated by the Fortran compiler. Crucially, this did not succeed just because the tool existed now, it worked because engineers in time started to trust it while having a <em><strong>solid understanding of the layer beneath it</strong></em>.</p><p><strong>What do we really learn from going through all of this history?</strong></p><p> We can derive that the most elite successful engineers were not those who refused the &#8220;tool&#8221; (the compiler) but those who mastered the high-level logic while maintaining enough foundational knowledge to understand <em>how</em> the compiler optimized their intent. This transition did not destroy the need for fundamentals; it shifted the fundamental focus from low level concerns like &#8220;memory address management&#8221; to higher level thinking like &#8220;algorithmic logic and problem decomposition&#8221;.</p><h3><strong>Progression of Abstractions </strong></h3><p><strong>1940s&#8211;50s: Machine Code</strong></p><ol><li><p>Abstraction Level: Machine Code</p></li><li><p>Primary Interaction: Binary / Hex Strings</p></li><li><p>Fundamental Requirement: Hardware registers, memory addresses</p></li><li><p>Tooling: Hand-wiring and punch cards</p></li></ol><p><strong>1950s&#8211;60s: Assembly</strong></p><ol><li><p>Abstraction Level: Assembly</p></li><li><p>Primary Interaction: Opcode mnemonics</p></li><li><p>Fundamental Requirement: Processor architecture, instruction sets</p></li><li><p>Tooling: Symbolic assemblers</p></li></ol><p><strong>1960s&#8211;80s: High-Level Languages (C, Pascal)</strong></p><ol><li><p>Abstraction Level: High-Level Programming</p></li><li><p>Primary Interaction: Structured logic</p></li><li><p>Fundamental Requirement: Data structures, pointers, memory management</p></li><li><p>Tooling: Optimizing compilers</p></li></ol><p><strong>1990s&#8211;2010s: 4GL / Scripting (Python)</strong></p><ol><li><p>Abstraction Level: Higher-level abstractions</p></li><li><p>Primary Interaction: Object-oriented / functional paradigms</p></li><li><p>Fundamental Requirement: Abstraction patterns, libraries, APIs</p></li><li><p>Tooling: IDEs, debuggers, package managers</p></li></ol><p><strong>2020s+: AI-Assisted Programming</strong></p><ol><li><p>Abstraction Level: Intent-driven</p></li><li><p>Primary Interaction: Prompt-based</p></li><li><p>Fundamental Requirement: System design, ML, DL</p></li><li><p>Tooling: LLMs, copilots</p></li></ol><p>The fourth and fifth generations have brought us to a point where &#8220;coding&#8221; is increasingly a task of specifying intent. Engineers now describe the desired outcome, constraints, and context- leaving it to compilers, frameworks, and modern AI systems to plan the exact implementation. Coding is no longer about micromanaging instructions; it is about clearly expressing <em>what should happen</em> and <em>under what conditions</em>.</p><p>However, this shift does not eliminate the need for understanding, it raises the bar for it. Just as a Fortran programmer needed to understand the <em><strong>why</strong></em> behind their algorithms to write correct and efficient programs, the modern AI engineer must understand the <em><strong>why</strong></em> behind machine learning systems. Without that grounding, the outputs- whether code or logic- can become brittle, unreliable, or subtly incorrect.</p><p>AI systems can generate plausible solutions, but they do not inherently understand correctness or intent. As a result, engineers must be able to reason about failure modes, validate outputs, and recognize when the generated logic is flawed or hallucinated. In this world, specifying intent is powerful - but only when paired with the ability to critically evaluate what is produced.</p><h3>Final Pillars as you step into this new world of AI with &#8220;<em>AI Tools&#8221; &amp; &#8220;Fundamentals&#8221;</em></h3><p><em><strong>Actionable Pillar1: </strong></em></p><p>A few years from now, nobody will remember which model launched this week.</p><p>But they will remember the engineers who understood what was actually going on.</p><p>Early in your journey, you will feel an urge to keep up and to track every new release, every benchmark, every shiny model that claims to change everything. Resist that instinct. The interface will keep changing, but the underlying principles move slowly. The elite engineers who endure are the ones who understand those principles deeply: how tokens become meaning, how embeddings capture relationships, how optimization shapes intelligence, how architectures like transformers actually work, and how data and compute quietly dictate everything behind the scenes. At last, how thinking, causal reasoning, world modelling, continual learning are still open problems that need breakthroughs. </p><p>Also, don&#8217;t make the mistake of thinking that theory alone will save you.</p><p>Modern AI is not a clean &amp; deterministic discipline anymore. It is messy. It is empirical. It is built as much through experimentation as through understanding. You will often not know why one prompt works and another fails. You will build systems that behave correctly for reasons you cannot fully prove. And that&#8217;s okay. Because real engineering lives at the intersection of theory and practice.</p><p>First principles will give you compression &amp; reliability in your decision making. (a way to think clearly)</p><p>Experimentation (trial&amp;error) will give you truth. (a way to know what actually works.)</p><p>Over time, what you are really developing is <em><strong>opinionated taste</strong></em>.</p><p>And this becomes even more important as the paradigm shifts from traditional engineering to AI Tools powered engineering.</p><p>We are no longer building with models as endpoints. We are building systems where models are just components. The real leverage now lies in how you stitch things together (retrieval systems, tools, memory, orchestration, evaluation loops). For traditional SWEs &#8212;&gt; Software Architecture is more important than ever.  </p><p> Moore&#8217;s Law has taught us and has proved that the compute keeps getting cheaper every 18-24 months, and the performance of the computing systems keeps getting doubled every 18-24 months. Hence, the pace of new models, new tools coming up and chips getting cheaper is going to be exponential </p><p>But the elite engineers who win are not the ones who wait for better tools. They &#8220;<em><strong>learn how to use the current models &amp; tools better than anyone else</strong></em>&#8221;.</p><p></p><p><em><strong>Actionable Pillar2: </strong></em>Learn to build solid software on these abstractions. </p><p>You don&#8217;t always need to train models from scratch. You need to know when not to. You will call APIs, compose systems, and build products on top of layers that didn&#8217;t exist a few years ago. Just like software evolved from assembly to high-level languages to frameworks, AI is evolving from raw models to primitives to full systems.</p><p>Your job is not to go one layer lower or one layer higher blindly.</p><p>Your job is to choose the right level of abstraction.</p><p>And as you do this, expand your scope. Because it is extremely possible (faster+easier) to expand the scope now more than ever. </p><p>Do not stay confined to a narrow role - not as a frontend engineer, not as a backend engineer, not even as an &#8220;AI engineer&#8221; in the narrow sense. The highest leverage engineers understand the full stack: user experience, backend systems, model behavior, and distributed architecture. They see the system, not just the component.</p><p>But there is one final discipline that will separate you from everyone else.</p><p><em><strong>Actionable Pillar3: </strong></em>Evaluation</p><p>In this new world, building is easy. You can ship something in a day. But knowing whether it actually works is a question before pushing it out for production. </p><p>Making it work Reliably, consistently, under real-world conditions is hard. This is where engineers will have more leverage than no-code based vibe-gineers. </p><p>While studying agent reliability, we will also have to learn &amp; push the art of defining evaluation at multiple fronts: datasets/use-cases/scale/functionality/load/stress and study failure modes. </p><blockquote><p>&#8220;I am a true artist if I am able to start from first principles, learn best known art, but reimplement it in a way that&#8217;s never been done before.&#8221;</p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What is the Purpose of your Job ? Why AI might not replace you. ]]></title><description><![CDATA[History is our guide to introspect the idea: "Correlation (Rise of AI, loss of our job)"]]></description><link>https://www.purukathuria.com/p/what-is-the-purpose-of-your-job-why</link><guid isPermaLink="false">https://www.purukathuria.com/p/what-is-the-purpose-of-your-job-why</guid><dc:creator><![CDATA[Puru Kathuria]]></dc:creator><pubDate>Thu, 09 Apr 2026 11:42:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7h7X!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c7d405a-21a6-4f48-a4fb-bdc8166d2096_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I have always been curious about this idea of the rise of a new industry, artificial intelligence, and the fear of losing our job. What is the correlation between them, and whether this correlation also signals the causation? I have heard <a href="https://en.wikipedia.org/wiki/Jensen_Huang">Jensen</a> talk about the idea, and it quietly resonates with my school of thought. </p><p>Lets go back in time: So one of the predictions from uh <a href="https://en.wikipedia.org/wiki/Geoffrey_Hinton">Geoffrey Hinton</a>(Godfather of AI) who who started the whole deep learning phenomenon. Hinton, An incredible researcher, professor at the University of Toronto, and he invented the idea of back propagation.  <br>Backpropagation allows the neural networks to learn. Hold on to that idea and let us circle back to how people used to write software. Historically, software was written when humans applied first principles thinking. They described an algorithm, a step-by-step approach to do a functionality, and then they codified the whole algorithm or steps or procedures, and the package of the code was called software, simple like a recipe book. Now, getting back to Geoff Hinton. If we think about AI, this invention of artificial intelligence and deep learning, we have to talk about neural networks. Neural networks have a lot of neurons, also simply called math units. You can think of them like a switchboard, and we connect all of these math units together. These math units take in some input, let&#8217;s say the image of a cat, and produce or guess what the output might be. So these mathematical units that we described as neurons fire and wire together to produce or guess the output as a cat. All the other signals, let&#8217;s say, there is a signal that also guesses a dog, and there is one more signal that guesses an elephant and maybe a tiger. All of these other signals tune or switch to zero when it shows the image as a cat. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Now, this mathematical unit switchboard is very gigantic. The more information you give it, the bigger the switchboard has to be to process the information and guess the output and to be accurate. <br><br>The first time when this mathematical unit-based switchboard sees the image of a cat, it produces garbage output. You need to keep on showing it a lot of examples of cats so that there comes a stage where it produces meaningful output, which is called a cat. What Geoff Hinton discovered is the idea of back propagation. While you keep on showing all of these examples of a cat, the output transitions from garbage to guessing a cat nicely. At the back of the envelope, these mathematical units are tuning the signals up and down in order to guess the cat correctly, and they are turning down the signals for dog, for elephant, for tiger. </p><p>That is the foundation of artificial intelligence, a piece of software that can learn from given examples. That is basically what we call machine learning, a machine that has the capability to learn. Geoff Hinton discovered this process of learning, which is back propagation. Deep learning is when we move on to the complex switchboards, which is nothing but neural networks. <br>So one of the first big commercialized applications was image recognition, and the most important image recognition application is radiology. He predicted that about five years ago that in four to five years the world will not need any radiologist. <br>Because AI would have swept the whole field, but it turns out that AI has swept the whole field and every radiologist is using AI in some way. What is ironic, though, is that the number of radiologists has actually grown. Now the question is why? It&#8217;s a very interesting question. Why have the radiologists grown? <br><br>The prediction stated that the entire profession of radiologists will be wiped out, but why did we need more. <br><br><br>And the reason for that, if you think deeply, is because the purpose of a radiologist is to diagnose disease, not only to study the image as a task. The image studying is simply one such task in service of diagnosing the disease. Now that we can study more images quickly and more precisely without ever making a mistake, you don&#8217;t get tired easily and you can study more images. You can study it in its 3D form instead of 2D, because the AI doesn&#8217;t care whether you have given it 3D pixels or 2D pixels. You could also study it in 4D so you can study images better, which a single radiologist could not easily do. <br><br>And so the number of tests that people are able to do has increased because they are able to serve more patients. The hospital has just gotten better with more clients, with better radiologists, with more patients. As a result, the overall productivity of the system has increased and produced better economics. When the hospital has better economics, they also tend to hire more radiologists. Because the sole purpose of the job of a radiologist is not only to study images, their purpose is to diagnose a disease. </p><p>And so the overarching question that we are leading up to is: what is the purpose of a job? What is the purpose of a doctor? What is the purpose of a lawyer? What is the purpose of a radiologist? Has the purpose really changed? <br><br>There is one more question that I would like to introspect. What if my car becomes self-driving? Will all chauffeurs be out of jobs? The answer probably is no, because for some people chauffeurs are a service. For some people, chauffeurs are a part of the hospitality experience. For some people, chauffeurs are there to protect the car owner. <br>So the Chauffeur that would lose their jobs would be a few, and many Chauffeur would also change their jobs. Now, the ones who are losing it, their sole purpose as a chauffeur was to drive the car for the car owner. </p><p>If our job is plain, simple automation, if our job is the task, if our job is only a precise, specific task, then the possibility of losing the job is very high. But if our job is more than a task, just like the shuffle, then we might not get replaced by AI. </p><p>We are also seeing what Elon Musk is working on. He is working on machines that make machines and robots. When that happens and when everything goes into production, a whole new industry of technicians and people who have to manufacture these robots will rise. The productivity of Elon&#8217;s system will increase, and the number of jobs that never existed will rise up. We are going to have a whole new industry of people taking care of these robots with respect to manufacturing and with respect to maintenance.</p><p>Now there would also be some people who would come up with a company that manufactures Apple for robots, because people would want their robot to look different than other people&#8217;s robot. This phenomenon will give rise to a whole new industry that never existed. </p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Life as a Language Model]]></title><description><![CDATA[Generative Future Simulation]]></description><link>https://www.purukathuria.com/p/life-as-a-language-model</link><guid isPermaLink="false">https://www.purukathuria.com/p/life-as-a-language-model</guid><dc:creator><![CDATA[Puru Kathuria]]></dc:creator><pubDate>Wed, 10 Dec 2025 08:49:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7h7X!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c7d405a-21a6-4f48-a4fb-bdc8166d2096_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1><strong>Predicting Human Decisions Like Predicting the Next Token</strong></h1><h3><strong>A mental model for modeling human life using Transformers</strong></h3><p>One of the most powerful realizations in the last decade of AI is the idea that almost anything can be represented as a sequence of tokens &#8212; language, images, code, music, and even protein structures. Once you tokenize something into a sequence, you can train a Transformer to predict the next element of that sequence. That is the foundation of models like GPT.</p><p>Today, GPT predicts the next token in a sentence.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Tomorrow, could a model predict the next decision in a human life?</p><h2><strong>&#127757; All of human knowledge is text</strong></h2><p>The internet holds the collective intelligence of our species &#8212; papers, books, conversations, code, opinions, history, news, even emotional expression. We tokenize this massive corpus and train models to learn patterns of how thoughts unfold.</p><p>A Transformer receives tokens &#8594; embeds them into vectors &#8594; and through self-attention learns structure, causality, dependency, and intent. When it predicts the next token, it is essentially predicting human thought continuation.</p><p>So here&#8217;s the jump:</p><blockquote><p>If text is predictable, and text captures human decisions, then decisions should also be predictable.</p></blockquote><div><hr></div><h2><strong>&#129504; The core idea</strong></h2><p>What if you take all decisions a single human has ever made:</p><ul><li><p>Where did I study?</p></li><li><p>Which job did I choose?</p></li><li><p>Who did I meet?</p></li><li><p>What did I buy?</p></li><li><p>When did I change habits?</p></li><li><p>Which books did I read?</p></li><li><p>What content did I consume?</p></li><li><p>What goals did I set?</p></li></ul><p>And represent them as ordered tokens in the timeline of life. Now treat this timeline exactly like a sequence of text tokens. </p><p>Every decision = a token</p><p>Every token has an embedding representing:</p><ul><li><p>context (why/when)</p></li><li><p>emotions</p></li><li><p>constraints</p></li><li><p>environment</p></li><li><p>past experiences</p></li><li><p>personality traits</p></li></ul><p>Feed that tokenized life sequence to a decoder-only transformer, and ask:</p><blockquote><p>Given the full sequence of this person&#8217;s life decisions so far, what is the most likely next decision?</p></blockquote><p>Autoregressively roll it forward:</p><p>Decision1 &#8594; Decision2 &#8594; Decision3 &#8594; ... &#8594; Decision_N &#8594; predict Decision_(N+1)</p><p>You just built a generative model of a human life.</p><h2><strong>Applications</strong></h2><h3><strong>1. Personal Future Simulation</strong></h3><p>Simulate your tomorrow, next quarter, or entire decade based on the pattern of your choices.</p><h3><strong>2. Counterfactual Generators</strong></h3><p>Ask:</p><ul><li><p>What would my life look like if I accepted that job offer?</p></li><li><p>What if I moved cities?<br> You modify the starting token and re-generate the future.</p></li></ul><h3><strong>3. Coaching &amp; Self-Awareness</strong></h3><p>A mirror that shows where your internal algorithm is leading you.</p><h3><strong>4. Behavioral Optimization</strong></h3><p>Detect loops and biases:</p><ul><li><p>impulsiveness</p></li><li><p>procrastination</p></li><li><p>risk aversion</p></li><li><p>pattern of relationships</p></li></ul><p>and propose alternative trajectories.</p><h2><strong>&#128302; The philosophical angle</strong></h2><p>Humans believe we are infinitely complex and unpredictable &#8212; yet we are predictable enough that companies infer our behavior through recommendation models.</p><p>But LLMs prove a deeper truth:</p><blockquote><p>Anything that is a sequence can be predicted. And a human life is nothing but a sequence of decisions.</p></blockquote><p>So the idea becomes:</p><p><strong>Treat life as a language model.</strong></p><p><strong>Treat decisions as tokens.</strong></p><p><strong>Treat the future as next-token prediction.</strong></p><p></p><p><strong>&#128099; Final thought</strong></p><p>We may soon reach a point where we can:</p><ul><li><p>Upload our history</p></li><li><p>Generate multiple future trajectories</p></li><li><p>Pick the most meaningful one</p></li></ul><p>Like chess engines compute the best move, life engines could compute the best decision.</p><blockquote><p>The future of AI may not just be generating text.</p><p>It may be generating lives.</p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[[Being Antifragile 01] Convexity in Life ]]></title><description><![CDATA[The Hidden Parallel Between Convex Functions in ML and Convexity in Life]]></description><link>https://www.purukathuria.com/p/being-antifragile-01-convexity-in</link><guid isPermaLink="false">https://www.purukathuria.com/p/being-antifragile-01-convexity-in</guid><dc:creator><![CDATA[Puru Kathuria]]></dc:creator><pubDate>Wed, 08 Oct 2025 16:08:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jUoz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Last night, while re-reading Nassim Nicholas Taleb&#8217;s Antifragile, and correlating it with Convex Optimization Problems I had a strange moment of intellectual whiplash.</p><p>His idea that some systems don&#8217;t just survive volatility but actually get better because of it. And right there, in between loss functions and gradient descent, I stumbled on a single word that connects both worlds: convexity.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>Calculus in Our Algorithms and in Our Lives</strong></p><p>That beautiful U-shaped curve we love in ML, the convex loss function, is our guarantee of stability.</p><p>It tells us:</p><ul><li><p>Small deviations don&#8217;t hurt much.</p></li><li><p>Moving in the right direction pays off disproportionately.</p></li><li><p>Mathematically elegant. Emotionally&#8230; familiar?</p></li></ul><p>Because Taleb&#8217;s &#8220;convex life&#8221; is built on the same logic.</p><p>When randomness hits you, the average outcome should improve. You either don&#8217;t lose much, or you gain a lot.</p><p><strong>The Fragile vs. The Convex Life</strong></p><p>The Fragile, Concave Life: smooth, predictable, but one big shock can wipe years of progress.</p><p>The Antifragile, Convex Life: messy, volatile, but every small stress adds strength.</p><p>You take asymmetric risks, the kind that can quietly fail but might change everything if they succeed.</p><p><strong>Applying Convex Optimization to Everyday Life</strong></p><p><strong>Career:</strong> Most jobs can be linear. The side project is convex. Worst case, you learn. Best case, you take off disproportionately.</p><p><strong>Learning:</strong> Each concept compounds. Calculus didn&#8217;t just help me pass exams, it gave me a framework for life &amp; a framework to take asymmetric bets in investing &amp; in life.</p><p><strong>Relationships:</strong> Smooth ones can be fragile. Convex ones embrace honest stress, emerging stronger after each conflict.</p><p>Convexity, it turns out, isn&#8217;t just a mathematical property. It&#8217;s a way of structuring your existence.</p><p>Convex optimization makes algorithms stable.</p><p>Convex living makes humans antifragile.</p><p>And maybe, just maybe, that&#8217;s the most practical use of calculus we&#8217;ll ever find outside a classroom.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jUoz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jUoz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 424w, https://substackcdn.com/image/fetch/$s_!jUoz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 848w, https://substackcdn.com/image/fetch/$s_!jUoz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 1272w, https://substackcdn.com/image/fetch/$s_!jUoz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jUoz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image of a convex function graph&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image of a convex function graph" title="Image of a convex function graph" srcset="https://substackcdn.com/image/fetch/$s_!jUoz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 424w, https://substackcdn.com/image/fetch/$s_!jUoz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 848w, https://substackcdn.com/image/fetch/$s_!jUoz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 1272w, https://substackcdn.com/image/fetch/$s_!jUoz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c618669-1fdf-45fc-9b8a-cbaf45874f52_3998x3999.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Puru's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Coming soon]]></title><description><![CDATA[This is Puru&#39;s Substack.]]></description><link>https://www.purukathuria.com/p/coming-soon</link><guid isPermaLink="false">https://www.purukathuria.com/p/coming-soon</guid><dc:creator><![CDATA[Puru Kathuria]]></dc:creator><pubDate>Tue, 07 Oct 2025 09:08:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7h7X!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c7d405a-21a6-4f48-a4fb-bdc8166d2096_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is Puru&#39;s Substack.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.purukathuria.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.purukathuria.com/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item></channel></rss>