<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Attention on Deepak Baby</title><link>https://deepakbaby.in/tags/attention/</link><description>Recent content in Attention on Deepak Baby</description><generator>Hugo</generator><language>en</language><lastBuildDate>Mon, 01 Jun 2026 21:25:56 +0200</lastBuildDate><atom:link href="https://deepakbaby.in/tags/attention/index.xml" rel="self" type="application/rss+xml"/><item><title>Attention Mechanisms, Part 1: From Seq2Seq to Learned Alignment</title><link>https://deepakbaby.in/posts/attention-history-bahdanau-to-transformer/</link><pubDate>Mon, 01 Jun 2026 00:00:00 +0000</pubDate><guid>https://deepakbaby.in/posts/attention-history-bahdanau-to-transformer/</guid><description>&lt;p>Most of the interesting things we ask neural networks to do look the same from a distance: take a sequence in, generate a sequence out. Translate an English sentence into French -&amp;gt; sequence of English words in, a sequence of French words out. Caption an image or transcribe a recording of speech -&amp;gt; sequence of pixels/recording-samples to a sequence of words. Summarize a paragraph or answer a question grounded in a story -&amp;gt; words in, words out.&lt;/p></description></item></channel></rss>