Amy Hanlonhttp://amygdalama.github.io/2015-08-17T00:00:00-04:00Video: How reviewing code makes me a better programmer!2015-08-17T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2015-08-17:video-how-reviewing-code-makes-me-a-better-programmer.html<p>Many developers see code review as a way to keep code compliant and to reduce bugs, but code review can be so much more than that!
In this talk, I'll share how reviewing code became one of the most valuable ways for me to become a better programmer.
I'll cover how you can learn more from code review, how you can improve your skills as a reviewer, and how to influence your colleagues to give you better code review.</p>
<p>Here's my talk about how reviewing code makes me a better programmer. You can also <a href="http://www.slideshare.net/AmyHanlon/how-reviewing-code-makes-me-a-better-programmer">view my slides</a>.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/MfKQGNUHaqs" frameborder="0" allowfullscreen></iframe>Death to the rubber stamp!2015-07-19T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2015-07-19:death-to-the-rubber-stamp.html<p>You work really hard on some code, submit a pull request, hoping for some feedback, and then, dun dun dunnnn...</p>
<blockquote>
<p>✨ lgtm! ✨</p>
</blockquote>
<iframe src="//giphy.com/embed/3h5pe45FM9qUM" width="480" height="267" frameBorder="0" style="max-width: 100%" class="giphy-embed" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>
<p>Come on. We all know that all software is broken. No pull request is perfect -- there's always room for improvement. Yet so often we respond with an even shorter version of <em>"yeah, sure, whatever"</em>.</p>
<p>This response is not helpful for you as the reviewer, and it's not helpful for the author, and it's not helpful for your organization. So, let's say someone assigns you a pull request. How can you do something more useful than simply giving it a rubber stamp of approval?</p>
<h3 id="ask-lots-of-questions">Ask lots of questions!</h3>
<p>When I get a pull request, here are some questions I ask myself (and if the answer isn't clear, I ask the author).
These questions both help find improvements in the patch, and also can start meaningful discussions that can help each of you learn more about how to think about code.</p>
<h3 id="meta">Meta</h3>
<ul>
<li>What problem is this pull request is solving?</li>
<li>Is it solving the right problem?</li>
<li>Does it solve the problem?</li>
<li>What are other ways the problem could be solved? Why did the author choose this solution?</li>
</ul>
<h3 id="history">History</h3>
<ul>
<li>Do the commit messages accurately describe what problem is being solved?</li>
<li>Do the structure and history of the commits make sense? </li>
<li>Are there any surprises in any of the commits (eg things you wouldn't expect to be included in the commit based on the commit message)?</li>
</ul>
<h3 id="design">Design</h3>
<blockquote>
<p>Good programmers write code that humans can understand.
<em>- Martin Fowler</em></p>
</blockquote>
<ul>
<li>Is the code well-tested?</li>
<li>Is the code easy to read?</li>
<li>Are there any parts that are confusing?</li>
<li>Are the names informative?</li>
<li>Are there any functions that do things that surprise you?</li>
<li>Do the functions do only one thing?</li>
<li>Are the statements in each function all at the same level of abstraction?</li>
<li>Are there any other code smells?</li>
</ul>
<p>If the code is confusing or hard to read, that's a problem with the code, not you (the reader). Ask the author to clarify and to make it easier to read.</p>
<h3 id="details">Details</h3>
<ul>
<li>Does the code comply to your org's coding standards?</li>
<li>Could the tests ever fail?</li>
<li>Do you see any obvious bugs? (This one's hard!)</li>
<li>Does the patch touch any particularly volatile or high-risk parts of the codebase?</li>
<li>If the patch has bugs, what's the worst thing that could happen?</li>
</ul>
<p>What questions do you ask when reviewing code?</p>
<h3 id="related-reading">Related reading</h3>
<ul>
<li><a href="https://rachelbythebay.com/w/2012/03/10/review/">Code reviews with a rubber stamp</a></li>
<li><a href="https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/b0Lb_mXfp0Y">Please don't rubber stamp code reviews</a></li>
<li><a href="http://jvns.ca/blog/2014/06/13/asking-questions-is-a-superpower/">Asking questions is a superpower</a></li>
<li><a href="https://www.youtube.com/watch?v=hY14Er6JX2s">Your Brain's API: Giving and receiving technical help</a></li>
<li><a href="http://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/01323508820">Clean Code</a></li>
</ul>Watch it fail2015-06-27T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2015-06-27:watch-it-fail.html<p>A tenet of test-driven development is, because you're writing your test first, you <em>watch it fail</em>, and then you write code to make it pass.
Here are three examples of tests that pass even when they shouldn't. See if you can figure out why!</p>
<h3 id="1-willow">1. willow</h3>
<p>A colleague of mine found a bunch of tests in our test suite using <a href="https://docs.python.org/3/library/unittest.mock.html">mock</a> that were written like this:</p>
<div class="highlight"><pre><span class="k">class</span> <span class="nc">TestChangePrimaryEmail</span><span class="p">(</span><span class="n">unittest</span><span class="o">.</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">test_change_primary_email_sends_email_notification</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">user</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">setup_test_user</span><span class="p">()</span>
<span class="n">old_email</span> <span class="o">=</span> <span class="n">user</span><span class="o">.</span><span class="n">email</span>
<span class="n">new_email</span> <span class="o">=</span> <span class="s">'test@example.com'</span>
<span class="k">with</span> <span class="n">patch</span><span class="o">.</span><span class="n">object</span><span class="p">(</span><span class="n">Emailer</span><span class="p">,</span> <span class="s">'send'</span><span class="p">)</span> <span class="k">as</span> <span class="n">mock_send</span><span class="p">:</span>
<span class="n">user</span><span class="o">.</span><span class="n">change_primary_email</span><span class="p">(</span><span class="n">new_email</span><span class="p">)</span>
<span class="n">mock_send</span><span class="o">.</span><span class="n">assert_has_call</span><span class="p">(</span><span class="n">old_email</span><span class="p">)</span>
</pre></div>
<p>This test would pass even if <code>user.change_primary_email</code> never sends an email! Why?</p>
<h3 id="2-lying-cat">2. lying cat</h3>
<p>Let's say we wanted to test cancelling a user, where a user object could look like this:</p>
<div class="highlight"><pre><span class="k">class</span> <span class="nc">User</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">is_cancelled</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">_cancelled</span>
<span class="k">def</span> <span class="nf">cancel</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_cancelled</span> <span class="o">=</span> <span class="bp">True</span>
</pre></div>
<p>And then our test looks like this:</p>
<div class="highlight"><pre><span class="k">class</span> <span class="nc">TestCancelUser</span><span class="p">(</span><span class="n">unittest</span><span class="o">.</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">test_cancel_user</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">user</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">setup_test_user</span><span class="p">()</span>
<span class="n">user</span><span class="o">.</span><span class="n">cancel</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">assertTrue</span><span class="p">(</span><span class="n">user</span><span class="o">.</span><span class="n">is_cancelled</span><span class="p">)</span>
</pre></div>
<p>This test would pass even if the user wasn't cancelled! Why?</p>
<h3 id="3-cady">3. cady</h3>
<p>Just this week we found a great bug:</p>
<ul>
<li>some users have special "post to wordpress" email addresses saved in their contact books</li>
<li>users can send invites to join Venmo to all of the email addresses in their contact book (who aren't already Venmo users)</li>
</ul>
<p>The combination of these two things could cause us to post an invite to join Venmo to the user's Wordpress blog. Oops.</p>
<p>A colleague fixed this bug by blacklisting emails with a Wordpress domain in our invite emailer and wrote a test for it like this:</p>
<div class="highlight"><pre><span class="k">class</span> <span class="nc">TestDontEmailWordpress</span><span class="p">(</span><span class="n">unittest</span><span class="o">.</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">test_inviting_contacts_skips_wordpress_emails</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">user</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">setup_test_user</span><span class="p">()</span>
<span class="n">user_to_invite</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">setup_another_test_user</span><span class="p">()</span>
<span class="n">user_to_invite</span><span class="o">.</span><span class="n">email</span> <span class="o">=</span> <span class="s">'blacklisted@wordpress.com'</span>
<span class="n">user</span><span class="o">.</span><span class="n">add_to_contacts</span><span class="p">(</span><span class="n">user_to_invite</span><span class="o">.</span><span class="n">email</span><span class="p">)</span>
<span class="k">with</span> <span class="n">patch</span><span class="o">.</span><span class="n">object</span><span class="p">(</span><span class="n">Emailer</span><span class="p">,</span> <span class="s">'send'</span><span class="p">)</span> <span class="k">as</span> <span class="n">mock_send</span><span class="p">:</span>
<span class="n">spam_contact_book_with_invites</span><span class="p">(</span><span class="n">user</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">assertFalse</span><span class="p">(</span><span class="n">mock_send</span><span class="o">.</span><span class="n">called</span><span class="p">)</span>
</pre></div>
<p>This test would pass even if Wordpress emails weren't blacklisted! Why?</p>Module objects are global!2015-06-14T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2015-06-14:module-objects-are-global.html<p>One of my favorite kinds of bugs is when a test that seems entirely unrelated to a code change fails. I've trained myself to look for the common causes, usually having to deal with shared state due to a test missing a proper teardown. But this week, I had a new kind of failure, one to do with module objects and how Python's import mechanism works, which, if you didn't know, is also <a href="http://mathamy.com/import-accio-bootstrapping-python-grammar.html">one of my favorite things</a>.</p>
<p>I've been experimenting with <a href="https://github.com/joke2k/faker">faker</a>, a package for generating random phone numbers, email addresses, etc, for use in tests. My goal was to have a wide surface area of phone numbers used across tests, but also for each test to use the same phone number(s) every test run.</p>
<p>Using <code>faker</code> is pretty simple:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">faker</span> <span class="kn">import</span> <span class="n">Faker</span>
<span class="gp">>>> </span><span class="n">fake</span> <span class="o">=</span> <span class="n">Faker</span><span class="p">()</span>
<span class="gp">>>> </span><span class="n">fake</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">fake</span><span class="o">.</span><span class="n">phone_number</span><span class="p">()</span>
<span class="go">u'742-547-3459x52762'</span>
<span class="gp">>>> </span><span class="n">fake</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">fake</span><span class="o">.</span><span class="n">phone_number</span><span class="p">()</span>
<span class="go">u'742-547-3459x52762'</span>
</pre></div>
<p>Here, I'm seeding the <code>Faker</code> instance so we get the same phone number for each call. Having to do this in the setup for every test, though, seemed like a lot of tedious work, and probably easy to forget, so I wanted to see if I could build a nose plugin that would seed a <code>Faker</code> instance for me based on the hash of the test name.</p>
<p>This worked great... until I ran our entire test suite, and a test that <em>didn't</em> use <code>faker</code> mysteriously started failing due to an invalid phone number.</p>
<p>However, after a bit of digging, I found that the failing test <em>did</em> use the <code>random</code> module, generating a phone number like this:</p>
<div class="highlight"><pre><span class="o">>>></span> <span class="n">phone_number_digits</span> <span class="o">=</span> <span class="p">[</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">9</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="mi">10</span><span class="p">)]</span>
<span class="p">[</span><span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
</pre></div>
<p>I immediately recognize that this has some possibility to generate an invalid phone number (<code>999</code> isn't a valid area code). I try running the test suite again, assuming that there's a small chance that this test will fail, and maybe I just got unlucky. Nope, it failed a second time. No matter how many times I run the test suite, this test fails.</p>
<p>Hrm.</p>
<p>Does seeding <code>faker</code> also seed <code>random</code>? Let's test this out in our REPL:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">fake</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
<span class="go">85</span>
<span class="gp">>>> </span><span class="n">fake</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
<span class="go">85</span>
<span class="gp">>>> </span><span class="n">fake</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
<span class="go">85</span>
</pre></div>
<p>Aha! So <code>faker</code> seeds <code>random</code>. But how does that work? Time to look at the source for <code>faker</code> to see how <code>seed</code> works. <a href="https://github.com/joke2k/faker/blob/e036b29268d346000453211d6f3153e99bdc2fe6/faker/generator.py#L52">Here's</a> the relevant code:</p>
<div class="highlight"><pre><span class="k">def</span> <span class="nf">seed</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">seed</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="sd">"""Calls random.seed"""</span>
<span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span>
</pre></div>
<p>Here's a summary of what we know so far:</p>
<ol>
<li>module <code>a</code> imports module <code>b</code></li>
<li>module <code>b</code> seeds <code>random</code></li>
<li>as a result, <code>random</code> is <em>also</em> seeded in module <code>a</code></li>
</ol>
<p>This must mean that the <code>random</code> imported in <code>a</code> is the <em>same</em> module object as the <code>random</code> imported in <code>b</code>. So if this is the case, we can do things like add attributes to <code>random</code> in one module, and access them in another module. Let's try!</p>
<div class="highlight"><pre><span class="c"># a.py:</span>
<span class="kn">import</span> <span class="nn">random</span>
<span class="n">random</span><span class="o">.</span><span class="n">defined_in_a</span> <span class="o">=</span> <span class="s">"hi!"</span>
<span class="c"># b.py:</span>
<span class="kn">import</span> <span class="nn">random</span>
<span class="k">print</span><span class="p">(</span><span class="n">random</span><span class="o">.</span><span class="n">defined_in_a</span><span class="p">)</span>
</pre></div>
<p>When we try running <code>b.py</code>, do we get a <code>NameError</code>? Or does this resolve and print <code>"hi!"</code>? Let's see:</p>
<div class="highlight"><pre><span class="gp">$</span> python b.py
<span class="go">hi!</span>
</pre></div>
<p>Neat. So module objects are global. For more on how this works, this documentation might be helpful:</p>
<ul>
<li><a href="https://docs.python.org/3/reference/import.html#the-import-system">the import system</a></li>
<li><a href="https://docs.python.org/3.4/library/sys.html#sys.modules">sys. modules</a></li>
</ul>
<p>If you're thinking that it's probably bad that seeding <code>faker</code> has side-effects outside of <code>faker</code>, you're right! <a href="https://github.com/joke2k/faker/issues/14">Here's</a> a ticket explaining why this is a problem and some possible solutions.</p>Things you can do other than scoffing at someone2015-05-16T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2015-05-16:scoffing.html<p>Things you can do other than scoffing at someone who says they use a technology you heard was horrible: </p>
<ul>
<li>ask them about it, maybe they know things you don't know</li>
<li>change the subject</li>
<li>say something nice, like "I like your hat"</li>
<li>don't do or say anything</li>
<li>excuse yourself from the conversation</li>
<li>use this moment to grow as a person</li>
<li>say "oh, how interesting" even though you don't think it's interesting, while secretly judging them because you are clearly smarter and more experienced and knowledgeable than they are</li>
<li>tell them what you really think, which is that you're clearly smarter and more experienced and knowledgeable than they are, because you read this one article on HN that said that this technology they use fucking sucks, and then, mid-sentence, have a rare moment of reflection in which you realize that your feelings of superiority aren't relevant to the conversation at hand, and that maybe you could <em>not</em> go around making other people feel bad about themselves in order to make yourself feel better all the time, and maybe this compulsion you have to assert dominance is actually due to your own self consciousness and fear</li>
</ul>PyCon Recording: Investigating Python Wats2015-04-18T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2015-04-18:pycon-recording-investigating-python-wats.html<p>Many of us have experienced a "wat" in Python - some behavior that totally mystifies us. Here is my PyCon talk on Investigating Python Wats, where we uncover some surprising implementation details of CPython, some unexpected consequences of mutability, and details of scope and name resolution.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/sH4XF6pKKmk" frameborder="0" allowfullscreen></iframe>AlterConf: A Conference to Emulate2014-10-05T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-10-05:alterconf.html<p>Yesterday I attended <a href="http://www.alterconf.com/">AlterConf</a>, a conference about diversity in the tech and gaming industries, hosted by <a href="http://www.ashedryden.com/">Ashe Dryden</a>.</p>
<p>The talks were enlightening, personal, and hopeful:</p>
<ul>
<li><a href="http://thebitchwhocodes.com/">Stacey Mulcahy</a> and <a href="https://twitter.com/cattsmall">Catt Small</a> on lessons they've learned running the <a href="http://codeliberation.org/">Code Liberation Foundation</a>, an organization that offers free game development workshops for women</li>
<li><a href="http://manamaneco.blogspot.com/">Manuel Marcano</a> on avoiding Native American stereotypes in games</li>
<li><a href="http://www.davidpeter.me/">David Peter</a> on what deafness is, his experiences being deaf, and what we can do to be more inclusive</li>
<li><a href="http://anewchallengerawaits.com/general/">Shawn Alexander Allen</a> on how underrepresented game designers have used Kickstarter to crowdfund their games</li>
<li><a href="http://rubywankenoobie.tumblr.com/">Stephanie Morillo</a> on her experiences learning to code while growing up in the Bronx and how lack of exposure is blocking entire communities from startups and the tech industry</li>
<li><a href="http://senongo.net/">Senongo Akpem</a> on the booming tech scene in Nigeria</li>
<li><a href="https://twitter.com/chrisalgoo">Chris Algoo</a> on lessons learned organizing diversity-focused game jams through <a href="http://www.brooklyngamery.com/">Brooklyn Gamery</a></li>
<li><a href="https://twitter.com/sinthetix">Aly Ferguson</a> on how gaming has a tremendous impact in mental, physical, and social rehabilitation</li>
<li><a href="http://arlduc.org/">Arlene Ducao</a> on her experiences with micro-discrimination at MIT Media Lab and the relationship between US startup culture and imperialism</li>
</ul>
<p>AlterConf has the most diverse set of speakers I've ever seen at a conference. This is not a coincidence. I'd like to emphasize a few details of AlterConf that made the environment particularly safe and welcoming, in no particular order:</p>
<ul>
<li>explicit focus on all aspects of diversity</li>
<li><a href="https://twitter.com/AlterConf/status/506929157688000512">all gender restrooms</a></li>
<li>preferred pronouns on nametags</li>
<li>taped-off mobility zones</li>
<li>ADA accessible, and near ADA accessible public transportation</li>
<li>real-time captioning (Lindsey Kuper wrote a great <a href="http://composition.al/blog/2014/05/31/your-next-conference-should-have-real-time-captioning/">post</a> about why !!Con provided this)</li>
<li>sign language interpreters</li>
<li>hyper-local (reduced travel costs for attendees and speakers)</li>
<li>sliding-scale tickets</li>
<li>speaker compensation</li>
<li>first-time speakers</li>
<li><a href="http://www.alterconf.com/code-of-conduct">Code of Conduct</a></li>
<li>food and <em>non-alchoholic</em> drinks</li>
<li>content/trigger warnings when appropriate</li>
<li>no forced networking</li>
<li>side rooms available to take breaks from being social</li>
<li>plenty of breaks between talks</li>
<li>optional partners to walk with you to your subway station, etc</li>
</ul>
<p>It's clear that the organizers put a <em>ton</em> of work into making the conference safe and accessible for everyone, and their work paid off in an exceptionally diverse set of speakers and attendees. This is what can happen when organizers shift their focus from providing social events and happy hours to making their conference safe and accessible.</p>Python Wats: Mutable Default Arguments2014-04-25T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-04-25:python-wats-mutable-default-arguments.html<p>Let's look at a common Python <a href="https://www.destroyallsoftware.com/talks/wat">wat</a> and try to figure out wat's actually happening!</p>
<p>We'll define a function, <code>foo</code>, which takes one argument, <code>l</code>, which has the default value of an empty list.</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">foo</span><span class="p">(</span><span class="n">l</span><span class="o">=</span><span class="p">[]):</span>
<span class="gp">... </span> <span class="n">l</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="s">'cat'</span><span class="p">)</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">l</span>
</pre></div>
<p>What happens when we call <code>foo</code> multiple times?</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">foo</span><span class="p">()</span>
<span class="go">['cat']</span>
<span class="gp">>>> </span><span class="n">foo</span><span class="p">()</span>
<span class="go">['cat', 'cat']</span>
</pre></div>
<p>Whoa! So mutating <code>l</code> actually mutates it for all future calls to the function. Weird.</p>
<p>This means that the <code>[]</code> object is <em>only created once</em>, and each time we call <code>foo</code> without an argument, <code>l</code> is referring to that same object. This may lead you to form a hypothesis: <code>l=[]</code> is kind of like a name-binding statement that executes only once, when the function is defined.</p>
<p>But, if that hypothesis is true, then how should we expect the following function to behave?</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">bar</span><span class="p">(</span><span class="n">l</span><span class="o">=</span><span class="p">[]):</span>
<span class="gp">... </span> <span class="k">print</span> <span class="nb">locals</span><span class="p">()</span>
<span class="gp">... </span> <span class="n">l</span> <span class="o">=</span> <span class="p">[</span><span class="s">'cat'</span><span class="p">]</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">l</span>
<span class="gp">...</span>
<span class="gp">>>> </span><span class="n">bar</span><span class="p">()</span>
<span class="go"># ?</span>
<span class="gp">>>> </span><span class="n">bar</span><span class="p">()</span>
<span class="go"># ?</span>
</pre></div>
<p>Well, if <code>l=[]</code> is <em>like a name-binding statement that executes only once</em> when the function is defined, then I would expect something like this sequence of events to happen, when we define <code>bar</code> and then call it twice:</p>
<ol>
<li><code>bar</code> is defined<ul>
<li>the name <code>l</code> is bound to the object <code>[]</code></li>
</ul>
</li>
<li><code>bar</code> is called the first time:<ul>
<li><code>locals()</code> should return <code>{l : []}</code></li>
<li>then we reassign <code>l</code> to <code>['cat']</code> within the scope of <code>bar</code></li>
<li><code>bar</code> should return <code>['cat']</code></li>
</ul>
</li>
<li><code>bar</code> is called again:<ul>
<li><code>l=[]</code> is not executed (based on our assumption)</li>
<li><code>locals()</code> should either return <code>{}</code> or <code>{l : ['cat']}</code>, depending on if the assignment of <code>l = ['cat']</code> persists after the function is called the first time</li>
<li><code>bar</code> should return <code>['cat']</code></li>
</ul>
</li>
</ol>
<p>What actually happens?</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">bar</span><span class="p">()</span>
<span class="go">{'l': []}</span>
<span class="go">['cat']</span>
<span class="gp">>>> </span><span class="n">bar</span><span class="p">()</span>
<span class="go">{'l': []}</span>
<span class="go">['cat']</span>
</pre></div>
<p>Hrm. This behavior reasonably leads us to believe that the assignment <code>l=[]</code> happens <em>each time we call the function <code>bar</code></em>. But in <code>foo</code>, <code>l=[]</code> can't be a statement that executes each time the function is called, or else we'd create a new <code>[]</code> each time.</p>
<p>If we assume <code>l=[]</code> executes like a name-binding statement, then it must execute either (1) only once when the function is defined, or (2) each time the function is called. In <code>foo</code>, it only executes once, but in <code>bar</code>, it executes every time we call <code>bar</code>. That just can't be. So our assumption that <code>l=[]</code> executes like a name-binding statement leads to a contradiction, and thus must be wrong!</p>
<p>Guess what, nerds! We kind of just did a <a href="http://en.wikipedia.org/wiki/Proof_by_contradiction">proof by contradiction</a>!</p>
<p>So then what really happens when we define default values for arguments? Let's see if we can figure out where the default values are stored.</p>
<p>My usual go-to for questions like this is Python internals whiz and Hacker School Facilitator <a href="http://akaptur.github.io/">Allison Kaptur</a>, but you can also find the answer with a bit of <a href="https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=python%20mutable%20default%20arguments">googling</a>.</p>
<p>So, without further ado, what actually happens when we define a default argument in Python 2.x is that the value of the argument gets stored inside the function's <code>func_defaults</code> method. (In 3.x, the values are stored in the <code>__defaults__</code> method.)</p>
<p>Let's look back at the <code>foo</code> function:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">foo</span><span class="p">(</span><span class="n">l</span><span class="o">=</span><span class="p">[]):</span>
<span class="gp">... </span> <span class="n">l</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="s">'cat'</span><span class="p">)</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">l</span>
</pre></div>
<p>In Python 2.x, we can access <code>foo</code>'s <code>func_defaults</code> like so:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">foo</span><span class="o">.</span><span class="n">func_defaults</span>
<span class="go">([],)</span>
<span class="gp">>>> </span><span class="n">foo</span><span class="p">()</span>
<span class="go">['cat']</span>
<span class="gp">>>> </span><span class="n">foo</span><span class="o">.</span><span class="n">func_defaults</span>
<span class="go">(['cat'],)</span>
</pre></div>
<p>Aha! So the actual object that is being stored as the default for <code>foo</code> is being modified when we call <code>foo</code>! For fun, let's see if we can mutate the default value from outside of the function:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">foo</span><span class="o">.</span><span class="n">func_defaults</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="s">'dragon'</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">foo</span><span class="o">.</span><span class="n">func_defaults</span>
<span class="go">(['cat', 'dragon'],)</span>
<span class="gp">>>> </span><span class="n">foo</span><span class="p">()</span>
<span class="go">['cat', 'dragon', 'cat']</span>
</pre></div>
<p>Eep! That was fun. So what's in the <code>func_defaults</code> of <code>bar</code>? Recall:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">bar</span><span class="p">(</span><span class="n">l</span><span class="o">=</span><span class="p">[]):</span>
<span class="gp">... </span> <span class="k">print</span> <span class="nb">locals</span><span class="p">()</span>
<span class="gp">... </span> <span class="n">l</span> <span class="o">=</span> <span class="p">[</span><span class="s">'cat'</span><span class="p">]</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">l</span>
</pre></div>
<p>And then:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">bar</span><span class="o">.</span><span class="n">func_defaults</span>
<span class="go">([],)</span>
<span class="gp">>>> </span><span class="n">bar</span><span class="p">()</span>
<span class="go">{'l': []}</span>
<span class="go">['cat']</span>
<span class="gp">>>> </span><span class="n">bar</span><span class="o">.</span><span class="n">func_defaults</span>
<span class="go">([],)</span>
</pre></div>
<p>Okay! So since <code>bar</code> <em>reassigns</em> <code>l</code> to <code>['cat']</code>, it doesn't modify the object stored in <code>func_defaults</code>.</p>
<p>So what have we learned?</p>
<p>It appears as if the following happens when we define and call <code>bar</code>:</p>
<ol>
<li><code>bar</code> is defined<ul>
<li>the object <code>[]</code> is created and stored in the <code>func_defaults</code> tuple</li>
</ul>
</li>
<li><code>bar</code> is called the first time:<ul>
<li>since we didn't pass in a value for <code>l</code> as an argument, Python looks in the <code>func_defaults</code> for the value to bind to the name <code>l</code>, and grabs the <code>[]</code> object that we created when we defined <code>bar</code></li>
<li><code>locals()</code> returns <code>{l : []}</code></li>
<li>we reassign <code>l</code> to <code>['cat']</code> within the scope of <code>bar</code>. Since this is a reassignment, this doesn't modify the <code>[]</code> object contained in <code>func_defaults</code>. Instead, <code>l</code> is just bound to a different object in memory.</li>
<li><code>['cat']</code> is returned</li>
</ul>
</li>
<li><code>bar</code> is called again:<ul>
<li>since we didn't modify the <code>[]</code> object the first time we called <code>bar</code>, the same series of events happens as in step 2!</li>
</ul>
</li>
</ol>
<p>I should probably also mention something more useful: a common way of setting a default value to an empty list (and having it actually work as expected) is to do the following:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">baz</span><span class="p">(</span><span class="n">l</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="gp">... </span> <span class="k">if</span> <span class="ow">not</span> <span class="n">l</span><span class="p">:</span>
<span class="gp">... </span> <span class="n">l</span> <span class="o">=</span> <span class="p">[]</span>
<span class="gp">... </span> <span class="n">l</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="s">'cat'</span><span class="p">)</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">l</span>
</pre></div>
<p>When we call <code>baz</code> multiple times, its behavior is more expected:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">baz</span><span class="p">()</span>
<span class="go">['cat']</span>
<span class="gp">>>> </span><span class="n">baz</span><span class="p">()</span>
<span class="go">['cat']</span>
</pre></div>
<p>Cool! Wat conquered. We should collect badges for all the wats we've battled.</p>
<p>Thanks to <a href="http://maryrosecook.com/">Mary Rose Cook</a> and <a href="http://akaptur.github.io/">Allison Kaptur</a>, who valiantly battled this wat with me.</p>Introducing Iron Maker Or Forger Or Something2014-04-19T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-04-19:introducing-iron-maker-or-forger-or-something.html<p>The end of <a href="https://www.hackerschool.com/">Hacker School</a> is quickly approaching and a thousand kittens are crying tears of sadness!</p>
<p><img alt="" src="/images/cat-crying.gif" /></p>
<p>In order to maintain momentum and Never Graduate, I'm challenging myself and some other Hacker Schoolers to complete a programming project once a week for two(?) months. </p>
<p>This challenge is inspired by Iron Blogger, which requires participants to blog once a week, and if you miss a week, you owe $5. Mike Walker, from the Fall 2013 batch of Hacker School, <a href="http://blog.lazerwalker.com/blog/2013/12/24/one-post-a-week-running-an-iron-blogger-challenge">wrote extensively</a> about Iron Blogger. </p>
<p>The difference with Iron Maker, or Iron Forger, or Iron Something-Clever-That-Will-Hopefully-Come-To-Me-In-The-Near-Future, is that we will be completing small, self-contained programming projects, rather than writing blog posts. </p>
<p>Specifically, I want to work on projects that take about 4-8 hours to complete, and center around making, for lack of a better word, a <em>product</em>. </p>
<h2 id="criteria">Criteria</h2>
<p>A few days ago, I <a href="https://twitter.com/amygdalama/status/456950130286292992">asked Twitter</a> for project ideas, but I hadn't yet figured out how to explicitly define criteria for what I was looking for. <a href="https://twitter.com/moss">Moss Collum</a> responded with an excellent <a href="http://makingcodespeak.com/2014/04/18/tiny-projects.html">blog post</a>, which includes some defining characteristics of good projects:</p>
<blockquote>
<ul>
<li><strong>Short</strong>: I can see interesting results within a few hours, and some results even sooner.</li>
<li><strong>End-to-end</strong>: The project produces real software with a user-interface (even if it’s a simple one like a command-line script).</li>
<li><strong>Expandable</strong>: Once I have some working code, it should be easy to think of new features.</li>
<li><strong>Variable</strong>: There should be room to change requirements over time in ways that break my assumptions and test my code’s ability to evolve</li>
<li><strong>Fun</strong>: The problem should be something I can care about enough to stay engaged.</li>
</ul>
</blockquote>
<h2 id="purpose">Purpose</h2>
<p>These projects are not intended to result in products that people will actually use. Instead, the intention is to get exposure to new tools and concepts and to practice making software design decisions. </p>
<h2 id="suggested-projects">Suggested Projects</h2>
<p>I'll be maintaining a <a href="https://github.com/amygdalama/programming-projects">list</a> of suggested projects on GitHub. I'm trying to keep the list short, manageable, and high-quality, so I'm only adding projects that I'm committed to working on, and then for each project I'll be adding details on how long it took and what parts were fun/challenging/easy/boring.</p>
<p>The repo also contains a list of resources with much more extensive project lists. If I'm missing any good resources, of if you've completed any small projects that have been particularly enlightening, please submit a pull request or create an issue! </p>Python Closures and Free Variables2014-04-10T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-04-10:python-closures-and-free-variables.html<p>Today, friends, we will continue to dissect functional programming concepts in Python. We're going to try to figure out what the hell is going on in this chunk of code:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">make_contains_function</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="gp">... </span> <span class="k">def</span> <span class="nf">contains</span><span class="p">(</span><span class="n">s</span><span class="p">):</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">s</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">contains</span>
</pre></div>
<p>What happens when we pass <code>make_contains_function</code> a string?</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">make_contains_function</span><span class="p">(</span><span class="s">'a'</span><span class="p">)</span>
<span class="go"><function contains at 0x10a1e2cf8></span>
</pre></div>
<p>We get a function! Whoa. A function that returns a function. Cool. Let's assign this returned function a name and try to use it:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">contains_a</span> <span class="o">=</span> <span class="n">make_contains_function</span><span class="p">(</span><span class="s">'a'</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">contains_a</span>
<span class="go"><function contains at 0x10a1e2c80></span>
<span class="gp">>>> </span><span class="n">contains_a</span><span class="p">(</span><span class="s">'cat'</span><span class="p">)</span>
<span class="go">True</span>
<span class="gp">>>> </span><span class="n">contains_a</span><span class="p">(</span><span class="s">'bro'</span><span class="p">)</span>
<span class="go">False</span>
</pre></div>
<p>We can create a function called <code>contains_a</code> by calling the <code>make_contains_function</code> and passing the string <code>'a'</code> as a parameter. Then, when we pass <code>contains_a</code> a string, the function returns a boolean representing whether <code>'a'</code> is in the string or not.</p>
<p>Let's look at the original code again and try to understand what it does and why it works:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">make_contains_function</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="gp">... </span> <span class="k">def</span> <span class="nf">contains</span><span class="p">(</span><span class="n">s</span><span class="p">):</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">s</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">contains</span>
</pre></div>
<p>First let's translate this to English. We're creating a function called <code>make_contains_function</code>, which takes one parameter, <code>x</code>. In the body of the <code>make_contains_function</code>, we create an inner function called <code>contains</code>, which takes one parameter, <code>s</code>. The inner function returns <code>x in s</code>, and then the outer function returns the inner function.</p>
<p>But how does <code>contains</code> have access to <code>x</code>? Shouldn't that throw a <code>NameError</code>? Here's my mental model for how Python looks up the value associated with a name of a variable, <code>x</code>:</p>
<ol>
<li>
<p>Check to see if <code>x</code> is in the <code>locals()</code> dictionary. If it is, then the value of <code>x</code> is the value associated with <code>x</code> in <code>locals()</code>. i.e.:</p>
<div class="highlight"><pre><span class="k">if</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">locals</span><span class="p">():</span>
<span class="k">return</span> <span class="nb">locals</span><span class="p">()[</span><span class="n">x</span><span class="p">]</span>
</pre></div>
</li>
<li>
<p>Check to see if <code>x</code> is in the <code>globals()</code> dictionary. If it is, then the value of <code>x</code> is the value associated with <code>x</code> in <code>globals()</code>. i.e.:</p>
<div class="highlight"><pre><span class="k">elif</span> <span class="n">x</span> <span class="ow">in</span> <span class="nb">globals</span><span class="p">():</span>
<span class="k">return</span> <span class="nb">globals</span><span class="p">()[</span><span class="n">x</span><span class="p">]</span>
</pre></div>
</li>
<li>
<p>Check to see if <code>x</code> is in the <code>__builtins__.__dict__</code> dictionary. If it is, then the value of <code>x</code> is the value associated with <code>x</code> in <code>__builtins__.__dict__</code>. i.e.:</p>
<div class="highlight"><pre><span class="k">elif</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">__builtins__</span><span class="o">.</span><span class="n">__dict__</span><span class="p">:</span>
<span class="k">return</span> <span class="n">__builtins__</span><span class="o">.</span><span class="n">__dict__</span><span class="p">[</span><span class="n">x</span><span class="p">]</span>
</pre></div>
</li>
<li>
<p>Otherwise, throw a <code>NameError</code>.</p>
</li>
</ol>
<p>My mental model for how <code>locals()</code> works is that it returns all local variables, which are defined in the <em>most narrowly-defined</em> current scope. In the case of <code>x</code> in our example, the most narrowly-defined current scope is the function <code>contains</code>. Since <code>x</code> isn't assigned a value within the function <code>contains</code>, <code>locals()</code> won't contain a value for <code>x</code> (based on my mental model).</p>
<p>My model for how <code>globals()</code> works is that it returns the variables which are defined at the module-level (i.e. variables which aren't defined within a scope like a function or a class. Since <code>x</code> is defined within a function, namely within the <code>make_contains_function</code>, it won't be included in the <code>globals()</code> dictionary either.</p>
<p><code>x</code> is pretty clearly not defined in <code>__builtins__.__dict__</code>, because it isn't defined in the <code>builtin</code> module. (It isn't automatically imported any time you run Python).</p>
<p>Poor <code>x</code>.</p>
<p>So is my mental model correct? If it is, we should be getting a <code>NameError</code> when we execute the <code>contains_a</code> or <code>contains_b</code> functions. Since we're not getting a <code>NameError</code>, something about my mental model must be inaccurate.</p>
<p>Shucks.</p>
<p>Let's try printing the <code>locals()</code> within each of the functions in our code block, to see where <code>x</code> is defined:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">def</span> <span class="nf">make_contains_function</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="gp">... </span> <span class="k">print</span> <span class="s">"Inside make_contains_function"</span>
<span class="gp">... </span> <span class="k">print</span> <span class="s">"locals(): "</span><span class="p">,</span> <span class="nb">locals</span><span class="p">()</span>
<span class="gp">... </span> <span class="k">def</span> <span class="nf">contains</span><span class="p">(</span><span class="n">s</span><span class="p">):</span>
<span class="gp">... </span> <span class="k">print</span> <span class="s">"Inside contains function"</span>
<span class="gp">... </span> <span class="k">print</span> <span class="s">"locals(): "</span><span class="p">,</span> <span class="nb">locals</span><span class="p">()</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">s</span>
<span class="gp">... </span> <span class="k">return</span> <span class="n">contains</span>
</pre></div>
<p>If my mental model is correct, <code>x</code> should be returned by <code>locals()</code> within the <code>make_contains_function</code>, but not by <code>locals()</code> within the <code>contains</code> function. Let's put my model to the test!</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">contains_a</span> <span class="o">=</span> <span class="n">make_contains_function</span><span class="p">(</span><span class="s">'a'</span><span class="p">)</span>
<span class="go">Inside make_contains_function</span>
<span class="go">locals(): {'x': 'a'}</span>
<span class="gp">>>> </span><span class="n">contains_a</span><span class="p">(</span><span class="s">'cat'</span><span class="p">)</span>
<span class="go">Inside contains function</span>
<span class="go">locals(): {'x': 'a', 's': 'cat'}</span>
<span class="go">True</span>
</pre></div>
<p>Oh! So <code>x</code> is returned by <code>locals()</code> inside the <code>contains</code> function. That's why we don't get a <code>NameError</code> when we try using <code>x</code>. My mental model of how <code>locals()</code> works and what it returns must be wrong. Let's look at the <a href="https://docs.python.org/2/library/functions.html#locals">documentation</a> for <code>locals()</code>:</p>
<blockquote>
<p>Update and return a dictionary representing the current local symbol table. Free variables are returned by <code>locals()</code> when it is called in function blocks but not in class blocks.</p>
</blockquote>
<p>Hm. What is a "free variable"? Does that apply to our situation? I suspect it does. Either that or my definition of a local variable is wrong. Googling "python free variable" brings us to the trusty Python <a href="https://docs.python.org/2/reference/executionmodel.html">Execution Model</a> page, which I strongly believe every Python programmer should read and re-read often.</p>
<blockquote>
<p>When a name is used in a code block, it is resolved using the nearest enclosing scope. The set of all such scopes visible to a code block is called the block's <em>environment</em>.</p>
<p>If a name is bound in a block, it is a local variable of that block. If a name is bound at the module level, it is a global variable. (The variables of the module code block are local and global.) If a variable is used in a code block but not defined there, it is a <em>free variable</em>.</p>
</blockquote>
<p>Let's apply this information to our example, and list what we know:</p>
<ol>
<li>
<p><code>contains</code> is a function.</p>
</li>
<li>
<p><code>x</code> is a free variable in <code>contains</code>, because it is referenced in <code>contains</code> but isn't defined there.</p>
</li>
<li>
<p>Free variables are not local variables.</p>
</li>
<li>
<p>However, free variables are returned when calling <code>locals()</code> within a function block.</p>
</li>
</ol>
<p>Okay! When Python looks up the name <code>x</code>, it finds a value for it in the <code>locals()</code> dictionary, even though <code>x</code> isn't a local variable. My mental model wasn't <em>too</em> far off. I just need to adjust how I think about how <code>locals()</code> behaves within functions.</p>
<p>And, so that you understand the title of this post, and so that you can sound smart around other programmers, you should know that a function that uses a <em>free variable</em> is called a <em>closure</em>. So, in our example, <code>x</code> is a <em>free variable</em> and the function <code>contains</code> is a <em>closure</em>.</p>
<p>Credit to <a href="https://twitter.com/ballingt">Tom Ballinger</a> for the example code block and for intoducing me to <a href="http://www.diveintopython3.net/">Dive Into Python3</a>, an excellent read and the inspiration for this post.</p>Dissecting the Reduce Function2014-04-07T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-04-07:dissecting-the-reduce-function.html<p>Good morning, <a href="http://www.vice.com/columns/good-morning-sinners-with-warren-ellis">sinners</a>. Today we're going to figure out what the hell is going on inside a Python expression like:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="nb">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="p">[</span><span class="n">x</span><span class="p">],</span> <span class="n">things</span><span class="p">,</span> <span class="p">[])</span>
</pre></div>
<p>(What <code>things</code> is doesn't matter too much, as long as it's an iterable. We'll look at a more specific example in a bit.)</p>
<p>Mary Rose Cook <a href="http://maryrosecook.com/blog/post/a-practical-introduction-to-functional-programming">defines</a> the <code>reduce</code> function nicely:</p>
<blockquote>
<p>Reduce takes a function and a collection of items. It returns a value that is created by combining the items.</p>
</blockquote>
<p>Here, the function passed as a parameter to the <code>reduce</code> function is a <code>lambda</code> statement, which Mary also defines:</p>
<blockquote>
<p>[A <code>lambda</code> statement is] an anonymous, inlined function [...] The parameters of the <code>lambda</code> are defined to the left of the colon. The function body is defined to the right of the colon. The result of running the function body is (implicitly) returned.</p>
</blockquote>
<p>A caveat: our <code>reduce</code> function also takes in a third parameter, the empty string, which I'll discuss in detail later.</p>
<p>Another caveat: if you're using Python3, <code>reduce</code> is no longer a builtin function. To use it, you'll need to:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">functools</span> <span class="kn">import</span> <span class="nb">reduce</span>
</pre></div>
<p>So, how do we better understand what's happening inside this expression?</p>
<h2 id="a-mental-model">A Mental Model</h2>
<p>Let's look at a specific example of our expression and examine a mental model -- which may or may not be correct -- for what's happening to the values of <code>a</code> and <code>x</code> inside the expression.</p>
<p>I said earlier that <code>things</code> needs to be an iterable. So, presumably, <code>things</code> could be a string. Let's try:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">things</span> <span class="o">=</span> <span class="s">'012'</span>
<span class="gp">>>> </span><span class="nb">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="p">[</span><span class="n">x</span><span class="p">],</span> <span class="n">things</span><span class="p">,</span> <span class="p">[])</span>
<span class="go">['0', '1', '2']</span>
</pre></div>
<p>What's going on in this statement? My (not functional) mental model for how this works is:</p>
<p>For each <code>x</code> in <code>things</code>, convert <code>x</code> to a list and then append <code>[x]</code> to <code>a</code>, which starts as an empty list on the first iteration (this is because of the third parameter we passed to <code>reduce</code>, which I'll explain later). After each iteration, <code>a</code> grows one element longer (because we're adding whatever <code>[x]</code> is during the iteration to <code>a</code>). Return the result of the last iteration.</p>
<p>So, the first time we pass <code>a</code> and <code>x</code> through to the <code>lambda</code> statement:</p>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'0'</span>
<span class="n">a</span> <span class="o">==</span> <span class="p">[]</span>
</pre></div>
<p>The <code>lambda</code> then implicitly returns the list given by:</p>
<div class="highlight"><pre><span class="n">a</span> <span class="o">+</span> <span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o">==</span> <span class="p">[]</span> <span class="o">+</span> <span class="p">[</span><span class="n">things</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span>
<span class="o">==</span> <span class="p">[]</span> <span class="o">+</span> <span class="p">[</span><span class="s">'0'</span><span class="p">]</span>
<span class="o">==</span> <span class="p">[</span><span class="s">'0'</span><span class="p">]</span>
</pre></div>
<p>The second time we pass <code>a</code> and <code>x</code> to <code>lambda</code>, we have:</p>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'1'</span>
<span class="c"># set `a` to the value the previous step implicitly returned</span>
<span class="n">a</span> <span class="o">==</span> <span class="p">[</span><span class="s">'0'</span><span class="p">]</span>
<span class="n">a</span> <span class="o">+</span> <span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o">==</span> <span class="p">[</span><span class="s">'0'</span><span class="p">]</span> <span class="o">+</span> <span class="p">[</span><span class="s">'1'</span><span class="p">]</span>
<span class="o">==</span> <span class="p">[</span><span class="s">'0'</span><span class="p">,</span> <span class="s">'1'</span><span class="p">]</span>
</pre></div>
<p>For the third and final step, we have:</p>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'2'</span>
<span class="n">a</span> <span class="o">==</span> <span class="p">[</span><span class="s">'0'</span><span class="p">,</span> <span class="s">'1'</span><span class="p">]</span>
<span class="n">a</span> <span class="o">+</span> <span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o">==</span> <span class="p">[</span><span class="s">'0'</span><span class="p">,</span> <span class="s">'1'</span><span class="p">]</span> <span class="o">+</span> <span class="p">[</span><span class="s">'2'</span><span class="p">]</span>
<span class="o">==</span> <span class="p">[</span><span class="s">'0'</span><span class="p">,</span> <span class="s">'1'</span><span class="p">,</span> <span class="s">'2'</span><span class="p">]</span>
</pre></div>
<p>So why did we have to pass <code>[]</code> as the third parameter to <code>reduce</code>? Well, if <code>reduce</code> isn't given a third parameter, for the first iteration it sets <code>a = x</code>, and then jumps to the second iteration. So, in my mental model, the first iteration would look like:</p>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'0'</span>
<span class="n">a</span> <span class="o">==</span> <span class="n">x</span>
<span class="o">==</span> <span class="s">'0'</span>
<span class="c"># no calculation of a + [x]</span>
<span class="c"># instead, implicitly returns the value of `a`, which is '0'</span>
</pre></div>
<p>And then in the second iteration:</p>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'1'</span>
<span class="c"># set `a` to the value implicitly returned from the first iteration</span>
<span class="n">a</span> <span class="o">==</span> <span class="s">'0'</span>
<span class="n">a</span> <span class="o">+</span> <span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o">==</span> <span class="s">'0'</span> <span class="o">+</span> <span class="p">[</span><span class="s">'1'</span><span class="p">]</span>
</pre></div>
<p>This results in an error because you can't concatenate a string and a list.</p>
<p>Is my mental model correct? We can tell that the model ultimately returns the same value as the expression, but how can we tell if these are really the values of <code>a</code> and <code>x</code> at each step?</p>
<h2 id="testing-the-model">Testing the Model</h2>
<p>We'll need to figure out some clever way of printing or storing the values of <code>a</code> and <code>x</code> at each step.</p>
<p>Let's recall the original expression:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="nb">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="p">[</span><span class="n">x</span><span class="p">],</span> <span class="n">things</span><span class="p">,</span> <span class="p">[])</span>
</pre></div>
<p>Here we append the value of <code>x</code> to <code>a</code>. Could we also append the value of <code>a</code> itself? Then maybe we could see what <code>a</code> is at each step. Let's try:</p>
<div class="highlight"><pre><span class="o">>>></span> <span class="n">reduce</span><span class="p">(</span><span class="n">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">x</span><span class="o">:</span> <span class="n">a</span> <span class="o">+</span> <span class="p">[{</span><span class="sc">'a'</span> <span class="o">:</span> <span class="n">a</span><span class="p">,</span> <span class="sc">'x'</span> <span class="o">:</span> <span class="n">x</span><span class="p">}],</span> <span class="n">things</span><span class="p">,</span> <span class="p">[])</span>
</pre></div>
<p>There's a lot going on in that expression, so let's break it up:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">things</span> <span class="o">=</span> <span class="s">'012'</span>
<span class="gp">>>> </span><span class="n">f</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="n">a</span><span class="p">,</span> <span class="s">'x'</span> <span class="p">:</span> <span class="n">x</span><span class="p">}]</span>
<span class="gp">>>> </span><span class="nb">reduce</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">things</span><span class="p">,</span> <span class="p">[])</span>
</pre></div>
<p>Let's apply the mental model to help us understand this example.</p>
<h4 id="step-1">Step 1:</h4>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'0'</span>
<span class="n">a</span> <span class="o">==</span> <span class="p">[]</span>
<span class="c"># this step implicitly returns the value:</span>
<span class="n">a</span> <span class="o">+</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="n">a</span><span class="p">,</span> <span class="s">'x'</span><span class="p">:</span> <span class="n">x</span><span class="p">}]</span>
<span class="o">==</span> <span class="p">[]</span> <span class="o">+</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}]</span>
<span class="o">==</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}]</span>
</pre></div>
<h4 id="step-2">Step 2:</h4>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'1'</span>
<span class="n">a</span> <span class="o">==</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}]</span>
<span class="n">a</span> <span class="o">+</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="n">a</span><span class="p">,</span> <span class="s">'x'</span><span class="p">:</span> <span class="n">x</span><span class="p">}]</span>
<span class="o">==</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}]</span> <span class="o">+</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">}]</span>
<span class="o">==</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span> <span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">}]</span>
</pre></div>
<h4 id="step-3">Step 3:</h4>
<div class="highlight"><pre><span class="n">x</span> <span class="o">==</span> <span class="n">things</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>
<span class="o">==</span> <span class="s">'2'</span>
<span class="n">a</span> <span class="o">==</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span> <span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">}]</span>
<span class="n">a</span> <span class="o">+</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="n">a</span><span class="p">,</span> <span class="s">'x'</span><span class="p">:</span> <span class="n">x</span><span class="p">}]</span>
<span class="o">==</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span> <span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">}]</span>
<span class="o">+</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span> <span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'2'</span><span class="p">}]</span>
<span class="o">==</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span>
<span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">},</span>
<span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span> <span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'2'</span><span class="p">}]</span>
</pre></div>
<p>This is pretty complicated and difficult to read. You might want to write out the steps for yourself. I did (obviously). And I screwed it up the first few times.</p>
<p>Let's see if our result from step 3 is the same as what Python evaluates for our expression:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">answer</span> <span class="o">=</span> <span class="nb">reduce</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">things</span><span class="p">,</span> <span class="p">[])</span>
<span class="gp">>>> </span><span class="n">result</span> <span class="o">=</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span>
<span class="gp">... </span><span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">},</span>
<span class="gp">... </span><span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">},</span> <span class="p">{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[{</span><span class="s">'a'</span> <span class="p">:</span> <span class="p">[],</span> <span class="s">'x'</span> <span class="p">:</span> <span class="s">'0'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">}],</span> <span class="s">'x'</span><span class="p">:</span> <span class="s">'2'</span><span class="p">}]</span>
<span class="gp">>>> </span><span class="n">answer</span> <span class="o">==</span> <span class="n">result</span>
<span class="go">True</span>
</pre></div>
<p>Victory! But what does this mean?</p>
<h2 id="wait-what-were-we-trying-to-do-again">Wait what were we trying to do again?</h2>
<p>We've been attempting to understand what happens inside a <code>reduce</code> function. We developed a mental model for what happens to the values of <code>a</code> and <code>x</code> at each iteration, and we came up with a way to test to see if the mental model was accurate. And it was!</p>
<p>I think this might be one of those posts that is less about what I intended it to be about (dissecting the <code>reduce</code> function) and more about the approach I would take to accomplish what I intended it to be about. Meta, sinners.</p>After Six Months of Learning The Python, I Can Finally Print "Hello World!"2014-04-01T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-04-01:after-six-months-of-learning-the-python-i-can-finally-print-hello-world.html<p>I've been trying to learn how to write a function in the Python that prints two words, only two, "Hello World." I've been trying for six months. And today, friends, I've done it.</p>
<p>I've read so many places that you can make print statements in the Python like so:</p>
<div class="highlight"><pre><span class="k">print</span> <span class="s">"Hello World!"</span>
</pre></div>
<p>But where do you type this in? How do you tell the computer, the Python, the whatever, that you want it to take this sequence of characters, interpret it as code, and execute it?</p>
<p>Well, I hope you are sitting down, because I've found the answer: the <a href="https://docs.python.org/2/reference/simple_stmts.html#the-exec-statement"><code>exec</code></a> statement (or function if you're into the Python 3)! </p>
<p>Let's say you want the Python to execute the definition of a function like:</p>
<div class="highlight"><pre><span class="k">def</span> <span class="nf">foo</span><span class="p">():</span>
<span class="k">print</span> <span class="s">"Hello World!"</span>
</pre></div>
<p>You can accomplish this by firing up the Python interpreter and typing:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">exec</span><span class="p">(</span><span class="s">"def foo():</span><span class="se">\n</span><span class="s"> print 'Hello World!'</span><span class="se">\n</span><span class="s">"</span><span class="p">)</span>
</pre></div>
<p><a href="https://docs.python.org/2/reference/simple_stmts.html#the-exec-statement"><code>exec</code></a> here takes in a string of the Python code and executes it! The <code>\n</code> and the whitespace between the <code>\n</code> and the <code>print</code> statement are very important! The Python needs those to understand where the function ends.</p>
<p>So now, we can see that <code>foo</code> exists and is a function that prints "Hello World!"</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">foo</span>
<span class="go"><function foo at 0x10b611230></span>
<span class="gp">>>> </span><span class="n">foo</span><span class="p">()</span>
<span class="go">Hello World!</span>
</pre></div>
<p><code>exec</code> can also take in <code>code</code> objects. We can make a <code>code</code> object by using the <a href="https://docs.python.org/2/library/functions.html#compile"><code>compile</code></a> function:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">c</span> <span class="o">=</span> <span class="nb">compile</span><span class="p">(</span><span class="s">"def bar():</span><span class="se">\n</span><span class="s"> print 'Hello World!'</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="s">''</span><span class="p">,</span> <span class="s">'exec'</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">c</span>
<span class="go"><code object <module> at 0x10b5f88b0, file "", line 1></span>
</pre></div>
<p><code>compile</code> takes in a string of code, a filename (we can just pass it the empty string), and a mode, which can be 'exec', 'eval', or 'single'.</p>
<p>Let's pass <code>c</code> into <code>exec</code> to execute the code and define our <code>bar</code> function:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">exec</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">bar</span><span class="p">()</span>
<span class="go">Hello World!</span>
</pre></div>
<p>Yes! We did it again! This is a victorious day. </p>
<p>Interestingly enough, functions themselves have <code>code</code> objects assigned to them as attributes:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">bar</span><span class="o">.</span><span class="n">__code__</span>
<span class="go"><code object bar at 0x10b5f81b0, file "", line 1></span>
</pre></div>
<p>And we can overwrite these <code>code</code> objects with our own <code>code</code> objects! </p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">new_code</span> <span class="o">=</span> <span class="nb">compile</span><span class="p">(</span><span class="s">"print 'Hello, We Are Victorious Beings!'</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="s">''</span><span class="p">,</span> <span class="s">'exec'</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">new_code</span>
<span class="go"><code object <module> at 0x10b5f81b0, file "", line 1></span>
<span class="gp">>>> </span><span class="n">bar</span><span class="o">.</span><span class="n">__code__</span> <span class="o">=</span> <span class="n">new_code</span>
<span class="gp">>>> </span><span class="n">bar</span><span class="o">.</span><span class="n">__code__</span>
<span class="go"><code object <module> at 0x10b5f81b0, file "", line 1></span>
<span class="gp">>>> </span><span class="n">bar</span><span class="p">()</span>
<span class="go">Hello, We Are Victorious Beings!</span>
</pre></div>
<p>Neat! We don't need the <code>"def bar():"</code> part in the string we pass to <code>compile</code> because at this point, <code>bar</code> already exists and we're just overwriting the code in the body of the <code>bar</code> function.</p>
<p>Share in the comments if you know of any other ways to print statements in the Python!</p>A Love Affair With Broken Things2014-03-31T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-03-31:a-love-affair-with-broken-things.html<p>I love broken things, unfinished things, breaking things, unfinishing things. Broken and unfinished things allow you to see the process in which they were created; their most intimate secrets are exposed.</p>
<h2 id="broken-statues">Broken Statues</h2>
<p>A week ago, inside <a href="http://www.metmuseum.org/en">The Metropolitan Museum of Art</a>, my love for broken things was realized. Just <em>look</em> at these. They're so <em>naked, vulnerable</em>.</p>
<p><img alt="some ladies, broken" src="/images/broken_ladies.JPG" title="broken ladies" />
<img alt="a face, broken" src="/images/broken_face.JPG" title="broken face" />
<img alt="a torso, broken" src="/images/broken_torso.JPG" title="broken torso" /></p>
<p>Okay so some of them are literally naked, but you get the point. When a statue is broken, you can sneak a peek inside! You get so many clues about how it was made! Is it hollow? What's it made out of? Is the material on the outside the same as the inside? Does it have a frame?</p>
<h2 id="breaking-code">Breaking Code</h2>
<p>While I can't bring myself to break art to get clues about the process of its creation, I <em>can</em> break code! Breaking code is free and doesn't hurt anyone! (As long as you keep it local...) I do this quite a bit as a method of learning - removing the pieces of code that you don't understand reveals the purpose of those pieces. It's like removing the arm of a statue to look inside.</p>
<p>Let's look at some code from Mary Rose Cook's functional programming <a href="http://maryrosecook.com/blog/post/a-practical-introduction-to-functional-programming">tutorial</a> (which is amazing, and you should absolutely read it and do the exercises and spend time understanding it completely if you're at all interested in functional programming). We won't understand the code at first (or at least <em>I</em> won't), but we'll take apart the pieces of the code in an attempt to better understand their purpose.</p>
<p>Mary aptly explains what Python's builtin <code>map</code> function does:</p>
<blockquote>
<p>Map takes a function and a collection of items. It makes a new, empty collection, runs the function on each item in the original collection and inserts each return value into the new collection. It returns the new collection.</p>
</blockquote>
<p>Her first example for showing how <code>map</code> works is fairly straightforward:</p>
<blockquote>
<p>This is a simple map that takes a list of names and returns a list of the lengths of those names:</p>
<div class="highlight"><pre><span class="n">name_lengths</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="nb">len</span><span class="p">,</span> <span class="p">[</span><span class="s">"Mary"</span><span class="p">,</span> <span class="s">"Isla"</span><span class="p">,</span> <span class="s">"Sam"</span><span class="p">])</span>
<span class="k">print</span> <span class="n">name_lengths</span>
<span class="c"># => [4, 4, 3]</span>
</pre></div>
</blockquote>
<p>In the second example of <code>map</code>, we see that Mary uses a <code>lambda</code> function:</p>
<blockquote>
<p>This is a map that squares every number in the passed collection:</p>
<div class="highlight"><pre><span class="n">squares</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span> <span class="o">*</span> <span class="n">x</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
<span class="k">print</span> <span class="n">squares</span>
<span class="c"># => [0, 1, 4, 9, 16]</span>
</pre></div>
<p>This map doesn’t take a named function. It takes an anonymous, inlined function defined with lambda. The parameters of the lambda are defined to the left of the colon. The function body is defined to the right of the colon. The result of running the function body is (implicitly) returned.</p>
</blockquote>
<p>But <em>why</em> does Mary use a <code>lambda</code> function here? Let's spend some time breaking this code and reconstructing it to understand why the <code>lambda</code> function is used.</p>
<p>First let's try removing the <code>lambda</code> and seeing what happens:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">squares</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="n">x</span> <span class="o">*</span> <span class="n">x</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
<span class="gt">Traceback (most recent call last):</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>, in <span class="n"><module></span>
<span class="gr">NameError</span>: <span class="n">name 'x' is not defined</span>
</pre></div>
<p>Okay. <code>x</code> is not defined. That makes sense, because <code>x</code> isn't in our <code>locals</code> or our <code>globals</code> or our <code>builtins</code>. Remember that when Python sees the name of a variable, it looks in those three places for a definition of that variable. If Python doesn't find the variable in any of those places, it throws a <code>NameError</code>. <code>lambda</code> must temporarily add variables (here, <code>x</code>) to our namespace and then throw them away.</p>
<p>We removed the arm of the statue and a <code>NameError</code> was revealed. Cool. Now let's try naively reconstructing the statue.</p>
<p>What if we tried using the <code>**</code> operator? Can we pass something like <code>**2</code> as the function for <code>map</code>? Let's try:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">squares</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="o">**</span><span class="mi">2</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>
<span class="n">squares</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="o">**</span><span class="mi">2</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
<span class="o">^</span>
<span class="gr">SyntaxError</span>: <span class="n">invalid syntax</span>
</pre></div>
<p>This <code>SyntaxError</code> makes me think that <code>**</code> is part of a statement, defined in Python's Grammar file, and the way that we typed our code is in violation of the the definition of that statement.</p>
<p>I am going to cheat a bit here. I am going to present something I found on the internet that helps us understand this <code>SyntaxError</code> without showing how I knew what to google to get the answer. At some point I'll write about, given a <code>SyntaxError</code>, how we can find the relevant rules defined in Python's Grammar, understand which rules we're violating, and adjust our code to obey. But not today.</p>
<p>So the short story is I did some research to figure out how <code>**</code> Python operators are <a href="https://docs.python.org/2/reference/expressions.html#the-power-operator">defined</a>:</p>
<div class="highlight"><pre><span class="n">power</span> <span class="o">::=</span> <span class="n">primary</span> <span class="p">[</span><span class="s">"**"</span> <span class="n">u_expr</span><span class="p">]</span>
</pre></div>
<p>The important thing to note is that any time Python sees <code>**</code> in this context, it expects a thing called a <a href="https://docs.python.org/2/reference/expressions.html#primaries"><code>primary</code></a> to come before it and a thing called a <a href="https://docs.python.org/2/reference/expressions.html#unary-arithmetic-and-bitwise-operations"><code>u_expr</code></a> to come after it. We can tell we violated this rule without even understanding what a <code>primary</code> or a <code>u_expr</code> is. We tried typing <code>**2</code>, which doesn't include anything that could be interpreted as a <code>primary</code> before the <code>**</code>.</p>
<p>Okay. So we can't reconstruct Mary's function using <code>**</code> instead of <code>lambda</code>.</p>
<p>What else could we try instead of a lambda function? Is there a function already defined in Python that does the same thing as the operator <code>**</code> but in function syntax?</p>
<p>Let's <a href="https://www.google.com/search?q=python+power+operator+function&oq=python+power+operator+function&aqs=chrome..69i57.426j0j1&sourceid=chrome&espv=210&es_sm=91&ie=UTF-8">google</a> "python power operator function." We quickly discover that there's a builtin <code>pow</code> function that takes two parameters, <code>x</code> and <code>y</code> and returns <code>x**y</code>. Cool! So <code>pow(x,2)</code> should return the same thing as <code>x**2</code>.</p>
<p>Does the <code>pow</code> function work in our <code>map</code> function? Let's try!</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">squares</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="nb">pow</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
<span class="gt">Traceback (most recent call last):</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>, in <span class="n"><module></span>
<span class="gr">TypeError</span>: <span class="n">pow expected at least 2 arguments, got 1</span>
</pre></div>
<p>Oh, right. Derp. We need to pass <code>2</code> to <code>pow</code>, in addition to each element in our list. In the <a href="https://docs.python.org/2.7/library/functions.html#map">documentation</a> for <code>map</code>, we see that if the function takes two arguments, we need to pass it two iterables. So we could do something kind of dumb like:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">squares</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="nb">pow</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">],</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
<span class="gp">>>> </span><span class="n">squares</span>
<span class="go">[0, 1, 4, 9, 16]</span>
</pre></div>
<p>It works, but it's pretty ugly compared to the original:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">squares</span> <span class="o">=</span> <span class="nb">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span> <span class="o">*</span> <span class="n">x</span><span class="p">,</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
</pre></div>
<p>It seems silly to use a function, <code>pow</code>, that takes two arguments, when one of the arguments we pass it is always the same.</p>
<p>Ohhh.</p>
<p>Maybe that's why Mary used <code>lambda</code>! To create a function that works kind of like <code>pow</code> but just takes one argument!</p>
<p>So we broke the statue, attempted to reconstruct it, and then wound up with something way uglier than the original. And thus, through breaking Mary's code, her design decisions were revealed! And now we have a better understanding of why the original process was used!</p>
<p>Breaking things is fucking rad.</p>
<p><strong>Disclaimer:</strong> This post is not intended to show the most pythonic way of squaring a list of integers. Instead, it is intended to show that we can discover how and why a code block works by exploring what happens when we remove chunks of it.</p>What's the deal with __builtins__ vs __builtin__2014-03-23T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-03-23:whats-the-deal-with-builtins-vs-builtin.html<p>Seriously, what's the difference? When you first fire up the Python interpreter, <code>__builtins__</code> is in your namespace for free:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="nb">globals</span><span class="p">()</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span>
<span class="go">['__builtins__', '__name__', '__doc__', '__package__']</span>
<span class="gp">>>> </span><span class="n">__builtins__</span>
<span class="go"><module '__builtin__' (built-in)></span>
</pre></div>
<p>But it appears to be the <code>__builtin__</code> module (singular)! If you:</p>
<div class="highlight"><pre><span class="o">>>></span> <span class="kn">import</span> <span class="nn">__builtin__</span>
<span class="o">>>></span> <span class="n">__builtin__</span> <span class="ow">is</span> <span class="n">__builtins__</span>
<span class="bp">True</span>
</pre></div>
<p>Hrm. So they are both names that point to the same object, the module <code>__builtin__</code>. Weird. Why does Python do this? Do they always behave the same?</p>
<p>I read on <a href="http://stackoverflow.com/questions/11181519/python-whats-the-difference-between-builtin-and-builtins">StackOverflow</a> that</p>
<blockquote>
<p>By default, when in the <code>__main__</code> module, <code>__builtins__</code> is the built-in module <code>__builtin__</code> (note: no 's'); when in any other module, <code>__builtins__</code> is an alias for the dictionary of the <code>__builtin__</code> module itself.</p>
</blockquote>
<p>What. What does that mean.</p>
<p>This talk of the "<code>__main__</code> module" and "any other module" reminds me of a sequence of words that I've known for quite a while, but haven't completely grokked:</p>
<blockquote>
<p>We can access the name of the current module with the builtin variable <code>__name__</code>.</p>
</blockquote>
<p>You're probably familiar with the related canonical statement:</p>
<div class="highlight"><pre><span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span>
<span class="n">main</span><span class="p">()</span>
</pre></div>
<p>But what does "current module" mean? What does the <code>__name__</code> variable look like when it does not equal <code>__main__</code>?</p>
<p>I happen to know, because I've obsessively read about the <code>import</code> statement, another sequence of words:</p>
<blockquote>
<p>Any code executed as a result of an <code>import</code> isn't executed in the <code>__main__</code> module.</p>
</blockquote>
<p>Let's use these bits of knowledge to observe the behavior of <code>__builtins__</code> both inside and outside of the <code>__main__</code> module. We can also check out the <code>__name__</code> variable while we're at it.</p>
<p>First, let's make a script, <code>a.py</code>, which will allow us to observe the behavior of <code>__builtin__</code>, <code>__builtins__</code>, and <code>__name__</code>.</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">__builtin__</span>
<span class="k">print</span> <span class="s">"In a"</span>
<span class="k">print</span> <span class="s">"__name__ is:"</span><span class="p">,</span> <span class="n">__name__</span>
<span class="k">print</span> <span class="s">"__builtin__ is __builtins__:"</span><span class="p">,</span> <span class="n">__builtin__</span> <span class="ow">is</span> <span class="n">__builtins__</span>
<span class="k">print</span> <span class="s">"type(__builtin__):"</span><span class="p">,</span> <span class="nb">type</span><span class="p">(</span><span class="n">__builtin__</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"type(__builtins__):"</span><span class="p">,</span> <span class="nb">type</span><span class="p">(</span><span class="n">__builtins__</span><span class="p">)</span>
</pre></div>
<p>Let's see what happens when we execute <code>a.py</code>:</p>
<div class="highlight"><pre><span class="gp">$</span> python a.py
<span class="go">In a</span>
<span class="go">__name__ is: __main__</span>
<span class="go">__builtin__ is __builtins__: True</span>
<span class="go">type(__builtin__): <type 'module'></span>
<span class="go">type(__builtins__): <type 'module'></span>
</pre></div>
<p>Okay. So we're in the <code>__main__</code> module, and in here <code>__builtin__</code> is pointing to the same module object as <code>__builtins__</code>.</p>
<p>What happens if we <code>import a</code> in another script? The code in <code>a</code> will execute, but it won't be executed within the <code>__main__</code> module. Instead, it'll be executed within the <code>a</code> module. Let's write another script, <code>b.py</code>, to find out what happens to <code>__builtins__</code> outside of <code>__main__</code>:</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">__builtin__</span>
<span class="k">print</span> <span class="s">"In b, before importing a"</span>
<span class="c"># the output from this should be the same as when we ran</span>
<span class="c"># $ python a.py</span>
<span class="k">print</span> <span class="s">"__name__ is:"</span><span class="p">,</span> <span class="n">__name__</span>
<span class="k">print</span> <span class="s">"__builtin__ is __builtins__:"</span><span class="p">,</span> <span class="n">__builtin__</span> <span class="ow">is</span> <span class="n">__builtins__</span>
<span class="k">print</span> <span class="s">"type(__builtin__):"</span><span class="p">,</span> <span class="nb">type</span><span class="p">(</span><span class="n">__builtin__</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"type(__builtins__):"</span><span class="p">,</span> <span class="nb">type</span><span class="p">(</span><span class="n">__builtins__</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span>
<span class="kn">import</span> <span class="nn">a</span>
<span class="c"># code from a will execute here</span>
</pre></div>
<p>Let's see what happens when we run <code>b.py</code>:</p>
<div class="highlight"><pre><span class="gp">$</span> python b.py
<span class="go">In b, before importing a</span>
<span class="go">__name__ is: __main__</span>
<span class="go">__builtin__ is __builtins__: True</span>
<span class="go">type(__builtin__): <type 'module'></span>
<span class="go">type(__builtins__): <type 'module'></span>
<span class="go">In a</span>
<span class="go">__name__ is: a</span>
<span class="go">__builtin__ is __builtins__: False</span>
<span class="go">type(__builtin__): <type 'module'></span>
<span class="go">type(__builtins__): <type 'dict'></span>
</pre></div>
<p>Aha. So when we're outside the context of the <code>__main__</code> module, <code>__name__</code> is just equal to the name of the module where code is currently being executed. That seems logical. And outside of <code>__main__</code>, <code>__builtins__</code> is a dict, rather than a module.</p>
<p>We were told earlier that, outside the context of <code>__main__</code>, <em>"<code>__builtins__</code> is an alias for the dictionary of the <code>__builtin__</code> module"</em>. I think that means that <code>__builtins__ is __builtin__.__dict__</code>. Let's see if my hypothesis is true, by adding another line to the bottom of our <code>a.py</code> file:</p>
<div class="highlight"><pre><span class="k">print</span> <span class="s">"__builtins__ is __builtin__.__dict__"</span><span class="p">,</span> <span class="n">__builtins__</span> <span class="ow">is</span> <span class="n">__builtin__</span><span class="o">.</span><span class="n">__dict__</span>
</pre></div>
<p>Running <code>b.py</code> again, we get:</p>
<div class="highlight"><pre><span class="gp">$</span> python b.py
<span class="go">In b, before importing a</span>
<span class="go">__name__ is: __main__</span>
<span class="go">__builtin__ is __builtins__: True</span>
<span class="go">type(__builtin__): <type 'module'></span>
<span class="go">type(__builtins__): <type 'module'></span>
<span class="go">In a</span>
<span class="go">__name__ is: a</span>
<span class="go">__builtin__ is __builtins__: False</span>
<span class="go">type(__builtin__): <type 'module'></span>
<span class="go">type(__builtins__): <type 'dict'></span>
<span class="go">__builtins__ is __builtin__.__dict__ True</span>
</pre></div>
<p>Yes! My hypothesis was correct. Okay. So now I get why using <code>__builtin__</code> is better than <code>__builtins__</code>:</p>
<p><strong>The type, and thus behavior, of <code>__builtins__</code> changes based on the context of where it's being executed, while the type and behavior of <code>__builtin__</code> is constant. Rad.</strong></p>
<p>Thanks, stranger who <a href="https://github.com/amygdalama/nagini/issues/1">suggested</a> I look into this, for the learning opportunity. And thanks, always, to Allison Kaptur, for exploring this topic with me.</p>
<p>The code for this blog post is on <a href="https://github.com/amygdalama/builtins">GitHub</a>, of course.</p>Replacing import with accio: A Dive into Bootstrapping and Python's Grammar2014-03-14T00:00:00-04:00Amy Hanlontag:amygdalama.github.io,2014-03-14:import-accio-bootstrapping-python-grammar.html<p>At <a href="https://www.hackerschool.com/">Hacker School</a>, I've been building an alternate universe Python by overwriting builtin functions and statements with Harry Potter spells. This is a thing you can do at Hacker School!</p>
<p>Although this project started as a joke, I've quickly descended so deeply into Python internals that I've, with the guidance of the fabulous Hacker School facilitator <a href="http://akaptur.github.io/">Allison Kaptur</a>, made edits to the CPython source code, and compiled a Python to compile a Python. All to replace the <code>import</code> statement with <code>accio</code>.</p>
<p>But before we get into compiling the Harry Potter Python I lovingly call Nagini, let's first talk about some Python internals basics, with spells as examples, of course.</p>
<h1 id="overwriting-builtin-functions">Overwriting Builtin Functions</h1>
<p>Python builtin functions are stored in a module called <code>__builtins__</code> that's automatically imported on startup.</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="nb">dir</span><span class="p">(</span><span class="n">__builtins__</span><span class="p">)</span>
<span class="go">['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', 'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False', 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError', 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError', 'NameError', 'None', 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError', 'PendingDeprecationWarning', 'ReferenceError', 'RuntimeError', 'RuntimeWarning', 'StandardError', 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'True', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'ZeroDivisionError', '_', '__debug__', '__doc__', '__import__', '__name__', '__package__', 'abs', 'all', 'any', 'apply', 'basestring', 'bin', 'bool', 'buffer', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod', 'cmp', 'coerce', 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'execfile', 'exit', 'file', 'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int', 'intern', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list', 'locals', 'long', 'map', 'max', 'memoryview', 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'raw_input', 'reduce', 'reload', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'unichr', 'unicode', 'vars', 'xrange', 'zip']</span>
</pre></div>
<p>Overwriting Python builtins is surprisingly easy!</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">wingardium_leviosa</span> <span class="o">=</span> <span class="n">__builtins__</span><span class="o">.</span><span class="n">float</span>
<span class="gp">>>> </span><span class="k">del</span> <span class="n">__builtins__</span><span class="o">.</span><span class="n">float</span>
<span class="gp">>>> </span><span class="nb">float</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="gt">Traceback (most recent call last):</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>, in <span class="n"><module></span>
<span class="gr">NameError</span>: <span class="n">name 'float' is not defined</span>
<span class="gp">>>> </span><span class="n">wingardium_leviosa</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="go">3.0</span>
</pre></div>
<p>However, overwriting <code>import</code> is not so easy. Let's try:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">accio</span> <span class="o">=</span> <span class="kn">import</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>
<span class="n">accio</span> <span class="o">=</span> <span class="n">import</span>
<span class="o">^</span>
<span class="gr">SyntaxError</span>: <span class="n">invalid syntax</span>
</pre></div>
<p>Python is expecting the name of a module after <code>import</code>, and thus it throws a <code>SyntaxError</code>. This is an effect of <code>import x</code> being a <em>statement</em>, rather than an <em>expression</em>.</p>
<p>Hm. I remember seeing the function <code>__import__</code> listed when we ran <code>dir(__builtins__)</code>. Maybe we can overwrite that instead:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">accio</span> <span class="o">=</span> <span class="n">__builtins__</span><span class="o">.</span><span class="n">__import__</span>
<span class="gp">>>> </span><span class="n">accio</span> <span class="n">sys</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>
<span class="n">accio</span> <span class="n">sys</span>
<span class="o">^</span>
<span class="gr">SyntaxError</span>: <span class="n">invalid syntax</span>
<span class="go"># :(</span>
</pre></div>
<p>What if we tried calling <code>accio</code> like a function?</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">accio</span><span class="p">(</span><span class="n">sys</span><span class="p">)</span>
<span class="gt">Traceback (most recent call last):</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>, in <span class="n"><module></span>
<span class="gr">NameError</span>: <span class="n">name 'sys' is not defined</span>
</pre></div>
<p>Maybe we need to pass 'sys' as a string?</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">accio</span><span class="p">(</span><span class="s">'sys'</span><span class="p">)</span>
<span class="go"><module 'sys' (built-in)></span>
<span class="go"># Ooh!</span>
<span class="gp">>>> </span><span class="n">sys</span> <span class="o">=</span> <span class="n">accio</span><span class="p">(</span><span class="s">'sys'</span><span class="p">)</span>
<span class="gp">>>> </span><span class="n">sys</span>
<span class="go"><module 'sys' (built-in)></span>
</pre></div>
<p>Aha. So the statement <code>import x</code> probably does something like:
1. call the <code>__import__</code> function on <code>x</code>: <code>__builtins__.__import__('x')</code>
2. assign the name <code>x</code> to the module returned by <code>__import__</code></p>
<p>And <code>import sys</code> is like shorthand for the command:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="n">sys</span> <span class="o">=</span> <span class="n">__builtins__</span><span class="o">.</span><span class="n">__import__</span><span class="p">(</span><span class="s">'sys'</span><span class="p">)</span>
</pre></div>
<p>(Here I'm only describing simple <code>import</code> statements, but more complex statements like <code>from x import y.w, y.z</code> work similarly.)</p>
<p>So we have a way to add <code>accio</code> as a function, but not as a statement. I'm unsatisfied.</p>
<p>For fun, can we delete import?</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="k">del</span> <span class="kn">import</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>
<span class="k">del</span> <span class="n">import</span>
<span class="o">^</span>
<span class="gr">SyntaxError</span>: <span class="n">invalid syntax</span>
<span class="gp">>>> </span><span class="k">del</span> <span class="n">__builtins__</span><span class="o">.</span><span class="n">__import__</span>
<span class="gp">>>> </span><span class="kn">import</span> <span class="nn">os</span>
<span class="gt">Traceback (most recent call last):</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>, in <span class="n"><module></span>
<span class="gr">ImportError</span>: <span class="n">__import__ not found</span>
</pre></div>
<p>Kind of! Although I want <code>import os</code> to be a <code>SyntaxError</code> rather than an <code>ImportError</code> because clearly <code>import</code> is the wrong thing to type and the user should know to type <code>accio</code> instead.</p>
<p>So, to completely overwrite <code>import</code> with <code>accio</code>, we'll need to learn where Python defines statements.</p>
<h1 id="grammar">Grammar</h1>
<p>Eli Bendersky wrote a great <a href="http://eli.thegreenplace.net/2010/06/30/python-internals-adding-a-new-statement-to-python/">blog post</a> about adding an <code>until</code> statement to Python. Since we want to <em>replace</em> a statement, rather than add one, our method will be a bit different.</p>
<p>Regardless, it looks like the place to start for changing Python's statements is in the <code>Grammar</code> file in the Python <a href="http://docs.python.org/devguide/setup.html">source code</a>. <strong>Python source code!</strong> Isn't this <em>fun?!</em></p>
<p>Python's source code is stored in a Mercurial repository, so first we'll have to install Mercurial.</p>
<div class="highlight"><pre><span class="nv">$ </span>brew install mercurial
</pre></div>
<p>Then we can clone CPython (like <code>git clone</code>):</p>
<div class="highlight"><pre><span class="nv">$ </span>hg clone http://hg.python.org/cpython
</pre></div>
<p>This will take a whole minute. Grab a coffee.</p>
<p>In the Python Mercurial repo, different versions of Python have different branches. By default we're on a Python3 branch. I'm still running Python2 on my machine, so let's checkout version 2.7:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd </span>cpython
<span class="nv">$ </span>hg checkout 2.7
</pre></div>
<p>Now let's <a href="http://docs.python.org/devguide/setup.html">compile CPython</a> and see if it works!</p>
<div class="highlight"><pre><span class="nv">$ </span>./configure --with-pydebug
<span class="nv">$ </span>make -s -j2
</pre></div>
<p>I get a warning message saying some modules were unable to be built, but I am unstoppable. We are unstoppable. Let's continue.</p>
<p>It seems like the place to start is in the file <code>Grammar/Grammar</code>, so let's start poking around there. <a href="http://docs.python.org/2/reference/grammar.html">This</a> is what it looks like. Searching for 'import' brings us to lines 52-60:</p>
<div class="highlight"><pre><span class="n">import_stmt</span><span class="p">:</span> <span class="n">import_name</span> <span class="o">|</span> <span class="n">import_from</span>
<span class="n">import_name</span><span class="p">:</span> <span class="s">'import'</span> <span class="n">dotted_as_names</span>
<span class="n">import_from</span><span class="p">:</span> <span class="p">(</span><span class="s">'from'</span> <span class="p">(</span><span class="s">'.'</span><span class="o">*</span> <span class="n">dotted_name</span> <span class="o">|</span> <span class="s">'.'</span><span class="o">+</span><span class="p">)</span>
<span class="s">'import'</span> <span class="p">(</span><span class="s">'*'</span> <span class="o">|</span> <span class="s">'('</span> <span class="n">import_as_names</span> <span class="s">')'</span> <span class="o">|</span> <span class="n">import_as_names</span><span class="p">))</span>
<span class="n">import_as_name</span><span class="p">:</span> <span class="n">NAME</span> <span class="p">[</span><span class="s">'as'</span> <span class="n">NAME</span><span class="p">]</span>
<span class="n">dotted_as_name</span><span class="p">:</span> <span class="n">dotted_name</span> <span class="p">[</span><span class="s">'as'</span> <span class="n">NAME</span><span class="p">]</span>
<span class="n">import_as_names</span><span class="p">:</span> <span class="n">import_as_name</span> <span class="p">(</span><span class="s">','</span> <span class="n">import_as_name</span><span class="p">)</span><span class="o">*</span> <span class="p">[</span><span class="s">','</span><span class="p">]</span>
<span class="n">dotted_as_names</span><span class="p">:</span> <span class="n">dotted_as_name</span> <span class="p">(</span><span class="s">','</span> <span class="n">dotted_as_name</span><span class="p">)</span><span class="o">*</span>
<span class="n">dotted_name</span><span class="p">:</span> <span class="n">NAME</span> <span class="p">(</span><span class="s">'.'</span> <span class="n">NAME</span><span class="p">)</span><span class="o">*</span>
</pre></div>
<p>Cool! We can kind of understand what's going on here just from reading. It looks like an <code>import_stmt</code> is either an <code>import_name</code> or an <code>import_from</code> which have the format <code>import x</code> and <code>from x import y</code>, respectively. What happens if we just change 'import' to 'accio' in lines 53 and 55? Let's try it. After making the change and saving the <code>Grammar</code> file, type the following command to compile:</p>
<div class="highlight"><pre><span class="nv">$ </span>make -s -j2
</pre></div>
<p>Ach. If only it was that easy. This throws an error:</p>
<div class="highlight"><pre><span class="gt">Traceback (most recent call last):</span>
File <span class="nb">"/Users/amyhanlon/projects/nagini/cpython/Lib/runpy.py"</span>, line <span class="m">151</span>, in <span class="n">_run_module_as_main</span>
<span class="n">mod_name</span><span class="p">,</span> <span class="n">loader</span><span class="p">,</span> <span class="n">code</span><span class="p">,</span> <span class="n">fname</span> <span class="o">=</span> <span class="n">_get_module_details</span><span class="p">(</span><span class="n">mod_name</span><span class="p">)</span>
File <span class="nb">"/Users/amyhanlon/projects/nagini/cpython/Lib/runpy.py"</span>, line <span class="m">113</span>, in <span class="n">_get_module_details</span>
<span class="n">code</span> <span class="o">=</span> <span class="n">loader</span><span class="o">.</span><span class="n">get_code</span><span class="p">(</span><span class="n">mod_name</span><span class="p">)</span>
File <span class="nb">"/Users/amyhanlon/projects/nagini/cpython/Lib/pkgutil.py"</span>, line <span class="m">283</span>, in <span class="n">get_code</span>
<span class="bp">self</span><span class="o">.</span><span class="n">code</span> <span class="o">=</span> <span class="nb">compile</span><span class="p">(</span><span class="n">source</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">filename</span><span class="p">,</span> <span class="s">'exec'</span><span class="p">)</span>
File <span class="nb">"/Users/amyhanlon/projects/nagini/cpython/Lib/sysconfig.py"</span>, line <span class="m">4</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="o">^</span>
<span class="gr">SyntaxError</span>: <span class="n">invalid syntax</span>
</pre></div>
<p>This error occurs while trying to execute a Python script! Compiling CPython requires running Python scripts! Interesting. Maybe at this point we remember that Python is <a href="http://en.wikipedia.org/wiki/Bootstrapping_(compilers)">bootstrapped</a>. We look back at the <a href="http://docs.python.org/devguide/setup.html">Python Developer's Guide</a> and we find that <em>"Vast areas of CPython are written completely in Python: as of this writing, CPython contains slightly more Python code than C."</em></p>
<p>So then we wonder - when CPython is compiling, does it execute Python scripts with the Python that's currently being compiled? Or does it use another already-compiled muggle Python, like our environment Python? If it uses the Python that's currently being compiled, we'll need to change these .py scripts to say <code>accio</code> instead of <code>import</code>. Otherwise, what do we do? Our muggle Python only understands <code>import</code> and not <code>accio</code>...</p>
<p>Let's look into one of the .py scripts within <code>Lib</code> to investigate. Here's the first line of the <code>Lib/keyword.py</code> script:</p>
<div class="highlight"><pre><span class="c">#! /usr/bin/env python</span>
</pre></div>
<p>Aha! This script is executed via our environment Python! Our environment Python only understands <code>import</code>. So <code>keyword.py</code> needs to have <code>import</code> and not <code>accio</code>. However, since we got a <code>SyntaxError</code> on an <code>import</code> statement, that must mean that at least sometimes during the process of compiling we're required to use <code>accio</code> instead of <code>import</code>. Hrm... Any ideas?</p>
<h1 id="yo-dawg-i-heard-you-like-pythons">Yo Dawg, I Heard You Like Pythons</h1>
<p>What if we did something crazy like compiled an intermediary Python that understands <em>both</em> <code>accio</code> <em>and</em> <code>import</code>, and used <em>that</em> Python to compile <em>another</em> Python that only understands <code>accio</code>? (Full credit for this idea goes to <a href="http://akaptur.github.io/">Allison Kaptur</a>.)</p>
<p>So, for our intermediary Python we'll need to edit the <code>Grammar</code> file like so:</p>
<div class="highlight"><pre><span class="n">import_name</span><span class="p">:</span> <span class="s">'import'</span> <span class="n">dotted_as_names</span> <span class="o">|</span> <span class="s">'accio'</span> <span class="n">dotted_as_names</span>
<span class="n">import_from</span><span class="p">:</span> <span class="p">((</span><span class="s">'from'</span> <span class="p">(</span><span class="s">'.'</span><span class="o">*</span> <span class="n">dotted_name</span> <span class="o">|</span> <span class="s">'.'</span><span class="o">+</span><span class="p">)</span>
<span class="s">'import'</span> <span class="p">(</span><span class="s">'*'</span> <span class="o">|</span> <span class="s">'('</span> <span class="n">import_as_names</span> <span class="s">')'</span> <span class="o">|</span> <span class="n">import_as_names</span><span class="p">))</span> <span class="o">|</span>
<span class="p">(</span><span class="s">'from'</span> <span class="p">(</span><span class="s">'.'</span><span class="o">*</span> <span class="n">dotted_name</span> <span class="o">|</span> <span class="s">'.'</span><span class="o">+</span><span class="p">)</span>
<span class="s">'accio'</span> <span class="p">(</span><span class="s">'*'</span> <span class="o">|</span> <span class="s">'('</span> <span class="n">import_as_names</span> <span class="s">')'</span> <span class="o">|</span> <span class="n">import_as_names</span><span class="p">)))</span>
</pre></div>
<p>Thus this Python should understand both <code>import</code> and <code>accio</code>. Let's compile.</p>
<div class="highlight"><pre><span class="nv">$ </span>make -s -j2
</pre></div>
<p>Eep! No errors! Just the warning about missing modules that we also received before we made any changes! Now we need to prepend our $PATH so that this Python will become our environment Python (but only for this terminal session). That way this intermediary Python will be used to compile our final Python. Let's make a symlink to the <code>python.exe</code> that was created when we ran <code>make</code>, and then add the path to that symlink to our $PATH:</p>
<div class="highlight"><pre><span class="nv">$ </span>mkdir bin
<span class="nv">$ </span><span class="nb">cd </span>bin
<span class="nv">$ </span>ln -s ../python.exe python
<span class="nv">$ </span><span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>:<span class="nv">$PATH</span>
</pre></div>
<p>Now we'll need to duplicate this entire <code>cpython</code> directory and make our final Python:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd</span> ../
<span class="nv">$ </span>cp -r cpython nagini-python
<span class="nv">$ </span><span class="nb">cd </span>nagini-python
</pre></div>
<p>We want to change the <code>Grammar</code> file for this Python to only allow <code>accio</code>:</p>
<div class="highlight"><pre><span class="n">import_name</span><span class="p">:</span> <span class="s">'accio'</span> <span class="n">dotted_as_names</span>
<span class="n">import_from</span><span class="p">:</span> <span class="p">(</span><span class="s">'from'</span> <span class="p">(</span><span class="s">'.'</span><span class="o">*</span> <span class="n">dotted_name</span> <span class="o">|</span> <span class="s">'.'</span><span class="o">+</span><span class="p">)</span>
<span class="s">'accio'</span> <span class="p">(</span><span class="s">'*'</span> <span class="o">|</span> <span class="s">'('</span> <span class="n">import_as_names</span> <span class="s">')'</span> <span class="o">|</span> <span class="n">import_as_names</span><span class="p">))</span>
</pre></div>
<p>And then we want to replace every instance of <code>import</code> in every .py file to <code>accio</code>. We'll use a blackbox bash command to accomplish that:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="k">for </span>i in <span class="sb">`</span>find . -name <span class="s1">'*.py'</span><span class="sb">`</span>; <span class="k">do </span>sed -i <span class="s1">''</span> <span class="s1">'s/[[:<:]]import[[:>:]]/accio/g'</span> <span class="nv">$i</span>; <span class="k">done</span>
</pre></div>
<p>Now we just need to compile this new Python!</p>
<div class="highlight"><pre><span class="nv">$ </span>make -s -j2
</pre></div>
<p>Let's make a symlink to this Python...</p>
<div class="highlight"><pre><span class="nv">$ </span>mkdir bin
<span class="nv">$ </span><span class="nb">cd </span>bin
<span class="nv">$ </span>ln -s ../python.exe python
<span class="nv">$ </span><span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>:<span class="nv">$PATH</span>
<span class="nv">$ </span>python
</pre></div>
<p>And fire it up...</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">sys</span>
File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="o">^</span>
<span class="gr">SyntaxError</span>: <span class="n">invalid syntax</span>
<span class="gp">>>> </span><span class="n">accio</span> <span class="n">sys</span>
<span class="gp">>>> </span><span class="n">sys</span><span class="o">.</span><span class="n">modules</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span>
<span class="go">['copy_reg', 'sre_compile', '_sre', 'encodings', 'site', '__builtin__', 'sysconfig', '__main__', 'encodings.encodings', 'abc', 'posixpath', '_weakrefset', 'errno', 'encodings.codecs', 'sre_constants', 're', '_abcoll', 'types', '_codecs', 'encodings.__builtin__', '_warnings', 'genericpath', 'stat', 'zipimport', '_sysconfigdata', 'warnings', 'UserDict', 'encodings.ascii', 'sys', '_osx_support', 'codecs', 'os.path', 'sitecustomize', 'signal', 'traceback', 'linecache', 'posix', 'encodings.aliases', 'exceptions', 'sre_parse', 'os', '_weakref']</span>
</pre></div>
<p>HOLY SHIT IT WORKS!</p>
<h1 id="fin">Fin</h1>
<p>That's it. We just compiled two Pythons and fooled around with source code for the sake of a joke. Grab yourselves a beer, friends. Victory.</p>
<p>My super messy and not-really-prepared-for-the-general-public GitHub <a href="https://github.com/amygdalama/nagini">repo</a> contains both versions of Python, for reference.</p>GitHub Pages Publication with Git Hooks2014-03-08T00:00:00-05:00Amy Hanlontag:amygdalama.github.io,2014-03-08:github-pages-publication-git-hooks.html<p>As I <a href="http://mathamy.com/migrating-to-github-pages-using-pelican.html">wrote</a> at length a couple weeks ago, this blog is hosted on <a href="http://pages.github.com/">GitHub Pages</a> and generated by <a href="http://docs.getpelican.com/en/3.3.0/">Pelican</a>. Generally the integration between the two is quite blissful, except for managing two separate repositories - <a href="https://github.com/amygdalama/blog-source">blog-source</a> for my blog's source content, configuration files, and theme, and <a href="https://github.com/amygdalama/amygdalama.github.io">amygdalama.github.io</a> for my Pelican-generated site.</p>
<p>Lately I've been pushing to these two separate repositories manually, so my workflow looks something like:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd </span>blog/ <span class="c"># root directory for my blog</span>
<span class="nv">$ </span>make devserver <span class="c"># automatically re-generates site and hosts site locally</span>
<span class="c"># change something in content or settings</span>
<span class="nv">$ </span>git add --all
<span class="nv">$ </span>git commit -m <span class="s2">"committing blog source"</span>
<span class="nv">$ </span>git push origin master <span class="c"># pushes to my blog-source repo on GitHub</span>
<span class="nv">$ </span><span class="nb">cd </span>output/ <span class="c"># pelican-generated output folder</span>
<span class="nv">$ </span>git commit -am <span class="s2">"committing pelican-generated site content"</span>
<span class="nv">$ </span>git push origin master <span class="c"># pushes to my amygdalama.github.io repo</span>
</pre></div>
<p>But alas! I have figured out how to automatically push any changes committed in my root blog directory to my blog-source repo and then automatically add, commit, and push changes to my GitHub Pages repo. Now my workflow looks like:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd </span>blog/
<span class="nv">$ </span>make devserver
<span class="c"># change something in content or settings</span>
<span class="nv">$ </span>git add --all
<span class="nv">$ </span>git commit -m <span class="s2">"committing blog source"</span>
<span class="c"># everything else happens automatically! no more typing! yay!</span>
</pre></div>
<p>The key is to use <a href="http://githooks.com/">Git Hooks</a>. For a particular .git repo, you can add an executable file in <code>.git/hooks</code> which will automatically execute before or after an event like <code>commit</code>. The available types of hooks can be found <a href="http://githooks.com/">here</a>.</p>
<p>For this specific automation task, I used a post-commit hook. To do this, first create the file <code>.git/hooks/post-commit</code>:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd </span>blog/
<span class="nv">$ </span>subl .git/hooks/post-commit
</pre></div>
<p>Add these lines to the file:</p>
<div class="highlight"><pre><span class="c">#!/bin/bash</span>
git push origin master
<span class="nb">cd </span>output/
git add --all <span class="c"># sub-optimally will add all even if you didn't add all to blog-source</span>
git commit -m <span class="s2">"automatic commit"</span> <span class="c"># add whatever commit message you want</span>
git push origin master
</pre></div>
<p>Then, to make the file executable, type:</p>
<div class="highlight"><pre><span class="nv">$ </span>chmod <span class="nv">a</span><span class="o">=</span>r+w+x .git/hooks/post-commit
</pre></div>
<p>And that's it! Joyful automation!</p>
<h2 id="edit">Edit</h2>
<p><em>2014-03-14</em>: Hacker Schooler Matthew Avant has a <a href="http://www.mavant.com/4.html">superior</a> method using a pre-push hook (rather than a post-commit hook). I'm using his method now :)</p>How Should I Learn Programming?2014-02-23T00:00:00-05:00Amy Hanlontag:amygdalama.github.io,2014-02-23:how-should-i-learn-programming.html<p>I rarely hear the question <em>"How should I learn programming?"</em> from individuals who are just beginning to write code. Instead I hear way more specific variants like:</p>
<ul>
<li>What are the best sites for learning programming?</li>
<li>What's a good free online course...?</li>
<li>What's a good textbook...?</li>
</ul>
<p>These are not very helpful questions to ask! Programming is not a subject in which you can just take a class or read a book and then "get it".</p>
<h2 id="tutorials-are-boring">Tutorials Are Boring</h2>
<p>Tutorials/books/online classes are good for learning the basics of a language (syntax, control flow like for-loops and if-statements, data types like strings, arrays, etc). You want to use tutorials to give you enough of a framework to know what problems can be solved through programming and what to google when you run into an issue or a bug. </p>
<p>At some point when you're taking an online class or going through a tutorial, you'll probably find that you no longer find it intellectually stimulating. This is okay! Tutorials are basically monkey-see, monkey-do. They <em>aren't</em> intellectually stimulating (at least usually). When you get to this point, it is okay to quit the tutorial! Instead you should work on something that will satisfy your curiousity. You should write your own programs!</p>
<p>I only really started learning <em>how to program</em>, i.e. how to solve a problem by myself from start to finish by writing code, when I strayed from tutorials. Instead I would think of interesting problems to solve (or find <a href="https://www.kaggle.com/competitions">some</a> <a href="https://projecteuler.net/">online</a>) and then struggle through creating solutions. This approach is similar to actual coding in the wild! However, it can be very scary for new programmers! </p>
<h2 id="communities-are-inspiring">Communities Are Inspiring</h2>
<p>Being surrounded by other humans who are learning programming and working on really interesting projects is an essential part of becoming a better programmer! I am <em>extremely</em> fortunate to be a part of the epitome of this type of community at <a href="https://www.hackerschool.com/">Hacker School</a>. Being part of a group of programmers dedicated to learning and making cool things is quite inspiring, and you can probably find a group like this in your city, too! Many cities have <a href="http://www.meetup.com/">Meetup</a> groups related to programming. Join those! Go to hackathons! Meet friends that are better programmers than you! Get real humans to read your code and give you suggestions on how to improve! Pair program! These things, however, are terrifying! They are not easy! You will be exposing your weaknesses and lack of knowledge to other human beings! Oh dear! </p>
<h2 id="the-fear-you-will-let-it-pass-through-you">The Fear, You Will Let It Pass Through You</h2>
<p><img src="http://amygdalama.github.io/images/paul_fear.jpg"></p>
<p>This fear is totally reasonable! And there are so many more reasons to be afraid! Most of us are quite accustomed to learning in a very sequential way, in which we start with Chapter 1, and then once we finish Chapter 12, everyone assumes we know what we're talking about. Learning to code is very different from this, and foreign things can be scary! Also, working on solving problems on our own can be scary because maybe you aren't smart enough! And working on solving problems with others is scary because you might not be good at communicating your ideas! Or you might not be as good at programming as they are! Or maybe you picked the wrong programming language! Or maybe you're afraid of not putting in enough effort to really get anywhere! Or of working so much you get burnt out! There are many things to be afraid of. In fact, on the first day of Hacker School we made a list of these things:</p>
<p><img alt="Hacker Schoolers Are Nervous" src="http://amygdalama.github.io/images/hs-nervous.jpeg" /></p>
<p>See! Other people are afraid! Fear is normal!</p>
<p>It is essential to become aware of your fears. List them. Have someone read the list. You will probably giggle together. They are probably afraid, too.</p>
<h2 id="fuck-tutorials-write-a-program-send-it-to-another-human">Fuck tutorials. Write a program. Send it to another human.</h2>
<p>You can send it to me! Here's my email: amyehanlon@gmail.com <3</p>
<p>A huge shoutout to the fabulous <a href="http://melchua.com/">Mel Chua</a> for talking through this post with me and giving me a better vocabulary and framework for understanding learning.</p>Migrating to GitHub Pages using Pelican2014-02-22T00:00:00-05:00Amy Hanlontag:amygdalama.github.io,2014-02-22:migrating-to-github-pages-using-pelican.html<p>Over the past week I've been dog-paddling through the ocean of misery that is migrating a blog from one host (WordPress) to another (<a href="http://pages.github.com/">GitHub Pages</a>) and attempting to learn enough CSS and <a href="http://jinja.pocoo.org/">Jinja</a> to handle setting up my site using <a href="http://docs.getpelican.com/en/3.3.0/">Pelican</a>. I have no experience with CSS! And my HTML experience is limited to injecting angst into my MySpace profile! And I became aware of Jinja and Pelican's existence about a week ago! So obviously I've drowned myself in 1.5 bottles of my neighborhood liquor store's 2-bottles-of-wine-for-$10 special.</p>
<p>The great part about this whole process is that with Pelican, I can write my blog posts and pages in <a href="http://daringfireball.net/projects/markdown/">Markdown</a> (about which I also knew little until last week, but it's <em>wonderfully easy to learn</em>.) I am so tired of wrangling with WordPress's built-in editor trying to get my code blocks and in-line code to format correctly. Markdown is a blissful alternative.</p>
<p>There's a plethora of material online on Pelican and GitHub pages, but it is fairly disconnected and presumes a certain level of front-end development experience, of which I have none. Hopefully this post can help others make this transition with less misery.</p>
<h2 id="github-pages-setup">GitHub Pages Setup</h2>
<ol>
<li>Create a GitHub repo following the <a href="http://pages.github.com/">GitHub Pages instructions</a> (the first step only!)</li>
</ol>
<p><em>A note on GitHub Pages:</em> I believe your HTML files (particularly your index.html file) must be in the <em>main directory</em> of your git repo for this to work. This will be important later. More detail is given in the <strong>Posting to GitHub</strong> section.</p>
<h2 id="pelican-setup">Pelican Setup</h2>
<ol>
<li>
<p>Install necessary <a href="http://docs.getpelican.com/en/3.1.1/getting_started.html#installing-pelican">packages</a></p>
</li>
<li>
<p>Run Pelican <a href="http://docs.getpelican.com/en/3.1.1/getting_started.html#kickstart-a-blog">quickstart</a></p>
<p>This will ask you lots of questions that probably seem foreign. These questions will set up some configuration files that you can later edit with your preferred <a href="http://docs.getpelican.com/en/3.1.1/settings.html">settings</a>. As an example, here's how I answered:</p>
<div class="highlight"><pre><span class="gp">$</span> pelican-quickstart
<span class="go">Where do you want to create your new web site? [.]</span>
<span class="go">What will be the title of this web site?</span>
<span class="gp">></span> Amy Hanlon
<span class="go">Who will be the author of this web site?</span>
<span class="gp">></span> Amy Hanlon
<span class="go">What will be the default language of this web site? [en]</span>
<span class="go">Do you want to specify a URL prefix? e.g., http://example.com (Y/n)</span>
<span class="gp">></span> y
<span class="go">What is your URL prefix? (see above example; no trailing slash)</span>
<span class="gp">></span> http://amygdalama.github.io
<span class="go">Do you want to enable article pagination? (Y/n)</span>
<span class="gp">></span> y
<span class="go">How many articles per page do you want? [10]</span>
<span class="go">Do you want to generate a Fabfile/Makefile to automate generation and publishing? (Y/n)</span>
<span class="gp">></span> y
<span class="go">Do you want an auto-reload & simpleHTTP script to assist with theme and site development? (Y/n)</span>
<span class="gp">></span> y
<span class="go">Do you want to upload your website using FTP? (y/N)</span>
<span class="gp">></span> n
<span class="go">Do you want to upload your website using SSH? (y/N)</span>
<span class="gp">></span> n
<span class="go">Do you want to upload your website using Dropbox? (y/N)</span>
<span class="gp">></span> n
<span class="go">Do you want to upload your website using S3? (y/N)</span>
<span class="gp">></span> n
<span class="go">Do you want to upload your website using Rackspace Cloud Files? (y/N)</span>
<span class="gp">></span> n
</pre></div>
<p>Now if you type the <code>tree</code> command within your blog's main directory, you should see:</p>
<div class="highlight"><pre><span class="nv">$ </span>tree
.
├── Makefile
├── content
├── develop_server.sh
├── fabfile.py
├── output
├── pelicanconf.py
└── publishconf.py
</pre></div>
<p>If you don't have <code>tree</code>, you should! It's neat. <code>brew install tree</code>. If you're on OSX and don't have <a href="http://brew.sh/">Homebrew</a>, you should! It's neat.</p>
<p>I'll briefly explain each of these files/directories:</p>
<ul>
<li>
<p><code>Makefile</code> tells the command <code>make</code> what to do. This file defines commands like <code>make devserver</code>. More information on <code>make</code> can be found <a href="http://www.gnu.org/software/make/manual/make.html">here</a>. I'll cover more on how to use this command for developing your site in the <strong>Generating Your Site</strong> section.</p>
</li>
<li>
<p><code>content</code> is the directory that should house all of your Markdown files. Pelican assumes that your articles/blog posts will be inside this directory. Additionally, there are some special directories you should create within <code>content</code>:</p>
<div class="highlight"><pre><span class="nv">$ </span>mkdir content/pages
<span class="nv">$ </span>mkdir content/images
</pre></div>
<p>Pelican by default is configured to know that your pages (i.e. static pages like About Me, Contact, etc) are found within this <code>pages</code> directory and that images are found within the <code>images</code> directory.</p>
</li>
<li>
<p><code>develop_server.sh</code> is a bash script that I believe handles serving your site locally during development (i.e. it serves your site to <a href="http://localhost:8000">http://localhost:8000</a>).</p>
</li>
<li>
<p><code>fabfile.py</code> is a configuration file for <a href="http://docs.fabfile.org/en/1.8/">Fabric</a> which allows you to generate your site using the <code>fab</code> command. You'll need to <code>pip install fabric</code> if you want to use it. Alternatively you can just use <code>make</code>.</p>
</li>
<li>
<p><code>output</code> is, by default, where Pelican will store your HTML files when you run <code>pelican content</code>. This can cause issues which I describe in the section <strong>Posting to GitHub</strong>.</p>
</li>
<li>
<p><code>pelicanconf.py</code> houses your Pelican configuration <a href="http://docs.getpelican.com/en/3.3.0/settings.html">settings</a>.</p>
</li>
<li>
<p><code>publishconf.py</code> is like <code>pelicanconf.py</code> in that it houses Pelican configuration settings, but is not intended to be used for local development. The reasoning behind having two separate files is described in <a href="http://stackoverflow.com/a/20845195">this Stack Overflow answer</a>.</p>
</li>
</ul>
</li>
</ol>
<h2 id="exporting-existing-content">Exporting Existing Content</h2>
<p>This section assumes you have existing content on a WordPress blog. Pelican also has an importer for Dotclear and RSS/Atom feeds. You can skip this section if you don't have existing content living elsewhere that you want to port to your site on GitHub Pages.</p>
<ol>
<li><a href="http://en.blog.wordpress.com/2006/06/12/xml-import-export/">Export WordPress content to XML</a></li>
<li><a href="http://docs.getpelican.com/en/3.1.1/importer.html">Imperfectly convert the XML to Markdown using Pelican</a></li>
<li>Manually export your images from your WordPress Media Library (I know. This sucks.) Move these images to <code>content/images</code>.</li>
<li>Manually edit the Markdown output (your code blocks, links, embedded images will likely need editing).</li>
<li>Move your Markdown files to the <code>content</code> directory within your website's main directory. Content intended to be static pages (i.e. About Me, Contact, etc) should go in the <code>content/pages</code> directory. Articles/blog posts should go in the <code>content</code> directory.</li>
</ol>
<h2 id="pelican-themes">Pelican Themes</h2>
<ol>
<li>
<p>Clone the available <a href="https://github.com/getpelican/pelican-themes">Pelican Themes</a> into your blog's main directory.</p>
<div class="highlight"><pre><span class="nv">$ </span>git clone https://github.com/getpelican/pelican-themes
</pre></div>
</li>
<li>
<p>Choose a theme you'd like to use. Pelican by default comes with the notmyidea and simple themes. Most other themes have a sample image in the pelican-themes repo to help you decide.</p>
</li>
<li>
<p>After you've chosen a theme, set the THEME variable in your <code>pelicanconf.py</code> file to the absolute or relative path to the theme. For example, I'm using the subtle theme and added this line to my <code>pelicanconf.py</code> file:</p>
<div class="highlight"><pre><span class="n">THEME</span> <span class="o">=</span> <span class="s">"pelican-themes/subtle"</span>
</pre></div>
<p>This method is better than using <code>pelican-themes</code> as described <a href="http://docs.getpelican.com/en/3.3.0/pelican-themes.html">here</a>, because it ensures that the Pelican HTML output will reflect any changes you make to the theme (without having to re-install the theme by running the <code>pelican-themes</code> command).</p>
</li>
</ol>
<h2 id="customization">Customization</h2>
<p>All elements of your theme are customizable! You can change attributes of text like font, size, color, and more in the <code>main.css</code> file found in your theme's directory. For example, I've made many edits to the file <code>pelican-themes/subtle/static/css/main.css</code>.</p>
<p>Similarly, you can change layouts of your pages (like what shows up in your site nav menu) by exploring the HTML files in the <code>templates</code> folder within your theme. There will usually be a <code>base.html</code> file (or something similar) that provides the foundation for things like your header and site nav menu that will apply to every page.</p>
<p>There should also be HTML files that serve as templates for specific types of pages. For example, <code>article.html</code> defines the basic structure for your articles/blog posts. If you want to change the metadata that displays above article content, you should look there.</p>
<p>If you see something on your website that you want to change, and you're not sure where to look in your theme's CSS/HTML files, right click on the element in the browser and go to "Inspect Element". This will show you where in the HTML the element is (on the left) and what parts of the CSS file define its style (on the right). You can adjust things here in the browser to test out different fonts, colors, etc, but changes you make to the code in your browser will not be reflected in your source files.</p>
<h2 id="generating-your-site">Generating Your Site</h2>
<p>Once you have markdown files in your <code>content</code> folder, navigate to your blog's main directory and run:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd </span>blog
<span class="nv">$ </span>make devserver
</pre></div>
<p><code>make devserver</code> does a number of things: first it runs the <code>pelican</code> command on your <code>content</code> folder to generate HTML for your site using the theme you specify in your <code>pelicanconf.py</code> file, and serves your site locally at <a href="http://localhost:8000">http://localhost:8000</a>. <code>make devserver</code> will also automatically regenerate your site (i.e. run <code>pelican</code> on <code>content</code> every time you save a change to a content, configuration, or theme file! Just refresh the page in your browser, and you should immediately see the changes. If this doesn't work, it's probably due to the settings you have in your configuration files (<code>pelicanconf.py</code>, <code>Makefile</code>, and/or <code>develop_server.sh</code>).</p>
<h2 id="posting-to-github">Posting to GitHub</h2>
<p>Recall that you need a repository on GitHub named <em>username.github.io</em> (this will be the remote repository for your blog), and that your HTML files need to be in this repository's main directory (not within a subdirectory).</p>
<p>It's intuitive to initialize a local repository for your blog within your blog's main directory, because in addition to posting the HTML, you'd also like to backup your content Markdown files, configuration files, and customized theme. This is a reasonable desire!</p>
<p>However, if you do this, GitHub won't generate your site! It isn't smart enough to know that the HTML files it needs to serve are actually contained within the <code>output</code> folder (recall that Pelican by default saves the HTML it generates in this folder).</p>
<p>The best solution I've come up with so far (and please email me if you know of a better solution!) is to create two separate repositories - one inside the <code>output</code> directory where Pelican generates your HTML (this repo should have <em>username.github.io</em> on GitHub as a remote), and another in your blog's main directory with your source Markdown files (in <code>content</code>), theme, and configuration files (this repo should have a different remote on GitHub).</p>
<p>In the terminal, move to the <code>output</code> directory, and initialize a git repo. Add a remote pointing to the repo you created on GitHub (called <em>username.github.io</em>), add all the files you want to commit, commit, and push changes to the remote repository.</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd </span>output
<span class="nv">$ </span>git init
<span class="nv">$ </span>git remote add origin https://github.com/username/username.github.io.git
<span class="nv">$ </span>git add --all
<span class="nv">$ </span>git commit -m <span class="s2">"commit message"</span>
<span class="nv">$ </span>git push origin master
</pre></div>
<p>If you use this method, you'll want to change the following setting to <code>False</code> in your <code>publishconf.py</code> file:</p>
<div class="highlight"><pre><span class="n">DELETE_OUTPUT_DIRECTORY</span> <span class="o">=</span> <span class="bp">False</span>
</pre></div>
<p>Otherwise if you use the <code>publishconf.py</code> file as your settings file when running the <code>pelican</code> command, you'll delete your git repo!</p>
<p>Similarly, don't use the <code>make clean</code> command! If you poke around the <code>Makefile</code>, you'll see that <code>make clean</code> runs <code>rm -rf output</code> which will delete all files (including your git repo) in your output folder.</p>
<p>If you accidentally delete the repo in your output folder, it's not a <em>huge</em> deal (I've done it like 5 times playing with different commands and settings). Just clone your remote <em>username.github.io</em> repo into a new, empty <code>output</code> folder, re-generate your site with any changes you've made since your last push to the remote, and then commit and push the changes to the remote:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd </span>blog
<span class="nv">$ </span>git clone https://github.com/username/username.github.io.git output
<span class="nv">$ </span>pelican content
<span class="nv">$ </span><span class="nb">cd </span>output
<span class="nv">$ </span>git add --all
<span class="nv">$ </span>git commit -m <span class="s2">"commit message"</span>
<span class="nv">$ </span>git push origin master
</pre></div>
<p>You'll also need to set up another repository for your source content, configuration files, and theme, which is annoying. I added a .gitignore to this repo to ignore the files in the output folder, but that isn't necessary.</p>
<p>Within about 10 minutes of pushing your changes, your site should be up and running! (Later changes should be reflected on your site almost instantaneously.)</p>
<h2 id="custom-domain-setup">Custom Domain Setup</h2>
<p>If you have your own domain name that you'd like to use instead of <em>username.github.io</em>, you'll need to follow <a href="https://help.github.com/articles/setting-up-a-custom-domain-with-pages">these instructions</a>.</p>
<h2 id="fin">Fin</h2>
<p>Feel free to poke around my blog's <a href="https://github.com/amygdalama/blog-source">GitHub</a> <a href="https://github.com/amygdalama/amygdalama.github.io">repos</a> (beware: there are unpublished draft posts in there). My configuration files in particular might be useful to you.</p>
<p>If any of you Hacker Schoolers have trouble migrating your blog, I'd be happy to help!</p>ipython and pandas are whoa!2014-02-14T15:52:00-05:00Amy Hanlontag:amygdalama.github.io,2014-02-14:ipython-and-pandas-are-whoa.html<p>In my first five months of using Python, I would create .py files and
then execute them in the terminal, like the loyal follower of <a href="http://learnpythonthehardway.org/book/">LPTHW</a>
that I am. While it's certainly necessary to be able to write Python
scripts and execute them from the terminal, I've learned that this
workflow isn't the best way to explore data in Python.</p>
<p>I initially converted to Python from R, and since have missed the
interactiveness of working within RStudio (you don't have to re-run your
entire code every time you make a change and want to view the results)
and how R allows you to so quickly get to your data with data frames
built straight from text files. Enter: <a href="http://ipython.org/">ipython</a> and <a href="http://pandas.pydata.org/">pandas</a>. (Go
to their respective sites for installation instructions.)</p>
<p>Julia Evans has a fantastic <a href="https://github.com/jvns/pandas-cookbook">tutorial</a> on how to use pandas in an
ipython notebook. Here, I'd like to explore the differences between
doing data analysis using ipython notebook and pandas versus writing .py
scripts using mainly standard Python packages and executing via the
terminal.</p>
<p>Let's use the classic iris dataset for some simple analysis. First we
need to do some setup. Create a directory for our project (mine is
stored within my Projects directory that I keep in my Home directory.)
In the terminal:</p>
<div class="highlight"><pre><span class="nv">$ </span>mkdir ~/Projects/ipython-pandas-whoa
<span class="nv">$ </span><span class="nb">cd</span> ~/Projects
</pre></div>
<p>Then create a Python script and open it with a text editor (I use
Sublime Text 2, which I've <a href="http://www.sublimetext.com/docs/2/osx_command_line.html">configured</a> to open files with the command
<code>subl</code>.</p>
<div class="highlight"><pre><span class="nv">$ </span>touch barbaric-script.py
<span class="nv">$ </span>subl barbaric-script.py
</pre></div>
<p>This should open up a blank .py file in Sublime. If you don't have
<code>subl</code> set up, you can simply open up the blank barbaric-script.py file
we created with the <code>touch</code> command in your text editor of choice.</p>
<p>In the barbaric-script.py file, let's import the necessary packages.
We'll be using a couple standard python packages plus matplotlib.pyplot.
If you don't have matplotlib installed, you can find installation
instructions <a href="http://matplotlib.org/users/installing.html">here</a>.</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">csv</span>
<span class="kn">import</span> <span class="nn">pprint</span>
<span class="kn">import</span> <span class="nn">urllib2</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
</pre></div>
<p>To get the iris data, we'll use Python's built-in urllib2 and csv
packages and then do some basic formatting. Let's process the dataset as
a list of dictionaries.</p>
<div class="highlight"><pre><span class="n">url</span> <span class="o">=</span> <span class="s">"http://mlr.cs.umass.edu/ml/machine-learning-databases/iris/iris.data"</span>
<span class="c"># Open data from URL as a file-like object</span>
<span class="n">f</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">urlopen</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
<span class="c"># Create an empty list to store our data</span>
<span class="n">parsed_data</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c"># Read file</span>
<span class="n">raw_data</span> <span class="o">=</span> <span class="n">csv</span><span class="o">.</span><span class="n">reader</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
<span class="c"># Define our headers since the file doesn't contain explicit headers</span>
<span class="c"># I found these headers from looking at the documentation at</span>
<span class="c"># http://mlr.cs.umass.edu/ml/machine-learning-databases/iris/iris.names</span>
<span class="n">headers</span> <span class="o">=</span> <span class="p">[</span><span class="s">'Sepal Length'</span><span class="p">,</span> <span class="s">'Sepal Width'</span><span class="p">,</span> <span class="s">'Petal Length'</span><span class="p">,</span> <span class="s">'Petal Width'</span><span class="p">,</span> <span class="s">'Class'</span>
<span class="p">]</span>
<span class="c"># Iterate through the rows in the file, and create a dictionary for each row</span>
<span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">raw_data</span><span class="p">:</span>
<span class="c"># Dictionaries should have headers -> row</span>
<span class="n">parsed_data</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">dict</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">headers</span><span class="p">,</span> <span class="n">row</span><span class="p">)))</span>
<span class="c"># Delete the last row of parsed_data because it's blank</span>
<span class="n">parsed_data</span> <span class="o">=</span> <span class="n">parsed_data</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
<span class="c"># Let's see what parsed_data looks like</span>
<span class="n">pprint</span><span class="o">.</span><span class="n">pprint</span><span class="p">(</span><span class="n">parsed_data</span><span class="p">[:</span><span class="mi">3</span><span class="p">])</span>
</pre></div>
<p>What you should see when you execute barbaric-script.py:</p>
<div class="highlight"><pre><span class="gp">$</span> python barbaric-script.py
<span class="go">[{'Class': 'Iris-setosa',</span>
<span class="go">'Petal Length': '1.4',</span>
<span class="go">'Petal Width': '0.2',</span>
<span class="go">'Sepal Length': '5.1',</span>
<span class="go">'Sepal Width': '3.5'},</span>
<span class="go">{'Class': 'Iris-setosa',</span>
<span class="go">'Petal Length': '1.4',</span>
<span class="go">'Petal Width': '0.2',</span>
<span class="go">'Sepal Length': '4.9',</span>
<span class="go">'Sepal Width': '3.0'},</span>
<span class="go">{'Class': 'Iris-setosa',</span>
<span class="go">'Petal Length': '1.3',</span>
<span class="go">'Petal Width': '0.2',</span>
<span class="go">'Sepal Length': '4.7',</span>
<span class="go">'Sepal Width': '3.2'}]</span>
</pre></div>
<p>Great! So we have our data formatted nicely. We have a list with
elements corresponding to each row in the file, and the elements in the
list are dictionaries. The keys in the dictionaries are the column
headers, and the values are the values associated with the respective
column header for that particular row. For example, the first
row/observation (printed as the first dictionary out of the three listed
above) is in the class <code>Iris-setosa</code>, has a Petal Length of <code>1.4</code>, a Petal
Width of <code>0.2</code>, a Sepal Length of <code>5.1</code>, and a Sepal Width of <code>3.5</code>. Let's do
some plotting to explore our data. We can make a histogram! Add the
following lines to your barbaric-script.py file:</p>
<div class="highlight"><pre><span class="c"># Let's create a list of the Sepal Lengths</span>
<span class="c"># I'm calling float on the entries because otherwise they're stored as strings</span>
<span class="n">sepal_lengths</span> <span class="o">=</span> <span class="p">[</span><span class="nb">float</span><span class="p">(</span><span class="n">parsed_data</span><span class="p">[</span><span class="n">i</span><span class="p">][</span><span class="s">'Sepal Length'</span><span class="p">])</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">parsed_data</span><span class="p">))]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">sepal_lengths</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'#348ABD'</span><span class="p">,</span> <span class="n">edgecolor</span><span class="o">=</span><span class="s">'none'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s">'Sepal Length'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s">'Count'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</pre></div>
<p>Let's try executing our script again.</p>
<div class="highlight"><pre><span class="nv">$ </span>python barbaric-script.py
</pre></div>
<p>A histogram should pop up in a separate window:</p>
<p><img alt="alt text" src="http://amygdalama.github.io/images/barbaric.png" /></p>
<p>That's a lot of code (18 lines)! And we didn't even do anything
complicated - just parsed our data and plotted a histogram. Also, notice
that each time you tried to fix a bug or add a feature (like plotting
the histogram), you had to execute the entire script again, rather than
just the piece where you fixed the bug. That doesn't matter too much
since our data is fairly small and the script doesn't require much time
to execute, but it would be pretty annoying if our script took longer to
run.</p>
<p>Now let's try doing the same thing, but using pandas in an ipython
notebook. Use the terminal to run ipython notebook:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">cd</span> ~/Projects/ipython-pandas-whoa
<span class="nv">$ </span>ipython notebook
</pre></div>
<p>This will set up a local server on your computer, which will serve the
ipython notebooks that are in your working directory (right now you
don't have any) to http://127.0.0.1:8888. This should automatically open
up in your browser.</p>
<p>In the tab that opens in your browser, click "New Notebook". A new
ipython notebook will open up in another tab. A new cell will also open
up. Cells are places where you can write and execute code. This is what
a cell looks like:</p>
<p><img alt="alt text" src="http://amygdalama.github.io/images/ipythoncell.png" /></p>
<p>To allow matplotlib to post plots within your notebook, rather than
opening up a separate window (like when we executed our .py script),
type the following into the cell:</p>
<div class="highlight"><pre><span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span>
</pre></div>
<p>Then to execute the cell, press shift+enter (which will execute the cell
your cursor is currently in and then automatically either move to the
next cell, if one exists, or open up a new cell). Once you press
shift+enter, your notebook should look like:</p>
<p><img alt="alt text" src="http://amygdalama.github.io/images/ipythoncell.png" /></p>
<p>In the next cell, import the packages we'll be using, namely
matplotlib.pyplot and pandas:</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span>
</pre></div>
<p>Press shift+enter again to execute the code (I'm going to stop telling
you to do this, but just assume that after each block of code you should
press shift+enter. If you get confused about what your notebook should
look like, look at my <a href="http://nbviewer.ipython.org/github/amygdalama/ipython-pandas-whoa/blob/master/ipython-pandas-whoa.ipynb">example notebook</a>.)</p>
<p>To process the iris data, we can use the built-in panda read_csv
parser. We can get to a easy-to-use data frame within three lines of
code!</p>
<div class="highlight"><pre><span class="n">url</span> <span class="o">=</span> <span class="s">"http://mlr.cs.umass.edu/ml/machine-learning-databases/iris/iris.data"</span>
<span class="c"># Define our headers since the url doesn't contain explicit headers</span>
<span class="c"># I found these headers from looking at the documentation at</span>
<span class="c"># http://mlr.cs.umass.edu/ml/machine-learning-databases/iris/iris.names</span>
<span class="n">headers</span> <span class="o">=</span> <span class="p">[</span><span class="s">'Sepal Length'</span><span class="p">,</span> <span class="s">'Sepal Width'</span><span class="p">,</span> <span class="s">'Petal Length'</span><span class="p">,</span> <span class="s">'Petal Width'</span><span class="p">,</span> <span class="s">'Class'</span><span class="p">]</span>
<span class="n">iris</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">header</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">names</span><span class="o">=</span><span class="n">headers</span><span class="p">)</span>
</pre></div>
<p>Let's see what the data looks like.</p>
<div class="highlight"><pre><span class="n">iris</span><span class="p">[:</span><span class="mi">3</span><span class="p">]</span>
</pre></div>
<p>You should see a beautifully formatted table! This is a pandas data
frame, which is similar to a data frame in R.</p>
<p><img alt="alt text" src="http://amygdalama.github.io/images/table.png" /></p>
<p>Now let's plot a histogram for the Sepal Length column.</p>
<div class="highlight"><pre><span class="c"># I use two brackets around 'Sepal Length' to force pandas to make this</span>
<span class="c"># a data frame rather than just a series, which is like a numpy array.</span>
<span class="c"># The brackets here aren't necessary, but makes printing sepal_lengths</span>
<span class="c"># prettier and makes it easier for us to combine sepal_lengths with other</span>
<span class="c"># data.</span>
<span class="n">sepal_lengths</span> <span class="o">=</span> <span class="n">iris</span><span class="p">[[</span><span class="s">'Sepal Length'</span><span class="p">]]</span>
<span class="c"># Make the plot pretty!</span>
<span class="n">pd</span><span class="o">.</span><span class="n">set_option</span><span class="p">(</span><span class="s">'display.mpl_style'</span><span class="p">,</span> <span class="s">'default'</span><span class="p">)</span>
<span class="n">sepal_lengths</span><span class="o">.</span><span class="n">hist</span><span class="p">()</span>
</pre></div>
<p><img alt="alt text" src="http://amygdalama.github.io/images/quite_elegant.png" /></p>
<p>By this point it should be fairly obvious that ipython and pandas are
both awesome for data analysis! They make Python a much more serious
contender as a tool for data analysis by giving you quick and easy
access to your data. In this simple example, the barbaric-script.py was
18 lines of code (disregarding comments, etc) and comparatively the
ipython notebook with pandas was only 10!</p>
<p>For further reading, I definitely encourage you to check out Julia's
<a href="https://github.com/jvns/pandas-cookbook">pandas cookbook</a> (presented via ipython notebooks), on which
incidentally I'll be collaborating with her next week (eeeep)!</p>
<p>My code for this post can be found on my <a href="https://github.com/amygdalama/ipython-pandas-whoa">GitHub</a>, and the ipython
notebook can be easily viewed on <a href="http://nbviewer.ipython.org/github/amygdalama/ipython-pandas-whoa/blob/master/ipython-pandas-whoa.ipynb">NBViewer</a>.</p>Hacker School Day 2: Goals! Or, A Grasp For Sanity2014-02-12T10:02:00-05:00Amy Hanlontag:amygdalama.github.io,2014-02-12:hacker-school-day-2-goals.html<p>I have to admit day 1 of Hacker School was fairly overwhelming for me. With so many amazing people to talk to and interesting projects to
collaborate on I felt a bit like:</p>
<p><img alt="alt text" src="http://amygdalama.github.io/images/allthetennisballs.gif" /></p>
<p><em>(Credit to Hacker School alumnus Alex Beaulne for finding this perfect
gif.)</em></p>
<p>In an attempt to regain a bit of sanity, yesterday I prioritized setting
goals for the batch and creating a semi-pubic to-do list. I also began
maintaining a list of learnings about how I can better approach Hacker
School and learning in general. My to-do list, goals, and learnings can
be found on my <a href="https://github.com/amygdalama/hacker-school-progress">GitHub</a>, which I'll be updating at least daily.</p>
<p>In order to determine my goals, I answered the following questions, to
ensure that what I set out to accomplish aligned with my values and my
motivations for attending Hacker School. These lists are ranked somewhat
by importance, but probably aren't perfectly ordered.</p>
<h3 id="what-do-i-want-to-get-out-of-hacker-school">What do I want to get out of Hacker School?</h3>
<ul>
<li>Greater confidence, honesty, self-awareness, bravery</li>
<li>Gain expertise in Python, Data Analysis, Machine Learning (currently
beginner-/intermediate-level)</li>
<li>Join a community of programmers/math nerds who will help motivate me
to achieve greatness</li>
<li>Frankly, a job</li>
</ul>
<h3 id="what-do-i-want-to-accomplish-at-hacker-school">What do I want to accomplish at Hacker School?</h3>
<ul>
<li>One large (1-month) project which
requires pulling data from many sources and applying a recommender
system, probably to recommend an optimal brunch spot for me, because
brunch is awesome. This idea is inspired by <a href="http://www.hilarymason.com/presentations-2/in-search-of-the-optimal-cheeseburger/">Hilary Mason</a></li>
<li>A plethora of varying, small (1-day) analysis and machine learning
projects</li>
<li>Blog posts galore (3-5 per week) with a focus on writing
tutorials/explaining benefits of various packages</li>
<li>A successful pull request, contributing to an open source project or
an existing tutorial</li>
<li>Create a Python package (maybe with functions based on Jaynes'
<em>Probability Theory: The Logic of Science</em>)</li>
<li>Give a talk! (eeeep!)</li>
<li>Revamped website possibly using Pelican/GitHub pages (nonessential,
not a priority)</li>
<li>Murdered out GitHub account (a pleasant side effect of all of the
above)</li>
</ul>I moved cross-country for Hacker School and I'm only slightly convinced this is Real Life2014-02-10T23:23:00-05:00Amy Hanlontag:amygdalama.github.io,2014-02-10:i-moved-cross-country-for-hacker-school-and-im-only-slightly-convinced-this-is-real-life.html<p>Or, Hacker School: Day 1.</p>
<p>Holy Shit! I have been fairly silent over the past month because I got
accepted to <a href="https://www.hackerschool.com/">Hacker School</a> and had to figure out the logistics of
quitting my job, moving cross-country, and living in the most expensive
city in the States without a proper job for at least 3 months.</p>
<p>So allow me to re-iterate: Holy Shit! And somehow I did it and I'm alive
but not entirely certain it's not all just a dream.</p>
<p>Today was the first day of a possibly insane but probably brilliant
adventure. Hacker School is self-described as a writers' retreat for
programmers. I've heard other Hacker Schoolers struggle to explain it as
a day spa for programmers, a hippie commune programming cult (okay, that
one was me), not a school, not necessarily for <a href="http://amygdalama.github.io/images/hackers-movie.jpg">these</a> kind of
Hackers, and more. The Hacker School founders and facilitators have
written extensively about what Hacker School is and is not, so dig into
their blog if you're curious. It's difficult to define, and they do a
much better job than I do.</p>
<p>For the first day, I dedicated a good hour to getting my computer set up
properly (if you've seen my <a href="http://mathamy.com/2013/12/27/wget-sanity-part-2-im-an-idiot-and-decided-to-switch-to-a-brewed-python-and-reinstall-the-scientific-stack-at-10pm/">wget</a> posts, you'll know this
is something I've struggled with in the past). Then I had my first
attempt at pair programming in which I explained a few cool things about
<a href="http://scikit-learn.org/">scikit-learn</a> to another Hacker Schooler who had never used it
before. In turn she showed me some regression modeling in R.</p>
<p>I also got to play with HS's resident Apple II. One of the other Hacker
Schoolers wrote a simple for loop to print out "DONG" 100 times. Why
would you do anything else?</p>
<p><img alt="alt text" src="http://amygdalama.github.io/images/appleiidong.jpg" /></p>
<p>I am supremely honored and excited as Hell to be at Hacker School. For
the next three months, I will be surrounded by intellectually curious
nerds who are more thoughtful and self-aware than most people I know. I
hope I am brave and honest enough to confront the gaps in my knowledge,
ask questions, seek help, and ultimately get the most out of what will
be some of the most formidable months of my life. Which reminds me of
the Bene Gesserit litany:</p>
<blockquote>
<p><em>I must not fear. Fear is the mind-killer. Fear is the little-death
that brings total obliteration. I will face my fear. I will permit it
to pass over me and through me. And when it has gone past me I will
turn to see fears path. Where the fear has gone there will be
nothing... Only I will remain.</em></p>
</blockquote>wget sanity part 2: I'm an idiot and decided to switch to a brewed python (and reinstall the scientific stack) at 10pm2013-12-27T01:08:00-05:00Amy Hanlontag:amygdalama.github.io,2013-12-27:wget-sanity-part-2-im-an-idiot-and-decided-to-switch-to-a-brewed-python-and-reinstall-the-scientific-stack-at-10pm.html<p>For the past four months I've been using Anaconda's Python distribution
on my Macbook, which has been great (except for <a href="http://mathamy.com/2013/12/02/homebrew-path-pythonpath/">Part 1</a> of what has
now become a series), until I wanted to play with virtualenv. Apparently
Anaconda does not work well with virtualenv and suggests using its own
conda virtual environments and I don't take well to The Man telling me I
can't do something I want to do. So of course my reaction was to
<code>rm -rf /anaconda</code>.</p>
<p>Now I'm building my Python stack back from the ground up, and I've
decided to try out a <a href="https://github.com/Homebrew/homebrew/wiki/Homebrew-and-Python">brewed</a> Python. I've been following <a href="http://www.lowindata.com/2013/installing-scientific-python-on-mac-os-x/">these</a>
very elegant instructions to set up numpy, scipy, etc, on OSX, and it
worked flawlessly! Until matplotlib! I had all of the dependencies
(freetype, which I installed via Homebrew, zlib, and libpng), but I kept
getting thrown this error:</p>
<div class="highlight"><pre>fatal error: 'freetype/config/ftheader.h' file not found
#include <freetype/config/ftheader.h>
</pre></div>
<p>I finally found <a href="http://stackoverflow.com/questions/1477144/compile-matplotlib-for-python-on-snow-leopard">this</a> StackOverflow discussion with the suggestion to
type this into the terminal (for brewed python):</p>
<div class="highlight"><pre><span class="nv">$ </span>ln -s /usr/local/include/freetype2/ /usr/include/freetype
</pre></div>
<p>And then <code>sudo pip install matplotlib</code> worked! I don't rigorously
understand this magic, but based on my research this creates a symbolic
link from <code>/usr/include/freetype</code> to <code>/usr/local/include/freetype2</code>
(the brewed freetype). I'm guessing that matplotlib by default was
looking for freetype in <code>/usr/include/freetype</code>, but it wasn't there
since Homebrew installs everything in <code>/usr/local</code>. So, creating the
symbolic link allowed matplotlib to find freetype. In moments like these
I'm like 'Yer a wizard, Harry.'</p>Using PyMC to Analyze A/B Testing Data2013-12-24T22:23:00-05:00Amy Hanlontag:amygdalama.github.io,2013-12-24:using-pymc-to-analyze-ab-testing-data.html<p>In Chapter 2 of <a href="https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers">Bayesian Methods for Hackers</a>, there's an example of
Bayesian analysis of an A/B test using simulated data. I decided to play
around with this analysis method with real A/B landing page test data
from one of my clients.</p>
<p>This method uses PyMC to estimate the real conversion rate for each page
and Matplotlib to visually interpret the results.</p>
<p>First, I import the relevant packages:</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">pymc</span> <span class="kn">as</span> <span class="nn">pm</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
</pre></div>
<p>My client ran a landing page test with the following results:</p>
<div class="highlight"><pre><span class="n">clicks_A</span> <span class="o">=</span> <span class="mi">1135</span>
<span class="n">orders_A</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">clicks_B</span> <span class="o">=</span> <span class="mi">1149</span>
<span class="n">orders_B</span> <span class="o">=</span> <span class="mi">17</span>
</pre></div>
<p>The observed conversion rates are 44% and 1.48% for pages A and B,
respectively, but I'd like to be confident that the true conversion rate
of page B is higher than page A.</p>
<p>To format this data for the analysis, I create a numpy array for each
page with 1s representing orders and 0s representing clicks without an
order:</p>
<div class="highlight"><pre><span class="n">data_A</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">r_</span><span class="p">[[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">clicks_A</span> <span class="o">-</span> <span class="n">orders_A</span><span class="p">),</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">*</span> <span class="n">orders_A</span><span class="p">]</span>
<span class="n">data_B</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">r_</span><span class="p">[[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">clicks_B</span> <span class="o">-</span> <span class="n">orders_B</span><span class="p">),</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">*</span> <span class="n">orders_B</span><span class="p">]</span>
</pre></div>
<p>Next I assign distributions to my prior beliefs of <code>p_A</code> and <code>p_B</code>, the unknown, true conversion rates. I
assume, for simplicity, that the distributions are uniform (i.e. I have
no prior knowledge of what <code>p_A</code> and 'p_B' are).
[Note: the rest of the code in blog post is taken from [the book].</p>
<div class="highlight"><pre><span class="n">p_A</span> <span class="o">=</span> <span class="n">pm</span><span class="o">.</span><span class="n">Uniform</span><span class="p">(</span><span class="s">'p_A'</span><span class="p">,</span> <span class="n">lower</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">upper</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">p_B</span> <span class="o">=</span> <span class="n">pm</span><span class="o">.</span><span class="n">Uniform</span><span class="p">(</span><span class="s">'p_B'</span><span class="p">,</span> <span class="n">lower</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">upper</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</pre></div>
<p>Since I want to estimate the difference in true conversion rates, I need
to define a variable <code>delta</code>, which equals <code>p_B - p_A</code>. Since, if I know
both <code>p_A</code> and <code>p_B</code>, I can calculate <code>delta</code>, it's a deterministic
variable. In PyMC, deterministic variables are created using a function
with a <code>pymc.deterministic</code> wrapper:</p>
<div class="highlight"><pre><span class="nd">@pm.deterministic</span>
<span class="k">def</span> <span class="nf">delta</span><span class="p">(</span><span class="n">p_A</span><span class="o">=</span><span class="n">p_A</span><span class="p">,</span> <span class="n">p_B</span><span class="o">=</span><span class="n">p_B</span><span class="p">):</span>
<span class="k">return</span> <span class="n">p_B</span> <span class="o">-</span> <span class="n">p_A</span>
</pre></div>
<p>Next I add the observed data to PyMC variables and run an inference
algorithm (I don't understand what this code is actually doing yet - an
explanation is coming up in Chapter 3):</p>
<div class="highlight"><pre><span class="n">obs_A</span> <span class="o">=</span> <span class="n">pm</span><span class="o">.</span><span class="n">Bernoulli</span><span class="p">(</span><span class="s">"obs_A"</span><span class="p">,</span>
<span class="n">p_A</span><span class="p">,</span> <span class="n">value</span> <span class="o">=</span> <span class="n">data_A</span><span class="p">,</span> <span class="n">observed</span> <span class="o">=</span> <span class="bp">True</span><span class="p">)</span>
<span class="n">obs_B</span> <span class="o">=</span> <span class="n">pm</span><span class="o">.</span><span class="n">Bernoulli</span><span class="p">(</span><span class="s">"obs_B"</span><span class="p">,</span> <span class="n">p_B</span><span class="p">,</span> <span class="n">value</span> <span class="o">=</span> <span class="n">data_B</span><span class="p">,</span> <span class="n">observed</span> <span class="o">=</span> <span class="bp">True</span><span class="p">)</span>
<span class="n">mcmc</span> <span class="o">=</span> <span class="n">pm</span><span class="o">.</span><span class="n">MCMC</span><span class="p">([</span><span class="n">p_A</span><span class="p">,</span> <span class="n">p_B</span><span class="p">,</span> <span class="n">delta</span><span class="p">,</span> <span class="n">obs_A</span><span class="p">,</span> <span class="n">obs_B</span><span class="p">])</span>
<span class="n">mcmc</span><span class="o">.</span><span class="n">sample</span><span class="p">(</span><span class="mi">20000</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
</pre></div>
<p>Then I plot the posterior distributions for the three unknowns:</p>
<div class="highlight"><pre><span class="n">p_A_samples</span> <span class="o">=</span> <span class="n">mcmc</span><span class="o">.</span><span class="n">trace</span><span class="p">(</span><span class="s">"p_A"</span><span class="p">)[:]</span>
<span class="n">p_B_samples</span> <span class="o">=</span> <span class="n">mcmc</span><span class="o">.</span><span class="n">trace</span><span class="p">(</span><span class="s">"p_B"</span><span class="p">)[:]</span>
<span class="n">delta_samples</span> <span class="o">=</span> <span class="n">mcmc</span><span class="o">.</span><span class="n">trace</span><span class="p">(</span><span class="s">"delta"</span><span class="p">)[:]</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">311</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="o">.</span><span class="mo">035</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">p_A_samples</span><span class="p">,</span> <span class="n">histtype</span><span class="o">=</span><span class="s">'stepfilled'</span><span class="p">,</span> <span class="n">bins</span><span class="o">=</span><span class="mi">25</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.85</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="s">"posterior of $p_A$"</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"#A60628"</span><span class="p">,</span> <span class="n">normed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">edgecolor</span><span class="o">=</span> <span class="s">"none"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">"upper right"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s">"Posterior distributions of $p_A$, $p_B$, and delta</span>
<span class="n">unknowns</span><span class="s">")</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">312</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="o">.</span><span class="mo">035</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">p_B_samples</span><span class="p">,</span> <span class="n">histtype</span><span class="o">=</span><span class="s">'stepfilled'</span><span class="p">,</span> <span class="n">bins</span><span class="o">=</span><span class="mi">25</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.85</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="s">"posterior of $p_B$"</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"#467821"</span><span class="p">,</span> <span class="n">normed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">edgecolor</span> <span class="o">=</span> <span class="s">"none"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">"upper right"</span><span class="p">)</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">313</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylim</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">120</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">hist</span><span class="p">(</span><span class="n">delta_samples</span><span class="p">,</span> <span class="n">histtype</span><span class="o">=</span><span class="s">'stepfilled'</span><span class="p">,</span> <span class="n">bins</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.85</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="s">"posterior of $p_B$ - $p_A$"</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"#7A68A6"</span><span class="p">,</span><span class="n">normed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="n">edgecolor</span> <span class="o">=</span> <span class="s">"none"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">"upper right"</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">vlines</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">120</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"black"</span><span class="p">,</span> <span class="n">alpha</span> <span class="o">=</span> <span class="o">.</span><span class="mi">5</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</pre></div>
<p><img alt="alt text" src="http://amygdalama.github.io/images/pymc_posteriors.png" /></p>
<p>I can also compute the probability that the true conversion rate of page
A, <code>p_A</code>, is better than the true conversion rate of page
B, <code>p_A</code>:</p>
<div class="highlight"><pre><span class="k">print</span> <span class="s">"Probability site A is BETTER than site B: </span><span class="si">%.3f</span><span class="s">"</span> <span class="o">%</span>
<span class="p">(</span><span class="n">delta_samples</span> <span class="o"><</span> <span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span>
<span class="k">print</span> <span class="s">"Probability site A is WORSE than site B: </span><span class="si">%.3f</span><span class="s">"</span> <span class="o">%</span>
<span class="p">(</span><span class="n">delta_samples</span> <span class="o">></span> <span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span>
</pre></div>
<p>This should print out:</p>
<div class="highlight"><pre><span class="go">Probability page A is BETTER than page B: 0.006</span>
<span class="go">Probability page A is WORSE than page B: 0.994</span>
</pre></div>
<p>It's very safe to say (as long as our data was collected properly) that
page B is better than page A, and these results come very intuitively
from looking at the graphs.</p>
<p>The full code can be found on my <a href="https://github.com/amygdalama/tutorials/blob/master/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/MySourceFiles/Chapter2/ab-real-data.py">GitHub</a>.</p>wget Sanity after Installing Homebrew and F*cking Up PATH/PYTHONPATH2013-12-02T21:48:00-05:00Amy Hanlontag:amygdalama.github.io,2013-12-02:homebrew-path-pythonpath.html<p>About a week ago, I installed <a href="http://brew.sh/">Homebrew</a> on my OSX, mostly because I wanted to use Unix's <code>wget</code> command (It's like Unix's <a href="http://en.wikipedia.org/wiki/List_of_spells_in_Harry_Potter">accio</a>! <code>wget horcrux</code>. <code>wget firebolt</code>.)</p>
<p>Unfortunately, doing so installed a new version of python on my machine and switched the <code>PATH</code> and <code>PYTHONPATH</code> away from my <a href="https://store.continuum.io/cshop/anaconda/">Anaconda</a> version of python (which is the version I use and that houses all of my beloved installed packages). After hours of
struggle, I figured out how to solve this problem.</p>
<p>First, to see if you have this problem, type in the terminal:</p>
<div class="highlight"><pre><span class="nv">$ </span><span class="nb">type</span> -a python
</pre></div>
<p>Which will give you output looking something like this:</p>
<div class="highlight"><pre><span class="gp">$</span> <span class="nb">type</span> -a python
<span class="go">python is /usr/local/bin/python</span>
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is /Library/Frameworks/Python.framework/Versions/2.7/bin/python</span>
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is /usr/bin/python</span>
<span class="go">python is /usr/local/bin/python</span>
</pre></div>
<p>This is a hierarchical list of the pythons installed on your machine.
The python that you actually want executed when you type <code>python</code> into
the terminal should be at the top of the list. In my situation,
<code>//anaconda/bin/python</code> is my preferred python. To get this version to
the top of the list, you need to edit your <code>PATH</code>, which tells your
machine the directory to find <code>python</code> when you type it in the terminal.</p>
<p>Similarly, you'll probably need to update your <code>PYTHONPATH</code>, which tells
python where to look for modules to import. To see where python is
currently looking for modules, run <code>python</code> in your terminal and type the
following commands:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">sys</span>
<span class="gp">>>> </span><span class="n">sys</span><span class="o">.</span><span class="n">path</span>
</pre></div>
<p>As output, python will print out a list of the directories where it
looks for modules:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">sys</span>
<span class="gp">>>> </span><span class="n">sys</span><span class="o">.</span><span class="n">path</span>
<span class="go">['',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/setuptools-1.3-py2.7.egg',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip-1.4.1-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages', '/Users/amyhanlon',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python27.zip',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-old',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload',</span>
<span class="go">'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages',</span>
<span class="go">'/Library/Python/2.7/site-packages']</span>
</pre></div>
<p>To fix this problem, open the hidden file called <code>~/.bash_profile</code> which exists in your home directory (the <code>~</code> symbolizes your home directory).</p>
<div class="highlight"><pre><span class="nv">$ </span>subl ~/.bash_profile
</pre></div>
<p>Here I'm using the <code>subl</code> command which opens files in Sublime.</p>
<p>The file contents should look something like this:</p>
<div class="highlight"><pre><span class="c"># Setting `PATH` for Python 2.7</span>
<span class="c"># The original version is saved in .bash_profile.pysave</span>
<span class="nv">PATH</span><span class="o">=</span><span class="s2">"/Library/Frameworks/Python.framework/Versions/2.7/bin:${PATH}"</span>
<span class="nb">export</span> <span class="sb">`</span>PATH<span class="sb">`</span>
<span class="c"># added by Anaconda 1.7.0 installer</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"//anaconda/bin:$PATH"</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span>/usr/local/bin:<span class="nv">$PATH</span>
<span class="nb">export </span><span class="nv">PYTHONPATH</span><span class="o">=</span><span class="s2">"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages:$PYTHONPATH"</span>
</pre></div>
<p>The last <code>PATH=</code> and <code>PYTHONPATH=</code> lines should be directed to the Anaconda location of python, and in this case they aren't. To fix this, comment out or delete lines that incorrectly assign the <code>PATH</code> and <code>PYTHONPATH</code> variables, and
add lines that assign the correct variables. After making edits, my file looks like this:</p>
<div class="highlight"><pre><span class="c"># Setting PATH for Python 2.7</span>
<span class="c"># The original version is saved in .bash_profile.pysave</span>
<span class="c">#</span>
<span class="nv">PATH</span><span class="o">=</span><span class="s2">"/Library/Frameworks/Python.framework/Versions/2.7/bin:${PATH}"</span>
<span class="nv">PATH</span><span class="o">=</span><span class="s2">"//anaconda/bin:${PATH}"</span>
<span class="nb">export</span> <span class="sb">`</span>PATH<span class="sb">`</span>
<span class="c"># added by Anaconda 1.7.0 installer</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"//anaconda/bin:$PATH"</span>
<span class="c"># export PATH=/usr/local/bin:$PATH</span>
<span class="c"># export</span>
<span class="nv">PYTHONPATH</span><span class="o">=</span><span class="s2">"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages:$PYTHONPATH"</span>
<span class="nb">export </span><span class="nv">PYTHONPATH</span><span class="o">=</span><span class="s2">"//anaconda/lib/python2.7/site-packages:$PYTHONPATH"</span>
</pre></div>
<p>Save the file. Then, in the terminal (after exiting python), source the
file, and see if you've fixed the <code>PATH</code> and <code>PYTHONPATH</code>:</p>
<div class="highlight"><pre><span class="gp">$</span> <span class="nb">source</span> ~/.bash_profile
<span class="gp">$</span> <span class="nb">type</span> -a python
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is /usr/local/bin/python</span>
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is /Library/Frameworks/Python.framework/Versions/2.7/bin/python</span>
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is //anaconda/bin/python</span>
<span class="go">python is /usr/bin/python</span>
<span class="go">python is /usr/local/bin/python</span>
</pre></div>
<p>Now you should see the Anaconda python (or your preferable python) at
the top of the list in the output.</p>
<p>Check your <code>PYTHONPATH</code> in the python interpreter by using the same method as before:</p>
<div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">sys</span>
<span class="gp">>>> </span><span class="n">sys</span><span class="o">.</span><span class="n">path</span>
<span class="go">['', '//anaconda/lib/python2.7/site-packages/gspread-0.1.0-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/fuzzywuzzy-0.2-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/inflect-0.2.4-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/beautifulsoup4-4.3.2-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/pyicloud-0.3.0-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/foursquare-20130707-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/poster-0.8.1-py2.7.egg',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages', '/Users/amyhanlon',</span>
<span class="go">'//anaconda/lib/python27.zip', '//anaconda/lib/python2.7',</span>
<span class="go">'//anaconda/lib/python2.7/plat-darwin',</span>
<span class="go">'//anaconda/lib/python2.7/plat-mac',</span>
<span class="go">'//anaconda/lib/python2.7/plat-mac/lib-scriptpackages',</span>
<span class="go">'//anaconda/lib/python2.7/lib-tk', '//anaconda/lib/python2.7/lib-old',</span>
<span class="go">'//anaconda/lib/python2.7/lib-dynload',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/PIL',</span>
<span class="go">'//anaconda/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']</span>
</pre></div>
<p>Now the terminal is using my Anaconda version of python and is looking
in the Anaconda library for my packages! Viola!</p>Adventures in Miami: Bullish Conference 20132013-12-02T13:48:00-05:00Amy Hanlontag:amygdalama.github.io,2013-12-02:adventures-in-miami-bullish-conference-2013.html<p><img alt="alt text" src="http://amygdalama.github.io/images/bullicorns.jpg" /></p>
<p><em>(Photo by Julie Lavoie)</em></p>
<p>Adjusting to Real Life after a weekend in South Beach, filled with
mojito-flavored space popsicles, champagne, Cuban food, unicorn puzzles,
and badass, ambitious, feminist women is like coming down from some
fantastic drug-induced hallucination. This morning I was unpleasantly
surprised that there wasn't a sunkissed cabana man waiting to serve me
delicious iced coffee, and the absence of poolside complimentary wine
hour at 5pm will only deepen my wounds. (And yes! Cabana men are
real! Unfortunately for me they are all either teenagers or gay or both,
and even more unfortunately none agreed to move to Austin to volunteer
as my butler.)</p>
<p>Fortunately, we Bullicorns are well-equipped for aggressively
confronting this cabana man-less Real World, because alongside
<a href="http://www.thegloss.com/2012/03/21/career/bullish-life-achieve-goals-and-glory-by-recreating-like-a-total-fcking-badass-126/">recreating like a total fucking badass</a>, <a href="http://www.bullishconference.com/">BullCon</a> was packed with
motivating workshops. Topics included negotiation with Ji Eun (Jamie)
Lee, who frequently speaks at women's conferences on the topic; planning
your 2014 and the top 10 principles of bullishness with Jen Dzuira,
organizer of the conference and writer of the <em>Bullish</em> column; time
management with Laura Vanderkam, author of <em>What the Most Successful
People Do Before Breakfast</em>; all the tools to do all the things (and
the power of being your own assistant, at least if you don't have one)
with Haley Mlotek, publisher of <em>Worn</em> magazine and Jen's virtual
assistant; and pitching yourself and your ideas (without selling out)
with Jennifer Wright, Editor-in-chief at the <em>New York Observer</em>.
Write-ups on each of the workshops can be found on the <a href="http://www.getbullish.com/tag/bullcon2013/">Bullish</a> blog.</p>
<p>BullCon was the perfect balance of gentlewomanly recreation and
productivity, and I'm excited to see the progress each of us makes in
the next year. See you ladies in 2014 (and on the internet)!</p>Visualizing Social Circles Using Facebook Data2013-11-26T23:07:00-05:00Amy Hanlontag:amygdalama.github.io,2013-11-26:visualizing-social-circles-using-facebook-data.html<p>Intrigued by David Smith's <a href="http://blog.revolutionanalytics.com/2013/11/how-to-analyze-you-facebook-friends-network-with-r.html">Facebook friends network analysis</a> using
the Rfacebook package, I decided to try it out on my own group of
friends, to see if my social circles are clique-y ala <a href="http://amygdalama.github.io/images/nshs.jpg">North Shore High
School</a>.</p>
<p><img alt="alt text" src="http://amygdalama.github.io/images/fb-network.png" /></p>
<p>It turns out the social circles in my network were visualized almost
perfectly! The smallest group to the far right of the graph contains
family members, the large group at the top contains current co-workers,
and the large group at the bottom contains friends in the Austin cyclist
scene. Other clusters are friends from high school, friends from
college, and bar friends. The most isolated group is my family. The
groups with the most connections to each other are my current coworkers,
cyclists, and bar friends, which makes sense, because they are roughly
in the same age group and live (or have lived) in Austin, a fairly
connected city. Now I'm wondering how to get the data to see if Austin
really is more connected than other cities or if it's just in our
collective imagination.</p>
<p>Thanks to David for the <a href="http://blog.revolutionanalytics.com/2013/11/how-to-analyze-you-facebook-friends-network-with-r.html">tutorial</a> on
graphing social networks and to Julianhi for his <a href="http://thinktostart.wordpress.com/2013/11/19/analyzing-facebook-with-r/">tutorial</a> on setting
up Rfacebook.</p>