PyData London 2016 write-up

Last weekend I was at the PyData London conference for three Pythonic days. Firstly, thanks to the organiser, volunteers, speakers, sponsors and everyone who has contributed in a way or another to make the event a great success.

This year I had the opportunity to contribute as member of the review committee, which means I had a glimpse at the behind-the-scenes and I know how many great proposals we had. With three days and three to four tracks running in parallel, there is room for a lot of Pythonic parley, yet unfortunately many good proposals had to be turned down due to time/space constraints. The programme turned out to be great nevertheless.

The three days were really intense so there is just too much to say, but I’ll try to summarise some of the take-home messages.

Tutorials: delivering a tutorial is difficult. Everything that could go wrong, will go wrong (big screen that goes bananas for 10 minutes, flaky Internet connection so a conda install takes ages, you mention it). Jupyter notebook makes life better, but I strongly feel for the speakers, so a big thank you for taking the time to prepare some quality material.

Topics of interest: some topics seem to capture most of the attention this year, in particular there was a lot of interest around data pipelines, deep learning and Bayesian stats. Unsurprising?

Keynotes: following the recent news on the LIGO project, Prof. Andreas Freise gave an introduction to gravitational waves, lasers, the latest achievements in physics and other cool things far beyond my understanding. Something I could understand and relate to is his way to describe how he needs to write code to carry on his job, but writing code is not his main job. This is true for many academics and researchers without a software engineering background, who were also the main audience of my talk on building data pipelines (luckily enough, scheduled right after the keynote in the same room).

The second keynote, given by Tetiana Ivanova, was about the beginning of her journey in Data Science without formal education. Some of the suggestions were sensible, in fact I recently shared some of the same ideas in a short talk to UCL students and post-docs who want to move to industry.

The third and last keynote was given by Travis Oliphant: CEO of Continuum Analytics, author of NumPy, creator of SciPy, Pythonista since the late 1990’s. His talk was about scaling up and scaling out the PyData stack. Things to watch out for: Numba and Dask. Really exciting stuff going on!

My talk: I presented “Building Data Pipelines in Python”, with a focus on the need to bring R&D and Engineering together, and how basic engineering principles can be beneficial even if your job is not all about writing code. After presenting a very similar talk at PyCon Italy, I found the audience in London to be a bit more on the academic side than I initially thought, which was perfect for my engineering rants. After the usual first few minutes of feeling awkward when speaking publicly, I started my discussion on unit testing and asked how many in the audience write unit tests regularly. Random guy from the audience: “What’s a unit test?”. Thank you kind stranger, you lifted my spirit and the rest of the talk was a breeze.

The slides of my talk are on my speakerdeck.

Last year it took several months to get the videos out, this year only one day! So this is the video of my talk: https://www.youtube.com/watch?v=7NzH1Gx8-4E

I had some interesting questions after the talk and I also had some nice conversations the day after. Apparently, I raised some interest on Luigi, in fact a few people told me how they really had to attend the other talk about using Luigi in production, deliverd by Pete Owlett from Deliveroo, after listening to mine (the room was overflowing so I couldn’t even get close!). There was also some genuine interest on unit testing, and a very interesting question was how to apply it when working with Jupyter notebooks.

Lighting talks: apparently, saving your Jupyter notebooks on git is an issue that is taken very seriously by the community. In fact, three speakers came up with different solutions for the same problem.

Organisation: hat off to the organisers and everyone involved, and see you at the PyData London meetup!

Get in touch if you also have a write up of the event:

@MarcoBonzanini

PyData London Meetup 2015-02-03

A quick update with my impressions on the last PyData London meet-up I attended this evening.

I missed the past couple of meet-ups because of some clash on my personal schedule, so this was my first time in the new venue, still close to Old Street, hosted by Lyst. A nice big room to accomodate many many people (more than 200 probably?), and a nice selection of craft beers.

We started with the usual initial community-related announcements, including the impressive achievement of the group: 1,000+ members on the meetup.com dedicated page! Well done guys!

The core topic of the evening was data visualisation. For this reason, the main candidate for Python Module Of The Month was (… drum roll …) matplotlib :-D

The first talk, “Thinking about data visualisation”, was given by Andy Kirk from Visualising Data. Thinking was an important keyword of the presentation: after an initial comparison between talent (e.g. artistic, creative) vs thinking, Andy went on discussing different aspects of thinking and how our thinking can provide more value to the visualisation we are proposing. Long story short, you don’t have to be an artist to create useful visualisation, assuming you take the context and the overall story into account.

The second talk, “Lies, damned lies and dataviz”, was given by Andrew Clegg from Etsy. Andrew started the presentation with some argument in support of providing good dataviz, and then he continued with a great sequence of examples of bad dataviz, discussing how some of them simply confuse the user without providing any additional insight, while others really trick the user into wrong conclusions. Whether these are damned lies or just bad design choices… well that’s another story.

Both presentations were really great, and both of them technology-agnostic, which is something I enjoyed for some reason.

During the final lightning-and-community-talks, the interesting news was the announcement of a new O’Reilly book on data visualisation with Python and Javascript (link to come).

Overall a very nice evening, congratulations to all the organisers.

PyData London meetup.com group:
http://www.meetup.com/PyData-London-Meetup/