PyCon UK 2016 write-up

Last week I had a long weekend at PyCon UK 2016 in Cardiff, and it’s been a fantastic experience! Great talks, great friends/colleagues and lots of ideas.

On Monday 19th, on the last day of the conference, my friend Miguel and I have run a tutorial/workshop on Natural Language Processing in Python (the GitHub repo contains the Jupyter notebooks we used as well as some slides for an introduction).

Our NLP tutorial

Since I’ve already mentioned it, I’ll start from the end :)

The tutorial was tailored for NLP beginners and, as I mentioned explicitly at the very beginning, I wasn’t there to impress the experts. Rather, the whole point was to get the attendees a bit curious about Natural Language Processing, and to show them what you can do with a few lines of Python.

Overall, I think we’ve been quite lucky as we had the perfect audience: the right number of people (around 20+) with a bit of Python knowledge but not much NLP knowledge.

We only had some minor hiccups with the installation process, which is something we’re going to work on to make it smoother and more beginner-friendly. In particular the things I’d like to improve are:

  • add some testing / pre-flight checks, e.g. “how do I know that the environment is set up correctly?” (Miguel has already added this)
  • support for Windows: I’m quite useless with trouble-shooting Windows issues, but a couple of attendees had some troubles with the installation process not going too smoothly; maybe some virtual machine setup will be helpful

I also think having the material available in advance, so the attendees can start setting up the environment is very helpful. Most of them were quite engaged and I received a couple of “bug reports” on-the-fly, even a pull request that improved the installation process (thanks!)

Last but not least, I was also happy to give out a copy of my book (Mastering Social Media Mining with Python) that I had with me (the raffle was implemented on the spot through random.choice(), and the book went to Paivi from Django Girls).

I’ll give a shorter version of this tutorial at PyCon Ireland later this year, so in case you’ll be around, I’ll see you there :)

Unfortunately, the tutorials were not recorded so there is no video on-line, but the slides are in the GitHub repo so please dig in and send feedback if you have any.

The Open Day

Thursday 15th was “day zero” of the conference, hosted at Cardiff University. The ticket was free, although there was limited capacity. The day was aimed at introducing the new audience to Python and PyCon. We haven’t seen much Python code on that day, as the talks were mainly for newcomers, yet we had a lot of food for thoughs. This is a great way to introduce more people to Python and to show them how the community is friendly and happy to get more beginners on board.

Teachers, Kids and Education

One of the main themes of the conference was Education. Friday 16th, the first day of the main event, was labelled “Teachers Day”, while Saturday 17th was “Kids Day”. The effort to make CS education more accessible for kids was very clear, and some of the initiatives were really spot-on. In particular, some of the kids have been able to hack some small project together in a very short time, and they delivered a “show and tell” session at the end of the second day. I think their creativity and the fact that they were standing in front of a crowd of 500+ developers to show what they have been working on during their day have been very impressive.

Community in the Broader Sense

Another aspect that became quite clear is the strength of the Python Community. Some representatives of PyCon Poland, PyCon Switzerland and Django Europe were introducing their upcoming events. Some attendees with less economic capabilities were given the opportunity to attend, through some form of financial support (including e.g. students from India).

Representatives from PyCon Namibia and PyCon Zimbabwe were also attending and they discussed some of the challenges they are facing while building a local community in their countries.

In particular, the work Jessica from PyNAM is carrying out with young learners is extremely inspiring and deserves more visibility (link to the video of her talk).

Accessibility for Everybody

One of the features that I’ve never experienced in a conference so far was the speech-to-text transcription. During the talks, the speech-to-text team have been very busy writing down what the speakers were saying in real-time. While this is sometimes considered an accessibility feature which might benefit only deaf users, it turned out live captions are extremely beneficial for everybody. Firstly, not all the non-deaf attendees have perfect hearing. Secondly, not everybody is an English native speaker (both speakers and audience), so a word might be missed, or an accent might cause some confusion. Lastly, not every attendee is paying full attention to every talk for the whole talk: sometimes towards the end of the day, you just switch off for a moment and the live captions allow you to catch up.

Providing some accessibility feature turned out to be beneficial for everybody.

Shout out to the Organisers

Organising such a big event (500+ attendees) is not an easy task, so all the people who have worked hard to make this conference happen deserve a big round of applause. Not naming names here, but if you’ve been involved, thanks!

Being Interviewed about NLP

This was a bit random, in a very pleasant way. On Saturday, Miguel, Lev from RaRe Technologies and I spent some time with Kate Jarmul, who by the way just introduced her book on data wrangling, and also delivered a tutorial on the topic. The topic of the conversation was on our views, in the broader sense, about NLP / Text Analytics, how we got into this field, how we see this field evolving and so on. Apparently, this was an interview with some experts of the field, for a piece she’s writing for the O’Reilly blog (I should put an amazed emoticon here).

Using Python for …

The breadth of the topics discussed during the conference was really amazing. I think this kind of events are a great way to see what people are working on and how the tools we use every day are used by other people.

I’m not going to name any talk in particular, because there are too many good talks that deserve to be mentioned.

In terms of topics, some fields that are well covered by Python are:

  • Data Science (and related topics like data cleaning, NLP and machine learning)
  • Web development (with Django and so many interesting libraries)
  • electronics and robotics (with Raspberry Pi, micro:bit, MicroPython etc)
  • you name it :)

I’m probably not saying anything new here, but it was nice to see it in first person and step outside my data-sciency comfort zone.


Thanks to everybody who contributed to this event, and see you in Cardiff for PyCon UK 2017!

PyData London 2016 write-up

Last weekend I was at the PyData London conference for three Pythonic days. Firstly, thanks to the organiser, volunteers, speakers, sponsors and everyone who has contributed in a way or another to make the event a great success.

This year I had the opportunity to contribute as member of the review committee, which means I had a glimpse at the behind-the-scenes and I know how many great proposals we had. With three days and three to four tracks running in parallel, there is room for a lot of Pythonic parley, yet unfortunately many good proposals had to be turned down due to time/space constraints. The programme turned out to be great nevertheless.

The three days were really intense so there is just too much to say, but I’ll try to summarise some of the take-home messages.

Tutorials: delivering a tutorial is difficult. Everything that could go wrong, will go wrong (big screen that goes bananas for 10 minutes, flaky Internet connection so a conda install takes ages, you mention it). Jupyter notebook makes life better, but I strongly feel for the speakers, so a big thank you for taking the time to prepare some quality material.

Topics of interest: some topics seem to capture most of the attention this year, in particular there was a lot of interest around data pipelines, deep learning and Bayesian stats. Unsurprising?

Keynotes: following the recent news on the LIGO project, Prof. Andreas Freise gave an introduction to gravitational waves, lasers, the latest achievements in physics and other cool things far beyond my understanding. Something I could understand and relate to is his way to describe how he needs to write code to carry on his job, but writing code is not his main job. This is true for many academics and researchers without a software engineering background, who were also the main audience of my talk on building data pipelines (luckily enough, scheduled right after the keynote in the same room).

The second keynote, given by Tetiana Ivanova, was about the beginning of her journey in Data Science without formal education. Some of the suggestions were sensible, in fact I recently shared some of the same ideas in a short talk to UCL students and post-docs who want to move to industry.

The third and last keynote was given by Travis Oliphant: CEO of Continuum Analytics, author of NumPy, creator of SciPy, Pythonista since the late 1990’s. His talk was about scaling up and scaling out the PyData stack. Things to watch out for: Numba and Dask. Really exciting stuff going on!

My talk: I presented “Building Data Pipelines in Python”, with a focus on the need to bring R&D and Engineering together, and how basic engineering principles can be beneficial even if your job is not all about writing code. After presenting a very similar talk at PyCon Italy, I found the audience in London to be a bit more on the academic side than I initially thought, which was perfect for my engineering rants. After the usual first few minutes of feeling awkward when speaking publicly, I started my discussion on unit testing and asked how many in the audience write unit tests regularly. Random guy from the audience: “What’s a unit test?”. Thank you kind stranger, you lifted my spirit and the rest of the talk was a breeze.

The slides of my talk are on my speakerdeck.

Last year it took several months to get the videos out, this year only one day! So this is the video of my talk:

I had some interesting questions after the talk and I also had some nice conversations the day after. Apparently, I raised some interest on Luigi, in fact a few people told me how they really had to attend the other talk about using Luigi in production, deliverd by Pete Owlett from Deliveroo, after listening to mine (the room was overflowing so I couldn’t even get close!). There was also some genuine interest on unit testing, and a very interesting question was how to apply it when working with Jupyter notebooks.

Lighting talks: apparently, saving your Jupyter notebooks on git is an issue that is taken very seriously by the community. In fact, three speakers came up with different solutions for the same problem.

Organisation: hat off to the organisers and everyone involved, and see you at the PyData London meetup!

Get in touch if you also have a write up of the event:


PyCon Italia / PyData Italy 2016 Write-Up

Last week I’ve travelled to Florence to attend PyCon Sette, the seventh edition of the Italian Python Conference, born 10 years ago and held annually (with three editions of EuroPython in between).

First off, I have something to admit: as this was my first time at PyCon Italia, clearly I didn’n know what I was missing. Being overly busy with work and side projects, this is the perfect excuse to resume the blog.


The city doesn’t need much presentation: it’s simply one of the most beautiful cities in the world. I haven’t been there for a few years but things don’t seem to be very different from a turist’s point of view. The craft beer scene is booming, but at the same time culinary traditions are well preserved. Both of these are big thumbs-up for me. The best random moment of my trip: getting lost in the back streets of the old city centre, and then finding a dodgy hole-in-the-wall place that sells incredible focaccia and panini.

The Conference

PyCon Sette can be summarised as three intense days of Python, with more than 500 attendees. The first day was opened by Alex Martelli with a keynote about exception handling in Python 2 vs Python 3. A part from the keynotes, at any given time we had between 4 and 6 parallel sessions of talks or trainings. I decided to stick to the PyData track for the whole time, although the other tracks were also featuring some interesting talks. Some of the tracks were related to a particular sub-community, with PyData and DjangoVillage having a strong presence, but also Odoo, DjangoGirls and the Italian Postgres User Group are worth mentioning.

I’ve listened to many interesting talks. On top of my head, a few to remember: the talk about Internet of Things by Stefano Terna of (also winners of the start-up contest), the one about deployment of scikit-learn models in the cloud by Alex Casalboni and an interesting one about Functional Programming and Dask by Holger Peters.

Overall, hats off to the organisers. In particular, I had some conversations with Valerio Maggio who is the founder of PyData Italy. We exchanged some opinions about the conference and the community in the broader sense. Hopefully the interest around Data Science in Italy will keep rising, so maybe several local events throughout the year will be held, rather than having just one big national event per year.

My Talk

On Saturday, I gave a talk on Building Data Pipelines in Python. I wrote about building data pipelines with Luigi before, but this talk gave me the opportunity to look at the bigger picture. The general message was that Research and Engineering are different disciplines, but we (data-sciency and researchy people) can benefit from trying to meet in the middle. In particular, good engineering practices can help the less engineering-oriented researchers in their day-to-day mundane tasks. After opening the discussion on the overall topic, I had a brief moment of ranting about unit testing (or the lack of testing culture in some academic circles), I introduced Luigi as a workflow manager to build pipelines in Python and I closed with an overview on logging (described by Alex Martelli in his keynote as something that scares people off, at least initially) and a consideration about using good engineering practices in research.

The talk was addressed to beginners and to the less engineering-savvy PyData users, so expert software engineers probably didn’t benefit much from it. I had anyway a good response with several people coming after the talk for a chat. All in all, if at least one researcher will look into testing or will decide to try one of the workflow managers I mentioned, I’d say I’ve reached my goal.

The slides of my talk are on my speakerdeck (videos will be on-line soon).

See you next year in Florence!