In this talk from DevRelCon London, Microsoft’s Tania Allard describes how she has helped a diverse group of people get their start with open source through mentored events at conferences.
Tania: My name is Tania Allard. I’m a developer advocate at Microsoft and I’m going to be talking about a project that I’ve been brewing for about three years now. It basically started like a thing that I thought, “Oh, I’m just going to do it.” And has evolved over time. I share the URL of everything that I produce. This is a device so you can use it, reuse it, and adapt as you want. If anyone needs to access these lines because accessibility needs or something, you can access them now.
To give you a bit of context about what I’m going to be talking about, is, first I want to focus on why we need mentored sprints, why we should start initiatives like this when there are already a lot of hackathons. People are organizing hackathons all the time. There are conference sprints, organized internal sprints. Then how we actually came with narrowing what the sprints were about and finding a format that works. But saying that it works is very ambiguous so I’m going to try and distill what has worked what was the success, where we failed, and how we’re actually measuring ‘this works’ and impact of this sprint. Finally, I want to talk about the future of this project and how we can scale it, or how we’re planning to scale it which is a crucial part of the community ecosystem, in our ecosystems.
I’m going to give a tiny bit of a background for you to understand where a lot of the things that I will be talking about are coming from. In my introduction, it has already been mentioned that I work in scientific computing and machine learning but I’ve also been doing a lot of high performance computing and research for a lot of years now. I have a research background and I work, therefore, with open source communities. I work with people that were very heavily impacted, Ennard, Julia and Fourtron because those are the communities where I’ve been traditionally editing in and these are the people that I identify with and that I serve. One of the big, big problems that all of these communities share is that they’re very, very big but we are not very diverse. Especially the high performance computing is not diverse at all. It’s an old community so we’ve been struggling to get new voices and elevate and amplify those voices of the newer community and the broader diverse community of people around us. I’m by no means going to try to be as eloquent or as entertaining as Don Goodman is, but I do want to focus on some of the issues that we have in open source.
This is, again, to set context on why this initiative started. Although I focus a lot on open source and open source is an initiative from the community for the community and it’s intended to serve all the people, and it’s all about democratizing knowledge, making things more accessible and properly use tools to dismantle current systems that are oppressing certain communities. Unfortunately, in the last few years and decades, this has not been the main focus. The main focus has been on code so code has become the primary object of the open source movement. This leads to other bigger problems and one of the biggest problems that we have in our communities is the fallacy of meritocracy. One definition that is very simple and I like when I talk about meritocracy is ‘if society governed by people selected according to merit’.
If we put it in an open source context, if you’re a maintainer or a developer of an open source project and then folks contribute to your project, and they send code over or a pull request, then in this meritocracy society, you’ll only even weight this based on the merit and the correctness of the goal, whether it is subject to your style, it’s well written, it’s testable and it’s tough without taking into consideration any of the background of the person that has contributed to the code. You’re basically erasing their gender, their background, their origin and the challenges and barriers that they had to overcome. This ties very much with what Don was talking about, that we are removing the humane part of open source. And these two problems in open source come hand in hand. It also leads to the privilege dichotomy because so many people talk about how contributing to open source is a privilege and they always refer to, “Well, for you to be able to spend time, you need to be in a good position where you don’t have additional caring responsibilities so you’ve finished your work, you can go home and spend a few more hours coding”. Or “You are lucky that your company pays you for you to contribute to open source tools.” And things of the such.
But what happens when, not these needs or these privileges are non existent? There are a very, very few amount of people that maintain open source software that are making huge sacrifices because they already are taking care of community, they’re already taking care of a project. They’re taking the time that they could be spending with family or with friends to maintain this afloat and make it sustainable. They’re also losing earning potential because they are not being compensated for this additional emotional labor that they also invest in our communities. Having this privilege dichotomy also reinforces the fallacy of meritocracy and therefore, the focus on code, so this is a really cyclic problem that we’re experiencing in our community.
If we refer again to the community, there are many, many groups and many circles of people that interact in it but I’m going to focus on three primary subsets of them. The first one is the people that are traditionally considered as creators or knowledge drivers. And in this area, we normally tend to associate with maintainers, developers, people that are leading research in a certain area. Once their software or their products are out there, and they have attached license. Basically, they are allowing the entirety of the world to use and build on those tools. But the problem is that then we have another set of people using those tools to satisfy someone’s needs. Whatever that needs is, or whatever those interest are that are being served, depend on a very great deal on who’s making this decision. Then probably the biggest circle, or the biggest layer of communed people is those that are open source consumers and open source objects and I’m going to give you an example.
Nowadays, AI is everywhere but we have a very small people actually leading research, creating the libraries that power AI machine learning. Then we have other corporations that are deciding to put AI on our phones, on our cars, everywhere, our microwaves. But then all of the other people that are not traditionally considered to be knowledge creators or drivers for decision making, are still being objects because we’re still collecting their data. We’re still processing their data. We’re still making decisions for them based on these AI systems.
This poses a very, very big problem when we have very little representation of our entire community. What happens when we have a subset of our community that shares a lot of demographic or background or privilege or sacrifice characteristics and those that are creating the knowledge and are deciding whose interests we’re going to be serving. All of the other people that are different or belong to a different group are often relegated to consume and being affected, probably in negative ways. And one of the problems that this has been happening a lot is because open source has to fight a lot of fights.
Usually, we have to fight for sustainability. It’s not easy when you are driving on voluntary work, when they’re driving on trying to get critical minds to maintain a project that is being used by seven million people. Funding and compensation is not equitable. It’s very, very hard to justify the impact of an open source project because it’s not cited enough. It’s very, very difficult to create a dependency graph and quantify the direct impact. Getting big pools of funding and funding to compensate those that are spending their time and their effort is very, very hard. I’m not going to get into agency, that’s another big problem.
But then, once people and once projects start getting into sustainability, they start focusing on diversity. It starts becoming a secondary or tertiary objective whereas in a perfect world, this diversity and sustainability would be together in sustainability and would be a primary goal. And in my community, this is reflected greatly in how much representation we have in leadership positions in open source projects. My friend Anthony Scopatz led this analysis. It’s mostly around gender or how many, well how much diversity in gender we have in those that are the core maintainers or in the governance bodies of big projects turn out that we’re doing little about it. It’s not surprising. The problem is because we’ve not been supporting all these projects have been for so long in that sustainability pothole and have not been supporting the inclusion of very diverse voices.
This brought me to, how can we start? How can we start bringing and amplifying those voices? A critical part is how to onboard them, how you approach them. If you are a regular contributor to open source, or if you’ve tried to contribute to open source, your first interaction can determine completely if you’re going to engage later. If you go yourself to sprint and decide, I’m going to contribute to, let’s say, Pandas and you struggle a lot to get all the dependencies installed on your computer then you go to the issue tracker and gotta spend four hours trying to find one issue that you can work on because they’re labeled as ‘easy’ or ‘first time contributors’ and you read through the issue and you have no idea what’s going on. If you start facing all of these barriers early on, you’re less likely to stick around and contribute. These mentored sprints really started with all of these people in mind, those that have never had opportunity to contribute to open source, irregardless of why. Irregardless of the reasons why they never had this chance.
Finding a time and location for this was critical. In the Python world, something that is very common is that we have the PyCon conferences where you have the mentoring, the talks, and then a few days later or after the main conference is finished, people stay for two or three days to contribute and sprint on the projects that they love, that they use. It’s a great opportunity because maintainers also use these conferences as their annual gathering, to share ideas, to improve the roadmap and work on issues together. But it’s so hard. Put yourself in the place of someone that has never contributed to open source. How can I go to my employer and say, well instead of going five days to a conference, I’m going to be there for nine days because I want to see if I can contribute some code or do some coding. It’s very, very hard. Or you have to take some quota of your vacation but you also have to pay for your lodging and accommodation so it also becomes very, very challenging especially if you then go after those hurdles and then say, well I’m going to stay here for two more days and it turns out that the first day, it was all dedicated for you to install the dependencies and find one issue that you could work on. Finding time and a location where you could get a support for these individuals was critical. Then we also had to focus on the people that were going to participate.
The focus, it’s very problematic saying, I’m going to create mentored sprints for diverse beginners because it brings a lot of problems. But the focus was on those people that are traditionally from underrepresented communities or underrepresented groups in our community, because they are also the less likely to have received formal mentorship before. They’re also the people that are less likely to ask questions in a group of experienced developers or experienced contributors because we always have the pressure of, we have to demonstrate that we’re good enough. We have to try twice or thrice as hard to earn our place in a community. But then, also saying that we’re going to start the mentorship program is very challenging, especially when you’re working with underrepresented communities because you say, I’m going to start the mentorship program. We’re going to serve these groups doing that without acknowledging that there is going to be bias is going to be very, very harming. Not only you have to be prepared to mentor those that are willing to contribute but also those that are going to be guiding them, mentoring them and onboarding. We really wanted for the mentors that in these sprints were core contributors, core developers or very experienced people that were very familiar with the code basis to be able to take some of the learnings and processes back to their project and aim for sustainability and diversity. Hopefully, that will also help them to improve their onboarding processes. In this case, it’s a mentor with a mentor scheme where everyone is mentored, everyone is supported, everyone is guided because we want everyone to be successful and for this to happen, we have three basic pillars. Or three bases.
The first is, establishing a mentorship. We, at the beginning, we start things like this that are short sprints or short events, you know, that some people are going to be there for the four hours or for the day that it takes. But some others might take that mentorship further and that’s something that we really wanted to encourage by having a very small ratio of attendees to mentors so that they can interact. Also it’s very easy when they are doing sprints to go and measure how many pull requests were submitted during that sprint.
But that is not reflective of all of the tasks and activities that are involved in open source. Something that we really focus and we really, really guided mentors through was to bring a diverse set of issues and a diverse set of tasks that people can contribute to. That could be anything from code, having some issues really, really thought through and prepared that were very low hanging fruit but would help people to start familiarizing with the code base. But we also encouraged them to bring, well let’s say you want to improve the logo for your project bring that as well as an option. Do you have a tutorial that nobody has seen before and you need someone to do a dry run? Bring that over. Are you releasing a new feature of your library? As well, do you need a discussion in depth for a grant that you’re writing to direct funds? Bring that over. So the more diverse the tasks that could be involved also the more diverse the people, we could attract and leverage their interests and skills. By doing so, we also provided a safe environment where everyone can thrive irregardless on their background, again, their experience. It’s been very rewarding seeing this kind of feedback coming through where people often, one, someone commented, one of the reasons I’ve never contributed to open source before is because I thought I was not good enough.
But by having this space, I’ve seen that that is not true, that I can do things. This is a tweet from the mentored sprint for PyCon US and I was super, super happy because we had folks from Adafruit and Circuitpython and they had some tutorials that they were going to release to the community and they wanted someone to do a dry run. We had two tables of people heavy on hardware, filing issues, recommending new paths and new parts for tutorials to be added and they were having blast of a time. So if we were limiting ourselves to consider pull requests, all of these contributions that can made a huge difference to onboard your community and your user, it would have been lost. For us to be able to get to this structure, it’s been really hard. It’s not been, believe me, I’ve worked with a lot of community people, a lot of friends, people that have been organizing hackathons since for a really long time.
The first situation was when I was leading the Python course for CodeFirst Girls in Manchester in Sheffield and a friend was doing Leeds, it just happened to coincide with Hacktoberfest. We said “Oh wouldn’t it be great if we could get these people that had never coded before in their life to start contributing to open source projects?” We were doing that as part of the course and most of the folks completed the Hacktoberfest challenge and some people had only been coding or programming for four weeks by that time. Tthat was incredible. So then the next situation was using the PyLadies umbrella, the organization that I love and I work a lot with, to start organizing a formal sprint at PyCon UK. This was good but not as good as it could have been ’cause although I love PyLadies, when you say PyLadies, that already has a label and a lot of folks will not feel comfortable self identifying themselves as a PyLady because it, again, although it’s inclusive and it’s meant to support, and mentor, encourage women and those who identify themselves as such, it also poses a lot of problems. So that was one of the issues that we had and some feedback. Also, a lot of folks were having the issue, again the problem of finding the issues to contribute to and work on. But something very, very nice happened and that I was not anticipating and folks that were maintainers of some libraries showed up on the day and they were like, “I know that you’re running this sprint “and I maintain this library. “Can I just hang around here, can I help?” “Can I just mentor?” and I was like, “Okay yeah, cool, fine. Remember, we have the code of conduct, we have this stuff, blah, blah, blah, and you’re here to help, we are here for the people.” It was incredible to see the effect that it had. Having those folks that were so familiar with the code base and were so willing to help. That changed completely, the atmosphere. So that’s how we started this idea of bring the mentors, bring the people in the room.
I started pretty much bothering people in Manchester, you may have seen, to start organizing this again at Hacktoberfest, bring their members, grow it out. That eventually led to this being part of the PyCon US Hatchery program. The Hatchery program is an initiative to start finding other events that could eventually become a staple event of the PyCon conferences. It was very scary because PyCon US is the biggest Python conference and we opened a call. We thought, probably three people were going to show up, we’re probably not going to have a lot of projects interested in. We did a lot of advertising. We got a lot of people signing up and we were asking those that were signing up to contribute what their interests were, what their area of expertise was, if they had any projects or any stuff that they wanted to contribute to and that helped us a lot to target specific projects or specific maintainers that we needed that were going to be at PyCon to take part. We ended up with 13 different mentoring projects in this program going from GNU Mailman to the PIPA warehouse to Volume, Tensor Flow, we had a very, very, very big range of projects and we had over a hundred people signed up. We had to turn down people at the entrance because we physically didn’t have enough space. That was a very, very rewarding experience which led to opportunities to rerun this in scientific computing conferences like SciPy, EuroScipy, and that’s where a lot of the maintainers started talking to me and were like, well we have this problem. All of our core contributors and maintainers are white cis men. How do you get all of these people to come here and how can we onboard them for the long term? So that gave me another idea and that’s what I’m going to be talking about in the final remarks.
Now, the Hacktoberfest has become a staple event where we run these mentored sprints and conferences as well because they give these places where people meet together in a very consistent timing where you can plan ahead. But what’s next? This is something that’s interesting, what’s next? We’re going to carry on with a standalone project for Hacktoberfest and conferences and we’re starting a pilot program where we’re going to focus on specific projects. These are some of the conferences that we’re working on.
But for us to be able to attack this, we have to follow up with mentors, understand what has helped them, start scaling up and creating mentored sprints in a box for anyone to be able to run their own mentored sprints because I can’t go to every conference. My co-organizers can’t go to every conferences but it would be even much better if we could empower anyone to run similar initiatives in their conferences in their communities for their people. I’m aiming for sustainability. For this, it’s critical to start, for us to start partnering up with organizations like Numfocus, eLife, Software Sustainability Institute and Python Software Foundation to establish programs where those that are onboarded in the short term mentorship can eventually become part of a longer term mentorship and we can start funding and compensating you appropriately so that sustainability and diversity become one single umbrella for all of the projects. This is what we’re going to be focusing on 2020. I would love for this to go beyond my communities. I would love for this to become a staple in all PyCons, in many, many conferences and I also want to invite you to think how this would look for your community, how you could integrate this in your community and how we can make this better for you. So thank you very much and if you want to talk about this, or anything else, please reach out.
Can you make good release notes by collating your commit messages? Eva Parish argues not.